You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2014/03/19 06:44:43 UTC

[jira] [Updated] (LUCENE-4957) Stop IndexWriter from writing broken term vector offset data in 5.0

     [ https://issues.apache.org/jira/browse/LUCENE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-4957:
--------------------------------

    Attachment: LUCENE-4957.patch

I started on this today, but then decided to beef up offsets testing in general (LUCENE-4641) and found more issues to fix. So I think we aren't quite there yet.

If we can fix those issues, then i think we just need this patch, plus to generate a 4.x index with backwards offsets for TestBackCompat to ensure codecs can still deal with it.


> Stop IndexWriter from writing broken term vector offset data in 5.0
> -------------------------------------------------------------------
>
>                 Key: LUCENE-4957
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4957
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>         Attachments: LUCENE-4957.patch
>
>
> Today we allow this in (some analyzers are broken), and only reject them if someone is indexing offsets into the postings lists.
> But we should ban this also when term vectors are enabled. Its time to stop writing this broken data and let broken analyzers be broken.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org