You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2014/09/24 01:16:34 UTC
[jira] [Updated] (LUCENE-5975) Lucene can't read 3.0-3.3 deleted
documents
[ https://issues.apache.org/jira/browse/LUCENE-5975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated LUCENE-5975:
--------------------------------
Attachment: LUCENE-5975.patch
Patch. the fix is a one-line basically:
{code}
if (version >= VERSION_CHECKSUM) {
CodecUtil.checkFooter(input);
- } else {
+ } else if (version >= VERSION_DGAPS_CLEARED) {
CodecUtil.checkEOF(input);
- }
+ } // otherwise, before this we cannot even check that we read the entire file due to bugs in those versions!!!!
assert verifyCount();
{code}
Patch is huge because the test includes all unique released versions of BitVector.java from 3.x.
I think this is fine since it only applies for 4.10 branch anyway, we don't have to carry this crap in trunk or 5.x
> Lucene can't read 3.0-3.3 deleted documents
> -------------------------------------------
>
> Key: LUCENE-5975
> URL: https://issues.apache.org/jira/browse/LUCENE-5975
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Robert Muir
> Priority: Blocker
> Attachments: LUCENE-5975.patch
>
>
> BitVector before Lucene 3.4 had many bugs, particulary that it wrote extra bogus trailing crap at the end.
> But since Lucene 4.8, we check that we read all the bytes... this check can fail for 3.0-3.3 indexes due to the previous bugs in those indexes, instead users will get exception on open like this: CorruptIndexException(did not read all bytes from file: read 5000 vs 5001....
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org