You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Patrick Mi <pa...@touchpoint.co.nz> on 2014/09/23 05:07:58 UTC

How to configure lucene 4.x to read 3.x index files

Hi there,

I understood that Lucene V4 could read 3.x index files by configuring
Lucene3xCodec but what exactly needs to be done here?

I used DEMO code from V4.10.0 to generate v4 index files and could read them
without problem. When I tried to read index files generated from V3 I got
the following errors:

Exception in thread "main" org.apache.lucene.index.CorruptIndexException:
did not read all bytes from file: read 65 vs size 66 (resource:
BufferedChecksumIndexInput(MMapIndexInput(path="C:\indexes\v3\_1os1_5.del")))
	at org.apache.lucene.codecs.CodecUtil.checkEOF(CodecUtil.java:252)
	at org.apache.lucene.codecs.lucene40.BitVector.<init>(BitVector.java:363)
	at
org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:91)
	at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:116)
	at
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
	at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:913)
	at
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:53)
	at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:67)
	at org.apache.lucene.demo.SearchFiles.main(SearchFiles.java:95)

My classpath includes the following jars from V4:
lucene-core-4.10.0.jar
lucene-analyzers-common-4.10.0.jar
lucene-queries-4.10.0.jar
lucene-queryparser-4.10.0.jar
lucene-facet-4.10.0.jar
lucene-expressions-4.10.0.jar

Noticed that META-INF/services/org.apache.lucene.codecs.Codec ( part of
lucene-core-4.10.0.jar) contains the following lines:
org.apache.lucene.codecs.lucene40.Lucene40Codec
org.apache.lucene.codecs.lucene3x.Lucene3xCodec
org.apache.lucene.codecs.lucene41.Lucene41Codec
org.apache.lucene.codecs.lucene42.Lucene42Codec
org.apache.lucene.codecs.lucene45.Lucene45Codec
org.apache.lucene.codecs.lucene46.Lucene46Codec
org.apache.lucene.codecs.lucene49.Lucene49Codec
org.apache.lucene.codecs.lucene410.Lucene410Codec

Does that mean Lucene3xCodec will be picked up automatically based on the
index files itself?

Where is the API I could force the code to use V3 setting? IndexReader and
IndexSearcher don’t seem to have anywhere I can pass that in?

Did some search but couldn't find the useful resources covered that. Much
appreciated if someone could point out the right direction.

Regards,
Patrick

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to configure lucene 4.x to read 3.x index files

Posted by Robert Muir <rc...@gmail.com>.
I opened an issue with a patch for this:

https://issues.apache.org/jira/browse/LUCENE-5975

Thanks for reporting it!

On Mon, Sep 22, 2014 at 11:07 PM, Patrick Mi
<pa...@touchpoint.co.nz> wrote:
> Hi there,
>
> I understood that Lucene V4 could read 3.x index files by configuring
> Lucene3xCodec but what exactly needs to be done here?
>
> I used DEMO code from V4.10.0 to generate v4 index files and could read them
> without problem. When I tried to read index files generated from V3 I got
> the following errors:
>
> Exception in thread "main" org.apache.lucene.index.CorruptIndexException:
> did not read all bytes from file: read 65 vs size 66 (resource:
> BufferedChecksumIndexInput(MMapIndexInput(path="C:\indexes\v3\_1os1_5.del")))
>         at org.apache.lucene.codecs.CodecUtil.checkEOF(CodecUtil.java:252)
>         at org.apache.lucene.codecs.lucene40.BitVector.<init>(BitVector.java:363)
>         at
> org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:91)
>         at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:116)
>         at
> org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
>         at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:913)
>         at
> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:53)
>         at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:67)
>         at org.apache.lucene.demo.SearchFiles.main(SearchFiles.java:95)
>
> My classpath includes the following jars from V4:
> lucene-core-4.10.0.jar
> lucene-analyzers-common-4.10.0.jar
> lucene-queries-4.10.0.jar
> lucene-queryparser-4.10.0.jar
> lucene-facet-4.10.0.jar
> lucene-expressions-4.10.0.jar
>
> Noticed that META-INF/services/org.apache.lucene.codecs.Codec ( part of
> lucene-core-4.10.0.jar) contains the following lines:
> org.apache.lucene.codecs.lucene40.Lucene40Codec
> org.apache.lucene.codecs.lucene3x.Lucene3xCodec
> org.apache.lucene.codecs.lucene41.Lucene41Codec
> org.apache.lucene.codecs.lucene42.Lucene42Codec
> org.apache.lucene.codecs.lucene45.Lucene45Codec
> org.apache.lucene.codecs.lucene46.Lucene46Codec
> org.apache.lucene.codecs.lucene49.Lucene49Codec
> org.apache.lucene.codecs.lucene410.Lucene410Codec
>
> Does that mean Lucene3xCodec will be picked up automatically based on the
> index files itself?
>
> Where is the API I could force the code to use V3 setting? IndexReader and
> IndexSearcher don’t seem to have anywhere I can pass that in?
>
> Did some search but couldn't find the useful resources covered that. Much
> appreciated if someone could point out the right direction.
>
> Regards,
> Patrick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to configure lucene 4.x to read 3.x index files

Posted by Michael McCandless <lu...@mikemccandless.com>.
Hi Patrick,

4.10.1 will fix this, so you can read your 3.x indices again.  See
https://issues.apache.org/jira/browse/LUCENE-5975 for details...

Mike McCandless

http://blog.mikemccandless.com


On Tue, Sep 23, 2014 at 9:18 PM, Robert Muir <rc...@gmail.com> wrote:
> As reported in the issue, since 4.8 we do better checks when reading
> this stuff in.
>
> Unfortunately, 3.0-3.3 indexes had bugs in the way they encode the
> deleted documents.
>
> So for those indexes, we have to ignore the trailing garbage at the
> end of the file.
>
> On Tue, Sep 23, 2014 at 9:15 PM, Patrick Mi <pa...@touchpoint.co.nz> wrote:
>> Hi Robert/Uwe,
>>
>> I have tried v4.8 and v4.9 - not working either.
>>
>> V4.7.0, V4.7.1, v4.7.2 are good.
>>
>> Regards,
>> Patrick
>>
>> -----Original Message-----
>> From: Patrick Mi [mailto:patrick.mi@touchpoint.co.nz]
>> Sent: Wednesday, 24 September 2014 12:24 p.m.
>> To: 'java-user@lucene.apache.org'
>> Subject: RE: How to configure lucene 4.x to read 3.x index files
>>
>> Hi Robert/Uwe,
>>
>> Thanks very much for the quick response.
>>
>> I have tried again with a different set of index(28k documents) generated
>> from V3 too and that worked.
>>
>> But the one(30k documents) I tried indeed worked for the V3 but not V4.10.
>> Maybe something in that index could cause problem in V4 but not v3.
>>
>> Also I have tried an earlier version v4.7 as Uwe suggested and V4.7 version
>> works on the V3 index that V4.10 failed to open.
>>
>> Regards,
>>
>> Patrick
>>
>>
>>
>> -----Original Message-----
>> From: Robert Muir [mailto:rcmuir@gmail.com]
>> Sent: Tuesday, 23 September 2014 11:52 p.m.
>> To: java-user
>> Subject: Re: How to configure lucene 4.x to read 3.x index files
>>
>> You should not have to configure anything.
>>
>> The exception should not happen: can I have this index to debug the issue?
>>
>> On Mon, Sep 22, 2014 at 11:07 PM, Patrick Mi
>> <pa...@touchpoint.co.nz> wrote:
>>> Hi there,
>>>
>>> I understood that Lucene V4 could read 3.x index files by configuring
>>> Lucene3xCodec but what exactly needs to be done here?
>>>
>>> I used DEMO code from V4.10.0 to generate v4 index files and could read
>>> them
>>> without problem. When I tried to read index files generated from V3 I got
>>> the following errors:
>>>
>>> Exception in thread "main" org.apache.lucene.index.CorruptIndexException:
>>> did not read all bytes from file: read 65 vs size 66 (resource:
>>> BufferedChecksumIndexInput(MMapIndexInput(path="C:\indexes\v3\_1os1_5.del")))
>>>         at org.apache.lucene.codecs.CodecUtil.checkEOF(CodecUtil.java:252)
>>>         at
>>> org.apache.lucene.codecs.lucene40.BitVector.<init>(BitVector.java:363)
>>>         at
>>> org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:91)
>>>         at
>>> org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:116)
>>>         at
>>> org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
>>>         at
>>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:913)
>>>         at
>>> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:53)
>>>         at
>>> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:67)
>>>         at org.apache.lucene.demo.SearchFiles.main(SearchFiles.java:95)
>>>
>>> My classpath includes the following jars from V4:
>>> lucene-core-4.10.0.jar
>>> lucene-analyzers-common-4.10.0.jar
>>> lucene-queries-4.10.0.jar
>>> lucene-queryparser-4.10.0.jar
>>> lucene-facet-4.10.0.jar
>>> lucene-expressions-4.10.0.jar
>>>
>>> Noticed that META-INF/services/org.apache.lucene.codecs.Codec ( part of
>>> lucene-core-4.10.0.jar) contains the following lines:
>>> org.apache.lucene.codecs.lucene40.Lucene40Codec
>>> org.apache.lucene.codecs.lucene3x.Lucene3xCodec
>>> org.apache.lucene.codecs.lucene41.Lucene41Codec
>>> org.apache.lucene.codecs.lucene42.Lucene42Codec
>>> org.apache.lucene.codecs.lucene45.Lucene45Codec
>>> org.apache.lucene.codecs.lucene46.Lucene46Codec
>>> org.apache.lucene.codecs.lucene49.Lucene49Codec
>>> org.apache.lucene.codecs.lucene410.Lucene410Codec
>>>
>>> Does that mean Lucene3xCodec will be picked up automatically based on the
>>> index files itself?
>>>
>>> Where is the API I could force the code to use V3 setting? IndexReader and
>>> IndexSearcher don’t seem to have anywhere I can pass that in?
>>>
>>> Did some search but couldn't find the useful resources covered that. Much
>>> appreciated if someone could point out the right direction.
>>>
>>> Regards,
>>> Patrick
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to configure lucene 4.x to read 3.x index files

Posted by Robert Muir <rc...@gmail.com>.
As reported in the issue, since 4.8 we do better checks when reading
this stuff in.

Unfortunately, 3.0-3.3 indexes had bugs in the way they encode the
deleted documents.

So for those indexes, we have to ignore the trailing garbage at the
end of the file.

On Tue, Sep 23, 2014 at 9:15 PM, Patrick Mi <pa...@touchpoint.co.nz> wrote:
> Hi Robert/Uwe,
>
> I have tried v4.8 and v4.9 - not working either.
>
> V4.7.0, V4.7.1, v4.7.2 are good.
>
> Regards,
> Patrick
>
> -----Original Message-----
> From: Patrick Mi [mailto:patrick.mi@touchpoint.co.nz]
> Sent: Wednesday, 24 September 2014 12:24 p.m.
> To: 'java-user@lucene.apache.org'
> Subject: RE: How to configure lucene 4.x to read 3.x index files
>
> Hi Robert/Uwe,
>
> Thanks very much for the quick response.
>
> I have tried again with a different set of index(28k documents) generated
> from V3 too and that worked.
>
> But the one(30k documents) I tried indeed worked for the V3 but not V4.10.
> Maybe something in that index could cause problem in V4 but not v3.
>
> Also I have tried an earlier version v4.7 as Uwe suggested and V4.7 version
> works on the V3 index that V4.10 failed to open.
>
> Regards,
>
> Patrick
>
>
>
> -----Original Message-----
> From: Robert Muir [mailto:rcmuir@gmail.com]
> Sent: Tuesday, 23 September 2014 11:52 p.m.
> To: java-user
> Subject: Re: How to configure lucene 4.x to read 3.x index files
>
> You should not have to configure anything.
>
> The exception should not happen: can I have this index to debug the issue?
>
> On Mon, Sep 22, 2014 at 11:07 PM, Patrick Mi
> <pa...@touchpoint.co.nz> wrote:
>> Hi there,
>>
>> I understood that Lucene V4 could read 3.x index files by configuring
>> Lucene3xCodec but what exactly needs to be done here?
>>
>> I used DEMO code from V4.10.0 to generate v4 index files and could read
>> them
>> without problem. When I tried to read index files generated from V3 I got
>> the following errors:
>>
>> Exception in thread "main" org.apache.lucene.index.CorruptIndexException:
>> did not read all bytes from file: read 65 vs size 66 (resource:
>> BufferedChecksumIndexInput(MMapIndexInput(path="C:\indexes\v3\_1os1_5.del")))
>>         at org.apache.lucene.codecs.CodecUtil.checkEOF(CodecUtil.java:252)
>>         at
>> org.apache.lucene.codecs.lucene40.BitVector.<init>(BitVector.java:363)
>>         at
>> org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:91)
>>         at
>> org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:116)
>>         at
>> org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
>>         at
>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:913)
>>         at
>> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:53)
>>         at
>> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:67)
>>         at org.apache.lucene.demo.SearchFiles.main(SearchFiles.java:95)
>>
>> My classpath includes the following jars from V4:
>> lucene-core-4.10.0.jar
>> lucene-analyzers-common-4.10.0.jar
>> lucene-queries-4.10.0.jar
>> lucene-queryparser-4.10.0.jar
>> lucene-facet-4.10.0.jar
>> lucene-expressions-4.10.0.jar
>>
>> Noticed that META-INF/services/org.apache.lucene.codecs.Codec ( part of
>> lucene-core-4.10.0.jar) contains the following lines:
>> org.apache.lucene.codecs.lucene40.Lucene40Codec
>> org.apache.lucene.codecs.lucene3x.Lucene3xCodec
>> org.apache.lucene.codecs.lucene41.Lucene41Codec
>> org.apache.lucene.codecs.lucene42.Lucene42Codec
>> org.apache.lucene.codecs.lucene45.Lucene45Codec
>> org.apache.lucene.codecs.lucene46.Lucene46Codec
>> org.apache.lucene.codecs.lucene49.Lucene49Codec
>> org.apache.lucene.codecs.lucene410.Lucene410Codec
>>
>> Does that mean Lucene3xCodec will be picked up automatically based on the
>> index files itself?
>>
>> Where is the API I could force the code to use V3 setting? IndexReader and
>> IndexSearcher don’t seem to have anywhere I can pass that in?
>>
>> Did some search but couldn't find the useful resources covered that. Much
>> appreciated if someone could point out the right direction.
>>
>> Regards,
>> Patrick
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: How to configure lucene 4.x to read 3.x index files

Posted by Patrick Mi <pa...@touchpoint.co.nz>.
Hi Robert/Uwe,

I have tried v4.8 and v4.9 - not working either.

V4.7.0, V4.7.1, v4.7.2 are good.

Regards,
Patrick

-----Original Message-----
From: Patrick Mi [mailto:patrick.mi@touchpoint.co.nz]
Sent: Wednesday, 24 September 2014 12:24 p.m.
To: 'java-user@lucene.apache.org'
Subject: RE: How to configure lucene 4.x to read 3.x index files

Hi Robert/Uwe,

Thanks very much for the quick response.

I have tried again with a different set of index(28k documents) generated
from V3 too and that worked.

But the one(30k documents) I tried indeed worked for the V3 but not V4.10.
Maybe something in that index could cause problem in V4 but not v3.

Also I have tried an earlier version v4.7 as Uwe suggested and V4.7 version
works on the V3 index that V4.10 failed to open.

Regards,

Patrick



-----Original Message-----
From: Robert Muir [mailto:rcmuir@gmail.com]
Sent: Tuesday, 23 September 2014 11:52 p.m.
To: java-user
Subject: Re: How to configure lucene 4.x to read 3.x index files

You should not have to configure anything.

The exception should not happen: can I have this index to debug the issue?

On Mon, Sep 22, 2014 at 11:07 PM, Patrick Mi
<pa...@touchpoint.co.nz> wrote:
> Hi there,
>
> I understood that Lucene V4 could read 3.x index files by configuring
> Lucene3xCodec but what exactly needs to be done here?
>
> I used DEMO code from V4.10.0 to generate v4 index files and could read
> them
> without problem. When I tried to read index files generated from V3 I got
> the following errors:
>
> Exception in thread "main" org.apache.lucene.index.CorruptIndexException:
> did not read all bytes from file: read 65 vs size 66 (resource:
> BufferedChecksumIndexInput(MMapIndexInput(path="C:\indexes\v3\_1os1_5.del")))
>         at org.apache.lucene.codecs.CodecUtil.checkEOF(CodecUtil.java:252)
>         at
> org.apache.lucene.codecs.lucene40.BitVector.<init>(BitVector.java:363)
>         at
> org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:91)
>         at
> org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:116)
>         at
> org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
>         at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:913)
>         at
> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:53)
>         at
> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:67)
>         at org.apache.lucene.demo.SearchFiles.main(SearchFiles.java:95)
>
> My classpath includes the following jars from V4:
> lucene-core-4.10.0.jar
> lucene-analyzers-common-4.10.0.jar
> lucene-queries-4.10.0.jar
> lucene-queryparser-4.10.0.jar
> lucene-facet-4.10.0.jar
> lucene-expressions-4.10.0.jar
>
> Noticed that META-INF/services/org.apache.lucene.codecs.Codec ( part of
> lucene-core-4.10.0.jar) contains the following lines:
> org.apache.lucene.codecs.lucene40.Lucene40Codec
> org.apache.lucene.codecs.lucene3x.Lucene3xCodec
> org.apache.lucene.codecs.lucene41.Lucene41Codec
> org.apache.lucene.codecs.lucene42.Lucene42Codec
> org.apache.lucene.codecs.lucene45.Lucene45Codec
> org.apache.lucene.codecs.lucene46.Lucene46Codec
> org.apache.lucene.codecs.lucene49.Lucene49Codec
> org.apache.lucene.codecs.lucene410.Lucene410Codec
>
> Does that mean Lucene3xCodec will be picked up automatically based on the
> index files itself?
>
> Where is the API I could force the code to use V3 setting? IndexReader and
> IndexSearcher don’t seem to have anywhere I can pass that in?
>
> Did some search but couldn't find the useful resources covered that. Much
> appreciated if someone could point out the right direction.
>
> Regards,
> Patrick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: How to configure lucene 4.x to read 3.x index files

Posted by Patrick Mi <pa...@touchpoint.co.nz>.
Hi Robert/Uwe,

Thanks very much for the quick response.

I have tried again with a different set of index(28k documents) generated
from V3 too and that worked.

But the one(30k documents) I tried indeed worked for the V3 but not V4.10.
Maybe something in that index could cause problem in V4 but not v3.

Also I have tried an earlier version v4.7 as Uwe suggested and V4.7 version
works on the V3 index that V4.10 failed to open.

Regards,

Patrick



-----Original Message-----
From: Robert Muir [mailto:rcmuir@gmail.com]
Sent: Tuesday, 23 September 2014 11:52 p.m.
To: java-user
Subject: Re: How to configure lucene 4.x to read 3.x index files

You should not have to configure anything.

The exception should not happen: can I have this index to debug the issue?

On Mon, Sep 22, 2014 at 11:07 PM, Patrick Mi
<pa...@touchpoint.co.nz> wrote:
> Hi there,
>
> I understood that Lucene V4 could read 3.x index files by configuring
> Lucene3xCodec but what exactly needs to be done here?
>
> I used DEMO code from V4.10.0 to generate v4 index files and could read
> them
> without problem. When I tried to read index files generated from V3 I got
> the following errors:
>
> Exception in thread "main" org.apache.lucene.index.CorruptIndexException:
> did not read all bytes from file: read 65 vs size 66 (resource:
> BufferedChecksumIndexInput(MMapIndexInput(path="C:\indexes\v3\_1os1_5.del")))
>         at org.apache.lucene.codecs.CodecUtil.checkEOF(CodecUtil.java:252)
>         at
> org.apache.lucene.codecs.lucene40.BitVector.<init>(BitVector.java:363)
>         at
> org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:91)
>         at
> org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:116)
>         at
> org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
>         at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:913)
>         at
> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:53)
>         at
> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:67)
>         at org.apache.lucene.demo.SearchFiles.main(SearchFiles.java:95)
>
> My classpath includes the following jars from V4:
> lucene-core-4.10.0.jar
> lucene-analyzers-common-4.10.0.jar
> lucene-queries-4.10.0.jar
> lucene-queryparser-4.10.0.jar
> lucene-facet-4.10.0.jar
> lucene-expressions-4.10.0.jar
>
> Noticed that META-INF/services/org.apache.lucene.codecs.Codec ( part of
> lucene-core-4.10.0.jar) contains the following lines:
> org.apache.lucene.codecs.lucene40.Lucene40Codec
> org.apache.lucene.codecs.lucene3x.Lucene3xCodec
> org.apache.lucene.codecs.lucene41.Lucene41Codec
> org.apache.lucene.codecs.lucene42.Lucene42Codec
> org.apache.lucene.codecs.lucene45.Lucene45Codec
> org.apache.lucene.codecs.lucene46.Lucene46Codec
> org.apache.lucene.codecs.lucene49.Lucene49Codec
> org.apache.lucene.codecs.lucene410.Lucene410Codec
>
> Does that mean Lucene3xCodec will be picked up automatically based on the
> index files itself?
>
> Where is the API I could force the code to use V3 setting? IndexReader and
> IndexSearcher don’t seem to have anywhere I can pass that in?
>
> Did some search but couldn't find the useful resources covered that. Much
> appreciated if someone could point out the right direction.
>
> Regards,
> Patrick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to configure lucene 4.x to read 3.x index files

Posted by Robert Muir <rc...@gmail.com>.
You should not have to configure anything.

The exception should not happen: can I have this index to debug the issue?

On Mon, Sep 22, 2014 at 11:07 PM, Patrick Mi
<pa...@touchpoint.co.nz> wrote:
> Hi there,
>
> I understood that Lucene V4 could read 3.x index files by configuring
> Lucene3xCodec but what exactly needs to be done here?
>
> I used DEMO code from V4.10.0 to generate v4 index files and could read them
> without problem. When I tried to read index files generated from V3 I got
> the following errors:
>
> Exception in thread "main" org.apache.lucene.index.CorruptIndexException:
> did not read all bytes from file: read 65 vs size 66 (resource:
> BufferedChecksumIndexInput(MMapIndexInput(path="C:\indexes\v3\_1os1_5.del")))
>         at org.apache.lucene.codecs.CodecUtil.checkEOF(CodecUtil.java:252)
>         at org.apache.lucene.codecs.lucene40.BitVector.<init>(BitVector.java:363)
>         at
> org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:91)
>         at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:116)
>         at
> org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
>         at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:913)
>         at
> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:53)
>         at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:67)
>         at org.apache.lucene.demo.SearchFiles.main(SearchFiles.java:95)
>
> My classpath includes the following jars from V4:
> lucene-core-4.10.0.jar
> lucene-analyzers-common-4.10.0.jar
> lucene-queries-4.10.0.jar
> lucene-queryparser-4.10.0.jar
> lucene-facet-4.10.0.jar
> lucene-expressions-4.10.0.jar
>
> Noticed that META-INF/services/org.apache.lucene.codecs.Codec ( part of
> lucene-core-4.10.0.jar) contains the following lines:
> org.apache.lucene.codecs.lucene40.Lucene40Codec
> org.apache.lucene.codecs.lucene3x.Lucene3xCodec
> org.apache.lucene.codecs.lucene41.Lucene41Codec
> org.apache.lucene.codecs.lucene42.Lucene42Codec
> org.apache.lucene.codecs.lucene45.Lucene45Codec
> org.apache.lucene.codecs.lucene46.Lucene46Codec
> org.apache.lucene.codecs.lucene49.Lucene49Codec
> org.apache.lucene.codecs.lucene410.Lucene410Codec
>
> Does that mean Lucene3xCodec will be picked up automatically based on the
> index files itself?
>
> Where is the API I could force the code to use V3 setting? IndexReader and
> IndexSearcher don’t seem to have anywhere I can pass that in?
>
> Did some search but couldn't find the useful resources covered that. Much
> appreciated if someone could point out the right direction.
>
> Regards,
> Patrick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to configure lucene 4.x to read 3.x index files

Posted by Uwe Schindler <uw...@thetaphi.de>.
Yes it can read 3.x index files without extra configuaration. You cannot enforce that, it is automatically. 

Unfortunately, Lucene 4.10 has some problems, which will be fixed with a bugfix release soon. Those bugs can lead to index corruption.

Maybe try 4.9.1 first.

Are you sure the 3.x index is ok?

Uwe

Am 23. September 2014 05:07:58 MESZ, schrieb Patrick Mi <pa...@touchpoint.co.nz>:
>Hi there,
>
>I understood that Lucene V4 could read 3.x index files by configuring
>Lucene3xCodec but what exactly needs to be done here?
>
>I used DEMO code from V4.10.0 to generate v4 index files and could read
>them
>without problem. When I tried to read index files generated from V3 I
>got
>the following errors:
>
>Exception in thread "main"
>org.apache.lucene.index.CorruptIndexException:
>did not read all bytes from file: read 65 vs size 66 (resource:
>BufferedChecksumIndexInput(MMapIndexInput(path="C:\indexes\v3\_1os1_5.del")))
>	at org.apache.lucene.codecs.CodecUtil.checkEOF(CodecUtil.java:252)
>	at
>org.apache.lucene.codecs.lucene40.BitVector.<init>(BitVector.java:363)
>	at
>org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:91)
>	at
>org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:116)
>	at
>org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
>	at
>org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:913)
>	at
>org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:53)
>	at
>org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:67)
>	at org.apache.lucene.demo.SearchFiles.main(SearchFiles.java:95)
>
>My classpath includes the following jars from V4:
>lucene-core-4.10.0.jar
>lucene-analyzers-common-4.10.0.jar
>lucene-queries-4.10.0.jar
>lucene-queryparser-4.10.0.jar
>lucene-facet-4.10.0.jar
>lucene-expressions-4.10.0.jar
>
>Noticed that META-INF/services/org.apache.lucene.codecs.Codec ( part of
>lucene-core-4.10.0.jar) contains the following lines:
>org.apache.lucene.codecs.lucene40.Lucene40Codec
>org.apache.lucene.codecs.lucene3x.Lucene3xCodec
>org.apache.lucene.codecs.lucene41.Lucene41Codec
>org.apache.lucene.codecs.lucene42.Lucene42Codec
>org.apache.lucene.codecs.lucene45.Lucene45Codec
>org.apache.lucene.codecs.lucene46.Lucene46Codec
>org.apache.lucene.codecs.lucene49.Lucene49Codec
>org.apache.lucene.codecs.lucene410.Lucene410Codec
>
>Does that mean Lucene3xCodec will be picked up automatically based on
>the
>index files itself?
>
>Where is the API I could force the code to use V3 setting? IndexReader
>and
>IndexSearcher don’t seem to have anywhere I can pass that in?
>
>Did some search but couldn't find the useful resources covered that.
>Much
>appreciated if someone could point out the right direction.
>
>Regards,
>Patrick
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org

--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de