You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Andreas Sewe <an...@codetrails.com> on 2017/03/29 16:11:12 UTC

5.x to 6.x migration: replacement for Lucene50Codec

Hi,

I am currently attempting a Lucene 5.x to 6.x migration (from 5.2.1 to
6.1, to be precise), and am looking for a replacement for Lucene50Codec:

  indexWriterConfig.setCodec(new Lucene50Codec(Mode.BEST_COMPRESSION));

The org.apache.lucene.codecs.lucene50 package is still there, so what I
am after is probably possible, but unfortunately not obvious.

Any pointers are greatly appreciated.

Best wishes,

Andreas

-- 
Codetrails GmbH
The knowledge transfer company

Robert-Bosch-Str. 7, 64293 Darmstadt
Phone: +49-6151-276-7092
Mobile: +49-170-811-3791
http://www.codetrails.com/

Managing Director: Dr. Marcel Bruch
Handelsregister: Darmstadt HRB 91940


RE: 5.x to 6.x migration: replacement for Lucene50Codec

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,
> >>> I should have mentioned that I for compatibility reasons still need to
> >>> be able to read/write indexes created with the old version, i.e., with
> >>> the 5.0 codec.
> >
> > The old codecs are read-only! As said before, you can only specify the
> codec for IndexWriter. That means new segemnts to already existing indexes
> will automatically use the new codec. Old segments already in your index will
> stay with the old codec, until they are merged away, in which case they are
> implicitly upgraded.
> 
> Just to be clear: If lucene-backwards-codecs.jar 6.1 is on the classpath
> (and gives me access to Lucene50Codec), can I specify the Lucene50Codec
> in the IndexWriter's IndexWriterConfig and thus get Lucene 6.1 to write
> an index compatible with Lucene 5.0?

Does not work as the Lucene50 Codec can only READ indexes, writing is not implemented. The Lucene50 Codec is just there to READ indexes which still contain 5.x segments. This is done by Codec.forName() based on the codec written to the index files.

If you pass an instance of the Lucene50 codec to IndexWriter, you will hit an UnsupportedOperationException at some point.

> > As the Lucene 5 codec is read only, it is impossible to create a new index (or
> modify an existing index) in a way that it will still be readable with Lucene 5.
> As soon as you touch an index with the new codec, it will be mixed codec
> versions and cannot be read with old version. But Lucene 6 will happily
> handle the mixed codec index - it is designed for that use case (default-codec
> indexes will behave the same way). ๐Ÿ˜Š
> 
> That's a cool feature and may work well for my use case. The only thing
> I worry about is how ServiceLoader-based Codec discovery will work in an
> OSGi environment (specially, an Eclipse plug-in).

Lucene requires ServiceLoader for reading indexes, no way around that. The jar files of Lucene must _all_ reside in the same ClassLoader or alternatively in the ContextClassLoader. If Lucene's ServiceLoader would not work in your environment, opening a DirectoryReader would fail ASAP with a not found exception.

Uwe

> Best wishes,
> 
> Andreas
> 
> --
> Codetrails GmbH
> The knowledge transfer company
> 
> Robert-Bosch-Str. 7, 64293 Darmstadt
> Phone: +49-6151-276-7092
> Mobile: +49-170-811-3791
> http://www.codetrails.com/
> 
> Managing Director: Dr. Marcel Bruch
> Handelsregister: Darmstadt HRB 91940



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: 5.x to 6.x migration: replacement for Lucene50Codec

Posted by Andreas Sewe <an...@codetrails.com>.
Hi Uwe,

>>> I should have mentioned that I for compatibility reasons still need to
>>> be able to read/write indexes created with the old version, i.e., with
>>> the 5.0 codec.
> 
> The old codecs are read-only! As said before, you can only specify the codec for IndexWriter. That means new segemnts to already existing indexes will automatically use the new codec. Old segments already in your index will stay with the old codec, until they are merged away, in which case they are implicitly upgraded.

Just to be clear: If lucene-backwards-codecs.jar 6.1 is on the classpath
(and gives me access to Lucene50Codec), can I specify the Lucene50Codec
in the IndexWriter's IndexWriterConfig and thus get Lucene 6.1 to write
an index compatible with Lucene 5.0?

> As the Lucene 5 codec is read only, it is impossible to create a new index (or modify an existing index) in a way that it will still be readable with Lucene 5. As soon as you touch an index with the new codec, it will be mixed codec versions and cannot be read with old version. But Lucene 6 will happily handle the mixed codec index - it is designed for that use case (default-codec indexes will behave the same way). ๐Ÿ˜Š

That's a cool feature and may work well for my use case. The only thing
I worry about is how ServiceLoader-based Codec discovery will work in an
OSGi environment (specially, an Eclipse plug-in).

Best wishes,

Andreas

-- 
Codetrails GmbH
The knowledge transfer company

Robert-Bosch-Str. 7, 64293 Darmstadt
Phone: +49-6151-276-7092
Mobile: +49-170-811-3791
http://www.codetrails.com/

Managing Director: Dr. Marcel Bruch
Handelsregister: Darmstadt HRB 91940


RE: 5.x to 6.x migration: replacement for Lucene50Codec

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

> > I should have mentioned that I for compatibility reasons still need to
> > be able to read/write indexes created with the old version, i.e., with
> > the 5.0 codec.

The old codecs are read-only! As said before, you can only specify the codec for IndexWriter. That means new segemnts to already existing indexes will automatically use the new codec. Old segments already in your index will stay with the old codec, until they are merged away, in which case they are implicitly upgraded.

As the Lucene 5 codec is read only, it is impossible to create a new index (or modify an existing index) in a way that it will still be readable with Lucene 5. As soon as you touch an index with the new codec, it will be mixed codec versions and cannot be read with old version. But Lucene 6 will happily handle the mixed codec index - it is designed for that use case (default-codec indexes will behave the same way). ๐Ÿ˜Š

Uwe




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: 5.x to 6.x migration: replacement for Lucene50Codec

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

you have to define your own codec only during indexing, so you can just update that for the migration. This then affects all new segments written to your index.

To read indexes, Lucene will automatically load the codec based on the names written to index files. If you want to open 5.x indexes, the lucene-backwards-codecs.jar must be in classpath, as lucene-core.jar does not contain the old codec. You would otherwise get some Exception.

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Andreas Sewe [mailto:andreas.sewe@codetrails.com]
> Sent: Thursday, March 30, 2017 9:17 AM
> To: java-user@lucene.apache.org
> Cc: Adrien Grand <jp...@gmail.com>
> Subject: Re: 5.x to 6.x migration: replacement for Lucene50Codec
> 
> Hi Adrien,
> 
> > If you move to Lucene 6.1, then this should be Lucene60Codec. More
> > generally that would be the same codec that is returned by
> Codec.getDefault.
> 
> I should have mentioned that I for compatibility reasons still need to
> be able to read/write indexes created with the old version, i.e., with
> the 5.0 codec.
> 
> As the org.apache.lucene.codecs.lucene50 package is still around, I
> think that this should be possible; there is just no ready-made Codec
> for me to use.
> 
> I hope this clarifies things.
> 
> Best wishes,
> 
> Andreas
> 
> --
> Codetrails GmbH
> The knowledge transfer company
> 
> Robert-Bosch-Str. 7, 64293 Darmstadt
> Phone: +49-6151-276-7092
> Mobile: +49-170-811-3791
> http://www.codetrails.com/
> 
> Managing Director: Dr. Marcel Bruch
> Handelsregister: Darmstadt HRB 91940



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: 5.x to 6.x migration: replacement for Lucene50Codec

Posted by Andreas Sewe <an...@codetrails.com>.
Hi Adrien,

> If you move to Lucene 6.1, then this should be Lucene60Codec. More
> generally that would be the same codec that is returned by Codec.getDefault.

I should have mentioned that I for compatibility reasons still need to
be able to read/write indexes created with the old version, i.e., with
the 5.0 codec.

As the org.apache.lucene.codecs.lucene50 package is still around, I
think that this should be possible; there is just no ready-made Codec
for me to use.

I hope this clarifies things.

Best wishes,

Andreas

-- 
Codetrails GmbH
The knowledge transfer company

Robert-Bosch-Str. 7, 64293 Darmstadt
Phone: +49-6151-276-7092
Mobile: +49-170-811-3791
http://www.codetrails.com/

Managing Director: Dr. Marcel Bruch
Handelsregister: Darmstadt HRB 91940


Re: 5.x to 6.x migration: replacement for Lucene50Codec

Posted by Adrien Grand <jp...@gmail.com>.
If you move to Lucene 6.1, then this should be Lucene60Codec. More
generally that would be the same codec that is returned by Codec.getDefault.

Le mer. 29 mars 2017 ร  18:11, Andreas Sewe <an...@codetrails.com> a
รฉcrit :

> Hi,
>
> I am currently attempting a Lucene 5.x to 6.x migration (from 5.2.1 to
> 6.1, to be precise), and am looking for a replacement for Lucene50Codec:
>
>   indexWriterConfig.setCodec(new Lucene50Codec(Mode.BEST_COMPRESSION));
>
> The org.apache.lucene.codecs.lucene50 package is still there, so what I
> am after is probably possible, but unfortunately not obvious.
>
> Any pointers are greatly appreciated.
>
> Best wishes,
>
> Andreas
>
> --
> Codetrails GmbH
> The knowledge transfer company
>
> Robert-Bosch-Str. 7, 64293 Darmstadt
> Phone: +49-6151-276-7092 <+49%206151%202767092>
> Mobile: +49-170-811-3791 <+49%20170%208113791>
> http://www.codetrails.com/
>
> Managing Director: Dr. Marcel Bruch
> Handelsregister: Darmstadt HRB 91940
>
>