You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by "Rob Staveley (Tom)" <rs...@seseit.com> on 2009/12/09 13:13:20 UTC

Index file compatibility and a migration plan to lucene 3

I have Lucene 2.3.1 code and indexes deployed in production in a distributed
system and would like to bring everything up to date with 3.0.0 via 2.9.1.

Here's my migration plan:

1. Add a index writer which generates a 2.9.1 "test" index 
2. Have that "test" index writer push that 2.9.1 "test" index into
production and see if the distributed 2.3.1 index readers can cope with it
OK
3. Upgrade my index writers to 2.9.1 (still using evil compressed fields) -
we shall have 2.9.1 writers and 2.3.1 readers during this phase. See if it
works.
4. Upgrade my index readers to 2.9.1 (still using evil compressed fields,
but with support for explicit use of CompressionTools for decompression,
where fields have been explicitly compressed with CompressionTools - the
application will knows which need decompression)
5. Add a CompressionTools to my "test" index writer, generating explicitly
compressed fields in the 2.9.1 "test" index
6. Test explicit decompression for relevant fields with CompressionTools in
my 2.9.1 "test" index in my index readers
7. Upgrade my "test" index writer to 3.0.0 
8. Have that "test" index writer push that 3.0.0 "test" index into
production and see if the distributed 2.9.1 index readers can cope with it
OK
9. Go through my index writers and index reader clients and systematically
purge all of the Field.Store.COMPRESS fields and migrate to an explicit
CompressionTools approach where applicable and no compression where
applicable. During this phase I'll expect to have
CompressionTools-compressed fields coexisting with their
Field.Store.COMPRESS predecessors, where index reader client use of
Field.Store.COMPRESS is in transit to the explicit decompression approach.
10. Upgrade my index writers to 3.0.0 
11. Upgrade my index readers to 3.0.0

I've simplified this a bit, because I shan't really be testing straight off
in production(!) - I'll test the migration plan in a test cluster first; but
this gives you the idea about the path.

I wanted to know if I should expect problems with this plan. I'm depending
on newer writers generating indexes for older readers and 3 is a major
number upgrade. It looks like I can get away with this in version 3, but
that's by no means a guarantee according to
http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats

Does this sound like a good plan?


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

RE: Index file compatibility and a migration plan to lucene 3

Posted by Uwe Schindler <uw...@thetaphi.de>.

Exactly.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Nigel [mailto:nigelspleen@gmail.com]
> Sent: Friday, December 11, 2009 2:56 AM
> To: java-user@lucene.apache.org
> Subject: Re: Index file compatibility and a migration plan to lucene 3
> 
> I have a follow-up question to this thread on Field.Store.COMPRESS in
> 2.9.1
> and beyond.  I'm getting a bit confused between the changes in 2.9.1 and
> 3.0
> so I want to make sure I know what's going on.  We also use old-style
> compressed fields and are about to upgrade to 2.9.1.
> 
> Is the following accurate?
> 
> 1) Indexes created in 2.4 with compressed fields can be read by 2.9.1.
> New
> docs can be added in 2.9.1 using compressed fields, if you don't mind the
> deprecation warnings.  Merges and optimizes done in 2.9.1 will preserve
> the
> compressed fields in the same format.
> 
> 2) Indexes created in 2.x with compressed fields can be read by 3.0.  New
> docs cannot by added in 3.0 using compressed fields, since that
> functionality has been removed (use CompressionTools instead).  The first
> time a merge or optimize is performed on an old-format index in 3.0, the
> compressed fields will all be uncompressed (i.e. converted to the new
> format).
> 
> Did I get that right?
> 
> Thanks,
> Chris


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Index file compatibility and a migration plan to lucene 3

Posted by Nigel <ni...@gmail.com>.

I have a follow-up question to this thread on Field.Store.COMPRESS in 2.9.1
and beyond.  I'm getting a bit confused between the changes in 2.9.1 and 3.0
so I want to make sure I know what's going on.  We also use old-style
compressed fields and are about to upgrade to 2.9.1.

Is the following accurate?

1) Indexes created in 2.4 with compressed fields can be read by 2.9.1.  New
docs can be added in 2.9.1 using compressed fields, if you don't mind the
deprecation warnings.  Merges and optimizes done in 2.9.1 will preserve the
compressed fields in the same format.

2) Indexes created in 2.x with compressed fields can be read by 3.0.  New
docs cannot by added in 3.0 using compressed fields, since that
functionality has been removed (use CompressionTools instead).  The first
time a merge or optimize is performed on an old-format index in 3.0, the
compressed fields will all be uncompressed (i.e. converted to the new
format).

Did I get that right?

Thanks,
Chris

RE: Index file compatibility and a migration plan to lucene 3

Posted by "Rob Staveley (Tom)" <rs...@seseit.com>.

> Don't you have a playground to properly test your changes

Yes, I'll be doing a practice run in a DEV cluster. It is the practice run that I'm planning at this point. 

Many thanks for your pointers, Danil. 

-----Original Message-----
From: Danil ŢORIN [mailto:torindan@gmail.com] 
Sent: 09 December 2009 16:03
To: java-user@lucene.apache.org
Subject: Re: Index file compatibility and a migration plan to lucene 3

There are a LOT of deprecated stuff in 2.9.1 (but it's still there)
and your code should run as it is
(however there are some changes in behavior, so read carefully CHANGES.txt)

In 3.0 this old stuff is removed.
Your production readers may not even start (which I guess is more
painful than 2 step transition)

The safest way is to upgrade to 2.9.1, fix ALL deprecation warnings,
and only then move on to 3.0.

Don't you have a playground to properly test your changes before
deploying to production ?

On Wed, Dec 9, 2009 at 17:55, Rob Staveley (Tom) <rs...@seseit.com> wrote:
> COMPRESS is supported (only deprecated) in 2.9.1, so I'm expecting them to be supported http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/document/Field.Store.html#COMPRESS
>
> I guess I should expect optimize() to increase the size of the index as compressed fields are expanded as it says in item 2 at http://lucene.apache.org/java/3_0_0/changes/Changes.html#3.0.0.changes_in_runtime_behavior
>
> That must mean that 3.0.0 can read 2.9.1 indexes, but I'm wondering if I shouldn't simply upgrade the readers straight from 2.3.1 to 3.0.0 now with an immediate optimize handling the conversion. Can I safely assume that 3.0.0 is able to read 2.3.1?
>
> Making code changes to the readers in production is tricky in my infrastructure and making one transition rather than two is very desirable.
>
> -----Original Message-----
> From: Danil ŢORIN [mailto:torindan@gmail.com]
> Sent: 09 December 2009 15:15
> To: java-user@lucene.apache.org
> Subject: Re: Index file compatibility and a migration plan to lucene 3
>
> 2nd point can be simply archived by an optimize (which will read old
> segments and will create a new one)
> But I'm not sure how it handles compressed fields.
>
>
> On Wed, Dec 9, 2009 at 16:50, Rob Staveley (Tom) <rs...@seseit.com> wrote:
>> Thanks, Danil. I think you've saved me a lot of time. Weiwei too - converting rather than reindexing everything, which will save a lot of time.
>>
>> So, I should do this:
>>
>> 1. Convert readers to 2.9.1, which should be able to read any 2.x index including the existing 2.3.1 indexes
>> 2. Convert writers to 2.9.1, using Weiwei's idea (converting the index with a 2.9.1 reader+writer conversion utility) to save some time.
>> 3. Have the writers push converted indexes to the readers using the existing production infrastructure
>> 4. Like (9.) in my original plan. [Go through my index writers and index reader clients and systematically purge all of the Field.Store.COMPRESS fields and migrate to an explicit CompressionTools approach where applicable and no compression where applicable. During this phase I'll expect to have CompressionTools-compressed fields coexisting with their Field.Store.COMPRESS predecessors, where index reader client use of Field.Store.COMPRESS is in transit to the explicit decompression approach.]
>> 5. Convert the readers to 3.0.0, which should be able to read 2.9.1, if there are no compressed fields (??)
>> 6. Convert the writers to 3.0.0
>>
>>
>>
>> -----Original Message-----
>> From: Danil ŢORIN [mailto:torindan@gmail.com]
>> Sent: 09 December 2009 13:20
>> To: java-user@lucene.apache.org
>> Subject: Re: Index file compatibility and a migration plan to lucene 3
>>
>> You NEED to update your readers first, or else they will be unable to
>> read files created by newer version.
>> And trust me, there are changes in index format from 2.3 -> 2.9
>>
>> On Wed, Dec 9, 2009 at 15:11, Weiwei Wang <ww...@gmail.com> wrote:
>>> Hi, Rob,
>>> I read
>>> http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats and
>>> found no compatibility guarantee for IndexWriter between different version.
>>>
>>> You can run your idea as a test and see the output.
>>> If it doesn't work, i suggest you convert your index to new version as I
>>> said in the last post.
>>>
>>> You can develop a convert tool to do this job automatically(that what i have
>>> done).
>>>
>>> If you do not have full access to the data center, you can read(readonly
>>> mode is preferred) from the data center(through nfs or something like that)
>>> and write to your local disk.
>>>
>>> When all converting is done, you can copy the new index to the data center
>>> with the help of the administrator.
>>>
>>> On Wed, Dec 9, 2009 at 8:42 PM, Rob Staveley (Tom) <rs...@seseit.com>wrote:
>>>
>>>> Thanks for the swift response, Weiwei.
>>>>
>>>> In my deployment, my index readers are in a data centre and therefore more
>>>> difficult to upgrade than the writers. That's why I wanted to start with the
>>>> writers rather than the readers. I realise that it looks the wrong way round
>>>> and http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formatseffectively says that changing the reader first is a better idea for most
>>>> situations, but I wanted to know if writer first would work for me for 2.3.1
>>>> -> 3.0.0.
>>>>
>>>> -----Original Message-----
>>>> From: Weiwei Wang [mailto:ww.wang.cs@gmail.com]
>>>> Sent: 09 December 2009 12:21
>>>> To: java-user@lucene.apache.org
>>>> Subject: Re: Index file compatibility and a migration plan to lucene 3
>>>>
>>>> I’ve finished a upgrade from 2.4.1 to 3.0.0
>>>>
>>>> What I do is like this:
>>>> 1. Upgrade my user-defined analyzer, tokenizer and filter to 3.0.0
>>>> 2. Use a 3.0.0 IndexReader to read the old version index and then use a
>>>> 3.0.0 IndexWriter to write all the documents into a new index
>>>> 3. Update QueryPaser to 3.0.0
>>>>
>>>> I've redeployed my system and it works fine now.
>>>>
>>>>
>>>> On Wed, Dec 9, 2009 at 8:13 PM, Rob Staveley (Tom) <rstaveley@seseit.com
>>>> >wrote:
>>>>
>>>> > I have Lucene 2.3.1 code and indexes deployed in production in a
>>>> > distributed
>>>> > system and would like to bring everything up to date with 3.0.0 via
>>>> 2.9.1.
>>>> >
>>>> > Here's my migration plan:
>>>> >
>>>> > 1. Add a index writer which generates a 2.9.1 "test" index
>>>> > 2. Have that "test" index writer push that 2.9.1 "test" index into
>>>> > production and see if the distributed 2.3.1 index readers can cope with
>>>> it
>>>> > OK
>>>> > 3. Upgrade my index writers to 2.9.1 (still using evil compressed fields)
>>>> -
>>>> > we shall have 2.9.1 writers and 2.3.1 readers during this phase. See if
>>>> it
>>>> > works.
>>>> > 4. Upgrade my index readers to 2.9.1 (still using evil compressed fields,
>>>> > but with support for explicit use of CompressionTools for decompression,
>>>> > where fields have been explicitly compressed with CompressionTools - the
>>>> > application will knows which need decompression)
>>>> > 5. Add a CompressionTools to my "test" index writer, generating
>>>> explicitly
>>>> > compressed fields in the 2.9.1 "test" index
>>>> > 6. Test explicit decompression for relevant fields with CompressionTools
>>>> in
>>>> > my 2.9.1 "test" index in my index readers
>>>> > 7. Upgrade my "test" index writer to 3.0.0
>>>> > 8. Have that "test" index writer push that 3.0.0 "test" index into
>>>> > production and see if the distributed 2.9.1 index readers can cope with
>>>> it
>>>> > OK
>>>> > 9. Go through my index writers and index reader clients and
>>>> systematically
>>>> > purge all of the Field.Store.COMPRESS fields and migrate to an explicit
>>>> > CompressionTools approach where applicable and no compression where
>>>> > applicable. During this phase I'll expect to have
>>>> > CompressionTools-compressed fields coexisting with their
>>>> > Field.Store.COMPRESS predecessors, where index reader client use of
>>>> > Field.Store.COMPRESS is in transit to the explicit decompression
>>>> approach.
>>>> > 10. Upgrade my index writers to 3.0.0
>>>> > 11. Upgrade my index readers to 3.0.0
>>>> >
>>>> > I've simplified this a bit, because I shan't really be testing straight
>>>> off
>>>> > in production(!) - I'll test the migration plan in a test cluster first;
>>>> > but
>>>> > this gives you the idea about the path.
>>>> >
>>>> > I wanted to know if I should expect problems with this plan. I'm
>>>> depending
>>>> > on newer writers generating indexes for older readers and 3 is a major
>>>> > number upgrade. It looks like I can get away with this in version 3, but
>>>> > that's by no means a guarantee according to
>>>> > http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats
>>>> >
>>>> > Does this sound like a good plan?
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>>>> >
>>>> >
>>>>
>>>>
>>>> --
>>>> Weiwei Wang
>>>> Alex Wang
>>>> 王巍巍
>>>> Room 403, Mengmin Wei Building
>>>> Computer Science Department
>>>> Gulou Campus of Nanjing University
>>>> Nanjing, P.R.China, 210093
>>>>
>>>> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>>
>>> --
>>> Weiwei Wang
>>> Alex Wang
>>> 王巍巍
>>> Room 403, Mengmin Wei Building
>>> Computer Science Department
>>> Gulou Campus of Nanjing University
>>> Nanjing, P.R.China, 210093
>>>
>>> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Index file compatibility and a migration plan to lucene 3

Posted by Danil ŢORIN <to...@gmail.com>.

There are a LOT of deprecated stuff in 2.9.1 (but it's still there)
and your code should run as it is
(however there are some changes in behavior, so read carefully CHANGES.txt)

In 3.0 this old stuff is removed.
Your production readers may not even start (which I guess is more
painful than 2 step transition)

The safest way is to upgrade to 2.9.1, fix ALL deprecation warnings,
and only then move on to 3.0.

Don't you have a playground to properly test your changes before
deploying to production ?

On Wed, Dec 9, 2009 at 17:55, Rob Staveley (Tom) <rs...@seseit.com> wrote:
> COMPRESS is supported (only deprecated) in 2.9.1, so I'm expecting them to be supported http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/document/Field.Store.html#COMPRESS
>
> I guess I should expect optimize() to increase the size of the index as compressed fields are expanded as it says in item 2 at http://lucene.apache.org/java/3_0_0/changes/Changes.html#3.0.0.changes_in_runtime_behavior
>
> That must mean that 3.0.0 can read 2.9.1 indexes, but I'm wondering if I shouldn't simply upgrade the readers straight from 2.3.1 to 3.0.0 now with an immediate optimize handling the conversion. Can I safely assume that 3.0.0 is able to read 2.3.1?
>
> Making code changes to the readers in production is tricky in my infrastructure and making one transition rather than two is very desirable.
>
> -----Original Message-----
> From: Danil ŢORIN [mailto:torindan@gmail.com]
> Sent: 09 December 2009 15:15
> To: java-user@lucene.apache.org
> Subject: Re: Index file compatibility and a migration plan to lucene 3
>
> 2nd point can be simply archived by an optimize (which will read old
> segments and will create a new one)
> But I'm not sure how it handles compressed fields.
>
>
> On Wed, Dec 9, 2009 at 16:50, Rob Staveley (Tom) <rs...@seseit.com> wrote:
>> Thanks, Danil. I think you've saved me a lot of time. Weiwei too - converting rather than reindexing everything, which will save a lot of time.
>>
>> So, I should do this:
>>
>> 1. Convert readers to 2.9.1, which should be able to read any 2.x index including the existing 2.3.1 indexes
>> 2. Convert writers to 2.9.1, using Weiwei's idea (converting the index with a 2.9.1 reader+writer conversion utility) to save some time.
>> 3. Have the writers push converted indexes to the readers using the existing production infrastructure
>> 4. Like (9.) in my original plan. [Go through my index writers and index reader clients and systematically purge all of the Field.Store.COMPRESS fields and migrate to an explicit CompressionTools approach where applicable and no compression where applicable. During this phase I'll expect to have CompressionTools-compressed fields coexisting with their Field.Store.COMPRESS predecessors, where index reader client use of Field.Store.COMPRESS is in transit to the explicit decompression approach.]
>> 5. Convert the readers to 3.0.0, which should be able to read 2.9.1, if there are no compressed fields (??)
>> 6. Convert the writers to 3.0.0
>>
>>
>>
>> -----Original Message-----
>> From: Danil ŢORIN [mailto:torindan@gmail.com]
>> Sent: 09 December 2009 13:20
>> To: java-user@lucene.apache.org
>> Subject: Re: Index file compatibility and a migration plan to lucene 3
>>
>> You NEED to update your readers first, or else they will be unable to
>> read files created by newer version.
>> And trust me, there are changes in index format from 2.3 -> 2.9
>>
>> On Wed, Dec 9, 2009 at 15:11, Weiwei Wang <ww...@gmail.com> wrote:
>>> Hi, Rob,
>>> I read
>>> http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats and
>>> found no compatibility guarantee for IndexWriter between different version.
>>>
>>> You can run your idea as a test and see the output.
>>> If it doesn't work, i suggest you convert your index to new version as I
>>> said in the last post.
>>>
>>> You can develop a convert tool to do this job automatically(that what i have
>>> done).
>>>
>>> If you do not have full access to the data center, you can read(readonly
>>> mode is preferred) from the data center(through nfs or something like that)
>>> and write to your local disk.
>>>
>>> When all converting is done, you can copy the new index to the data center
>>> with the help of the administrator.
>>>
>>> On Wed, Dec 9, 2009 at 8:42 PM, Rob Staveley (Tom) <rs...@seseit.com>wrote:
>>>
>>>> Thanks for the swift response, Weiwei.
>>>>
>>>> In my deployment, my index readers are in a data centre and therefore more
>>>> difficult to upgrade than the writers. That's why I wanted to start with the
>>>> writers rather than the readers. I realise that it looks the wrong way round
>>>> and http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formatseffectively says that changing the reader first is a better idea for most
>>>> situations, but I wanted to know if writer first would work for me for 2.3.1
>>>> -> 3.0.0.
>>>>
>>>> -----Original Message-----
>>>> From: Weiwei Wang [mailto:ww.wang.cs@gmail.com]
>>>> Sent: 09 December 2009 12:21
>>>> To: java-user@lucene.apache.org
>>>> Subject: Re: Index file compatibility and a migration plan to lucene 3
>>>>
>>>> I’ve finished a upgrade from 2.4.1 to 3.0.0
>>>>
>>>> What I do is like this:
>>>> 1. Upgrade my user-defined analyzer, tokenizer and filter to 3.0.0
>>>> 2. Use a 3.0.0 IndexReader to read the old version index and then use a
>>>> 3.0.0 IndexWriter to write all the documents into a new index
>>>> 3. Update QueryPaser to 3.0.0
>>>>
>>>> I've redeployed my system and it works fine now.
>>>>
>>>>
>>>> On Wed, Dec 9, 2009 at 8:13 PM, Rob Staveley (Tom) <rstaveley@seseit.com
>>>> >wrote:
>>>>
>>>> > I have Lucene 2.3.1 code and indexes deployed in production in a
>>>> > distributed
>>>> > system and would like to bring everything up to date with 3.0.0 via
>>>> 2.9.1.
>>>> >
>>>> > Here's my migration plan:
>>>> >
>>>> > 1. Add a index writer which generates a 2.9.1 "test" index
>>>> > 2. Have that "test" index writer push that 2.9.1 "test" index into
>>>> > production and see if the distributed 2.3.1 index readers can cope with
>>>> it
>>>> > OK
>>>> > 3. Upgrade my index writers to 2.9.1 (still using evil compressed fields)
>>>> -
>>>> > we shall have 2.9.1 writers and 2.3.1 readers during this phase. See if
>>>> it
>>>> > works.
>>>> > 4. Upgrade my index readers to 2.9.1 (still using evil compressed fields,
>>>> > but with support for explicit use of CompressionTools for decompression,
>>>> > where fields have been explicitly compressed with CompressionTools - the
>>>> > application will knows which need decompression)
>>>> > 5. Add a CompressionTools to my "test" index writer, generating
>>>> explicitly
>>>> > compressed fields in the 2.9.1 "test" index
>>>> > 6. Test explicit decompression for relevant fields with CompressionTools
>>>> in
>>>> > my 2.9.1 "test" index in my index readers
>>>> > 7. Upgrade my "test" index writer to 3.0.0
>>>> > 8. Have that "test" index writer push that 3.0.0 "test" index into
>>>> > production and see if the distributed 2.9.1 index readers can cope with
>>>> it
>>>> > OK
>>>> > 9. Go through my index writers and index reader clients and
>>>> systematically
>>>> > purge all of the Field.Store.COMPRESS fields and migrate to an explicit
>>>> > CompressionTools approach where applicable and no compression where
>>>> > applicable. During this phase I'll expect to have
>>>> > CompressionTools-compressed fields coexisting with their
>>>> > Field.Store.COMPRESS predecessors, where index reader client use of
>>>> > Field.Store.COMPRESS is in transit to the explicit decompression
>>>> approach.
>>>> > 10. Upgrade my index writers to 3.0.0
>>>> > 11. Upgrade my index readers to 3.0.0
>>>> >
>>>> > I've simplified this a bit, because I shan't really be testing straight
>>>> off
>>>> > in production(!) - I'll test the migration plan in a test cluster first;
>>>> > but
>>>> > this gives you the idea about the path.
>>>> >
>>>> > I wanted to know if I should expect problems with this plan. I'm
>>>> depending
>>>> > on newer writers generating indexes for older readers and 3 is a major
>>>> > number upgrade. It looks like I can get away with this in version 3, but
>>>> > that's by no means a guarantee according to
>>>> > http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats
>>>> >
>>>> > Does this sound like a good plan?
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>>>> >
>>>> >
>>>>
>>>>
>>>> --
>>>> Weiwei Wang
>>>> Alex Wang
>>>> 王巍巍
>>>> Room 403, Mengmin Wei Building
>>>> Computer Science Department
>>>> Gulou Campus of Nanjing University
>>>> Nanjing, P.R.China, 210093
>>>>
>>>> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>>
>>> --
>>> Weiwei Wang
>>> Alex Wang
>>> 王巍巍
>>> Room 403, Mengmin Wei Building
>>> Computer Science Department
>>> Gulou Campus of Nanjing University
>>> Nanjing, P.R.China, 210093
>>>
>>> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

RE: Index file compatibility and a migration plan to lucene 3

Posted by "Rob Staveley (Tom)" <rs...@seseit.com>.

COMPRESS is supported (only deprecated) in 2.9.1, so I'm expecting them to be supported http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/document/Field.Store.html#COMPRESS

I guess I should expect optimize() to increase the size of the index as compressed fields are expanded as it says in item 2 at http://lucene.apache.org/java/3_0_0/changes/Changes.html#3.0.0.changes_in_runtime_behavior 

That must mean that 3.0.0 can read 2.9.1 indexes, but I'm wondering if I shouldn't simply upgrade the readers straight from 2.3.1 to 3.0.0 now with an immediate optimize handling the conversion. Can I safely assume that 3.0.0 is able to read 2.3.1?

Making code changes to the readers in production is tricky in my infrastructure and making one transition rather than two is very desirable.

-----Original Message-----
From: Danil ŢORIN [mailto:torindan@gmail.com] 
Sent: 09 December 2009 15:15
To: java-user@lucene.apache.org
Subject: Re: Index file compatibility and a migration plan to lucene 3

2nd point can be simply archived by an optimize (which will read old
segments and will create a new one)
But I'm not sure how it handles compressed fields.


On Wed, Dec 9, 2009 at 16:50, Rob Staveley (Tom) <rs...@seseit.com> wrote:
> Thanks, Danil. I think you've saved me a lot of time. Weiwei too - converting rather than reindexing everything, which will save a lot of time.
>
> So, I should do this:
>
> 1. Convert readers to 2.9.1, which should be able to read any 2.x index including the existing 2.3.1 indexes
> 2. Convert writers to 2.9.1, using Weiwei's idea (converting the index with a 2.9.1 reader+writer conversion utility) to save some time.
> 3. Have the writers push converted indexes to the readers using the existing production infrastructure
> 4. Like (9.) in my original plan. [Go through my index writers and index reader clients and systematically purge all of the Field.Store.COMPRESS fields and migrate to an explicit CompressionTools approach where applicable and no compression where applicable. During this phase I'll expect to have CompressionTools-compressed fields coexisting with their Field.Store.COMPRESS predecessors, where index reader client use of Field.Store.COMPRESS is in transit to the explicit decompression approach.]
> 5. Convert the readers to 3.0.0, which should be able to read 2.9.1, if there are no compressed fields (??)
> 6. Convert the writers to 3.0.0
>
>
>
> -----Original Message-----
> From: Danil ŢORIN [mailto:torindan@gmail.com]
> Sent: 09 December 2009 13:20
> To: java-user@lucene.apache.org
> Subject: Re: Index file compatibility and a migration plan to lucene 3
>
> You NEED to update your readers first, or else they will be unable to
> read files created by newer version.
> And trust me, there are changes in index format from 2.3 -> 2.9
>
> On Wed, Dec 9, 2009 at 15:11, Weiwei Wang <ww...@gmail.com> wrote:
>> Hi, Rob,
>> I read
>> http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats and
>> found no compatibility guarantee for IndexWriter between different version.
>>
>> You can run your idea as a test and see the output.
>> If it doesn't work, i suggest you convert your index to new version as I
>> said in the last post.
>>
>> You can develop a convert tool to do this job automatically(that what i have
>> done).
>>
>> If you do not have full access to the data center, you can read(readonly
>> mode is preferred) from the data center(through nfs or something like that)
>> and write to your local disk.
>>
>> When all converting is done, you can copy the new index to the data center
>> with the help of the administrator.
>>
>> On Wed, Dec 9, 2009 at 8:42 PM, Rob Staveley (Tom) <rs...@seseit.com>wrote:
>>
>>> Thanks for the swift response, Weiwei.
>>>
>>> In my deployment, my index readers are in a data centre and therefore more
>>> difficult to upgrade than the writers. That's why I wanted to start with the
>>> writers rather than the readers. I realise that it looks the wrong way round
>>> and http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formatseffectively says that changing the reader first is a better idea for most
>>> situations, but I wanted to know if writer first would work for me for 2.3.1
>>> -> 3.0.0.
>>>
>>> -----Original Message-----
>>> From: Weiwei Wang [mailto:ww.wang.cs@gmail.com]
>>> Sent: 09 December 2009 12:21
>>> To: java-user@lucene.apache.org
>>> Subject: Re: Index file compatibility and a migration plan to lucene 3
>>>
>>> I’ve finished a upgrade from 2.4.1 to 3.0.0
>>>
>>> What I do is like this:
>>> 1. Upgrade my user-defined analyzer, tokenizer and filter to 3.0.0
>>> 2. Use a 3.0.0 IndexReader to read the old version index and then use a
>>> 3.0.0 IndexWriter to write all the documents into a new index
>>> 3. Update QueryPaser to 3.0.0
>>>
>>> I've redeployed my system and it works fine now.
>>>
>>>
>>> On Wed, Dec 9, 2009 at 8:13 PM, Rob Staveley (Tom) <rstaveley@seseit.com
>>> >wrote:
>>>
>>> > I have Lucene 2.3.1 code and indexes deployed in production in a
>>> > distributed
>>> > system and would like to bring everything up to date with 3.0.0 via
>>> 2.9.1.
>>> >
>>> > Here's my migration plan:
>>> >
>>> > 1. Add a index writer which generates a 2.9.1 "test" index
>>> > 2. Have that "test" index writer push that 2.9.1 "test" index into
>>> > production and see if the distributed 2.3.1 index readers can cope with
>>> it
>>> > OK
>>> > 3. Upgrade my index writers to 2.9.1 (still using evil compressed fields)
>>> -
>>> > we shall have 2.9.1 writers and 2.3.1 readers during this phase. See if
>>> it
>>> > works.
>>> > 4. Upgrade my index readers to 2.9.1 (still using evil compressed fields,
>>> > but with support for explicit use of CompressionTools for decompression,
>>> > where fields have been explicitly compressed with CompressionTools - the
>>> > application will knows which need decompression)
>>> > 5. Add a CompressionTools to my "test" index writer, generating
>>> explicitly
>>> > compressed fields in the 2.9.1 "test" index
>>> > 6. Test explicit decompression for relevant fields with CompressionTools
>>> in
>>> > my 2.9.1 "test" index in my index readers
>>> > 7. Upgrade my "test" index writer to 3.0.0
>>> > 8. Have that "test" index writer push that 3.0.0 "test" index into
>>> > production and see if the distributed 2.9.1 index readers can cope with
>>> it
>>> > OK
>>> > 9. Go through my index writers and index reader clients and
>>> systematically
>>> > purge all of the Field.Store.COMPRESS fields and migrate to an explicit
>>> > CompressionTools approach where applicable and no compression where
>>> > applicable. During this phase I'll expect to have
>>> > CompressionTools-compressed fields coexisting with their
>>> > Field.Store.COMPRESS predecessors, where index reader client use of
>>> > Field.Store.COMPRESS is in transit to the explicit decompression
>>> approach.
>>> > 10. Upgrade my index writers to 3.0.0
>>> > 11. Upgrade my index readers to 3.0.0
>>> >
>>> > I've simplified this a bit, because I shan't really be testing straight
>>> off
>>> > in production(!) - I'll test the migration plan in a test cluster first;
>>> > but
>>> > this gives you the idea about the path.
>>> >
>>> > I wanted to know if I should expect problems with this plan. I'm
>>> depending
>>> > on newer writers generating indexes for older readers and 3 is a major
>>> > number upgrade. It looks like I can get away with this in version 3, but
>>> > that's by no means a guarantee according to
>>> > http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats
>>> >
>>> > Does this sound like a good plan?
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>>> >
>>> >
>>>
>>>
>>> --
>>> Weiwei Wang
>>> Alex Wang
>>> 王巍巍
>>> Room 403, Mengmin Wei Building
>>> Computer Science Department
>>> Gulou Campus of Nanjing University
>>> Nanjing, P.R.China, 210093
>>>
>>> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>>
>> --
>> Weiwei Wang
>> Alex Wang
>> 王巍巍
>> Room 403, Mengmin Wei Building
>> Computer Science Department
>> Gulou Campus of Nanjing University
>> Nanjing, P.R.China, 210093
>>
>> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Index file compatibility and a migration plan to lucene 3

Posted by Danil ŢORIN <to...@gmail.com>.

2nd point can be simply archived by an optimize (which will read old
segments and will create a new one)
But I'm not sure how it handles compressed fields.


On Wed, Dec 9, 2009 at 16:50, Rob Staveley (Tom) <rs...@seseit.com> wrote:
> Thanks, Danil. I think you've saved me a lot of time. Weiwei too - converting rather than reindexing everything, which will save a lot of time.
>
> So, I should do this:
>
> 1. Convert readers to 2.9.1, which should be able to read any 2.x index including the existing 2.3.1 indexes
> 2. Convert writers to 2.9.1, using Weiwei's idea (converting the index with a 2.9.1 reader+writer conversion utility) to save some time.
> 3. Have the writers push converted indexes to the readers using the existing production infrastructure
> 4. Like (9.) in my original plan. [Go through my index writers and index reader clients and systematically purge all of the Field.Store.COMPRESS fields and migrate to an explicit CompressionTools approach where applicable and no compression where applicable. During this phase I'll expect to have CompressionTools-compressed fields coexisting with their Field.Store.COMPRESS predecessors, where index reader client use of Field.Store.COMPRESS is in transit to the explicit decompression approach.]
> 5. Convert the readers to 3.0.0, which should be able to read 2.9.1, if there are no compressed fields (??)
> 6. Convert the writers to 3.0.0
>
>
>
> -----Original Message-----
> From: Danil ŢORIN [mailto:torindan@gmail.com]
> Sent: 09 December 2009 13:20
> To: java-user@lucene.apache.org
> Subject: Re: Index file compatibility and a migration plan to lucene 3
>
> You NEED to update your readers first, or else they will be unable to
> read files created by newer version.
> And trust me, there are changes in index format from 2.3 -> 2.9
>
> On Wed, Dec 9, 2009 at 15:11, Weiwei Wang <ww...@gmail.com> wrote:
>> Hi, Rob,
>> I read
>> http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats and
>> found no compatibility guarantee for IndexWriter between different version.
>>
>> You can run your idea as a test and see the output.
>> If it doesn't work, i suggest you convert your index to new version as I
>> said in the last post.
>>
>> You can develop a convert tool to do this job automatically(that what i have
>> done).
>>
>> If you do not have full access to the data center, you can read(readonly
>> mode is preferred) from the data center(through nfs or something like that)
>> and write to your local disk.
>>
>> When all converting is done, you can copy the new index to the data center
>> with the help of the administrator.
>>
>> On Wed, Dec 9, 2009 at 8:42 PM, Rob Staveley (Tom) <rs...@seseit.com>wrote:
>>
>>> Thanks for the swift response, Weiwei.
>>>
>>> In my deployment, my index readers are in a data centre and therefore more
>>> difficult to upgrade than the writers. That's why I wanted to start with the
>>> writers rather than the readers. I realise that it looks the wrong way round
>>> and http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formatseffectively says that changing the reader first is a better idea for most
>>> situations, but I wanted to know if writer first would work for me for 2.3.1
>>> -> 3.0.0.
>>>
>>> -----Original Message-----
>>> From: Weiwei Wang [mailto:ww.wang.cs@gmail.com]
>>> Sent: 09 December 2009 12:21
>>> To: java-user@lucene.apache.org
>>> Subject: Re: Index file compatibility and a migration plan to lucene 3
>>>
>>> I’ve finished a upgrade from 2.4.1 to 3.0.0
>>>
>>> What I do is like this:
>>> 1. Upgrade my user-defined analyzer, tokenizer and filter to 3.0.0
>>> 2. Use a 3.0.0 IndexReader to read the old version index and then use a
>>> 3.0.0 IndexWriter to write all the documents into a new index
>>> 3. Update QueryPaser to 3.0.0
>>>
>>> I've redeployed my system and it works fine now.
>>>
>>>
>>> On Wed, Dec 9, 2009 at 8:13 PM, Rob Staveley (Tom) <rstaveley@seseit.com
>>> >wrote:
>>>
>>> > I have Lucene 2.3.1 code and indexes deployed in production in a
>>> > distributed
>>> > system and would like to bring everything up to date with 3.0.0 via
>>> 2.9.1.
>>> >
>>> > Here's my migration plan:
>>> >
>>> > 1. Add a index writer which generates a 2.9.1 "test" index
>>> > 2. Have that "test" index writer push that 2.9.1 "test" index into
>>> > production and see if the distributed 2.3.1 index readers can cope with
>>> it
>>> > OK
>>> > 3. Upgrade my index writers to 2.9.1 (still using evil compressed fields)
>>> -
>>> > we shall have 2.9.1 writers and 2.3.1 readers during this phase. See if
>>> it
>>> > works.
>>> > 4. Upgrade my index readers to 2.9.1 (still using evil compressed fields,
>>> > but with support for explicit use of CompressionTools for decompression,
>>> > where fields have been explicitly compressed with CompressionTools - the
>>> > application will knows which need decompression)
>>> > 5. Add a CompressionTools to my "test" index writer, generating
>>> explicitly
>>> > compressed fields in the 2.9.1 "test" index
>>> > 6. Test explicit decompression for relevant fields with CompressionTools
>>> in
>>> > my 2.9.1 "test" index in my index readers
>>> > 7. Upgrade my "test" index writer to 3.0.0
>>> > 8. Have that "test" index writer push that 3.0.0 "test" index into
>>> > production and see if the distributed 2.9.1 index readers can cope with
>>> it
>>> > OK
>>> > 9. Go through my index writers and index reader clients and
>>> systematically
>>> > purge all of the Field.Store.COMPRESS fields and migrate to an explicit
>>> > CompressionTools approach where applicable and no compression where
>>> > applicable. During this phase I'll expect to have
>>> > CompressionTools-compressed fields coexisting with their
>>> > Field.Store.COMPRESS predecessors, where index reader client use of
>>> > Field.Store.COMPRESS is in transit to the explicit decompression
>>> approach.
>>> > 10. Upgrade my index writers to 3.0.0
>>> > 11. Upgrade my index readers to 3.0.0
>>> >
>>> > I've simplified this a bit, because I shan't really be testing straight
>>> off
>>> > in production(!) - I'll test the migration plan in a test cluster first;
>>> > but
>>> > this gives you the idea about the path.
>>> >
>>> > I wanted to know if I should expect problems with this plan. I'm
>>> depending
>>> > on newer writers generating indexes for older readers and 3 is a major
>>> > number upgrade. It looks like I can get away with this in version 3, but
>>> > that's by no means a guarantee according to
>>> > http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats
>>> >
>>> > Does this sound like a good plan?
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>>> >
>>> >
>>>
>>>
>>> --
>>> Weiwei Wang
>>> Alex Wang
>>> 王巍巍
>>> Room 403, Mengmin Wei Building
>>> Computer Science Department
>>> Gulou Campus of Nanjing University
>>> Nanjing, P.R.China, 210093
>>>
>>> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>>
>> --
>> Weiwei Wang
>> Alex Wang
>> 王巍巍
>> Room 403, Mengmin Wei Building
>> Computer Science Department
>> Gulou Campus of Nanjing University
>> Nanjing, P.R.China, 210093
>>
>> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

RE: Index file compatibility and a migration plan to lucene 3

Posted by "Rob Staveley (Tom)" <rs...@seseit.com>.

Thanks, Danil. I think you've saved me a lot of time. Weiwei too - converting rather than reindexing everything, which will save a lot of time.

So, I should do this:

1. Convert readers to 2.9.1, which should be able to read any 2.x index including the existing 2.3.1 indexes
2. Convert writers to 2.9.1, using Weiwei's idea (converting the index with a 2.9.1 reader+writer conversion utility) to save some time.
3. Have the writers push converted indexes to the readers using the existing production infrastructure
4. Like (9.) in my original plan. [Go through my index writers and index reader clients and systematically purge all of the Field.Store.COMPRESS fields and migrate to an explicit CompressionTools approach where applicable and no compression where applicable. During this phase I'll expect to have CompressionTools-compressed fields coexisting with their Field.Store.COMPRESS predecessors, where index reader client use of Field.Store.COMPRESS is in transit to the explicit decompression approach.]
5. Convert the readers to 3.0.0, which should be able to read 2.9.1, if there are no compressed fields (??)
6. Convert the writers to 3.0.0



-----Original Message-----
From: Danil ŢORIN [mailto:torindan@gmail.com] 
Sent: 09 December 2009 13:20
To: java-user@lucene.apache.org
Subject: Re: Index file compatibility and a migration plan to lucene 3

You NEED to update your readers first, or else they will be unable to
read files created by newer version.
And trust me, there are changes in index format from 2.3 -> 2.9

On Wed, Dec 9, 2009 at 15:11, Weiwei Wang <ww...@gmail.com> wrote:
> Hi, Rob,
> I read
> http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats and
> found no compatibility guarantee for IndexWriter between different version.
>
> You can run your idea as a test and see the output.
> If it doesn't work, i suggest you convert your index to new version as I
> said in the last post.
>
> You can develop a convert tool to do this job automatically(that what i have
> done).
>
> If you do not have full access to the data center, you can read(readonly
> mode is preferred) from the data center(through nfs or something like that)
> and write to your local disk.
>
> When all converting is done, you can copy the new index to the data center
> with the help of the administrator.
>
> On Wed, Dec 9, 2009 at 8:42 PM, Rob Staveley (Tom) <rs...@seseit.com>wrote:
>
>> Thanks for the swift response, Weiwei.
>>
>> In my deployment, my index readers are in a data centre and therefore more
>> difficult to upgrade than the writers. That's why I wanted to start with the
>> writers rather than the readers. I realise that it looks the wrong way round
>> and http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formatseffectively says that changing the reader first is a better idea for most
>> situations, but I wanted to know if writer first would work for me for 2.3.1
>> -> 3.0.0.
>>
>> -----Original Message-----
>> From: Weiwei Wang [mailto:ww.wang.cs@gmail.com]
>> Sent: 09 December 2009 12:21
>> To: java-user@lucene.apache.org
>> Subject: Re: Index file compatibility and a migration plan to lucene 3
>>
>> I’ve finished a upgrade from 2.4.1 to 3.0.0
>>
>> What I do is like this:
>> 1. Upgrade my user-defined analyzer, tokenizer and filter to 3.0.0
>> 2. Use a 3.0.0 IndexReader to read the old version index and then use a
>> 3.0.0 IndexWriter to write all the documents into a new index
>> 3. Update QueryPaser to 3.0.0
>>
>> I've redeployed my system and it works fine now.
>>
>>
>> On Wed, Dec 9, 2009 at 8:13 PM, Rob Staveley (Tom) <rstaveley@seseit.com
>> >wrote:
>>
>> > I have Lucene 2.3.1 code and indexes deployed in production in a
>> > distributed
>> > system and would like to bring everything up to date with 3.0.0 via
>> 2.9.1.
>> >
>> > Here's my migration plan:
>> >
>> > 1. Add a index writer which generates a 2.9.1 "test" index
>> > 2. Have that "test" index writer push that 2.9.1 "test" index into
>> > production and see if the distributed 2.3.1 index readers can cope with
>> it
>> > OK
>> > 3. Upgrade my index writers to 2.9.1 (still using evil compressed fields)
>> -
>> > we shall have 2.9.1 writers and 2.3.1 readers during this phase. See if
>> it
>> > works.
>> > 4. Upgrade my index readers to 2.9.1 (still using evil compressed fields,
>> > but with support for explicit use of CompressionTools for decompression,
>> > where fields have been explicitly compressed with CompressionTools - the
>> > application will knows which need decompression)
>> > 5. Add a CompressionTools to my "test" index writer, generating
>> explicitly
>> > compressed fields in the 2.9.1 "test" index
>> > 6. Test explicit decompression for relevant fields with CompressionTools
>> in
>> > my 2.9.1 "test" index in my index readers
>> > 7. Upgrade my "test" index writer to 3.0.0
>> > 8. Have that "test" index writer push that 3.0.0 "test" index into
>> > production and see if the distributed 2.9.1 index readers can cope with
>> it
>> > OK
>> > 9. Go through my index writers and index reader clients and
>> systematically
>> > purge all of the Field.Store.COMPRESS fields and migrate to an explicit
>> > CompressionTools approach where applicable and no compression where
>> > applicable. During this phase I'll expect to have
>> > CompressionTools-compressed fields coexisting with their
>> > Field.Store.COMPRESS predecessors, where index reader client use of
>> > Field.Store.COMPRESS is in transit to the explicit decompression
>> approach.
>> > 10. Upgrade my index writers to 3.0.0
>> > 11. Upgrade my index readers to 3.0.0
>> >
>> > I've simplified this a bit, because I shan't really be testing straight
>> off
>> > in production(!) - I'll test the migration plan in a test cluster first;
>> > but
>> > this gives you the idea about the path.
>> >
>> > I wanted to know if I should expect problems with this plan. I'm
>> depending
>> > on newer writers generating indexes for older readers and 3 is a major
>> > number upgrade. It looks like I can get away with this in version 3, but
>> > that's by no means a guarantee according to
>> > http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats
>> >
>> > Does this sound like a good plan?
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >
>> >
>>
>>
>> --
>> Weiwei Wang
>> Alex Wang
>> 王巍巍
>> Room 403, Mengmin Wei Building
>> Computer Science Department
>> Gulou Campus of Nanjing University
>> Nanjing, P.R.China, 210093
>>
>> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> --
> Weiwei Wang
> Alex Wang
> 王巍巍
> Room 403, Mengmin Wei Building
> Computer Science Department
> Gulou Campus of Nanjing University
> Nanjing, P.R.China, 210093
>
> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Index file compatibility and a migration plan to lucene 3

Posted by Danil ŢORIN <to...@gmail.com>.

You NEED to update your readers first, or else they will be unable to
read files created by newer version.
And trust me, there are changes in index format from 2.3 -> 2.9

On Wed, Dec 9, 2009 at 15:11, Weiwei Wang <ww...@gmail.com> wrote:
> Hi, Rob,
> I read
> http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats and
> found no compatibility guarantee for IndexWriter between different version.
>
> You can run your idea as a test and see the output.
> If it doesn't work, i suggest you convert your index to new version as I
> said in the last post.
>
> You can develop a convert tool to do this job automatically(that what i have
> done).
>
> If you do not have full access to the data center, you can read(readonly
> mode is preferred) from the data center(through nfs or something like that)
> and write to your local disk.
>
> When all converting is done, you can copy the new index to the data center
> with the help of the administrator.
>
> On Wed, Dec 9, 2009 at 8:42 PM, Rob Staveley (Tom) <rs...@seseit.com>wrote:
>
>> Thanks for the swift response, Weiwei.
>>
>> In my deployment, my index readers are in a data centre and therefore more
>> difficult to upgrade than the writers. That's why I wanted to start with the
>> writers rather than the readers. I realise that it looks the wrong way round
>> and http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formatseffectively says that changing the reader first is a better idea for most
>> situations, but I wanted to know if writer first would work for me for 2.3.1
>> -> 3.0.0.
>>
>> -----Original Message-----
>> From: Weiwei Wang [mailto:ww.wang.cs@gmail.com]
>> Sent: 09 December 2009 12:21
>> To: java-user@lucene.apache.org
>> Subject: Re: Index file compatibility and a migration plan to lucene 3
>>
>> I’ve finished a upgrade from 2.4.1 to 3.0.0
>>
>> What I do is like this:
>> 1. Upgrade my user-defined analyzer, tokenizer and filter to 3.0.0
>> 2. Use a 3.0.0 IndexReader to read the old version index and then use a
>> 3.0.0 IndexWriter to write all the documents into a new index
>> 3. Update QueryPaser to 3.0.0
>>
>> I've redeployed my system and it works fine now.
>>
>>
>> On Wed, Dec 9, 2009 at 8:13 PM, Rob Staveley (Tom) <rstaveley@seseit.com
>> >wrote:
>>
>> > I have Lucene 2.3.1 code and indexes deployed in production in a
>> > distributed
>> > system and would like to bring everything up to date with 3.0.0 via
>> 2.9.1.
>> >
>> > Here's my migration plan:
>> >
>> > 1. Add a index writer which generates a 2.9.1 "test" index
>> > 2. Have that "test" index writer push that 2.9.1 "test" index into
>> > production and see if the distributed 2.3.1 index readers can cope with
>> it
>> > OK
>> > 3. Upgrade my index writers to 2.9.1 (still using evil compressed fields)
>> -
>> > we shall have 2.9.1 writers and 2.3.1 readers during this phase. See if
>> it
>> > works.
>> > 4. Upgrade my index readers to 2.9.1 (still using evil compressed fields,
>> > but with support for explicit use of CompressionTools for decompression,
>> > where fields have been explicitly compressed with CompressionTools - the
>> > application will knows which need decompression)
>> > 5. Add a CompressionTools to my "test" index writer, generating
>> explicitly
>> > compressed fields in the 2.9.1 "test" index
>> > 6. Test explicit decompression for relevant fields with CompressionTools
>> in
>> > my 2.9.1 "test" index in my index readers
>> > 7. Upgrade my "test" index writer to 3.0.0
>> > 8. Have that "test" index writer push that 3.0.0 "test" index into
>> > production and see if the distributed 2.9.1 index readers can cope with
>> it
>> > OK
>> > 9. Go through my index writers and index reader clients and
>> systematically
>> > purge all of the Field.Store.COMPRESS fields and migrate to an explicit
>> > CompressionTools approach where applicable and no compression where
>> > applicable. During this phase I'll expect to have
>> > CompressionTools-compressed fields coexisting with their
>> > Field.Store.COMPRESS predecessors, where index reader client use of
>> > Field.Store.COMPRESS is in transit to the explicit decompression
>> approach.
>> > 10. Upgrade my index writers to 3.0.0
>> > 11. Upgrade my index readers to 3.0.0
>> >
>> > I've simplified this a bit, because I shan't really be testing straight
>> off
>> > in production(!) - I'll test the migration plan in a test cluster first;
>> > but
>> > this gives you the idea about the path.
>> >
>> > I wanted to know if I should expect problems with this plan. I'm
>> depending
>> > on newer writers generating indexes for older readers and 3 is a major
>> > number upgrade. It looks like I can get away with this in version 3, but
>> > that's by no means a guarantee according to
>> > http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats
>> >
>> > Does this sound like a good plan?
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >
>> >
>>
>>
>> --
>> Weiwei Wang
>> Alex Wang
>> 王巍巍
>> Room 403, Mengmin Wei Building
>> Computer Science Department
>> Gulou Campus of Nanjing University
>> Nanjing, P.R.China, 210093
>>
>> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> --
> Weiwei Wang
> Alex Wang
> 王巍巍
> Room 403, Mengmin Wei Building
> Computer Science Department
> Gulou Campus of Nanjing University
> Nanjing, P.R.China, 210093
>
> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Index file compatibility and a migration plan to lucene 3

Posted by Weiwei Wang <ww...@gmail.com>.

Hi, Rob,
I read
http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats and
found no compatibility guarantee for IndexWriter between different version.

You can run your idea as a test and see the output.
If it doesn't work, i suggest you convert your index to new version as I
said in the last post.

You can develop a convert tool to do this job automatically(that what i have
done).

If you do not have full access to the data center, you can read(readonly
mode is preferred) from the data center(through nfs or something like that)
and write to your local disk.

When all converting is done, you can copy the new index to the data center
with the help of the administrator.

On Wed, Dec 9, 2009 at 8:42 PM, Rob Staveley (Tom) <rs...@seseit.com>wrote:

> Thanks for the swift response, Weiwei.
>
> In my deployment, my index readers are in a data centre and therefore more
> difficult to upgrade than the writers. That's why I wanted to start with the
> writers rather than the readers. I realise that it looks the wrong way round
> and http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formatseffectively says that changing the reader first is a better idea for most
> situations, but I wanted to know if writer first would work for me for 2.3.1
> -> 3.0.0.
>
> -----Original Message-----
> From: Weiwei Wang [mailto:ww.wang.cs@gmail.com]
> Sent: 09 December 2009 12:21
> To: java-user@lucene.apache.org
> Subject: Re: Index file compatibility and a migration plan to lucene 3
>
> I’ve finished a upgrade from 2.4.1 to 3.0.0
>
> What I do is like this:
> 1. Upgrade my user-defined analyzer, tokenizer and filter to 3.0.0
> 2. Use a 3.0.0 IndexReader to read the old version index and then use a
> 3.0.0 IndexWriter to write all the documents into a new index
> 3. Update QueryPaser to 3.0.0
>
> I've redeployed my system and it works fine now.
>
>
> On Wed, Dec 9, 2009 at 8:13 PM, Rob Staveley (Tom) <rstaveley@seseit.com
> >wrote:
>
> > I have Lucene 2.3.1 code and indexes deployed in production in a
> > distributed
> > system and would like to bring everything up to date with 3.0.0 via
> 2.9.1.
> >
> > Here's my migration plan:
> >
> > 1. Add a index writer which generates a 2.9.1 "test" index
> > 2. Have that "test" index writer push that 2.9.1 "test" index into
> > production and see if the distributed 2.3.1 index readers can cope with
> it
> > OK
> > 3. Upgrade my index writers to 2.9.1 (still using evil compressed fields)
> -
> > we shall have 2.9.1 writers and 2.3.1 readers during this phase. See if
> it
> > works.
> > 4. Upgrade my index readers to 2.9.1 (still using evil compressed fields,
> > but with support for explicit use of CompressionTools for decompression,
> > where fields have been explicitly compressed with CompressionTools - the
> > application will knows which need decompression)
> > 5. Add a CompressionTools to my "test" index writer, generating
> explicitly
> > compressed fields in the 2.9.1 "test" index
> > 6. Test explicit decompression for relevant fields with CompressionTools
> in
> > my 2.9.1 "test" index in my index readers
> > 7. Upgrade my "test" index writer to 3.0.0
> > 8. Have that "test" index writer push that 3.0.0 "test" index into
> > production and see if the distributed 2.9.1 index readers can cope with
> it
> > OK
> > 9. Go through my index writers and index reader clients and
> systematically
> > purge all of the Field.Store.COMPRESS fields and migrate to an explicit
> > CompressionTools approach where applicable and no compression where
> > applicable. During this phase I'll expect to have
> > CompressionTools-compressed fields coexisting with their
> > Field.Store.COMPRESS predecessors, where index reader client use of
> > Field.Store.COMPRESS is in transit to the explicit decompression
> approach.
> > 10. Upgrade my index writers to 3.0.0
> > 11. Upgrade my index readers to 3.0.0
> >
> > I've simplified this a bit, because I shan't really be testing straight
> off
> > in production(!) - I'll test the migration plan in a test cluster first;
> > but
> > this gives you the idea about the path.
> >
> > I wanted to know if I should expect problems with this plan. I'm
> depending
> > on newer writers generating indexes for older readers and 3 is a major
> > number upgrade. It looks like I can get away with this in version 3, but
> > that's by no means a guarantee according to
> > http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats
> >
> > Does this sound like a good plan?
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>
> --
> Weiwei Wang
> Alex Wang
> 王巍巍
> Room 403, Mengmin Wei Building
> Computer Science Department
> Gulou Campus of Nanjing University
> Nanjing, P.R.China, 210093
>
> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Weiwei Wang
Alex Wang
王巍巍
Room 403, Mengmin Wei Building
Computer Science Department
Gulou Campus of Nanjing University
Nanjing, P.R.China, 210093

Homepage: http://cs.nju.edu.cn/rl/weiweiwang

RE: Index file compatibility and a migration plan to lucene 3

Posted by "Rob Staveley (Tom)" <rs...@seseit.com>.

Thanks for the swift response, Weiwei.

In my deployment, my index readers are in a data centre and therefore more difficult to upgrade than the writers. That's why I wanted to start with the writers rather than the readers. I realise that it looks the wrong way round and http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats effectively says that changing the reader first is a better idea for most situations, but I wanted to know if writer first would work for me for 2.3.1 -> 3.0.0.

-----Original Message-----
From: Weiwei Wang [mailto:ww.wang.cs@gmail.com] 
Sent: 09 December 2009 12:21
To: java-user@lucene.apache.org
Subject: Re: Index file compatibility and a migration plan to lucene 3

I’ve finished a upgrade from 2.4.1 to 3.0.0

What I do is like this:
1. Upgrade my user-defined analyzer, tokenizer and filter to 3.0.0
2. Use a 3.0.0 IndexReader to read the old version index and then use a
3.0.0 IndexWriter to write all the documents into a new index
3. Update QueryPaser to 3.0.0

I've redeployed my system and it works fine now.


On Wed, Dec 9, 2009 at 8:13 PM, Rob Staveley (Tom) <rs...@seseit.com>wrote:

> I have Lucene 2.3.1 code and indexes deployed in production in a
> distributed
> system and would like to bring everything up to date with 3.0.0 via 2.9.1.
>
> Here's my migration plan:
>
> 1. Add a index writer which generates a 2.9.1 "test" index
> 2. Have that "test" index writer push that 2.9.1 "test" index into
> production and see if the distributed 2.3.1 index readers can cope with it
> OK
> 3. Upgrade my index writers to 2.9.1 (still using evil compressed fields) -
> we shall have 2.9.1 writers and 2.3.1 readers during this phase. See if it
> works.
> 4. Upgrade my index readers to 2.9.1 (still using evil compressed fields,
> but with support for explicit use of CompressionTools for decompression,
> where fields have been explicitly compressed with CompressionTools - the
> application will knows which need decompression)
> 5. Add a CompressionTools to my "test" index writer, generating explicitly
> compressed fields in the 2.9.1 "test" index
> 6. Test explicit decompression for relevant fields with CompressionTools in
> my 2.9.1 "test" index in my index readers
> 7. Upgrade my "test" index writer to 3.0.0
> 8. Have that "test" index writer push that 3.0.0 "test" index into
> production and see if the distributed 2.9.1 index readers can cope with it
> OK
> 9. Go through my index writers and index reader clients and systematically
> purge all of the Field.Store.COMPRESS fields and migrate to an explicit
> CompressionTools approach where applicable and no compression where
> applicable. During this phase I'll expect to have
> CompressionTools-compressed fields coexisting with their
> Field.Store.COMPRESS predecessors, where index reader client use of
> Field.Store.COMPRESS is in transit to the explicit decompression approach.
> 10. Upgrade my index writers to 3.0.0
> 11. Upgrade my index readers to 3.0.0
>
> I've simplified this a bit, because I shan't really be testing straight off
> in production(!) - I'll test the migration plan in a test cluster first;
> but
> this gives you the idea about the path.
>
> I wanted to know if I should expect problems with this plan. I'm depending
> on newer writers generating indexes for older readers and 3 is a major
> number upgrade. It looks like I can get away with this in version 3, but
> that's by no means a guarantee according to
> http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats
>
> Does this sound like a good plan?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Weiwei Wang
Alex Wang
王巍巍
Room 403, Mengmin Wei Building
Computer Science Department
Gulou Campus of Nanjing University
Nanjing, P.R.China, 210093

Homepage: http://cs.nju.edu.cn/rl/weiweiwang


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Index file compatibility and a migration plan to lucene 3

Posted by Weiwei Wang <ww...@gmail.com>.

I’ve finished a upgrade from 2.4.1 to 3.0.0

What I do is like this:
1. Upgrade my user-defined analyzer, tokenizer and filter to 3.0.0
2. Use a 3.0.0 IndexReader to read the old version index and then use a
3.0.0 IndexWriter to write all the documents into a new index
3. Update QueryPaser to 3.0.0

I've redeployed my system and it works fine now.


On Wed, Dec 9, 2009 at 8:13 PM, Rob Staveley (Tom) <rs...@seseit.com>wrote:

> I have Lucene 2.3.1 code and indexes deployed in production in a
> distributed
> system and would like to bring everything up to date with 3.0.0 via 2.9.1.
>
> Here's my migration plan:
>
> 1. Add a index writer which generates a 2.9.1 "test" index
> 2. Have that "test" index writer push that 2.9.1 "test" index into
> production and see if the distributed 2.3.1 index readers can cope with it
> OK
> 3. Upgrade my index writers to 2.9.1 (still using evil compressed fields) -
> we shall have 2.9.1 writers and 2.3.1 readers during this phase. See if it
> works.
> 4. Upgrade my index readers to 2.9.1 (still using evil compressed fields,
> but with support for explicit use of CompressionTools for decompression,
> where fields have been explicitly compressed with CompressionTools - the
> application will knows which need decompression)
> 5. Add a CompressionTools to my "test" index writer, generating explicitly
> compressed fields in the 2.9.1 "test" index
> 6. Test explicit decompression for relevant fields with CompressionTools in
> my 2.9.1 "test" index in my index readers
> 7. Upgrade my "test" index writer to 3.0.0
> 8. Have that "test" index writer push that 3.0.0 "test" index into
> production and see if the distributed 2.9.1 index readers can cope with it
> OK
> 9. Go through my index writers and index reader clients and systematically
> purge all of the Field.Store.COMPRESS fields and migrate to an explicit
> CompressionTools approach where applicable and no compression where
> applicable. During this phase I'll expect to have
> CompressionTools-compressed fields coexisting with their
> Field.Store.COMPRESS predecessors, where index reader client use of
> Field.Store.COMPRESS is in transit to the explicit decompression approach.
> 10. Upgrade my index writers to 3.0.0
> 11. Upgrade my index readers to 3.0.0
>
> I've simplified this a bit, because I shan't really be testing straight off
> in production(!) - I'll test the migration plan in a test cluster first;
> but
> this gives you the idea about the path.
>
> I wanted to know if I should expect problems with this plan. I'm depending
> on newer writers generating indexes for older readers and 3 is a major
> number upgrade. It looks like I can get away with this in version 3, but
> that's by no means a guarantee according to
> http://wiki.apache.org/lucene-java/BackwardsCompatibility#File_Formats
>
> Does this sound like a good plan?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Weiwei Wang
Alex Wang
王巍巍
Room 403, Mengmin Wei Building
Computer Science Department
Gulou Campus of Nanjing University
Nanjing, P.R.China, 210093

Homepage: http://cs.nju.edu.cn/rl/weiweiwang