You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Uwe Schindler (JIRA)" <ji...@apache.org> on 2011/05/08 15:10:03 UTC

[jira] [Created] (LUCENE-3082) Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing

Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing
------------------------------------------------------------------------------------------------------------------------------------

                 Key: LUCENE-3082
                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
             Project: Lucene - Java
          Issue Type: New Feature
          Components: Index
            Reporter: Uwe Schindler
            Priority: Minor
             Fix For: 3.2, 4.0


Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.

I propose to add a method to IndexWriter thats similar to optimize(), that uses a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment. The tool could optionally also optimize the index.

This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.

This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3082) Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030518#comment-13030518 ] 

Shai Erera commented on LUCENE-3082:
------------------------------------

This is a great idea. We should also allow one to plug in a PayloadProcessorProvider so he can rewrite the payload "on the go" if need be.

Also, while the index is being upgraded, I think it will be useful if we merge the segments that are upgraded, however not do cascading merges. Since segments are rewritten anyway, we can only gain from the merge. As always, if not everybody agree on this, we can make it a parameter.

And let's make sure that whatever 'upgrade' means is at the application control. I.e., upgrade can be simply upgrading from 3x to 4.0, but it can also be using PayloadProcessorProvider as well suddenly deciding that all segments should be compound. I'm pretty sure I'll want to control the first two, not so about the last one.

It can be a simple 'boolean shouldUpgradeSegment(SegmentInfo)' on this UpgradeMP, which apps can override.

> Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a method to IndexWriter thats similar to optimize(), that uses a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment. The tool could optionally also optimize the index.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3082) Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030483#comment-13030483 ] 

Michael McCandless commented on LUCENE-3082:
--------------------------------------------

Maybe instead of a new method on IW, this is a new tool (eg oal.index.UpgradeIndex)?  That tool would create IW w/ a custom UpgradeMergePolicy that rewrites all segments (or only segments not matching current format, but often that would presumably be all segments).

> Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a method to IndexWriter thats similar to optimize(), that uses a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment. The tool could optionally also optimize the index.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3082) Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3082:
----------------------------------

    Attachment: LUCENE-3082.patch

Small change to the merging of the leftover segments, that are not scheduled for merge by the wrapped MergePolicy: They re now merged together into one segment instead of separately. Normally that are only few ones (e.g. when TieredMergePolicy only optimized the first 30 segments and leave the rest for later). As we have no cascading optimize, we merge the remaining segments into one.

> Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a method to IndexWriter thats similar to optimize(), that uses a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment. The tool could optionally also optimize the index.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (LUCENE-3082) Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030519#comment-13030519 ] 

Uwe Schindler edited comment on LUCENE-3082 at 5/8/11 6:29 PM:
---------------------------------------------------------------

Patch that implements this with a merge policy:

It does not yet contain the command line updater, if you want to upgrade an old index, the API code to do this is very simple:

{code:java}
IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_XX, new KeywordAnalyzer());
iwc = iwc.setMergePolicy(new UpgradeIndexMergePolicy(iwc.getMergePolicy()));
IndexWriter w = new IndexWriter(dir, iwc);
w.optimize();
w.close();
{code}

The patch contains new tests in TestBackwards that verify the upgrade process:

- It tries to upgrade all old indexes from the well-known list in TestBackwards. When this is done, all of them should contain exactly one segment (because all segments previously in index are older version, so they are merged/optimized together in new format). It also verifies all segment versions to be Constants.LUCENE_MAIN_VERSION.
- It tries to upgrade two old, already optimized indexes (with prev version, I changed TestBackwards in my 3.1 checkout to generate those). It verifies the segment versions after the upgrade. This special case is needed, as optimizing a one-segment index is a no-op without the special merge-policy
- It uses the old optimized indexes, opens them using standard merge policy and adds some documents to them. After that it upgrades the index with a new IndexWriter using the special merge policy. In that case (as some segments are already in new version), the index should only have the old-segments merged together, the newly added ones are untouched. So segment is verified to be count > 1.

      was (Author: thetaphi):
    Path that implements this with a merge policy:

It does not yet contain the command line updater, if you want to upgrade an old index, the API code to do this is very simple:

{code:java}
IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_XX, new KeywordAnalyzer());
iwc = iwc.setMergePolicy(new UpgradeIndexMergePolicy(iwc.getMergePolicy()));
IndexWriter w = new IndexWriter(dir, iwc);
w.optimize();
w.close();
{code}

The patch contains new tests in TestBackwards that verify the upgrade process:

- It tries to upgrade all old segments in the well-known list. When this is done, all of them should contain exactly one segment (because all segments previously in index are older version, so they are merged/optimized together in new format). It also verifies all segment versions to be Constants.LUCENE_MAIN_VERSION.
- It tries to upgrade two old, already optimized indexes (with prev version, I changed TestBackwards in my 3.1 checkout to generate those). It verifies the segment versions after the upgrade. This special case is needed, as optimizing a one-segment index is a no-op without the special merge-policy
- It uses the old optimized indexes, opens them using standard merge policy and adds some documents to them. After that it upgrades the index, in that case (as some segments are already in new version), the index should only have the old-segments merged together, the newly added ones are untouched. So segment count > 1
  
> Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a method to IndexWriter thats similar to optimize(), that uses a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment. The tool could optionally also optimize the index.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3082) Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030528#comment-13030528 ] 

Michael McCandless commented on LUCENE-3082:
--------------------------------------------

Patch looks great!

The segmentsToOptimize ought to contain every segment in the index; that's only present for the case where optimize() is called in a bg thread but other threads continue to index new documents causing new segments to be flushed.  These new segments would then NOT be in the segmentsToOptimize when the optimize merges need to cascade.

TODO: for the command-line tool, we should make sure the index only has a single commit point (ie, abort if not).  Upgrading an index with more than one commit point is hairy (I think it's fine not to support this case... but we should not remove the commits).

> Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a method to IndexWriter thats similar to optimize(), that uses a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment. The tool could optionally also optimize the index.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3082) Add tool to upgrade all segments of an index to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3082:
----------------------------------

    Attachment: LUCENE-3082.patch

New patch with renamed class and added documentation as suggested by Mike.

The previous patch had also a bug in the command line tool (instead of "dir" it used still "args[0]" to invoke the ctor, which was a relict from earlier tool version).

I also fixed javadocs and added lucene.experimental to the UpgradeIndexMergePolicy, as we should not make it too public (but its not really "internal" because there are use cases not covered by the easy-to-use IndexUpgrader tool.

> Add tool to upgrade all segments of an index to last recent supported index format without optimizing
> -----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment using another MergePolicy's optimize strategy.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3082) Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030525#comment-13030525 ] 

Uwe Schindler commented on LUCENE-3082:
---------------------------------------

Shai:

- The supplied patch should handle all you want (there would be only one addition, the proposed 'boolean shouldUpgradeSegment(SegmentInfo)' method, which is a one-liner, will upload new patch for that und make the merge policy unfinal.
- It will not do cascading merges, because when the merge policy recognizes that all segments have already the new version it will not merge anything. So after the first iteration all segments will be upgraded, so on the next run of this policy, it will return null merges.

The other ideas like PayloadProcessor can be done outside of that in user code (but beware, it will not touch segments already in new version).

> Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a method to IndexWriter thats similar to optimize(), that uses a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment. The tool could optionally also optimize the index.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3082) Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3082:
----------------------------------

    Attachment: index.31.optimized.nocfs.zip
                index.31.optimized.cfs.zip

> Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a method to IndexWriter thats similar to optimize(), that uses a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment. The tool could optionally also optimize the index.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3082) Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3082:
----------------------------------

    Attachment: LUCENE-3082.patch

Patch with updated and randomized tests, command line tool (oal.index.IndexFormatUpgrader) and javadocs.

I think it's ready to commit.

> Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a method to IndexWriter thats similar to optimize(), that uses a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment. The tool could optionally also optimize the index.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (LUCENE-3082) Add tool to upgrade all segments of an index to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler reassigned LUCENE-3082:
-------------------------------------

    Assignee: Uwe Schindler

> Add tool to upgrade all segments of an index to last recent supported index format without optimizing
> -----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment using another MergePolicy's optimize strategy.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3082) Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3082:
----------------------------------

    Attachment: LUCENE-3082.patch

Upgraded patch with a protected shouldUpgradeSegment(SI) method.

> Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a method to IndexWriter thats similar to optimize(), that uses a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment. The tool could optionally also optimize the index.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Reopened] (LUCENE-3082) Add tool to upgrade all segments of an index to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler reopened LUCENE-3082:
-----------------------------------


We should add a warning to the MergePolicy/IndexUpgrader, that this tool reorders segments, if the index was partially upgraded before (e.g. by adding new documents). Segments that were upgraded before a call to MP's optimize come first, then the upgraded ones.

> Add tool to upgrade all segments of an index to last recent supported index format without optimizing
> -----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment using another MergePolicy's optimize strategy.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-3082) Add tool to upgrade all segments of an index to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler resolved LUCENE-3082.
-----------------------------------

    Resolution: Fixed

Committed trunk revision: 1102658
Merged 3.x revision: 1102659

> Add tool to upgrade all segments of an index to last recent supported index format without optimizing
> -----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082-reorder-warnings.patch, LUCENE-3082-reorder-warnings.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment using another MergePolicy's optimize strategy.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3082) Add tool to upgrade all segments of an index to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3082:
----------------------------------

    Attachment: LUCENE-3082-reorder-warnings.patch

Upgraded patch. Will now be committed.

I added Version ctor argument, as in 3.x this would chose the default merge policy.

> Add tool to upgrade all segments of an index to last recent supported index format without optimizing
> -----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082-reorder-warnings.patch, LUCENE-3082-reorder-warnings.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment using another MergePolicy's optimize strategy.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3082) Add tool to upgrade all segments of an index to last recent supported index format without optimizing

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030676#comment-13030676 ] 

Michael McCandless commented on LUCENE-3082:
--------------------------------------------

How about this wording:

Expert: this tool keeps only the last commit in an index; for this
reason, if the incoming index has more than one commit, the tool
refuses to run by default.  Specify -delete-prior-commits to override
this, allowing the tool to delete all but the last commit.

Maybe just call it IndexUpgrader?  (Format seems redundant?)

There's a missing { and } after the "if (commits.size() > 1)"


> Add tool to upgrade all segments of an index to last recent supported index format without optimizing
> -----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment using another MergePolicy's optimize strategy.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3082) Add tool to upgrade all segments of an index to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3082:
----------------------------------

    Description: 
Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.

I propose to add a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment using another MergePolicy's optimize strategy.

This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.

This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

  was:
Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.

I propose to add a method to IndexWriter thats similar to optimize(), that uses a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment. The tool could optionally also optimize the index.

This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.

This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

        Summary: Add tool to upgrade all segments of an index to last recent supported index format without optimizing  (was: Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing)

> Add tool to upgrade all segments of an index to last recent supported index format without optimizing
> -----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment using another MergePolicy's optimize strategy.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3082) Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3082:
----------------------------------

    Attachment: LUCENE-3082.patch

Path that implements this with a merge policy:

It does not yet contain the command line updater, if you want to upgrade an old index, the API code to do this is very simple:

{code:java}
IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_XX, new KeywordAnalyzer());
iwc = iwc.setMergePolicy(new UpgradeIndexMergePolicy(iwc.getMergePolicy()));
IndexWriter w = new IndexWriter(dir, iwc);
w.optimize();
w.close();
{code}

The patch contains new tests in TestBackwards that verify the upgrade process:

- It tries to upgrade all old segments in the well-known list. When this is done, all of them should contain exactly one segment (because all segments previously in index are older version, so they are merged/optimized together in new format). It also verifies all segment versions to be Constants.LUCENE_MAIN_VERSION.
- It tries to upgrade two old, already optimized indexes (with prev version, I changed TestBackwards in my 3.1 checkout to generate those). It verifies the segment versions after the upgrade. This special case is needed, as optimizing a one-segment index is a no-op without the special merge-policy
- It uses the old optimized indexes, opens them using standard merge policy and adds some documents to them. After that it upgrades the index, in that case (as some segments are already in new version), the index should only have the old-segments merged together, the newly added ones are untouched. So segment count > 1

> Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a method to IndexWriter thats similar to optimize(), that uses a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment. The tool could optionally also optimize the index.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-3082) Add tool to upgrade all segments of an index to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler resolved LUCENE-3082.
-----------------------------------

    Resolution: Fixed

Committed trunk revision: 1101088
Committed 3.x revision: 1101093

> Add tool to upgrade all segments of an index to last recent supported index format without optimizing
> -----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment using another MergePolicy's optimize strategy.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3082) Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030484#comment-13030484 ] 

Uwe Schindler commented on LUCENE-3082:
---------------------------------------

Here the discussion of #lucene-dev irc channel: [http://colabti.org/irclogger/irclogger_log/lucene-dev?date=2011-05-08#l117]

> Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a method to IndexWriter thats similar to optimize(), that uses a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment. The tool could optionally also optimize the index.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3082) Add tool to upgrade all segments of an index to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3082:
----------------------------------

    Attachment: LUCENE-3082-reorder-warnings.patch

Patch that adds some warnings about reordering of documents IDs if the index was partially upgraded before execution.

> Add tool to upgrade all segments of an index to last recent supported index format without optimizing
> -----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082-reorder-warnings.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment using another MergePolicy's optimize strategy.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3082) Add tool to upgrade all segments of an index to last recent supported index format without optimizing

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030797#comment-13030797 ] 

Uwe Schindler commented on LUCENE-3082:
---------------------------------------

I also used the full random IndexWriterConfig now after LUCENE-3083 was committed (Fix MockRandomMergePolicy).

I will now commit and merge the test code to produce the optimized indexes.

> Add tool to upgrade all segments of an index to last recent supported index format without optimizing
> -----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3082
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3082
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip
>
>
> Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized.
> I propose to add a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment using another MergePolicy's optimize strategy.
> This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool.
> This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org