You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Uwe Schindler (JIRA)" <ji...@apache.org> on 2010/09/27 23:11:36 UTC

[jira] Created: (LUCENE-2675) Add support for 3.0 indexes in 2.9 branch

Add support for 3.0 indexes in 2.9 branch
-----------------------------------------

Key: LUCENE-2675
URL: https://issues.apache.org/jira/browse/LUCENE-2675
Project: Lucene - Java
Issue Type: Improvement
Affects Versions: 2.9.3, 2.9.2, 2.9.1, 2.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
Priority: Minor
Fix For: 2.9.4

There was a lot of user requests to be able to read Lucene 3.0 indexes also with 2.9. This would make the migration easier. There is no problem in doing that, as the new stored fields version in Lucene 3.0 is only used to mark a segment's stored fields file as no longer containing compressed fields. But index format did not really change. This patch simply allows FieldsReader to pass a Lucene 3.0 version number, but still writes segments in 2.9 format (as you could suddenly turn on compression for added documents).

I added ZIP files for 3.0 indexes for TestBackwards. Without the patch it does not pass, as FieldsReader complains about incorrect version number (although it could read the file easily). If we would release maybe a 2.9.4 release of Lucene we should include that patch.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] Resolved: (LUCENE-2675) Add support for 3.0 indexes in 2.9 branch

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler resolved LUCENE-2675.
-----------------------------------

    Resolution: Fixed

Committed revision: 1028723

> Add support for 3.0 indexes in 2.9 branch
> -----------------------------------------
>
>                 Key: LUCENE-2675
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2675
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 2.9.4
>
>         Attachments: index.30.cfs.zip, index.30.nocfs.zip, LUCENE-2675.patch
>
>
> There was a lot of user requests to be able to read Lucene 3.0 indexes also with 2.9. This would make the migration easier. There is no problem in doing that, as the new stored fields version in Lucene 3.0 is only used to mark a segment's stored fields file as no longer containing compressed fields. But index format did not really change. This patch simply allows FieldsReader to pass a Lucene 3.0 version number, but still writes segments in 2.9 format (as you could suddenly turn on compression for added documents).
> I added ZIP files for 3.0 indexes for TestBackwards. Without the patch it does not pass, as FieldsReader complains about incorrect version number (although it could read the file easily). If we would release maybe a 2.9.4 release of Lucene we should include that patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] Commented: (LUCENE-2675) Add support for 3.0 indexes in 2.9 branch

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926226#action_12926226 ] 

Michael McCandless commented on LUCENE-2675:
--------------------------------------------

So, first off, Lucene's bw compat policy has never ensured this
"reverse compat".  Promising this would have been exceptionally
costly, in the past.  Once a new major release "kisses" your index, it
cannot be used by older versions of Lucene.

However, now with the switch to pluggable codecs in 4.0, it would be
conceptually possible to make a codec that ensures no change to the
index format, even on upgrade of the software.  And I think such a
codec would be reasonable to offer (we already basically have the
"hard part" done, with the preflex write codec, but it's only exposed
for testing).  But... nobody has stepped up to create this for
4.0... and it's not clear anybody will of course.

That this is so trivial really is a reflection of our crazy major
release criteria in the past (ie nothing of consequence changes going
from X.9 -> (X+1).0!).  Of course this is now changed, ie, 4.0 is
changing tons from 3.x.

I do appreciate the motivation for this, ie to allow an app to "try"
updating the software (Lucene 2.x -> 3.x) fully independently of
updating the index format.  It's a valid use case.

Net/net I'm fine with this change.  But we should advertise very
clearly that this is not in general promised by Lucene.  It'll be on a
case by case basis.  EG even if someone steps up and we make a codec
for 4.0 that can read/write the 3.x index format, that's still not
ensured going forward (unless we make a change to our back-compat
policy, which is a separate discussion).


> Add support for 3.0 indexes in 2.9 branch
> -----------------------------------------
>
>                 Key: LUCENE-2675
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2675
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 2.9.4
>
>         Attachments: index.30.cfs.zip, index.30.nocfs.zip, LUCENE-2675.patch
>
>
> There was a lot of user requests to be able to read Lucene 3.0 indexes also with 2.9. This would make the migration easier. There is no problem in doing that, as the new stored fields version in Lucene 3.0 is only used to mark a segment's stored fields file as no longer containing compressed fields. But index format did not really change. This patch simply allows FieldsReader to pass a Lucene 3.0 version number, but still writes segments in 2.9 format (as you could suddenly turn on compression for added documents).
> I added ZIP files for 3.0 indexes for TestBackwards. Without the patch it does not pass, as FieldsReader complains about incorrect version number (although it could read the file easily). If we would release maybe a 2.9.4 release of Lucene we should include that patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] Updated: (LUCENE-2675) Add support for 3.0 indexes in 2.9 branch

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/LUCENE-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2675:
----------------------------------

    Attachment: index.30.nocfs.zip
                LUCENE-2675.patch
                index.30.cfs.zip

> Add support for 3.0 indexes in 2.9 branch
> -----------------------------------------
>
>                 Key: LUCENE-2675
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2675
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 2.9.4
>
>         Attachments: index.30.cfs.zip, index.30.nocfs.zip, LUCENE-2675.patch
>
>
> There was a lot of user requests to be able to read Lucene 3.0 indexes also with 2.9. This would make the migration easier. There is no problem in doing that, as the new stored fields version in Lucene 3.0 is only used to mark a segment's stored fields file as no longer containing compressed fields. But index format did not really change. This patch simply allows FieldsReader to pass a Lucene 3.0 version number, but still writes segments in 2.9 format (as you could suddenly turn on compression for added documents).
> I added ZIP files for 3.0 indexes for TestBackwards. Without the patch it does not pass, as FieldsReader complains about incorrect version number (although it could read the file easily). If we would release maybe a 2.9.4 release of Lucene we should include that patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-2675) Add support for 3.0 indexes in 2.9 branch

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915496#action_12915496 ] 

Uwe Schindler edited comment on LUCENE-2675 at 9/27/10 5:39 PM:
----------------------------------------------------------------

Forget about Java versions. Almost everybody who migrates to 3.0 already uses Java 1.5 or 1.6. The problem is during the migration phase (when you undeprecate your code) you cannot switch between both versions easily as soon as you touch an index with 3.0, it will not open anymore in 2.9, but in reality its the same index version, there is *no fileformat change* at all! The version number is simply a marker for SegmentMerger in 3.0 that it can raw-copy documents because they do not contain compression anymore. If we would not have removed compression in 3.0, the file format would have been identical.

As we declare 2.9 and 3.0 as feature-identical even in the latest version, it is not understandable to anyone why they cannot open an 3.0 index with 2.9 and vice versa. For unicode reasons you should then also disallow opening a 2.9 index with 3.0 :-) I got requests (even on java-user, but also from my customers) quite often about that and one user that wants to migrate to 3.0 through 2.9 again asked me today.

I just repeat: The index format is identical!

Maybe we have other comments, I will only commit this if we have an agreement and only if we would release 2.9.4.

      was (Author: thetaphi):
    Forget about Java versions. Almost everybody who migrates to 3.0 already uses Java 1.5 or 1.6. The problem is during the migration phase (when you undeprecate your code) you cannot switch between both versions easily as soon as you touch an index with 3.0, it will not open anymore in 2.9, but in reality its the same index version, there is *no fileformat change* at all! The version number is simply a marker for SegmentMerger in 3.0 that it can raw-copy documents because they do not contain compression anymore. If we would not have removed compression in 3.0, the file format would have been identical.

As we declare 2.9 and 3.0 as feature-identical even in the latest version, it is not understandable to anyone why they cannot open an 3.0 index with 2.9 and vice versa. For unicode reasons you should then also disallow opening a 2.9 index with 3.0 :-) I got requests on java-user quite often about that and one user that wants to migrate to 3.0 through 2.9 again asked me today.

I just repeat: The index format is identical!

Maybe we have other comments, I will only commit this if we have an agreement and only if we would release 2.9.4.
  
> Add support for 3.0 indexes in 2.9 branch
> -----------------------------------------
>
>                 Key: LUCENE-2675
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2675
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 2.9.4
>
>         Attachments: index.30.cfs.zip, index.30.nocfs.zip, LUCENE-2675.patch
>
>
> There was a lot of user requests to be able to read Lucene 3.0 indexes also with 2.9. This would make the migration easier. There is no problem in doing that, as the new stored fields version in Lucene 3.0 is only used to mark a segment's stored fields file as no longer containing compressed fields. But index format did not really change. This patch simply allows FieldsReader to pass a Lucene 3.0 version number, but still writes segments in 2.9 format (as you could suddenly turn on compression for added documents).
> I added ZIP files for 3.0 indexes for TestBackwards. Without the patch it does not pass, as FieldsReader complains about incorrect version number (although it could read the file easily). If we would release maybe a 2.9.4 release of Lucene we should include that patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] Commented: (LUCENE-2675) Add support for 3.0 indexes in 2.9 branch

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915492#action_12915492 ] 

Robert Muir commented on LUCENE-2675:
-------------------------------------

in my opinion this would be very confusing and set a bad precedent.

I don't understand how it would make migration easier... migration backwards?

Personally I think we should let things be... e.g. someone will get confused and think they 
can open their 3.0-created index with 2.9/java 1.4 but it is a different version of Unicode, for example.


> Add support for 3.0 indexes in 2.9 branch
> -----------------------------------------
>
>                 Key: LUCENE-2675
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2675
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 2.9.4
>
>         Attachments: index.30.cfs.zip, index.30.nocfs.zip, LUCENE-2675.patch
>
>
> There was a lot of user requests to be able to read Lucene 3.0 indexes also with 2.9. This would make the migration easier. There is no problem in doing that, as the new stored fields version in Lucene 3.0 is only used to mark a segment's stored fields file as no longer containing compressed fields. But index format did not really change. This patch simply allows FieldsReader to pass a Lucene 3.0 version number, but still writes segments in 2.9 format (as you could suddenly turn on compression for added documents).
> I added ZIP files for 3.0 indexes for TestBackwards. Without the patch it does not pass, as FieldsReader complains about incorrect version number (although it could read the file easily). If we would release maybe a 2.9.4 release of Lucene we should include that patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] Commented: (LUCENE-2675) Add support for 3.0 indexes in 2.9 branch

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926260#action_12926260 ] 

Uwe Schindler commented on LUCENE-2675:
---------------------------------------

So as Robert seems to also agree, I think I commit this and modify changes.txt to explicitely say that thisonly applys to 2.9/3.0, as it uses same codebase and same bugfix level.

> Add support for 3.0 indexes in 2.9 branch
> -----------------------------------------
>
>                 Key: LUCENE-2675
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2675
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 2.9.4
>
>         Attachments: index.30.cfs.zip, index.30.nocfs.zip, LUCENE-2675.patch
>
>
> There was a lot of user requests to be able to read Lucene 3.0 indexes also with 2.9. This would make the migration easier. There is no problem in doing that, as the new stored fields version in Lucene 3.0 is only used to mark a segment's stored fields file as no longer containing compressed fields. But index format did not really change. This patch simply allows FieldsReader to pass a Lucene 3.0 version number, but still writes segments in 2.9 format (as you could suddenly turn on compression for added documents).
> I added ZIP files for 3.0 indexes for TestBackwards. Without the patch it does not pass, as FieldsReader complains about incorrect version number (although it could read the file easily). If we would release maybe a 2.9.4 release of Lucene we should include that patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] Commented: (LUCENE-2675) Add support for 3.0 indexes in 2.9 branch

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915506#action_12915506 ] 

Robert Muir commented on LUCENE-2675:
-------------------------------------

bq. Forget about Java versions

well its important to me, so I won't just forget about it. Especially to users that don't know how their analysis works, 
they do not know that java 1.4 is unicode 3.x and java 5 is unicode 4.x and harmony java5 is unicode 5.2 and java 6 is unicode 6.0. 

But this is even just part of the issue, i don't think we should do this. it adds too much confusion to be officially supported in any release.
furthermore its not like it can be duplicated with 4.0, i would be against adding 4.x index support to 3.x also, forget about in a bugfix release.
historically lucene has been held back by backwards compatibility, lets not throw forward compatibility into the mix.

bq. If we would not have removed compression in 3.0, the file format would have been identical.

a great example of why major release shouldn't be just removal of deprecations.



> Add support for 3.0 indexes in 2.9 branch
> -----------------------------------------
>
>                 Key: LUCENE-2675
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2675
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 2.9.4
>
>         Attachments: index.30.cfs.zip, index.30.nocfs.zip, LUCENE-2675.patch
>
>
> There was a lot of user requests to be able to read Lucene 3.0 indexes also with 2.9. This would make the migration easier. There is no problem in doing that, as the new stored fields version in Lucene 3.0 is only used to mark a segment's stored fields file as no longer containing compressed fields. But index format did not really change. This patch simply allows FieldsReader to pass a Lucene 3.0 version number, but still writes segments in 2.9 format (as you could suddenly turn on compression for added documents).
> I added ZIP files for 3.0 indexes for TestBackwards. Without the patch it does not pass, as FieldsReader complains about incorrect version number (although it could read the file easily). If we would release maybe a 2.9.4 release of Lucene we should include that patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] Commented: (LUCENE-2675) Add support for 3.0 indexes in 2.9 branch

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915498#action_12915498 ] 

Uwe Schindler commented on LUCENE-2675:
---------------------------------------

Additionally I opened this issue, so I can point those people to this issue, so they can patch their Lucene 2.9 to do what they expect. EVen if this gets rejected somehow. I am simply tired of sending this patch out to everyone.

> Add support for 3.0 indexes in 2.9 branch
> -----------------------------------------
>
>                 Key: LUCENE-2675
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2675
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 2.9.4
>
>         Attachments: index.30.cfs.zip, index.30.nocfs.zip, LUCENE-2675.patch
>
>
> There was a lot of user requests to be able to read Lucene 3.0 indexes also with 2.9. This would make the migration easier. There is no problem in doing that, as the new stored fields version in Lucene 3.0 is only used to mark a segment's stored fields file as no longer containing compressed fields. But index format did not really change. This patch simply allows FieldsReader to pass a Lucene 3.0 version number, but still writes segments in 2.9 format (as you could suddenly turn on compression for added documents).
> I added ZIP files for 3.0 indexes for TestBackwards. Without the patch it does not pass, as FieldsReader complains about incorrect version number (although it could read the file easily). If we would release maybe a 2.9.4 release of Lucene we should include that patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[jira] Commented: (LUCENE-2675) Add support for 3.0 indexes in 2.9 branch

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/LUCENE-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915496#action_12915496 ] 

Uwe Schindler commented on LUCENE-2675:
---------------------------------------

Forget about Java versions. Almost everybody who migrates to 3.0 already uses Java 1.5 or 1.6. The problem is during the migration phase (when you undeprecate your code) you cannot switch between both versions easily as soon as you touch an index with 3.0, it will not open anymore in 2.9, but in reality its the same index version, there is *no fileformat change* at all! The version number is simply a marker for SegmentMerger in 3.0 that it can raw-copy documents because they do not contain compression anymore. If we would not have removed compression in 3.0, the file format would have been identical.

As we declare 2.9 and 3.0 as feature-identical even in the latest version, it is not understandable to anyone why they cannot open an 3.0 index with 2.9 and vice versa. For unicode reasons you should then also disallow opening a 2.9 index with 3.0 :-) I got requests on java-user quite often about that and one user that wants to migrate to 3.0 through 2.9 again asked me today.

I just repeat: The index format is identical!

Maybe we have other comments, I will only commit this if we have an agreement and only if we would release 2.9.4.

> Add support for 3.0 indexes in 2.9 branch
> -----------------------------------------
>
>                 Key: LUCENE-2675
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2675
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9, 2.9.1, 2.9.2, 2.9.3
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: 2.9.4
>
>         Attachments: index.30.cfs.zip, index.30.nocfs.zip, LUCENE-2675.patch
>
>
> There was a lot of user requests to be able to read Lucene 3.0 indexes also with 2.9. This would make the migration easier. There is no problem in doing that, as the new stored fields version in Lucene 3.0 is only used to mark a segment's stored fields file as no longer containing compressed fields. But index format did not really change. This patch simply allows FieldsReader to pass a Lucene 3.0 version number, but still writes segments in 2.9 format (as you could suddenly turn on compression for added documents).
> I added ZIP files for 3.0 indexes for TestBackwards. Without the patch it does not pass, as FieldsReader complains about incorrect version number (although it could read the file easily). If we would release maybe a 2.9.4 release of Lucene we should include that patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org