You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (Created) (JIRA)" <ji...@apache.org> on 2011/11/15 01:22:52 UTC

[jira] [Created] (LUCENE-3576) TestBackwardsCompatibility needs terms with U+E000 to U+FFFF

TestBackwardsCompatibility needs terms with U+E000 to U+FFFF
------------------------------------------------------------

                 Key: LUCENE-3576
                 URL: https://issues.apache.org/jira/browse/LUCENE-3576
             Project: Lucene - Java
          Issue Type: Task
            Reporter: Robert Muir
             Fix For: 4.0


we changed sort order in 4.0, and have sophisticated backwards compatibility (e.g. surrogates dance),
but we don't test this at all in TestBackwardsCompatibility.

for example, nothing handles this case for term vectors...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3576) TestBackwardsCompatibility needs terms with U+E000 to U+FFFF

Posted by "Robert Muir (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-3576:
--------------------------------

    Attachment: index.36.surrogates.zip
                LUCENE-3576_trunk_test.patch
                LUCENE-3576_3x_createIndex.patch

Here's the patch to make the index from 3.x's testbackwards, and a simple test for trunk that fails. I also attached the zip.

I'll work on fixing the preflex codec's termvectorsreader/writer now.
                
> TestBackwardsCompatibility needs terms with U+E000 to U+FFFF
> ------------------------------------------------------------
>
>                 Key: LUCENE-3576
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3576
>             Project: Lucene - Java
>          Issue Type: Task
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-3576_3x_createIndex.patch, LUCENE-3576_trunk_test.patch, index.36.surrogates.zip
>
>
> we changed sort order in 4.0, and have sophisticated backwards compatibility (e.g. surrogates dance),
> but we don't test this at all in TestBackwardsCompatibility.
> for example, nothing handles this case for term vectors...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3576) TestBackwardsCompatibility needs terms with U+E000 to U+FFFF

Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188482#comment-13188482 ] 

Michael McCandless commented on LUCENE-3576:
--------------------------------------------

+1, we need to test term vectors w/ surrogates better!
                
> TestBackwardsCompatibility needs terms with U+E000 to U+FFFF
> ------------------------------------------------------------
>
>                 Key: LUCENE-3576
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3576
>             Project: Lucene - Java
>          Issue Type: Task
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-3576_3x_createIndex.patch, LUCENE-3576_trunk_test.patch, index.36.surrogates.zip
>
>
> we changed sort order in 4.0, and have sophisticated backwards compatibility (e.g. surrogates dance),
> but we don't test this at all in TestBackwardsCompatibility.
> for example, nothing handles this case for term vectors...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (LUCENE-3576) TestBackwardsCompatibility needs terms with U+E000 to U+FFFF

Posted by "Robert Muir (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir reassigned LUCENE-3576:
-----------------------------------

    Assignee: Robert Muir
    
> TestBackwardsCompatibility needs terms with U+E000 to U+FFFF
> ------------------------------------------------------------
>
>                 Key: LUCENE-3576
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3576
>             Project: Lucene - Java
>          Issue Type: Task
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 4.0
>
>
> we changed sort order in 4.0, and have sophisticated backwards compatibility (e.g. surrogates dance),
> but we don't test this at all in TestBackwardsCompatibility.
> for example, nothing handles this case for term vectors...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3576) TestBackwardsCompatibility needs terms with U+E000 to U+FFFF

Posted by "Robert Muir (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-3576:
--------------------------------

    Attachment: LUCENE-3576.patch

Here's a patch fixing the bug.

PreFlexRW now writes term vectors in UTF-16 order.
                
> TestBackwardsCompatibility needs terms with U+E000 to U+FFFF
> ------------------------------------------------------------
>
>                 Key: LUCENE-3576
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3576
>             Project: Lucene - Java
>          Issue Type: Task
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-3576.patch, LUCENE-3576_3x_createIndex.patch, LUCENE-3576_trunk_test.patch, index.36.surrogates.zip
>
>
> we changed sort order in 4.0, and have sophisticated backwards compatibility (e.g. surrogates dance),
> but we don't test this at all in TestBackwardsCompatibility.
> for example, nothing handles this case for term vectors...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-3576) TestBackwardsCompatibility needs terms with U+E000 to U+FFFF

Posted by "Robert Muir (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir resolved LUCENE-3576.
---------------------------------

    Resolution: Fixed
    
> TestBackwardsCompatibility needs terms with U+E000 to U+FFFF
> ------------------------------------------------------------
>
>                 Key: LUCENE-3576
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3576
>             Project: Lucene - Java
>          Issue Type: Task
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-3576.patch, LUCENE-3576_3x_createIndex.patch, LUCENE-3576_trunk_test.patch, index.36.surrogates.zip
>
>
> we changed sort order in 4.0, and have sophisticated backwards compatibility (e.g. surrogates dance),
> but we don't test this at all in TestBackwardsCompatibility.
> for example, nothing handles this case for term vectors...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org