You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (Created) (JIRA)" <ji...@apache.org> on 2011/12/14 13:13:30 UTC

[jira] [Created] (LUCENE-3647) DocValues merging is not associative, leading to different results depending upon how merges execute

DocValues merging is not associative, leading to different results depending upon how merges execute
----------------------------------------------------------------------------------------------------

                 Key: LUCENE-3647
                 URL: https://issues.apache.org/jira/browse/LUCENE-3647
             Project: Lucene - Java
          Issue Type: Bug
    Affects Versions: 4.0
            Reporter: Robert Muir


recently I cranked up TestDuelingCodecs to actually test docvalues (previously it wasn't testing it at all).

This test is simple, it indexes the same random content with 2 different indexwriters, it just allows them
to use different codecs with different indexwriterconfigs.

then it asserts the indexes are equal.

Sometimes, always on BYTES_FIXED_DEREF type, we end out with one reader that has a zero-filled byte[] for a doc,
but that same document in the other reader has no docvalues at all.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3647) DocValues merging is not associative, leading to different results depending upon how merges execute

Posted by "Robert Muir (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-3647:
--------------------------------

    Attachment: LUCENE-3647.patch

updated patch... fixes the test fail, was a missing break in the switch... I promise I won't write any more patches until i wake up with more coffee or beer.

                
> DocValues merging is not associative, leading to different results depending upon how merges execute
> ----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3647
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3647
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-3647.patch, LUCENE-3647_multi.patch, LUCENE-3647_test.patch
>
>
> recently I cranked up TestDuelingCodecs to actually test docvalues (previously it wasn't testing it at all).
> This test is simple, it indexes the same random content with 2 different indexwriters, it just allows them
> to use different codecs with different indexwriterconfigs.
> then it asserts the indexes are equal.
> Sometimes, always on BYTES_FIXED_DEREF type, we end out with one reader that has a zero-filled byte[] for a doc,
> but that same document in the other reader has no docvalues at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-3647) DocValues merging is not associative, leading to different results depending upon how merges execute

Posted by "Robert Muir (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir resolved LUCENE-3647.
---------------------------------

       Resolution: Fixed
    Fix Version/s: 4.0

all 3 failed hudson seeds pass now.
                
> DocValues merging is not associative, leading to different results depending upon how merges execute
> ----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3647
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3647
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-3647.patch, LUCENE-3647_multi.patch, LUCENE-3647_test.patch
>
>
> recently I cranked up TestDuelingCodecs to actually test docvalues (previously it wasn't testing it at all).
> This test is simple, it indexes the same random content with 2 different indexwriters, it just allows them
> to use different codecs with different indexwriterconfigs.
> then it asserts the indexes are equal.
> Sometimes, always on BYTES_FIXED_DEREF type, we end out with one reader that has a zero-filled byte[] for a doc,
> but that same document in the other reader has no docvalues at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3647) DocValues merging is not associative, leading to different results depending upon how merges execute

Posted by "Robert Muir (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-3647:
--------------------------------

    Attachment: LUCENE-3647_multi.patch

hmm my debugging there is a little bogus (just ignore the previous file, you can use the existing test with taht seed), but i think the synopsis is still correct.

I think as a start, don't we need to be careful when handling these fixed types in all places? Here's a patch for MultiDocValues that should fix some bugs related to this (unfortunately the seed still fails).

MultiDocValues isnt actually used during merging so we need to investigate other parts too and probably do the same?
                
> DocValues merging is not associative, leading to different results depending upon how merges execute
> ----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3647
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3647
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-3647_multi.patch, LUCENE-3647_test.patch
>
>
> recently I cranked up TestDuelingCodecs to actually test docvalues (previously it wasn't testing it at all).
> This test is simple, it indexes the same random content with 2 different indexwriters, it just allows them
> to use different codecs with different indexwriterconfigs.
> then it asserts the indexes are equal.
> Sometimes, always on BYTES_FIXED_DEREF type, we end out with one reader that has a zero-filled byte[] for a doc,
> but that same document in the other reader has no docvalues at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3647) DocValues merging is not associative, leading to different results depending upon how merges execute

Posted by "Simon Willnauer (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169289#comment-13169289 ] 

Simon Willnauer commented on LUCENE-3647:
-----------------------------------------

my first guess is that this comes due to the different IWCs if you don't specify a value for a field in one IW before the segment is flushed it will not write anything out. but if for instance in your second IW the last doc in a seg has a value DV fills it with default values for the other docs.  is that something which could happen here? Same is true if you merge fields ie. if you have slightly different merge policies?


                
> DocValues merging is not associative, leading to different results depending upon how merges execute
> ----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3647
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3647
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-3647_multi.patch, LUCENE-3647_test.patch
>
>
> recently I cranked up TestDuelingCodecs to actually test docvalues (previously it wasn't testing it at all).
> This test is simple, it indexes the same random content with 2 different indexwriters, it just allows them
> to use different codecs with different indexwriterconfigs.
> then it asserts the indexes are equal.
> Sometimes, always on BYTES_FIXED_DEREF type, we end out with one reader that has a zero-filled byte[] for a doc,
> but that same document in the other reader has no docvalues at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3647) DocValues merging is not associative, leading to different results depending upon how merges execute

Posted by "Robert Muir (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-3647:
--------------------------------

    Attachment: LUCENE-3647_test.patch

here's the test, run ant test -Dtestcase=TestDuelingCodecs -Dtestmethod=testEquals -Dtests.seed=-40a075cbf2de8088:-42be31e45e2a3e63:-1340cc72c4576f5a -Dtests.multiplier=3 -Dargs="-Dfile.encoding=ISO8859-1"
                
> DocValues merging is not associative, leading to different results depending upon how merges execute
> ----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3647
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3647
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-3647_test.patch
>
>
> recently I cranked up TestDuelingCodecs to actually test docvalues (previously it wasn't testing it at all).
> This test is simple, it indexes the same random content with 2 different indexwriters, it just allows them
> to use different codecs with different indexwriterconfigs.
> then it asserts the indexes are equal.
> Sometimes, always on BYTES_FIXED_DEREF type, we end out with one reader that has a zero-filled byte[] for a doc,
> but that same document in the other reader has no docvalues at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org