You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (Created) (JIRA)" <ji...@apache.org> on 2012/03/14 23:40:39 UTC

[jira] [Created] (LUCENE-3870) VarDerefBytesImpl doc values prefix length may fall across two pages

VarDerefBytesImpl doc values prefix length may fall across two pages
--------------------------------------------------------------------

                 Key: LUCENE-3870
                 URL: https://issues.apache.org/jira/browse/LUCENE-3870
             Project: Lucene - Java
          Issue Type: Bug
    Affects Versions: 4.0
            Reporter: Michael McCandless
             Fix For: 4.0


The VarDerefBytesImpl doc values encodes the unique byte[] with prefix (1 or 2 bytes) first, followed by bytes, so that it can use PagedBytes.fillSliceWithPrefix.

It does this itself rather than using PagedBytes.copyUsingLengthPrefix...

The problem is, it can write an invalid 2 byte prefix spanning two blocks (ie, last byte of block N and first byte of block N+1), which fillSliceWithPrefix won't decode correctly.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3870) VarDerefBytesImpl doc values prefix length may fall across two pages

Posted by "Michael McCandless (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-3870:
---------------------------------------

    Attachment: LUCENE-3870.patch

Patch with test case showing the failure:
{noformat}
1) testLengthPrefixAcrossTwoPages(org.apache.lucene.index.TestDocValuesIndexing)
java.lang.ArrayIndexOutOfBoundsException: 32768
	at org.apache.lucene.util.PagedBytes$Reader.fillSliceWithPrefix(PagedBytes.java:204)
	at org.apache.lucene.codecs.lucene40.values.VarDerefBytesImpl$VarDerefSource.getBytes(VarDerefBytesImpl.java:124)
	at org.apache.lucene.index.TestDocValuesIndexing.testLengthPrefixAcrossTwoPages(TestDocValuesIndexing.java:956)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
	at org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:729)
	at org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:645)
	at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
	at org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:556)
	at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51)
	at org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618)
	at org.junit.rules.RunRules.evaluate(RunRules.java:18)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
	at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
	at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
	at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51)
	at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21)
	at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
	at org.junit.rules.RunRules.evaluate(RunRules.java:18)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
	at org.junit.runners.Suite.runChild(Suite.java:128)
	at org.junit.runners.Suite.runChild(Suite.java:24)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:157)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:136)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:117)
	at org.junit.runner.JUnitCore.runMain(JUnitCore.java:98)
	at org.junit.runner.JUnitCore.runMainAndExit(JUnitCore.java:53)
	at org.junit.runner.JUnitCore.main(JUnitCore.java:45)
{noformat}

Seems like we either have to fix DV to ensure prefix always falls in a single block, or fix PagedBytes to tolerate a 2 byte prefix spanning two blocks...
                
> VarDerefBytesImpl doc values prefix length may fall across two pages
> --------------------------------------------------------------------
>
>                 Key: LUCENE-3870
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3870
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3870.patch
>
>
> The VarDerefBytesImpl doc values encodes the unique byte[] with prefix (1 or 2 bytes) first, followed by bytes, so that it can use PagedBytes.fillSliceWithPrefix.
> It does this itself rather than using PagedBytes.copyUsingLengthPrefix...
> The problem is, it can write an invalid 2 byte prefix spanning two blocks (ie, last byte of block N and first byte of block N+1), which fillSliceWithPrefix won't decode correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3870) VarDerefBytesImpl doc values prefix length may fall across two pages

Posted by "Simon Willnauer (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-3870:
------------------------------------

    Attachment: LUCENE-3870.patch

nice catch mike! Here is a patch fixing PagedBytes.Reader#fillSliceWithPrefix this is a general trap and should be fixed in this method IMO
                
> VarDerefBytesImpl doc values prefix length may fall across two pages
> --------------------------------------------------------------------
>
>                 Key: LUCENE-3870
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3870
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3870.patch, LUCENE-3870.patch
>
>
> The VarDerefBytesImpl doc values encodes the unique byte[] with prefix (1 or 2 bytes) first, followed by bytes, so that it can use PagedBytes.fillSliceWithPrefix.
> It does this itself rather than using PagedBytes.copyUsingLengthPrefix...
> The problem is, it can write an invalid 2 byte prefix spanning two blocks (ie, last byte of block N and first byte of block N+1), which fillSliceWithPrefix won't decode correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3870) VarDerefBytesImpl doc values prefix length may fall across two pages

Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231033#comment-13231033 ] 

Michael McCandless commented on LUCENE-3870:
--------------------------------------------

+1, looks good Simon!

Just remember to remove that sop...
                
> VarDerefBytesImpl doc values prefix length may fall across two pages
> --------------------------------------------------------------------
>
>                 Key: LUCENE-3870
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3870
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3870.patch, LUCENE-3870.patch
>
>
> The VarDerefBytesImpl doc values encodes the unique byte[] with prefix (1 or 2 bytes) first, followed by bytes, so that it can use PagedBytes.fillSliceWithPrefix.
> It does this itself rather than using PagedBytes.copyUsingLengthPrefix...
> The problem is, it can write an invalid 2 byte prefix spanning two blocks (ie, last byte of block N and first byte of block N+1), which fillSliceWithPrefix won't decode correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (LUCENE-3870) VarDerefBytesImpl doc values prefix length may fall across two pages

Posted by "Simon Willnauer (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer reassigned LUCENE-3870:
---------------------------------------

    Assignee: Simon Willnauer
    
> VarDerefBytesImpl doc values prefix length may fall across two pages
> --------------------------------------------------------------------
>
>                 Key: LUCENE-3870
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3870
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3870.patch
>
>
> The VarDerefBytesImpl doc values encodes the unique byte[] with prefix (1 or 2 bytes) first, followed by bytes, so that it can use PagedBytes.fillSliceWithPrefix.
> It does this itself rather than using PagedBytes.copyUsingLengthPrefix...
> The problem is, it can write an invalid 2 byte prefix spanning two blocks (ie, last byte of block N and first byte of block N+1), which fillSliceWithPrefix won't decode correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-3870) VarDerefBytesImpl doc values prefix length may fall across two pages

Posted by "Simon Willnauer (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer resolved LUCENE-3870.
-------------------------------------

       Resolution: Fixed
    Lucene Fields: New,Patch Available  (was: New)

fixed
                
> VarDerefBytesImpl doc values prefix length may fall across two pages
> --------------------------------------------------------------------
>
>                 Key: LUCENE-3870
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3870
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3870.patch, LUCENE-3870.patch
>
>
> The VarDerefBytesImpl doc values encodes the unique byte[] with prefix (1 or 2 bytes) first, followed by bytes, so that it can use PagedBytes.fillSliceWithPrefix.
> It does this itself rather than using PagedBytes.copyUsingLengthPrefix...
> The problem is, it can write an invalid 2 byte prefix spanning two blocks (ie, last byte of block N and first byte of block N+1), which fillSliceWithPrefix won't decode correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org