You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Steve Rowe (JIRA)" <ji...@apache.org> on 2015/08/08 17:12:46 UTC
[jira] [Reopened] (LUCENE-6697) Use 1D KD tree for alternative to
postings based numeric range filters
[ https://issues.apache.org/jira/browse/LUCENE-6697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Rowe reopened LUCENE-6697:
--------------------------------
Seeing 100% reproducible failure on branch_5x:
{noformat}
[junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestRangeTree -Dtests.method=testMultiValued -Dtests.seed=FD1D848DDE038459 -Dtests.slow=true -Dtests.locale=hr_HR -Dtests.timezone=Europe/Madrid -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
[junit4] ERROR 0.05s J8 | TestRangeTree.testMultiValued <<<
[junit4] > Throwable #1: java.lang.IllegalArgumentException: maxValuesSortInHeap must be >= maxValuesInLeafNode; got 1250 vs maxValuesInLeafNode=2013
[junit4] > at __randomizedtesting.SeedInfo.seed([FD1D848DDE038459:293DE0BF10C1C411]:0)
[junit4] > at org.apache.lucene.rangetree.RangeTreeWriter.verifyParams(RangeTreeWriter.java:114)
[junit4] > at org.apache.lucene.rangetree.RangeTreeDocValuesFormat.<init>(RangeTreeDocValuesFormat.java:98)
[junit4] > at org.apache.lucene.rangetree.TestRangeTree.testMultiValued(TestRangeTree.java:128)
[junit4] > at java.lang.Thread.run(Thread.java:745)
[junit4] 2> NOTE: test params are: codec=Asserting(Lucene53): {}, docValues:{}, sim=DefaultSimilarity, locale=hr_HR, timezone=Europe/Madrid
[junit4] 2> NOTE: Linux 4.1.0-custom2-amd64 amd64/Oracle Corporation 1.8.0_45 (64-bit)/cpus=16,threads=1,free=441475104,total=504365056
[junit4] 2> NOTE: All tests run in this JVM: [TestRangeTree]
{noformat}
> Use 1D KD tree for alternative to postings based numeric range filters
> ----------------------------------------------------------------------
>
> Key: LUCENE-6697
> URL: https://issues.apache.org/jira/browse/LUCENE-6697
> Project: Lucene - Core
> Issue Type: New Feature
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: 5.3, Trunk
>
> Attachments: LUCENE-6697.patch, LUCENE-6697.patch, LUCENE-6697.patch
>
>
> Today Lucene uses postings to index a numeric value at multiple
> precision levels for fast range searching. It's somewhat costly: each
> numeric value is indexed with multiple terms (4 terms by default)
> ... I think a dedicated 1D BKD tree should be more compact and perform
> better.
> It should also easily generalize beyond 64 bits to arbitrary byte[],
> e.g. for LUCENE-5596, but I haven't explored that here.
> A 1D BKD tree just sorts all values, and then indexes adjacent leaf
> blocks of size 512-1024 (by default) values per block, and their
> docIDs, into a fully balanced binary tree. Building the range filter
> is then just a recursive walk through this tree.
> It's the same structure we use for 2D lat/lon BKD tree, just with 1D
> instead. I implemented it as a DocValuesFormat that also writes the
> numeric tree on the side.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org