You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2016/08/01 12:32:21 UTC

[jira] [Commented] (LUCENE-7401) BKDWriter should ensure all dimensions are indexed

    [ https://issues.apache.org/jira/browse/LUCENE-7401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401949#comment-15401949 ] 

Michael McCandless commented on LUCENE-7401:
--------------------------------------------

bq. what happens eg. if you want to index all towns in the world alongside their population as a 3rd dimension. Given that there are very large areas that only have small towns, it could happen that the population dimension does not get indexed at all in these areas?

That's a good example!  In that case, with our current splitting, running a range filter for "small population" will be costly.  Though, without other filters (by lat/lon) it will likely be costly anyway since town population is probably Zipf's law like?  I.e., most areas will still have many more small population towns than big ones.

bq. Hmm this got me curious, why is it an adversarial case if all points are equidistant from an origin?

Oh it results in long slivery KD cells, which means queries have to visit too many points.

> BKDWriter should ensure all dimensions are indexed
> --------------------------------------------------
>
>                 Key: LUCENE-7401
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7401
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Adrien Grand
>            Priority: Minor
>
> The current heuristic is to use the dimension that has the largest span, so if dimensions have a different distribution of values, we could theoretically (but maybe in practice too?) end up with one dimension that is not indexed at all and queries that are mostly selective on this dimension would need to scan lots of blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org