You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Adrien Grand (JIRA)" <ji...@apache.org> on 2019/02/22 08:01:00 UTC

[jira] [Updated] (LUCENE-8705) Compress BKD trees by encoding the difference between two dimensions

     [ https://issues.apache.org/jira/browse/LUCENE-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrien Grand updated LUCENE-8705:
---------------------------------
    Summary: Compress BKD trees by encoding the difference between two dimensions  (was: Compress BKD trees by only encoding the difference between two dimensions)

> Compress BKD trees by encoding the difference between two dimensions
> --------------------------------------------------------------------
>
>                 Key: LUCENE-8705
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8705
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Adrien Grand
>            Priority: Minor
>
> When serializing BKD trees to disk, for each block we look at the common prefix for each dimension in isolation and only encode those common prefixes once for the entire block. Now that we have range fields and shapes so that several dimensions are storing related data, we might occasionally have longer common prefixes when comparing with values in other dimensions. For instance when indexing narrow ranges in a range field, we might get better compression on the second dimension by encoding suffixes that differ with the first dimension. This is also an obvious win if we are indexing lines or points as shapes, since we have dimensions that record exactly the same values in that case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org