You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Adrien Grand (JIRA)" <ji...@apache.org> on 2019/02/22 07:57:00 UTC
[jira] [Created] (LUCENE-8705) Compress BKD trees by only encoding
the difference between two dimensions
Adrien Grand created LUCENE-8705:
------------------------------------
Summary: Compress BKD trees by only encoding the difference between two dimensions
Key: LUCENE-8705
URL: https://issues.apache.org/jira/browse/LUCENE-8705
Project: Lucene - Core
Issue Type: Bug
Reporter: Adrien Grand
When serializing BKD trees to disk, for each block we look at the common prefix for each dimension in isolation and only encode those common prefixes once for the entire block. Now that we have range fields and shapes so that several dimensions are storing related data, we might occasionally have longer common prefixes when comparing with values in other dimensions. For instance when indexing narrow ranges in a range field, we might get better compression on the second dimension by encoding suffixes that differ with the first dimension. This is also an obvious win if we are indexing lines or points as shapes, since we have dimensions that record exactly the same values in that case.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org