You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2014/05/12 13:38:15 UTC
[jira] [Created] (LUCENE-5667) Optimize common-prefix across all
terms in a field
Michael McCandless created LUCENE-5667:
------------------------------------------
Summary: Optimize common-prefix across all terms in a field
Key: LUCENE-5667
URL: https://issues.apache.org/jira/browse/LUCENE-5667
Project: Lucene - Core
Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Fix For: 4.9, 5.0
I tested different UUID sources in Lucene
http://blog.mikemccandless.com/2014/05/choosing-fast-unique-identifier-uuid.html
and I was surprised to see that Flake IDs were slower than UUID V1.
They use the same raw sources of info (timestamp, node id, sequence
counter) but Flake ID preserves total order by keeping the timestamp
"intact" in the leading 64 bits.
I think the reason might be because a Flake ID will typically have a
longish common prefix for all docs, and I think we might be able to
optimize this in block-tree by storing that common prefix outside of
the FST, or maybe just pre-computing the common prefix on init and
storing the "effective" start node for the FST.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org