You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2016/02/02 12:01:39 UTC

[jira] [Created] (LUCENE-7007) Reduce block-tree GC/CPU cost when flushing or merging postings

Michael McCandless created LUCENE-7007:
------------------------------------------

             Summary: Reduce block-tree GC/CPU cost when flushing or merging postings
                 Key: LUCENE-7007
                 URL: https://issues.apache.org/jira/browse/LUCENE-7007
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Michael McCandless


Writing postings is a GC and CPU heavy operation now, in part because of how
block tree recursively builds up the tree structure, by creating many
tiny FSTs which it inefficiently merges together as it walks up the
tree eventually to the root block.

So I tried a quick prototype (patch attached) to use a
less-RAM-efficient, but much fewer tiny FST related objects, when
writing postings.

But in some quick indexing performance tests (luceneutil), it makes no
measurable improvements to indexing performance.

So I'm putting my patch up here for posterity ... I don't intend to
commit it unless we can iterate it further.  It adds code complexity,
it's not committable as-is (we need to conditionalize it so it
sometimes does use FSTs, for segments with many terms), etc.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org