You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Vitaly Funstein <vf...@gmail.com> on 2014/08/24 01:08:52 UTC

SegmentReader heap usage with stored field compression on

Is it reasonable to assume that using stored field compression with a lot
of stored fields per document in a very large  index (100+ GB) could
potentially lead to a significant heap utilization? If I am reading the
code in CompressingStoredFieldsIndexReader correctly, there's a non-trivial
accounting overhead, per segment, to maintain fields index reader state,
which appears to be a function of both compression chunk size and overall
segment size.

Not sure if my hunch is correct here, but we have run into situations when
loading stored fields for a relatively small number of search results
(<100K) after a single query for an index of the above size would result in
OOME with 5+ GB heap sizes, with dominating objects in heap dump being
SegmentReader... hence the question. Thank you.