You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by tsuraan <ts...@gmail.com> on 2010/02/03 22:07:20 UTC

Sort memory usage

Is the cache used by sorting on strings separated by reader, or is it
a global thing?  I'm trying to use the near-realtime search, and I
have a few indices with a million docs apiece.  If I'm opening a new
reader every minute, am I going to have every term in every sort field
read into RAM for each reader that I have open?  Or, is the cache
smarter about that, and at least the strings are interned, or does it
work completely differently from that?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Sort memory usage

Posted by Jake Mannix <ja...@gmail.com>.
On Wed, Feb 3, 2010 at 1:33 PM, tsuraan <ts...@gmail.com> wrote:

> > The FieldCache loads per segment, and the NRT reader is reloading only
> > new segments from disk, so yes, it's "smarter" about this caching in this
> > case.
>
> Ok, so the cache is tied to the index, and not to any particular
> reader.  The actual FieldCacheImpl keeps a mapping from Reader to its
> terms, so are the cached values shared just because the Readers
> actually wrap the same Indices, so the termDocs for all the readers
> are actually almost entirely the same?
>

The FieldCache *is* tied to readers, but it's tied to the SegmentReaders
which
comprise an IndexReader.  When you call IndexReader.reload() or if you
call IndexWriter.getReader(), you make new SegmentReaders inside of the
outer IndexReader that you are looking at, and only the new ones need to
load up their parts of the FieldCache.  Segments which have already been
read have SegmentReader instances which stick around, and the cache
pieces which are keyed on them stay around as well.

  -jake

Re: Sort memory usage

Posted by tsuraan <ts...@gmail.com>.
> The FieldCache loads per segment, and the NRT reader is reloading only
> new segments from disk, so yes, it's "smarter" about this caching in this
> case.

Ok, so the cache is tied to the index, and not to any particular
reader.  The actual FieldCacheImpl keeps a mapping from Reader to its
terms, so are the cached values shared just because the Readers
actually wrap the same Indices, so the termDocs for all the readers
are actually almost entirely the same?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Sort memory usage

Posted by Jake Mannix <ja...@gmail.com>.
The FieldCache loads per segment, and the NRT reader is reloading only
new segments from disk, so yes, it's "smarter" about this caching in this
case.

  -jake

On Wed, Feb 3, 2010 at 1:07 PM, tsuraan <ts...@gmail.com> wrote:

> Is the cache used by sorting on strings separated by reader, or is it
> a global thing?  I'm trying to use the near-realtime search, and I
> have a few indices with a million docs apiece.  If I'm opening a new
> reader every minute, am I going to have every term in every sort field
> read into RAM for each reader that I have open?  Or, is the cache
> smarter about that, and at least the strings are interned, or does it
> work completely differently from that?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>