You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jamie Johnson <je...@gmail.com> on 2012/02/04 06:37:27 UTC

term positions offsets and frequency

I see that adding this information can improve performance while
highlighting and enable some other components like mlt at the expense of
disk space, but my question is this information always loaded into memory
or only when needed on a per doc basis? I ask because I can afford the disk
space but large increases in memory usage I may not be able to support

Re: term positions offsets and frequency

Posted by Erick Erickson <er...@gmail.com>.
Well, you haven't told us anything about your setup, like how big the
corpus is, how big your index is, what "large increase" means (1G?
16G?).

Have a look at:
http://lucene.apache.org/java/3_5_0/fileformats.html#file-names
and watch the file size changes with and without the information
in order to make an estimate....

Best
Erick

On Sat, Feb 4, 2012 at 12:37 AM, Jamie Johnson <je...@gmail.com> wrote:
> I see that adding this information can improve performance while
> highlighting and enable some other components like mlt at the expense of
> disk space, but my question is this information always loaded into memory
> or only when needed on a per doc basis? I ask because I can afford the disk
> space but large increases in memory usage I may not be able to support