You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ariel <is...@gmail.com> on 2009/01/05 19:20:19 UTC

Re: Default and optimal use of RAMDirectory

Did you mean that the people that think the use of RAMDirectory is going to
speed up the indexing proccess are wrong ???

On Sun, Dec 21, 2008 at 10:22 PM, Otis Gospodnetic <
otis_gospodnetic@yahoo.com> wrote:

> Let me add to that that I clearly recall having a hard time getting the
> tests for that particular section of LIA1 to clearly and consistently show
> that using the RAMDirectory buffering approach instead of vanilla
> IndexWriter yields faster indexing.  Even back then IndexWriter buffered
> indexed data in memory, though today's IndexWriter is much, much better at
> it.
>
>
> Otis --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
> > From: Michael McCandless <lu...@mikemccandless.com>
> > To: java-user@lucene.apache.org
> > Sent: Saturday, December 20, 2008 4:25:13 AM
> > Subject: Re: Default and optimal use of RAMDirectory
> >
> > Actually, things have improved since LIA1 was written a few years ago:
> > IndexWriter now does a good job managing the RAM buffer you assign to
> > it, so you should not see much benefit by doing your own buffering
> > with RAMDirectory (and if you somehow do, I'd like to know about
> > it!).
> >
> > Instead you should call IndexWriter.setRAMBufferSizeMB.
> >
> > Also, FSDirectory does no RAM buffering on its own.
> >
> > See here for further ways to tune for indexing throughput:
> >
> > http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
> >
> > Mike
> >
> > wrote:
> >
> > >
> > > Hi all,
> > >
> > > First of I'd like to say I'm quite pleased to be a part of this mailing
> > > list - its even more exciting to know that we have Otis G. and Erik H.,
> > > authors of (at least in my opinion) the Lucene Bible - Lucene in
> Action,
> > > actively answering all these inquiries =)
> > >
> > > We're currently in the initial stages of implementing lucene as part of
> our
> > > product and one problem that we need to resolve is optimizing lucene.
>  I've
> > > been reading Lucene in Action book and one of the tips for optimizing
> > > lucene indexing is by using RAMDirectory as a buffer before writing to
> > > FSDirectory.  According to the book, this is done internally and
> > > automatically when I use FSDirectory.  My questions are 1.) What's the
> > > default implementation/ computation used in allocating RAMdirectory
> when we
> > > implement FSDirectory and 2.) What's the optimal way of customizing
> > > RAMDirectory usage - any tips on how to do it.
> > >
> > > BTW, we're using Lucene 2.3.2
> > >
> > > Thanks for all the help
> > >
> > > Joseph
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Default and optimal use of RAMDirectory

Posted by Erick Erickson <er...@gmail.com>.
In general from what I've seen on this list for the last couple of years,
you're right. You're better off tweaking the various parameters of your
IndexWriter (e.g. MaxBufferedDocs, MergeFactor, MergeDocs, etc.)
than trying to use the blunt tool of RAMDirectory.

Best
Erick

On Mon, Jan 5, 2009 at 1:20 PM, Ariel <is...@gmail.com> wrote:

> Did you mean that the people that think the use of RAMDirectory is going to
> speed up the indexing proccess are wrong ???
>
> On Sun, Dec 21, 2008 at 10:22 PM, Otis Gospodnetic <
> otis_gospodnetic@yahoo.com> wrote:
>
> > Let me add to that that I clearly recall having a hard time getting the
> > tests for that particular section of LIA1 to clearly and consistently
> show
> > that using the RAMDirectory buffering approach instead of vanilla
> > IndexWriter yields faster indexing.  Even back then IndexWriter buffered
> > indexed data in memory, though today's IndexWriter is much, much better
> at
> > it.
> >
> >
> > Otis --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >
> >
> >
> > ----- Original Message ----
> > > From: Michael McCandless <lu...@mikemccandless.com>
> > > To: java-user@lucene.apache.org
> > > Sent: Saturday, December 20, 2008 4:25:13 AM
> > > Subject: Re: Default and optimal use of RAMDirectory
> > >
> > > Actually, things have improved since LIA1 was written a few years ago:
> > > IndexWriter now does a good job managing the RAM buffer you assign to
> > > it, so you should not see much benefit by doing your own buffering
> > > with RAMDirectory (and if you somehow do, I'd like to know about
> > > it!).
> > >
> > > Instead you should call IndexWriter.setRAMBufferSizeMB.
> > >
> > > Also, FSDirectory does no RAM buffering on its own.
> > >
> > > See here for further ways to tune for indexing throughput:
> > >
> > > http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
> > >
> > > Mike
> > >
> > > wrote:
> > >
> > > >
> > > > Hi all,
> > > >
> > > > First of I'd like to say I'm quite pleased to be a part of this
> mailing
> > > > list - its even more exciting to know that we have Otis G. and Erik
> H.,
> > > > authors of (at least in my opinion) the Lucene Bible - Lucene in
> > Action,
> > > > actively answering all these inquiries =)
> > > >
> > > > We're currently in the initial stages of implementing lucene as part
> of
> > our
> > > > product and one problem that we need to resolve is optimizing lucene.
> >  I've
> > > > been reading Lucene in Action book and one of the tips for optimizing
> > > > lucene indexing is by using RAMDirectory as a buffer before writing
> to
> > > > FSDirectory.  According to the book, this is done internally and
> > > > automatically when I use FSDirectory.  My questions are 1.) What's
> the
> > > > default implementation/ computation used in allocating RAMdirectory
> > when we
> > > > implement FSDirectory and 2.) What's the optimal way of customizing
> > > > RAMDirectory usage - any tips on how to do it.
> > > >
> > > > BTW, we're using Lucene 2.3.2
> > > >
> > > > Thanks for all the help
> > > >
> > > > Joseph
> > > >
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > >
> > > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>