You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by Renata Vaccaro <re...@emailtopia.com> on 2013/06/05 00:50:47 UTC

IndexWriter.commit() performance

Hi all,

 

I'm new to this list and hoping I'm asking this question in the correct
place.  I upgraded lucene from a very, very old version to version
4.2.1.  I'm finding that calling IndexWriter.commit() is much slower
than the previous IndexWriter.close() that I was calling with the old
lucene (that didn't have a commit call).  It's taking 500ms-1s where
previously the close call was taking about 50ms.  I call commit every
time I add a document.  I am creating the IndexWriter as follows:

 

Directory dir = FSDirectory.open(index);

      Analyzer analyzer = new MsStandardAnalyzer();

      IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_42,
analyzer);

      iwc.setOpenMode(OpenMode.CREATE_OR_APPEND);

      iwc.setRAMBufferSizeMB(256.0);

IndexWriter writer = new IndexWriter(dir, iwc);

 

Is there something that I can do to make the commit call faster?


RE: IndexWriter.commit() performance

Posted by Renata Vaccaro <re...@emailtopia.com>.
Thanks very much.  I'll look into that.

-----Original Message-----
From: Michael McCandless [mailto:lucene@mikemccandless.com] 
Sent: Wednesday, June 05, 2013 7:45 AM
To: general@lucene.apache.org
Subject: Re: IndexWriter.commit() performance

On Tue, Jun 4, 2013 at 7:31 PM, Renata Vaccaro <re...@emailtopia.com>
wrote:
> Thanks.  I need the documents to be searchable as soon as they are
> added.  I also need the documents added to survive a machine crash.
>
> Soft commits and NRT gets might work, but from what I've read they are
> only available for Solr?

Likely commits got slower on upgrade because on your very, very old
Lucene version fsync was not called, so there was no safety on
OS/hardware crash to ensure the index was intact.

Solr's soft commit uses Lucene's near-real-time APIs, so you can
definitely do this with just Lucene: pass the IndexWriter to
DirectoryReader.open, and then use DirectoryReader.openIfChanged to
reopen (without committing).

This lets you decouple durability to crashes (how often you commit)
from index-to-search latency (how often you reopen the reader).

Mike McCandless

http://blog.mikemccandless.com

Re: IndexWriter.commit() performance

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Tue, Jun 4, 2013 at 7:31 PM, Renata Vaccaro <re...@emailtopia.com> wrote:
> Thanks.  I need the documents to be searchable as soon as they are
> added.  I also need the documents added to survive a machine crash.
>
> Soft commits and NRT gets might work, but from what I've read they are
> only available for Solr?

Likely commits got slower on upgrade because on your very, very old
Lucene version fsync was not called, so there was no safety on
OS/hardware crash to ensure the index was intact.

Solr's soft commit uses Lucene's near-real-time APIs, so you can
definitely do this with just Lucene: pass the IndexWriter to
DirectoryReader.open, and then use DirectoryReader.openIfChanged to
reopen (without committing).

This lets you decouple durability to crashes (how often you commit)
from index-to-search latency (how often you reopen the reader).

Mike McCandless

http://blog.mikemccandless.com

RE: IndexWriter.commit() performance

Posted by Renata Vaccaro <re...@emailtopia.com>.
Thanks.  I need the documents to be searchable as soon as they are
added.  I also need the documents added to survive a machine crash.  

Soft commits and NRT gets might work, but from what I've read they are
only available for Solr?

-----Original Message-----
From: Mark Bennett [mailto:mark.bennett@lucidworks.com] 
Sent: Tuesday, June 04, 2013 7:07 PM
To: <ge...@lucene.apache.org>
Subject: Re: IndexWriter.commit() performance

Although it's not exactly what you asked (and I don't mean this as a
sarcastic answer), one idea is to not call it directly from your code.
Or use the options that say "commit with N seconds"

This may not be feasible, depending on your requirements, and I'd
certainly respect that.  But, if you're looking at old examples and that
is what's motivating your question, it's good to know about the other
options available.

There's also soft commits and NRT gets, also interesting reading.

--
Mark Bennett / LucidWorks: Search & Big Data /
mark.bennett@lucidworks.com
Office: 408-898-4201 / Telecommute: 408-733-0387 / Cell: 408-829-6513

On Jun 4, 2013, at 3:50 PM, Renata Vaccaro <re...@emailtopia.com>
wrote:

> Hi all,
> 
> 
> 
> I'm new to this list and hoping I'm asking this question in the
correct
> place.  I upgraded lucene from a very, very old version to version
> 4.2.1.  I'm finding that calling IndexWriter.commit() is much slower
> than the previous IndexWriter.close() that I was calling with the old
> lucene (that didn't have a commit call).  It's taking 500ms-1s where
> previously the close call was taking about 50ms.  I call commit every
> time I add a document.  I am creating the IndexWriter as follows:
> 
> 
> 
> Directory dir = FSDirectory.open(index);
> 
>      Analyzer analyzer = new MsStandardAnalyzer();
> 
>      IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_42,
> analyzer);
> 
>      iwc.setOpenMode(OpenMode.CREATE_OR_APPEND);
> 
>      iwc.setRAMBufferSizeMB(256.0);
> 
> IndexWriter writer = new IndexWriter(dir, iwc);
> 
> 
> 
> Is there something that I can do to make the commit call faster?
> 



Re: IndexWriter.commit() performance

Posted by Mark Bennett <ma...@lucidworks.com>.
Although it's not exactly what you asked (and I don't mean this as a sarcastic answer), one idea is to not call it directly from your code.  Or use the options that say "commit with N seconds"

This may not be feasible, depending on your requirements, and I'd certainly respect that.  But, if you're looking at old examples and that is what's motivating your question, it's good to know about the other options available.

There's also soft commits and NRT gets, also interesting reading.

--
Mark Bennett / LucidWorks: Search & Big Data / mark.bennett@lucidworks.com
Office: 408-898-4201 / Telecommute: 408-733-0387 / Cell: 408-829-6513

On Jun 4, 2013, at 3:50 PM, Renata Vaccaro <re...@emailtopia.com> wrote:

> Hi all,
> 
> 
> 
> I'm new to this list and hoping I'm asking this question in the correct
> place.  I upgraded lucene from a very, very old version to version
> 4.2.1.  I'm finding that calling IndexWriter.commit() is much slower
> than the previous IndexWriter.close() that I was calling with the old
> lucene (that didn't have a commit call).  It's taking 500ms-1s where
> previously the close call was taking about 50ms.  I call commit every
> time I add a document.  I am creating the IndexWriter as follows:
> 
> 
> 
> Directory dir = FSDirectory.open(index);
> 
>      Analyzer analyzer = new MsStandardAnalyzer();
> 
>      IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_42,
> analyzer);
> 
>      iwc.setOpenMode(OpenMode.CREATE_OR_APPEND);
> 
>      iwc.setRAMBufferSizeMB(256.0);
> 
> IndexWriter writer = new IndexWriter(dir, iwc);
> 
> 
> 
> Is there something that I can do to make the commit call faster?
>