You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Mindaugas Žakšauskas <mi...@gmail.com> on 2008/03/05 18:18:54 UTC

Reusing same IndexSearcher

Hi,

Another newbie here...using Lucene 2.3.1 on Linux. Hopefully anyone
could advice me on /subj/.

Both IndexSearcher Javadoc and Lucene FAQ says the IndexSearcher
should be reused as it's thread safe. That's OK.
Now if I have index changed, I need to reopen the IndexReader that is
associated with it. How do I do this as IndexSearcher has no setter
method for IndexReader?

Let's speak in Java. Say, we've got a static singletone accessor method:

...
private static Searcher instance;
...
public static Searcher getLuceneSearcher () {
   if( instance == null ) {
      IndexReader reader = IndexReader.open( "/tmp/index_folder" );
      instance = new IndexSearcher( reader );    // simple yet boring
   } else {                                    // here goes the fun part
      IndexReader r_old = instance.getIndexReader();
      IndexReader r_new = r_old.reopen();
      if( r_old != r_new ) {
         r_old.close();                  // thanks for nice Javadoc, guys!
         // what to do now? there's no instance.setIndexReader( r_new )!
      }
   }
   return instance;
}

Of course, I could create a new IndexSearcher on the else branch and
return it. However, this approach resulted the infamous "too many open
files" exception. Lifting the `ulimit -n` to hundreds of thousands of
files didn't really help as the same exception was still being thrown
(actual resource usage fluctuating around 2000 of open files). Then
from `lsof` output I noticed that the same segment file was being open
more than once, apparently from different instances of
IndexSearchers/IndexReaders and went the path shown above. Maybe I'm
just plain wrong.

Really appreciate your advice.

Regards,
Mindaugas

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Reusing same IndexSearcher

Posted by Mindaugas Žakšauskas <mi...@gmail.com>.
Hi,

Thanks for your reply.

I can't think of any way to ensure fair file descriptor usage when
there are many active instances of IndexSearcher (all containing
IndexReader) running. Our project installations tend to run on heavily
loaded sites, where a lot of information is read and written at the
same time.

My original idea was not to operate at IndexReader level, but only
provide a single IndexSearcher (which contains IndexReader).
IndexSearcher being (thread) safe, none of the other code should be
aware of Lucene internals (think encapsulation).

Anyway, if what you saying is correct, I think the Javadoc and FAQ
must be little bit more specific on that. Also, I've looked at the
code of IndexSearcher and could not find a single reason why would
setting a new IndexReader hurt. But that's just probably me having a
hard day.

Can anyone comment if this approach is relevant and would help?
http://www.xman.org/jlinux/server.html

Regards,
Mindaugas

On Wed, Mar 5, 2008 at 5:41 PM, Michael McCandless
<lu...@mikemccandless.com> wrote:
>
>  Actually you do need to make a new IndexSearcher every time you
>  reopen a new IndexReader.
>
>  However, that should not lead to leaking file descriptors.  All open
>  files are held by IndexReader (not IndexSearcher), so as long as you
>  are properly closing your IndexReader's you shouldn't use up file
>  descriptors.
>
>  Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Reusing same IndexSearcher

Posted by Michael McCandless <lu...@mikemccandless.com>.
Actually you do need to make a new IndexSearcher every time you  
reopen a new IndexReader.

However, that should not lead to leaking file descriptors.  All open  
files are held by IndexReader (not IndexSearcher), so as long as you  
are properly closing your IndexReader's you shouldn't use up file  
descriptors.

Mike

Mindaugas ?ak?auskas wrote:

> Hi,
>
> Another newbie here...using Lucene 2.3.1 on Linux. Hopefully anyone
> could advice me on /subj/.
>
> Both IndexSearcher Javadoc and Lucene FAQ says the IndexSearcher
> should be reused as it's thread safe. That's OK.
> Now if I have index changed, I need to reopen the IndexReader that is
> associated with it. How do I do this as IndexSearcher has no setter
> method for IndexReader?
>
> Let's speak in Java. Say, we've got a static singletone accessor  
> method:
>
> ...
> private static Searcher instance;
> ...
> public static Searcher getLuceneSearcher () {
>    if( instance == null ) {
>       IndexReader reader = IndexReader.open( "/tmp/index_folder" );
>       instance = new IndexSearcher( reader );    // simple yet boring
>    } else {                                    // here goes the fun  
> part
>       IndexReader r_old = instance.getIndexReader();
>       IndexReader r_new = r_old.reopen();
>       if( r_old != r_new ) {
>          r_old.close();                  // thanks for nice  
> Javadoc, guys!
>          // what to do now? there's no instance.setIndexReader 
> ( r_new )!
>       }
>    }
>    return instance;
> }
>
> Of course, I could create a new IndexSearcher on the else branch and
> return it. However, this approach resulted the infamous "too many open
> files" exception. Lifting the `ulimit -n` to hundreds of thousands of
> files didn't really help as the same exception was still being thrown
> (actual resource usage fluctuating around 2000 of open files). Then
> from `lsof` output I noticed that the same segment file was being open
> more than once, apparently from different instances of
> IndexSearchers/IndexReaders and went the path shown above. Maybe I'm
> just plain wrong.
>
> Really appreciate your advice.
>
> Regards,
> Mindaugas
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org