You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by sol myr <so...@yahoo.com> on 2011/01/13 16:11:25 UTC

Newbie: "Life span" of IndexWriter / IndexSearcher?

Hi,

We're writing a web application, which naturally needs
- "IndexSearcher" when users use our search screen
- "IndexWriter" in a background process that periodically updates and optimizes our index.
Note our writer is exclusive - no other applications/threads ever write to our index files.

What's the common practice in terms of resource creation and sharing?
Specifically:

1) Should I have a single IndexSearcher to serve all (concurrent) users?
I saw such a recommendation in a tutorial, but discovered that an open IndexSearcher prevents 'optimize' from merging my files... so should I close it just before optimization? Or should I open an individual (short-lived) IndexSearcher for each search request?

2) Our tests also imply that IndexWriter.optimize()  takes effect only after you close() that writer - which is a shame, because I hoped to keep using the same writer (I hear it's expensive to instantiate). I doing something wrong? 

Thanks



      

Re: Newbie: "Life span" of IndexWriter / IndexSearcher?

Posted by sol myr <so...@yahoo.com>.
Worked like a charm - thanks a lot.


--- On Sun, 1/16/11, Raf <r....@gmail.com> wrote:

From: Raf <r....@gmail.com>
Subject: Re: Newbie: "Life span" of IndexWriter / IndexSearcher?
To: java-user@lucene.apache.org
Date: Sunday, January 16, 2011, 3:16 AM

Look at the JavaDoc:
http://lucene.apache.org/java/3_0_2/api/core/org/apache/lucene/index/IndexReader.html#reopen()

The *reopen* method returns a *new reader* if the index has changed since
the original reader was opened.

So, you should do something like this:
IndexReader newReader = reader.reopen(true);
if (newReader != reader) {
reader.close();
 reader = newReader;
searcher = new IndexSearcher(reader);
}

instead of
reader.reopen(true);

Bye.
*Raf*

On Sun, Jan 16, 2011 at 11:06 AM, sol myr <so...@yahoo.com> wrote:

> Hi,
>
> Thank you kindly for replying.
> Unfortunately, reopen() doesn't help me see the changes.
> Here's my test:
> First I write & commit a document, and run a search - which correctly finds
> this document.
> Then I write & commit another document, re-open the reader and run another
> search - this should find 2 documents, but it only finds 1 document (the
> first one).
> BTW if instead of 'reader.reopen()' I instantiate a brand-new searcher (and
> reader), it correctly finds 2 documents...
>
> // Shared objects:
> Directory directory = FSDirectory.open(new File("c:/myDir"));
> Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_30);
> IndexWriter writer = new IndexWriter(directory, analyzer,
>      IndexWriter.MaxFieldLength.LIMITED);
> Query query =  new TermQuery(new Term("title", "hello"));
>
> // Write document #1:
> writer.addDocument(makeDoc("hello world 1")); // Field title="hello world
> 1"
> writer.commit();
>
> // First search (yields document #1 as expected):
> IndexReader reader=IndexReader.open(directory, true);
> IndexSearcher searcher = new IndexSearcher(reader);
> TopDocs results1 = searcher.search(query, 10000);
> printResults(searcher, results1);
>
> // Write document #2:
> writer.addDocument(makeDoc("hello world 2")); // Field title="hello world
> 2"
> writer.commit();
>
> // Reopen reader, and search (should yield 2 documents, but I only see 1):
> reader.reopen(true);
> TopDocs results2 = searcher.search(query, 10000);
> printResults(searcher, results2);
>
>
> --- On Thu, 1/13/11, Uwe Schindler <uw...@thetaphi.de> wrote:
>
> From: Uwe Schindler <uw...@thetaphi.de>
> Subject: RE: Newbie: "Life span" of IndexWriter / IndexSearcher?
> To: java-user@lucene.apache.org
> Date: Thursday, January 13, 2011, 7:40 AM
>
> You can leave the IndexWriter and IndexSearcher all the time. The only
> important thing, changes made by IndexWriter's commit() method are only
> seen
> by IndexSearcher, when the underlying IndexReader is reopened (e.g. by
> using
> IndexReader.reopen()) - please note that this only works with direct access
> to the IndexReaders, so I would recommend using the constructors of
> IndexSearcher that take IndexReaders (the Directory ones are only for easy
> beginner's use).
>
>
>
>
>



      

Re: Newbie: "Life span" of IndexWriter / IndexSearcher?

Posted by Raf <r....@gmail.com>.
Look at the JavaDoc:
http://lucene.apache.org/java/3_0_2/api/core/org/apache/lucene/index/IndexReader.html#reopen()

The *reopen* method returns a *new reader* if the index has changed since
the original reader was opened.

So, you should do something like this:
IndexReader newReader = reader.reopen(true);
if (newReader != reader) {
reader.close();
 reader = newReader;
searcher = new IndexSearcher(reader);
}

instead of
reader.reopen(true);

Bye.
*Raf*

On Sun, Jan 16, 2011 at 11:06 AM, sol myr <so...@yahoo.com> wrote:

> Hi,
>
> Thank you kindly for replying.
> Unfortunately, reopen() doesn't help me see the changes.
> Here's my test:
> First I write & commit a document, and run a search - which correctly finds
> this document.
> Then I write & commit another document, re-open the reader and run another
> search - this should find 2 documents, but it only finds 1 document (the
> first one).
> BTW if instead of 'reader.reopen()' I instantiate a brand-new searcher (and
> reader), it correctly finds 2 documents...
>
> // Shared objects:
> Directory directory = FSDirectory.open(new File("c:/myDir"));
> Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_30);
> IndexWriter writer = new IndexWriter(directory, analyzer,
>      IndexWriter.MaxFieldLength.LIMITED);
> Query query =  new TermQuery(new Term("title", "hello"));
>
> // Write document #1:
> writer.addDocument(makeDoc("hello world 1")); // Field title="hello world
> 1"
> writer.commit();
>
> // First search (yields document #1 as expected):
> IndexReader reader=IndexReader.open(directory, true);
> IndexSearcher searcher = new IndexSearcher(reader);
> TopDocs results1 = searcher.search(query, 10000);
> printResults(searcher, results1);
>
> // Write document #2:
> writer.addDocument(makeDoc("hello world 2")); // Field title="hello world
> 2"
> writer.commit();
>
> // Reopen reader, and search (should yield 2 documents, but I only see 1):
> reader.reopen(true);
> TopDocs results2 = searcher.search(query, 10000);
> printResults(searcher, results2);
>
>
> --- On Thu, 1/13/11, Uwe Schindler <uw...@thetaphi.de> wrote:
>
> From: Uwe Schindler <uw...@thetaphi.de>
> Subject: RE: Newbie: "Life span" of IndexWriter / IndexSearcher?
> To: java-user@lucene.apache.org
> Date: Thursday, January 13, 2011, 7:40 AM
>
> You can leave the IndexWriter and IndexSearcher all the time. The only
> important thing, changes made by IndexWriter's commit() method are only
> seen
> by IndexSearcher, when the underlying IndexReader is reopened (e.g. by
> using
> IndexReader.reopen()) - please note that this only works with direct access
> to the IndexReaders, so I would recommend using the constructors of
> IndexSearcher that take IndexReaders (the Directory ones are only for easy
> beginner's use).
>
>
>
>
>

RE: Newbie: "Life span" of IndexWriter / IndexSearcher?

Posted by sol myr <so...@yahoo.com>.
Hi,

Thank you kindly for replying. 
Unfortunately, reopen() doesn't help me see the changes.
Here's my test:
First I write & commit a document, and run a search - which correctly finds this document.
Then I write & commit another document, re-open the reader and run another search - this should find 2 documents, but it only finds 1 document (the first one).
BTW if instead of 'reader.reopen()' I instantiate a brand-new searcher (and reader), it correctly finds 2 documents...

// Shared objects:
Directory directory = FSDirectory.open(new File("c:/myDir"));
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_30); 
IndexWriter writer = new IndexWriter(directory, analyzer, 
     IndexWriter.MaxFieldLength.LIMITED);
Query query =  new TermQuery(new Term("title", "hello"));

// Write document #1:
writer.addDocument(makeDoc("hello world 1")); // Field title="hello world 1"
writer.commit();

// First search (yields document #1 as expected):
IndexReader reader=IndexReader.open(directory, true);
IndexSearcher searcher = new IndexSearcher(reader);
TopDocs results1 = searcher.search(query, 10000);
printResults(searcher, results1);

// Write document #2:
writer.addDocument(makeDoc("hello world 2")); // Field title="hello world 2"
writer.commit();

// Reopen reader, and search (should yield 2 documents, but I only see 1):
reader.reopen(true);
TopDocs results2 = searcher.search(query, 10000);
printResults(searcher, results2);


--- On Thu, 1/13/11, Uwe Schindler <uw...@thetaphi.de> wrote:

From: Uwe Schindler <uw...@thetaphi.de>
Subject: RE: Newbie: "Life span" of IndexWriter / IndexSearcher?
To: java-user@lucene.apache.org
Date: Thursday, January 13, 2011, 7:40 AM

You can leave the IndexWriter and IndexSearcher all the time. The only
important thing, changes made by IndexWriter's commit() method are only seen
by IndexSearcher, when the underlying IndexReader is reopened (e.g. by using
IndexReader.reopen()) - please note that this only works with direct access
to the IndexReaders, so I would recommend using the constructors of
IndexSearcher that take IndexReaders (the Directory ones are only for easy
beginner's use). 




      

RE: Newbie: "Life span" of IndexWriter / IndexSearcher?

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,


> We're writing a web application, which naturally needs
> - "IndexSearcher" when users use our search screen
> - "IndexWriter" in a background process that periodically updates and
> optimizes our index.
> Note our writer is exclusive - no other applications/threads ever write to
our
> index files.
> 
> What's the common practice in terms of resource creation and sharing?
> Specifically:
> 
> 1) Should I have a single IndexSearcher to serve all (concurrent) users?
> I saw such a recommendation in a tutorial, but discovered that an open
> IndexSearcher prevents 'optimize' from merging my files... so should I
close it
> just before optimization? Or should I open an individual (short-lived)
> IndexSearcher for each search request?

You can leave the IndexWriter and IndexSearcher all the time. The only
important thing, changes made by IndexWriter's commit() method are only seen
by IndexSearcher, when the underlying IndexReader is reopened (e.g. by using
IndexReader.reopen()) - please note that this only works with direct access
to the IndexReaders, so I would recommend using the constructors of
IndexSearcher that take IndexReaders (the Directory ones are only for easy
beginner's use). See the Lucene In Action 2 for a good example of a Searcher
manager.

> 2) Our tests also imply that IndexWriter.optimize()  takes effect only
after
> you close() that writer - which is a shame, because I hoped to keep using
the
> same writer (I hear it's expensive to instantiate). I doing something
wrong?

This is wrong, see above. As the IndexReader/Searcher keeps the used
segments from the time it was opened, they can't go away until the
"snapshot" view of IndexReader is closed.

In general, it's not recommeneded to optimize indexes since 2.9 unless you
are doing things like delete all documents.

Uwe
 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org