You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by melix <ce...@lingway.com> on 2009/10/12 11:24:59 UTC

Realtime search best practices

Hi,

I'm going to replace an old reader/writer synchronization mechanism we had
implemented with the new near realtime search facilities in Lucene 2.9.
However, it's still a bit unclear on how to efficiently do it.

Is the following implementation the good way to do achieve it ? The context
is concurrent read/writes on an index :

1. create a Directory instance
2. create a writer on this directory
3. on each write request, add document to the writer
4. on each read request, 
 a. use writer.getReader() to obtain an up-to-date reader
 b. create an IndexSearcher with that reader
 c. perform Query
 d. close IndexSearcher
5. on application close
 a. close writer
 b. close directory

While this seems to be ok, I'm really wondering about the performance of
opening a searcher for each request. I could introduce some kind of delay
and cache a searcher for some seconds, but I'm not sure it's the best thing
to do.

Thanks,

Cedric


-- 
View this message in context: http://www.nabble.com/Realtime-search-best-practices-tp25852756p25852756.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Realtime search best practices

Posted by Jason Rutherglen <ja...@gmail.com>.

Hi Cedric,

There is a wiki page on NRT at:
http://wiki.apache.org/lucene-java/NearRealtimeSearch

Feel free tp ask questions if there's not enough information.

-J

On Mon, Oct 12, 2009 at 2:24 AM, melix <ce...@lingway.com> wrote:
>
> Hi,
>
> I'm going to replace an old reader/writer synchronization mechanism we had
> implemented with the new near realtime search facilities in Lucene 2.9.
> However, it's still a bit unclear on how to efficiently do it.
>
> Is the following implementation the good way to do achieve it ? The context
> is concurrent read/writes on an index :
>
> 1. create a Directory instance
> 2. create a writer on this directory
> 3. on each write request, add document to the writer
> 4. on each read request,
>  a. use writer.getReader() to obtain an up-to-date reader
>  b. create an IndexSearcher with that reader
>  c. perform Query
>  d. close IndexSearcher
> 5. on application close
>  a. close writer
>  b. close directory
>
> While this seems to be ok, I'm really wondering about the performance of
> opening a searcher for each request. I could introduce some kind of delay
> and cache a searcher for some seconds, but I'm not sure it's the best thing
> to do.
>
> Thanks,
>
> Cedric
>
>
> --
> View this message in context: http://www.nabble.com/Realtime-search-best-practices-tp25852756p25852756.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Realtime search best practices

Posted by Jake Mannix <ja...@gmail.com>.

Hi Cedric,

  I don't know of anyone with a substantial throughput production system who
is doing realtime search with the 2.9 improvements yet (and in fact, no
serious performance analysis has been done on these even "in the lab" so to
speak: follow https://issues.apache.org/jira/browse/LUCENE-1577 to track
work
on this), so some experimentation will be necessary to know how well it fits
in your environment.

  Your approach has the basic components of how to do 2.9 NRT search,
but it's missing the point when you're making your commit() calls.  Your
choices
here depend on some tradeoffs, as lucene provides ACID-like transactional
semantics whereby if you decide to commit() after every add(), then yes,
getReader() will be up-to-date with the most recent commit(), but at a cost
of indexing throughput (and much more frequent segment merges), at least
in comparison to only calling commit() at a slower rate (but calling
commit()
less frequently means, of course, that you only have readers as fresh as
your most recent commit).

  Also, you have to be aware that there are no guarantees as far as
realtimeliness is concerned with 2.9 NRT - if there is an addIndexes() going

on in anther thread on your IndexWriter, this is another instance where your

getReader() call won't block, but also won't necessarily get access to the
all of these new segments if the addIndexes() hasn't completed yet.

  Please post here any results you find with this - this is a very new
feature
and seeing how it works in the wild would be very helpful to everyone else
who is interested.

  -jake

On Mon, Oct 12, 2009 at 2:24 AM, melix <ce...@lingway.com> wrote:

>
> Hi,
>
> I'm going to replace an old reader/writer synchronization mechanism we had
> implemented with the new near realtime search facilities in Lucene 2.9.
> However, it's still a bit unclear on how to efficiently do it.
>
> Is the following implementation the good way to do achieve it ? The context
> is concurrent read/writes on an index :
>
> 1. create a Directory instance
> 2. create a writer on this directory
> 3. on each write request, add document to the writer
> 4. on each read request,
>  a. use writer.getReader() to obtain an up-to-date reader
>  b. create an IndexSearcher with that reader
>  c. perform Query
>  d. close IndexSearcher
> 5. on application close
>  a. close writer
>  b. close directory
>
> While this seems to be ok, I'm really wondering about the performance of
> opening a searcher for each request. I could introduce some kind of delay
> and cache a searcher for some seconds, but I'm not sure it's the best thing
> to do.
>
> Thanks,
>
> Cedric
>
>
> --
> View this message in context:
> http://www.nabble.com/Realtime-search-best-practices-tp25852756p25852756.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Realtime search best practices

Posted by John Wang <jo...@gmail.com>.

I think it was my email Yonik responded to and he is right, I was being lazy
and didn't read the javadoc very carefully.My bad.
Thanks for the javadoc change.

-John

On Mon, Oct 12, 2009 at 1:57 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:

> On Mon, Oct 12, 2009 at 4:35 PM, Jake Mannix <ja...@gmail.com>
> wrote:
> >  It may be surprising, but in fact I have read that
> > javadoc.
>
> It was not your email I responded to.
>
> >  It talks about not needing to close the
> > writer, but doesn't specifically talk about the what
> > the relationship between commit() calls and
> > getReader() calls is.
>
> Do you have a suggestion of how to update the JavaDoc?
> I'm not sure I understand the relationship between commit and
> getReader that you refer to.
>
> > , but why
> > is it so obvious that what could be happening
> > is that it only "returns all changes since the last
> > commit, but without touching disk because it
> > has docs in memory as well"?
>
> Sorry, this seems confusing - I'm not sure what you're trying to say.
> Perhaps we should approach this as proposed javadoc changes?
>
> -Yonik
> http://www.lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Realtime search best practices

Posted by Jake Mannix <ja...@gmail.com>.

On Mon, Oct 12, 2009 at 1:57 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:

> On Mon, Oct 12, 2009 at 4:35 PM, Jake Mannix <ja...@gmail.com>
> wrote:
> >  It may be surprising, but in fact I have read that
> > javadoc.
>
> It was not your email I responded to.
>

Sorry, my bad then - you said "guys" and John and I were the last two to be
asking questions / commenting on this thread.

> >  It talks about not needing to close the
> > writer, but doesn't specifically talk about the what
> > the relationship between commit() calls and
> > getReader() calls is.
>
> Do you have a suggestion of how to update the JavaDoc?
> I'm not sure I understand the relationship between commit and
> getReader that you refer to.
>

I like Mike's clarification to the first two javadocs he just posted,
very concise.

  -jake

Re: Realtime search best practices

Posted by Yonik Seeley <yo...@lucidimagination.com>.

On Mon, Oct 12, 2009 at 4:35 PM, Jake Mannix <ja...@gmail.com> wrote:
>  It may be surprising, but in fact I have read that
> javadoc.

It was not your email I responded to.

>  It talks about not needing to close the
> writer, but doesn't specifically talk about the what
> the relationship between commit() calls and
> getReader() calls is.

Do you have a suggestion of how to update the JavaDoc?
I'm not sure I understand the relationship between commit and
getReader that you refer to.

> , but why
> is it so obvious that what could be happening
> is that it only "returns all changes since the last
> commit, but without touching disk because it
> has docs in memory as well"?

Sorry, this seems confusing - I'm not sure what you're trying to say.
Perhaps we should approach this as proposed javadoc changes?

-Yonik
http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Realtime search best practices

Posted by Michael McCandless <lu...@mikemccandless.com>.

OK I opened https://issues.apache.org/jira/browse/LUCENE-1976.

Mike

On Tue, Oct 13, 2009 at 6:05 AM, Michael McCandless
<lu...@mikemccandless.com> wrote:
> I agree isCurrent doesn't work right for an NRT reader.  Right now, it
> will always return "true" because it's sharing the segmentInfos in use
> by the writer.
>
> Similarly, getVersion will lie.
>
> I'll open an issue to track how to fix it.
>
> Mike
>
> On Mon, Oct 12, 2009 at 6:12 PM, Yonik Seeley
> <yo...@lucidimagination.com> wrote:
>> Good point on isCurrent - I think it should only be with respect to
>> the latest index commit point? and we should clarify that in the
>> javadoc.
>>
>> [...]
>>> // but what does the nrtReader say?
>>> // it does not have access to the most recent commit
>>> // state, as there's been a commit (with documents)
>>> // since it was opened.  But the nrtReader *has* those
>>> // documents.
>>
>> I think we keep it simple - the nrtReader.isCurrent() would return
>> false after a commit is called.
>> Yes, isCurrent() is no longer such a great name.
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Realtime search best practices

Posted by Michael McCandless <lu...@mikemccandless.com>.

I agree isCurrent doesn't work right for an NRT reader.  Right now, it
will always return "true" because it's sharing the segmentInfos in use
by the writer.

Similarly, getVersion will lie.

I'll open an issue to track how to fix it.

Mike

On Mon, Oct 12, 2009 at 6:12 PM, Yonik Seeley
<yo...@lucidimagination.com> wrote:
> Good point on isCurrent - I think it should only be with respect to
> the latest index commit point? and we should clarify that in the
> javadoc.
>
> [...]
>> // but what does the nrtReader say?
>> // it does not have access to the most recent commit
>> // state, as there's been a commit (with documents)
>> // since it was opened.  But the nrtReader *has* those
>> // documents.
>
> I think we keep it simple - the nrtReader.isCurrent() would return
> false after a commit is called.
> Yes, isCurrent() is no longer such a great name.
>
> -Yonik
> http://www.lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Realtime search best practices

Posted by Michael McCandless <lu...@mikemccandless.com>.

On Tue, Oct 13, 2009 at 5:23 AM, Ganesh <em...@yahoo.co.in> wrote:

> In case of 2.4.1, the reader after reopen, will be warmed before actual use.

You mean you must warm it after you call reopen, before using it, right?

> In 2.9, public void setMergedSegmentWarmer(IndexWriter.IndexReaderWarmer warmer), does warming when we do getReader().

Right, and this is better that doing your own warming after calling
getReader because warming of newly merged segments won't block your
near real-time turnaround.

> If we do getReader() for every request then whether it will reduce the search performance?

For every search request?  Yes this will always reduce performance,
even worse than simply calling reopen for every search request,
because getReader() forces the writer to flush a new segment.

> Does warming necessarly required in 2.9? If we do warming for the very first time is not enough? Do we need to do it on every request?

It's not "required", but if you don't do it it means the first search
to land after a getReader will pay that warming cost.

Often this cost is negligible.  But, rarely, once a very large segment
merge has completed, the warming of that newly merged segment could be
very large.  This is heavily dependent on the size of your index,
whether your queries are using the FieldCache (doing field sorting, or
using function queries), etc.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Realtime search best practices

Posted by Ganesh <em...@yahoo.co.in>.

Hello all,

In case of 2.4.1, the reader after reopen, will be warmed before actual use. In 2.9, public void setMergedSegmentWarmer(IndexWriter.IndexReaderWarmer warmer), does warming when we do getReader(). 

If we do getReader() for every request then whether it will reduce the search performance? 

Does warming necessarly required in 2.9? If we do warming for the very first time is not enough? Do we need to do it on every request?

Regards
Ganesh

----- Original Message ----- 
From: "Yonik Seeley" <yo...@lucidimagination.com>
To: <ja...@lucene.apache.org>
Sent: Tuesday, October 13, 2009 3:42 AM
Subject: Re: Realtime search best practices


Good point on isCurrent - I think it should only be with respect to
the latest index commit point? and we should clarify that in the
javadoc.

[...]
> // but what does the nrtReader say?
> // it does not have access to the most recent commit
> // state, as there's been a commit (with documents)
> // since it was opened. But the nrtReader *has* those
> // documents.

I think we keep it simple - the nrtReader.isCurrent() would return
false after a commit is called.
Yes, isCurrent() is no longer such a great name.

-Yonik
http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Send instant messages to your online friends http://in.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Realtime search best practices

Posted by Yonik Seeley <yo...@lucidimagination.com>.

Good point on isCurrent - I think it should only be with respect to
the latest index commit point? and we should clarify that in the
javadoc.

[...]
> // but what does the nrtReader say?
> // it does not have access to the most recent commit
> // state, as there's been a commit (with documents)
> // since it was opened.  But the nrtReader *has* those
> // documents.

I think we keep it simple - the nrtReader.isCurrent() would return
false after a commit is called.
Yes, isCurrent() is no longer such a great name.

-Yonik
http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Realtime search best practices

Posted by Jake Mannix <ja...@gmail.com>.

I still see some things we might want to document or explain:

We still need to be careful what the call to "isCurrent()"
will mean in the future for IndexReaders - as now there is another
kind of "current" - "current even up to uncommitted changes".

Imagine the following set of IndexReaders floating around an
application:
------
1)  IndexReader reader = IndexReader.open(diskDir);

// this reader is certainly current.
2)  assert(reader.isCurrent());

3)  IndexWriter writer = new IndexWriter(diskDir);
4)  writer.addDocument(doc);

// this reader has access to that doc
5)  IndexReader nrtReader = writer.getReader();

6)  writer.addDocument(doc2);

// now for the isCurrent() semantics... the disk reader is
// still current, as of last commit:
7)  assert(reader.isCurrent());

// as is the nrtReader, even though it has information
// *past* the most recent commit, but not all of it!
8) assert(nrtReader.isCurrent());

// reopen the nrtReader and get access to doc2
9) nrtReader = writer.getReader();

// now nrtReader is not only current, but "maximally current"
10) assert(nrtReader.isCurrent());

// but what about now?
11)  writer.commit();

// the disk index reader follows the old ways:
12)  assert(!reader.isCurrent());

// but what does the nrtReader say?
// it does not have access to the most recent commit
// state, as there's been a commit (with documents)
// since it was opened.  But the nrtReader *has* those
// documents.

13)  assert(!nrtReader.isCurrent());
-----

The result of lines 8 and 13 especially seem to show how
one could get confused on what is meant by current - but
it maybe is just a naming issue (although line 13 seems
to be more than that: the nrtReader in that case really is
up-to-date with disk at this point, and would show exactly
the results which a freshly opened reader would).

Maybe people should be advised to not mix and match
disk readers and IndexWriter supplied ones, and if they
want NRT search with lucene 2.9+, they grab a reader from
the IndexWriter upon opening said writer, and then just
continually call reopen() on it as queries come in
throughout the life of their application (being careful not
to close() their writer and thus trigger an
AlreadyClosedException)?

  -jake


On Mon, Oct 12, 2009 at 1:56 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> I agree, the javadocs could be improved.  How about something like
> this for the first 2 paragraphs:
>
>   * Returns a readonly reader, covering all committed as
>   * well as un-committed changes to the index.  This
>    * provides "near real-time" searching, in that changes
>    * made during an IndexWriter session can be quickly made
>   * available for searching without closing the writer nor
>   * calling {@link #commit}.
>   *
>   * <p>Note that this is functionally equivalent to calling
>   * {#commit} and then using {@link IndexReader#open} to
>   * open a new reader.  But the turarnound time of this
>   * method should be faster since it avoids the potentially
>   * costly {@link #commit}.<p>
>
> Mike
>
> On Mon, Oct 12, 2009 at 4:35 PM, Jake Mannix <ja...@gmail.com>
> wrote:
> > Thanks Yonik,
> >
> >  It may be surprising, but in fact I have read that
> > javadoc.  It talks about not needing to close the
> > writer, but doesn't specifically talk about the what
> > the relationship between commit() calls and
> > getReader() calls is.  I suppose I should have
> > interpreted:
> >
> > "@returns a new reader which contains all
> > changes..."
> >
> > to mean "all uncommitted changes", but why
> > is it so obvious that what could be happening
> > is that it only "returns all changes since the last
> > commit, but without touching disk because it
> > has docs in memory as well"?
> >
> >  -jake
> >
> > On Mon, Oct 12, 2009 at 1:26 PM, Yonik Seeley <
> yonik@lucidimagination.com>wrote:
> >
> >> Guys, please - you're not new at this... this is what JavaDoc is for:
> >>
> >>  /**
> >>   * Returns a readonly reader containing all
> >>   * current updates.  Flush is called automatically.  This
> >>   * provides "near real-time" searching, in that changes
> >>   * made during an IndexWriter session can be made
> >>   * available for searching without closing the writer.
> >>   *
> >>   * <p>It's near real-time because there is no hard
> >>   * guarantee on how quickly you can get a new reader after
> >>   * making changes with IndexWriter.  You'll have to
> >>   * experiment in your situation to determine if it's
> >>   * fast enough.  As this is a new and experimental
> >>   * feature, please report back on your findings so we can
> >>   * learn, improve and iterate.</p>
> >>   *
> >>   * <p>The resulting reader supports {@link
> >>   * IndexReader#reopen}, but that call will simply forward
> >>   * back to this method (though this may change in the
> >>   * future).</p>
> >>   *
> >>   * <p>The very first time this method is called, this
> >>   * writer instance will make every effort to pool the
> >>   * readers that it opens for doing merges, applying
> >>   * deletes, etc.  This means additional resources (RAM,
> >>   * file descriptors, CPU time) will be consumed.</p>
> >>   *
> >>   * <p>For lower latency on reopening a reader, you should
> >>   * call {@link #setMergedSegmentWarmer} to
> >>   * pre-warm a newly merged segment before it's committed
> >>   * to the index.  This is important for minimizing
> >>   * index-to-search delay after a large merge.  </p>
> >>   *
> >>   * <p>If an addIndexes* call is running in another thread,
> >>   * then this reader will only search those segments from
> >>   * the foreign index that have been successfully copied
> >>   * over, so far</p>.
> >>   *
> >>   * <p><b>NOTE</b>: Once the writer is closed, any
> >>   * outstanding readers may continue to be used.  However,
> >>   * if you attempt to reopen any of those readers, you'll
> >>   * hit an {@link AlreadyClosedException}.</p>
> >>   *
> >>   * <p><b>NOTE:</b> This API is experimental and might
> >>   * change in incompatible ways in the next release.</p>
> >>   *
> >>   * @return IndexReader that covers entire index plus all
> >>   * changes made so far by this IndexWriter instance
> >>   *
> >>   * @throws IOException
> >>   */
> >>  public IndexReader getReader() throws IOException {
> >>
> >>
> >> -Yonik
> >> http://www.lucidimagination.com
> >>
> >>
> >> On Mon, Oct 12, 2009 at 4:18 PM, John Wang <jo...@gmail.com> wrote:
> >> > Oh, that is really good to know!
> >> > Is this deterministic? e.g. as long as writer.addDocument() is called,
> >> next
> >> > getReader reflects the change? Does it work with deletes? e.g.
> >> > writer.deleteDocuments()?
> >> > Thanks Mike for clarifying!
> >> >
> >> > -John
> >> >
> >> > On Mon, Oct 12, 2009 at 12:11 PM, Michael McCandless <
> >> > lucene@mikemccandless.com> wrote:
> >> >
> >> >> Just to clarify: IndexWriter.newReader returns a reader that searches
> >> >> uncommitted changes as well.  Ie, you need not call
> IndexWriter.commit
> >> >> to make the changes visible.
> >> >>
> >> >> However, if you're opening a reader the "normal" way
> >> >> (IndexReader.open) then it is necessary to first call
> >> >> IndexWriter.commit.
> >> >>
> >> >> Mike
> >> >>
> >> >> On Mon, Oct 12, 2009 at 5:24 AM, melix <ce...@lingway.com>
> >> >> wrote:
> >> >> >
> >> >> > Hi,
> >> >> >
> >> >> > I'm going to replace an old reader/writer synchronization mechanism
> we
> >> >> had
> >> >> > implemented with the new near realtime search facilities in Lucene
> >> 2.9.
> >> >> > However, it's still a bit unclear on how to efficiently do it.
> >> >> >
> >> >> > Is the following implementation the good way to do achieve it ? The
> >> >> context
> >> >> > is concurrent read/writes on an index :
> >> >> >
> >> >> > 1. create a Directory instance
> >> >> > 2. create a writer on this directory
> >> >> > 3. on each write request, add document to the writer
> >> >> > 4. on each read request,
> >> >> >  a. use writer.getReader() to obtain an up-to-date reader
> >> >> >  b. create an IndexSearcher with that reader
> >> >> >  c. perform Query
> >> >> >  d. close IndexSearcher
> >> >> > 5. on application close
> >> >> >  a. close writer
> >> >> >  b. close directory
> >> >> >
> >> >> > While this seems to be ok, I'm really wondering about the
> performance
> >> of
> >> >> > opening a searcher for each request. I could introduce some kind of
> >> delay
> >> >> > and cache a searcher for some seconds, but I'm not sure it's the
> best
> >> >> thing
> >> >> > to do.
> >> >> >
> >> >> > Thanks,
> >> >> >
> >> >> > Cedric
> >> >> >
> >> >> >
> >> >> > --
> >> >> > View this message in context:
> >> >>
> >>
> http://www.nabble.com/Realtime-search-best-practices-tp25852756p25852756.html
> >> >> > Sent from the Lucene - Java Users mailing list archive at
> Nabble.com.
> >> >> >
> >> >> >
> >> >> >
> ---------------------------------------------------------------------
> >> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >> >
> >> >> >
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >>
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Realtime search best practices

Posted by Michael McCandless <lu...@mikemccandless.com>.

OK I just committed it -- thanks!

Mike

On Mon, Oct 12, 2009 at 5:01 PM, Jake Mannix <ja...@gmail.com> wrote:
> That seems a lot more straightforward Mike, thanks.
>
>  -jake
>
> On Mon, Oct 12, 2009 at 1:56 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> I agree, the javadocs could be improved.  How about something like
>> this for the first 2 paragraphs:
>>
>>   * Returns a readonly reader, covering all committed as
>>   * well as un-committed changes to the index.  This
>>    * provides "near real-time" searching, in that changes
>>    * made during an IndexWriter session can be quickly made
>>   * available for searching without closing the writer nor
>>   * calling {@link #commit}.
>>   *
>>   * <p>Note that this is functionally equivalent to calling
>>   * {#commit} and then using {@link IndexReader#open} to
>>   * open a new reader.  But the turarnound time of this
>>   * method should be faster since it avoids the potentially
>>   * costly {@link #commit}.<p>
>>
>> Mike
>>
>> On Mon, Oct 12, 2009 at 4:35 PM, Jake Mannix <ja...@gmail.com>
>> wrote:
>> > Thanks Yonik,
>> >
>> >  It may be surprising, but in fact I have read that
>> > javadoc.  It talks about not needing to close the
>> > writer, but doesn't specifically talk about the what
>> > the relationship between commit() calls and
>> > getReader() calls is.  I suppose I should have
>> > interpreted:
>> >
>> > "@returns a new reader which contains all
>> > changes..."
>> >
>> > to mean "all uncommitted changes", but why
>> > is it so obvious that what could be happening
>> > is that it only "returns all changes since the last
>> > commit, but without touching disk because it
>> > has docs in memory as well"?
>> >
>> >  -jake
>> >
>> > On Mon, Oct 12, 2009 at 1:26 PM, Yonik Seeley <
>> yonik@lucidimagination.com>wrote:
>> >
>> >> Guys, please - you're not new at this... this is what JavaDoc is for:
>> >>
>> >>  /**
>> >>   * Returns a readonly reader containing all
>> >>   * current updates.  Flush is called automatically.  This
>> >>   * provides "near real-time" searching, in that changes
>> >>   * made during an IndexWriter session can be made
>> >>   * available for searching without closing the writer.
>> >>   *
>> >>   * <p>It's near real-time because there is no hard
>> >>   * guarantee on how quickly you can get a new reader after
>> >>   * making changes with IndexWriter.  You'll have to
>> >>   * experiment in your situation to determine if it's
>> >>   * fast enough.  As this is a new and experimental
>> >>   * feature, please report back on your findings so we can
>> >>   * learn, improve and iterate.</p>
>> >>   *
>> >>   * <p>The resulting reader supports {@link
>> >>   * IndexReader#reopen}, but that call will simply forward
>> >>   * back to this method (though this may change in the
>> >>   * future).</p>
>> >>   *
>> >>   * <p>The very first time this method is called, this
>> >>   * writer instance will make every effort to pool the
>> >>   * readers that it opens for doing merges, applying
>> >>   * deletes, etc.  This means additional resources (RAM,
>> >>   * file descriptors, CPU time) will be consumed.</p>
>> >>   *
>> >>   * <p>For lower latency on reopening a reader, you should
>> >>   * call {@link #setMergedSegmentWarmer} to
>> >>   * pre-warm a newly merged segment before it's committed
>> >>   * to the index.  This is important for minimizing
>> >>   * index-to-search delay after a large merge.  </p>
>> >>   *
>> >>   * <p>If an addIndexes* call is running in another thread,
>> >>   * then this reader will only search those segments from
>> >>   * the foreign index that have been successfully copied
>> >>   * over, so far</p>.
>> >>   *
>> >>   * <p><b>NOTE</b>: Once the writer is closed, any
>> >>   * outstanding readers may continue to be used.  However,
>> >>   * if you attempt to reopen any of those readers, you'll
>> >>   * hit an {@link AlreadyClosedException}.</p>
>> >>   *
>> >>   * <p><b>NOTE:</b> This API is experimental and might
>> >>   * change in incompatible ways in the next release.</p>
>> >>   *
>> >>   * @return IndexReader that covers entire index plus all
>> >>   * changes made so far by this IndexWriter instance
>> >>   *
>> >>   * @throws IOException
>> >>   */
>> >>  public IndexReader getReader() throws IOException {
>> >>
>> >>
>> >> -Yonik
>> >> http://www.lucidimagination.com
>> >>
>> >>
>> >> On Mon, Oct 12, 2009 at 4:18 PM, John Wang <jo...@gmail.com> wrote:
>> >> > Oh, that is really good to know!
>> >> > Is this deterministic? e.g. as long as writer.addDocument() is called,
>> >> next
>> >> > getReader reflects the change? Does it work with deletes? e.g.
>> >> > writer.deleteDocuments()?
>> >> > Thanks Mike for clarifying!
>> >> >
>> >> > -John
>> >> >
>> >> > On Mon, Oct 12, 2009 at 12:11 PM, Michael McCandless <
>> >> > lucene@mikemccandless.com> wrote:
>> >> >
>> >> >> Just to clarify: IndexWriter.newReader returns a reader that searches
>> >> >> uncommitted changes as well.  Ie, you need not call
>> IndexWriter.commit
>> >> >> to make the changes visible.
>> >> >>
>> >> >> However, if you're opening a reader the "normal" way
>> >> >> (IndexReader.open) then it is necessary to first call
>> >> >> IndexWriter.commit.
>> >> >>
>> >> >> Mike
>> >> >>
>> >> >> On Mon, Oct 12, 2009 at 5:24 AM, melix <ce...@lingway.com>
>> >> >> wrote:
>> >> >> >
>> >> >> > Hi,
>> >> >> >
>> >> >> > I'm going to replace an old reader/writer synchronization mechanism
>> we
>> >> >> had
>> >> >> > implemented with the new near realtime search facilities in Lucene
>> >> 2.9.
>> >> >> > However, it's still a bit unclear on how to efficiently do it.
>> >> >> >
>> >> >> > Is the following implementation the good way to do achieve it ? The
>> >> >> context
>> >> >> > is concurrent read/writes on an index :
>> >> >> >
>> >> >> > 1. create a Directory instance
>> >> >> > 2. create a writer on this directory
>> >> >> > 3. on each write request, add document to the writer
>> >> >> > 4. on each read request,
>> >> >> >  a. use writer.getReader() to obtain an up-to-date reader
>> >> >> >  b. create an IndexSearcher with that reader
>> >> >> >  c. perform Query
>> >> >> >  d. close IndexSearcher
>> >> >> > 5. on application close
>> >> >> >  a. close writer
>> >> >> >  b. close directory
>> >> >> >
>> >> >> > While this seems to be ok, I'm really wondering about the
>> performance
>> >> of
>> >> >> > opening a searcher for each request. I could introduce some kind of
>> >> delay
>> >> >> > and cache a searcher for some seconds, but I'm not sure it's the
>> best
>> >> >> thing
>> >> >> > to do.
>> >> >> >
>> >> >> > Thanks,
>> >> >> >
>> >> >> > Cedric
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > View this message in context:
>> >> >>
>> >>
>> http://www.nabble.com/Realtime-search-best-practices-tp25852756p25852756.html
>> >> >> > Sent from the Lucene - Java Users mailing list archive at
>> Nabble.com.
>> >> >> >
>> >> >> >
>> >> >> >
>> ---------------------------------------------------------------------
>> >> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >> >> >
>> >> >> >
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >> >>
>> >> >>
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Realtime search best practices

Posted by Jake Mannix <ja...@gmail.com>.

That seems a lot more straightforward Mike, thanks.

  -jake

On Mon, Oct 12, 2009 at 1:56 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> I agree, the javadocs could be improved.  How about something like
> this for the first 2 paragraphs:
>
>   * Returns a readonly reader, covering all committed as
>   * well as un-committed changes to the index.  This
>    * provides "near real-time" searching, in that changes
>    * made during an IndexWriter session can be quickly made
>   * available for searching without closing the writer nor
>   * calling {@link #commit}.
>   *
>   * <p>Note that this is functionally equivalent to calling
>   * {#commit} and then using {@link IndexReader#open} to
>   * open a new reader.  But the turarnound time of this
>   * method should be faster since it avoids the potentially
>   * costly {@link #commit}.<p>
>
> Mike
>
> On Mon, Oct 12, 2009 at 4:35 PM, Jake Mannix <ja...@gmail.com>
> wrote:
> > Thanks Yonik,
> >
> >  It may be surprising, but in fact I have read that
> > javadoc.  It talks about not needing to close the
> > writer, but doesn't specifically talk about the what
> > the relationship between commit() calls and
> > getReader() calls is.  I suppose I should have
> > interpreted:
> >
> > "@returns a new reader which contains all
> > changes..."
> >
> > to mean "all uncommitted changes", but why
> > is it so obvious that what could be happening
> > is that it only "returns all changes since the last
> > commit, but without touching disk because it
> > has docs in memory as well"?
> >
> >  -jake
> >
> > On Mon, Oct 12, 2009 at 1:26 PM, Yonik Seeley <
> yonik@lucidimagination.com>wrote:
> >
> >> Guys, please - you're not new at this... this is what JavaDoc is for:
> >>
> >>  /**
> >>   * Returns a readonly reader containing all
> >>   * current updates.  Flush is called automatically.  This
> >>   * provides "near real-time" searching, in that changes
> >>   * made during an IndexWriter session can be made
> >>   * available for searching without closing the writer.
> >>   *
> >>   * <p>It's near real-time because there is no hard
> >>   * guarantee on how quickly you can get a new reader after
> >>   * making changes with IndexWriter.  You'll have to
> >>   * experiment in your situation to determine if it's
> >>   * fast enough.  As this is a new and experimental
> >>   * feature, please report back on your findings so we can
> >>   * learn, improve and iterate.</p>
> >>   *
> >>   * <p>The resulting reader supports {@link
> >>   * IndexReader#reopen}, but that call will simply forward
> >>   * back to this method (though this may change in the
> >>   * future).</p>
> >>   *
> >>   * <p>The very first time this method is called, this
> >>   * writer instance will make every effort to pool the
> >>   * readers that it opens for doing merges, applying
> >>   * deletes, etc.  This means additional resources (RAM,
> >>   * file descriptors, CPU time) will be consumed.</p>
> >>   *
> >>   * <p>For lower latency on reopening a reader, you should
> >>   * call {@link #setMergedSegmentWarmer} to
> >>   * pre-warm a newly merged segment before it's committed
> >>   * to the index.  This is important for minimizing
> >>   * index-to-search delay after a large merge.  </p>
> >>   *
> >>   * <p>If an addIndexes* call is running in another thread,
> >>   * then this reader will only search those segments from
> >>   * the foreign index that have been successfully copied
> >>   * over, so far</p>.
> >>   *
> >>   * <p><b>NOTE</b>: Once the writer is closed, any
> >>   * outstanding readers may continue to be used.  However,
> >>   * if you attempt to reopen any of those readers, you'll
> >>   * hit an {@link AlreadyClosedException}.</p>
> >>   *
> >>   * <p><b>NOTE:</b> This API is experimental and might
> >>   * change in incompatible ways in the next release.</p>
> >>   *
> >>   * @return IndexReader that covers entire index plus all
> >>   * changes made so far by this IndexWriter instance
> >>   *
> >>   * @throws IOException
> >>   */
> >>  public IndexReader getReader() throws IOException {
> >>
> >>
> >> -Yonik
> >> http://www.lucidimagination.com
> >>
> >>
> >> On Mon, Oct 12, 2009 at 4:18 PM, John Wang <jo...@gmail.com> wrote:
> >> > Oh, that is really good to know!
> >> > Is this deterministic? e.g. as long as writer.addDocument() is called,
> >> next
> >> > getReader reflects the change? Does it work with deletes? e.g.
> >> > writer.deleteDocuments()?
> >> > Thanks Mike for clarifying!
> >> >
> >> > -John
> >> >
> >> > On Mon, Oct 12, 2009 at 12:11 PM, Michael McCandless <
> >> > lucene@mikemccandless.com> wrote:
> >> >
> >> >> Just to clarify: IndexWriter.newReader returns a reader that searches
> >> >> uncommitted changes as well.  Ie, you need not call
> IndexWriter.commit
> >> >> to make the changes visible.
> >> >>
> >> >> However, if you're opening a reader the "normal" way
> >> >> (IndexReader.open) then it is necessary to first call
> >> >> IndexWriter.commit.
> >> >>
> >> >> Mike
> >> >>
> >> >> On Mon, Oct 12, 2009 at 5:24 AM, melix <ce...@lingway.com>
> >> >> wrote:
> >> >> >
> >> >> > Hi,
> >> >> >
> >> >> > I'm going to replace an old reader/writer synchronization mechanism
> we
> >> >> had
> >> >> > implemented with the new near realtime search facilities in Lucene
> >> 2.9.
> >> >> > However, it's still a bit unclear on how to efficiently do it.
> >> >> >
> >> >> > Is the following implementation the good way to do achieve it ? The
> >> >> context
> >> >> > is concurrent read/writes on an index :
> >> >> >
> >> >> > 1. create a Directory instance
> >> >> > 2. create a writer on this directory
> >> >> > 3. on each write request, add document to the writer
> >> >> > 4. on each read request,
> >> >> >  a. use writer.getReader() to obtain an up-to-date reader
> >> >> >  b. create an IndexSearcher with that reader
> >> >> >  c. perform Query
> >> >> >  d. close IndexSearcher
> >> >> > 5. on application close
> >> >> >  a. close writer
> >> >> >  b. close directory
> >> >> >
> >> >> > While this seems to be ok, I'm really wondering about the
> performance
> >> of
> >> >> > opening a searcher for each request. I could introduce some kind of
> >> delay
> >> >> > and cache a searcher for some seconds, but I'm not sure it's the
> best
> >> >> thing
> >> >> > to do.
> >> >> >
> >> >> > Thanks,
> >> >> >
> >> >> > Cedric
> >> >> >
> >> >> >
> >> >> > --
> >> >> > View this message in context:
> >> >>
> >>
> http://www.nabble.com/Realtime-search-best-practices-tp25852756p25852756.html
> >> >> > Sent from the Lucene - Java Users mailing list archive at
> Nabble.com.
> >> >> >
> >> >> >
> >> >> >
> ---------------------------------------------------------------------
> >> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >> >
> >> >> >
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >>
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Realtime search best practices

Posted by Michael McCandless <lu...@mikemccandless.com>.

I agree, the javadocs could be improved.  How about something like
this for the first 2 paragraphs:

   * Returns a readonly reader, covering all committed as
   * well as un-committed changes to the index.  This
   * provides "near real-time" searching, in that changes
   * made during an IndexWriter session can be quickly made
   * available for searching without closing the writer nor
   * calling {@link #commit}.
   *
   * <p>Note that this is functionally equivalent to calling
   * {#commit} and then using {@link IndexReader#open} to
   * open a new reader.  But the turarnound time of this
   * method should be faster since it avoids the potentially
   * costly {@link #commit}.<p>

Mike

On Mon, Oct 12, 2009 at 4:35 PM, Jake Mannix <ja...@gmail.com> wrote:
> Thanks Yonik,
>
>  It may be surprising, but in fact I have read that
> javadoc.  It talks about not needing to close the
> writer, but doesn't specifically talk about the what
> the relationship between commit() calls and
> getReader() calls is.  I suppose I should have
> interpreted:
>
> "@returns a new reader which contains all
> changes..."
>
> to mean "all uncommitted changes", but why
> is it so obvious that what could be happening
> is that it only "returns all changes since the last
> commit, but without touching disk because it
> has docs in memory as well"?
>
>  -jake
>
> On Mon, Oct 12, 2009 at 1:26 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:
>
>> Guys, please - you're not new at this... this is what JavaDoc is for:
>>
>>  /**
>>   * Returns a readonly reader containing all
>>   * current updates.  Flush is called automatically.  This
>>   * provides "near real-time" searching, in that changes
>>   * made during an IndexWriter session can be made
>>   * available for searching without closing the writer.
>>   *
>>   * <p>It's near real-time because there is no hard
>>   * guarantee on how quickly you can get a new reader after
>>   * making changes with IndexWriter.  You'll have to
>>   * experiment in your situation to determine if it's
>>   * fast enough.  As this is a new and experimental
>>   * feature, please report back on your findings so we can
>>   * learn, improve and iterate.</p>
>>   *
>>   * <p>The resulting reader supports {@link
>>   * IndexReader#reopen}, but that call will simply forward
>>   * back to this method (though this may change in the
>>   * future).</p>
>>   *
>>   * <p>The very first time this method is called, this
>>   * writer instance will make every effort to pool the
>>   * readers that it opens for doing merges, applying
>>   * deletes, etc.  This means additional resources (RAM,
>>   * file descriptors, CPU time) will be consumed.</p>
>>   *
>>   * <p>For lower latency on reopening a reader, you should
>>   * call {@link #setMergedSegmentWarmer} to
>>   * pre-warm a newly merged segment before it's committed
>>   * to the index.  This is important for minimizing
>>   * index-to-search delay after a large merge.  </p>
>>   *
>>   * <p>If an addIndexes* call is running in another thread,
>>   * then this reader will only search those segments from
>>   * the foreign index that have been successfully copied
>>   * over, so far</p>.
>>   *
>>   * <p><b>NOTE</b>: Once the writer is closed, any
>>   * outstanding readers may continue to be used.  However,
>>   * if you attempt to reopen any of those readers, you'll
>>   * hit an {@link AlreadyClosedException}.</p>
>>   *
>>   * <p><b>NOTE:</b> This API is experimental and might
>>   * change in incompatible ways in the next release.</p>
>>   *
>>   * @return IndexReader that covers entire index plus all
>>   * changes made so far by this IndexWriter instance
>>   *
>>   * @throws IOException
>>   */
>>  public IndexReader getReader() throws IOException {
>>
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>>
>> On Mon, Oct 12, 2009 at 4:18 PM, John Wang <jo...@gmail.com> wrote:
>> > Oh, that is really good to know!
>> > Is this deterministic? e.g. as long as writer.addDocument() is called,
>> next
>> > getReader reflects the change? Does it work with deletes? e.g.
>> > writer.deleteDocuments()?
>> > Thanks Mike for clarifying!
>> >
>> > -John
>> >
>> > On Mon, Oct 12, 2009 at 12:11 PM, Michael McCandless <
>> > lucene@mikemccandless.com> wrote:
>> >
>> >> Just to clarify: IndexWriter.newReader returns a reader that searches
>> >> uncommitted changes as well.  Ie, you need not call IndexWriter.commit
>> >> to make the changes visible.
>> >>
>> >> However, if you're opening a reader the "normal" way
>> >> (IndexReader.open) then it is necessary to first call
>> >> IndexWriter.commit.
>> >>
>> >> Mike
>> >>
>> >> On Mon, Oct 12, 2009 at 5:24 AM, melix <ce...@lingway.com>
>> >> wrote:
>> >> >
>> >> > Hi,
>> >> >
>> >> > I'm going to replace an old reader/writer synchronization mechanism we
>> >> had
>> >> > implemented with the new near realtime search facilities in Lucene
>> 2.9.
>> >> > However, it's still a bit unclear on how to efficiently do it.
>> >> >
>> >> > Is the following implementation the good way to do achieve it ? The
>> >> context
>> >> > is concurrent read/writes on an index :
>> >> >
>> >> > 1. create a Directory instance
>> >> > 2. create a writer on this directory
>> >> > 3. on each write request, add document to the writer
>> >> > 4. on each read request,
>> >> >  a. use writer.getReader() to obtain an up-to-date reader
>> >> >  b. create an IndexSearcher with that reader
>> >> >  c. perform Query
>> >> >  d. close IndexSearcher
>> >> > 5. on application close
>> >> >  a. close writer
>> >> >  b. close directory
>> >> >
>> >> > While this seems to be ok, I'm really wondering about the performance
>> of
>> >> > opening a searcher for each request. I could introduce some kind of
>> delay
>> >> > and cache a searcher for some seconds, but I'm not sure it's the best
>> >> thing
>> >> > to do.
>> >> >
>> >> > Thanks,
>> >> >
>> >> > Cedric
>> >> >
>> >> >
>> >> > --
>> >> > View this message in context:
>> >>
>> http://www.nabble.com/Realtime-search-best-practices-tp25852756p25852756.html
>> >> > Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>> >> >
>> >> >
>> >> > ---------------------------------------------------------------------
>> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >> >
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Realtime search best practices

Posted by melix <ce...@lingway.com>.

Ok, thanks for the details. I see I'm not the only one finding the javadoc
hard to understand. While this is well documented, it's still not clear
enough about the exact semantics of "changes" : at first I thought it
returned an IndexReader on the *uncommited changes only*, which meant it did
not include commited ones. Well, it should have been obvious that I couldn't
do anything with such a reader but you know ;)

I'll try to implement something on that. I think it won't be so difficult as
I've got many writes and less reads. It means that the performance penalty
of creating a searcher should be acceptable. However, I'll keep you in
touch.


Jake Mannix wrote:
> 
> Thanks Yonik,
> 
>   It may be surprising, but in fact I have read that
> javadoc.  It talks about not needing to close the
> writer, but doesn't specifically talk about the what
> the relationship between commit() calls and
> getReader() calls is.  I suppose I should have
> interpreted:
> 
> "@returns a new reader which contains all
> changes..."
> 
> to mean "all uncommitted changes", but why
> is it so obvious that what could be happening
> is that it only "returns all changes since the last
> commit, but without touching disk because it
> has docs in memory as well"?
> 
>   -jake
> 
> On Mon, Oct 12, 2009 at 1:26 PM, Yonik Seeley
> <yo...@lucidimagination.com>wrote:
> 
>> Guys, please - you're not new at this... this is what JavaDoc is for:
>>
>>  /**
>>   * Returns a readonly reader containing all
>>   * current updates.  Flush is called automatically.  This
>>   * provides "near real-time" searching, in that changes
>>   * made during an IndexWriter session can be made
>>   * available for searching without closing the writer.
>>   *
>>   * <p>It's near real-time because there is no hard
>>   * guarantee on how quickly you can get a new reader after
>>   * making changes with IndexWriter.  You'll have to
>>   * experiment in your situation to determine if it's
>>   * fast enough.  As this is a new and experimental
>>   * feature, please report back on your findings so we can
>>   * learn, improve and iterate.</p>
>>   *
>>   * <p>The resulting reader supports {@link
>>   * IndexReader#reopen}, but that call will simply forward
>>   * back to this method (though this may change in the
>>   * future).</p>
>>   *
>>   * <p>The very first time this method is called, this
>>   * writer instance will make every effort to pool the
>>   * readers that it opens for doing merges, applying
>>   * deletes, etc.  This means additional resources (RAM,
>>   * file descriptors, CPU time) will be consumed.</p>
>>   *
>>   * <p>For lower latency on reopening a reader, you should
>>   * call {@link #setMergedSegmentWarmer} to
>>   * pre-warm a newly merged segment before it's committed
>>   * to the index.  This is important for minimizing
>>   * index-to-search delay after a large merge.  </p>
>>   *
>>   * <p>If an addIndexes* call is running in another thread,
>>   * then this reader will only search those segments from
>>   * the foreign index that have been successfully copied
>>   * over, so far</p>.
>>   *
>>   * <p>NOTE: Once the writer is closed, any
>>   * outstanding readers may continue to be used.  However,
>>   * if you attempt to reopen any of those readers, you'll
>>   * hit an {@link AlreadyClosedException}.</p>
>>   *
>>   * <p>NOTE: This API is experimental and might
>>   * change in incompatible ways in the next release.</p>
>>   *
>>   * @return IndexReader that covers entire index plus all
>>   * changes made so far by this IndexWriter instance
>>   *
>>   * @throws IOException
>>   */
>>  public IndexReader getReader() throws IOException {
>>
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>>
>> On Mon, Oct 12, 2009 at 4:18 PM, John Wang <jo...@gmail.com> wrote:
>> > Oh, that is really good to know!
>> > Is this deterministic? e.g. as long as writer.addDocument() is called,
>> next
>> > getReader reflects the change? Does it work with deletes? e.g.
>> > writer.deleteDocuments()?
>> > Thanks Mike for clarifying!
>> >
>> > -John
>> >
>> > On Mon, Oct 12, 2009 at 12:11 PM, Michael McCandless <
>> > lucene@mikemccandless.com> wrote:
>> >
>> >> Just to clarify: IndexWriter.newReader returns a reader that searches
>> >> uncommitted changes as well.  Ie, you need not call IndexWriter.commit
>> >> to make the changes visible.
>> >>
>> >> However, if you're opening a reader the "normal" way
>> >> (IndexReader.open) then it is necessary to first call
>> >> IndexWriter.commit.
>> >>
>> >> Mike
>> >>
>> >> On Mon, Oct 12, 2009 at 5:24 AM, melix <ce...@lingway.com>
>> >> wrote:
>> >> >
>> >> > Hi,
>> >> >
>> >> > I'm going to replace an old reader/writer synchronization mechanism
>> we
>> >> had
>> >> > implemented with the new near realtime search facilities in Lucene
>> 2.9.
>> >> > However, it's still a bit unclear on how to efficiently do it.
>> >> >
>> >> > Is the following implementation the good way to do achieve it ? The
>> >> context
>> >> > is concurrent read/writes on an index :
>> >> >
>> >> > 1. create a Directory instance
>> >> > 2. create a writer on this directory
>> >> > 3. on each write request, add document to the writer
>> >> > 4. on each read request,
>> >> >  a. use writer.getReader() to obtain an up-to-date reader
>> >> >  b. create an IndexSearcher with that reader
>> >> >  c. perform Query
>> >> >  d. close IndexSearcher
>> >> > 5. on application close
>> >> >  a. close writer
>> >> >  b. close directory
>> >> >
>> >> > While this seems to be ok, I'm really wondering about the
>> performance
>> of
>> >> > opening a searcher for each request. I could introduce some kind of
>> delay
>> >> > and cache a searcher for some seconds, but I'm not sure it's the
>> best
>> >> thing
>> >> > to do.
>> >> >
>> >> > Thanks,
>> >> >
>> >> > Cedric
>> >> >
>> >> >
>> >> > --
>> >> > View this message in context:
>> >>
>> http://www.nabble.com/Realtime-search-best-practices-tp25852756p25852756.html
>> >> > Sent from the Lucene - Java Users mailing list archive at
>> Nabble.com.
>> >> >
>> >> >
>> >> >
>> ---------------------------------------------------------------------
>> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >> >
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/Realtime-search-best-practices-tp25852756p25863095.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Realtime search best practices

Posted by Jake Mannix <ja...@gmail.com>.

Thanks Yonik,

  It may be surprising, but in fact I have read that
javadoc.  It talks about not needing to close the
writer, but doesn't specifically talk about the what
the relationship between commit() calls and
getReader() calls is.  I suppose I should have
interpreted:

"@returns a new reader which contains all
changes..."

to mean "all uncommitted changes", but why
is it so obvious that what could be happening
is that it only "returns all changes since the last
commit, but without touching disk because it
has docs in memory as well"?

  -jake

On Mon, Oct 12, 2009 at 1:26 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:

> Guys, please - you're not new at this... this is what JavaDoc is for:
>
>  /**
>   * Returns a readonly reader containing all
>   * current updates.  Flush is called automatically.  This
>   * provides "near real-time" searching, in that changes
>   * made during an IndexWriter session can be made
>   * available for searching without closing the writer.
>   *
>   * <p>It's near real-time because there is no hard
>   * guarantee on how quickly you can get a new reader after
>   * making changes with IndexWriter.  You'll have to
>   * experiment in your situation to determine if it's
>   * fast enough.  As this is a new and experimental
>   * feature, please report back on your findings so we can
>   * learn, improve and iterate.</p>
>   *
>   * <p>The resulting reader supports {@link
>   * IndexReader#reopen}, but that call will simply forward
>   * back to this method (though this may change in the
>   * future).</p>
>   *
>   * <p>The very first time this method is called, this
>   * writer instance will make every effort to pool the
>   * readers that it opens for doing merges, applying
>   * deletes, etc.  This means additional resources (RAM,
>   * file descriptors, CPU time) will be consumed.</p>
>   *
>   * <p>For lower latency on reopening a reader, you should
>   * call {@link #setMergedSegmentWarmer} to
>   * pre-warm a newly merged segment before it's committed
>   * to the index.  This is important for minimizing
>   * index-to-search delay after a large merge.  </p>
>   *
>   * <p>If an addIndexes* call is running in another thread,
>   * then this reader will only search those segments from
>   * the foreign index that have been successfully copied
>   * over, so far</p>.
>   *
>   * <p><b>NOTE</b>: Once the writer is closed, any
>   * outstanding readers may continue to be used.  However,
>   * if you attempt to reopen any of those readers, you'll
>   * hit an {@link AlreadyClosedException}.</p>
>   *
>   * <p><b>NOTE:</b> This API is experimental and might
>   * change in incompatible ways in the next release.</p>
>   *
>   * @return IndexReader that covers entire index plus all
>   * changes made so far by this IndexWriter instance
>   *
>   * @throws IOException
>   */
>  public IndexReader getReader() throws IOException {
>
>
> -Yonik
> http://www.lucidimagination.com
>
>
> On Mon, Oct 12, 2009 at 4:18 PM, John Wang <jo...@gmail.com> wrote:
> > Oh, that is really good to know!
> > Is this deterministic? e.g. as long as writer.addDocument() is called,
> next
> > getReader reflects the change? Does it work with deletes? e.g.
> > writer.deleteDocuments()?
> > Thanks Mike for clarifying!
> >
> > -John
> >
> > On Mon, Oct 12, 2009 at 12:11 PM, Michael McCandless <
> > lucene@mikemccandless.com> wrote:
> >
> >> Just to clarify: IndexWriter.newReader returns a reader that searches
> >> uncommitted changes as well.  Ie, you need not call IndexWriter.commit
> >> to make the changes visible.
> >>
> >> However, if you're opening a reader the "normal" way
> >> (IndexReader.open) then it is necessary to first call
> >> IndexWriter.commit.
> >>
> >> Mike
> >>
> >> On Mon, Oct 12, 2009 at 5:24 AM, melix <ce...@lingway.com>
> >> wrote:
> >> >
> >> > Hi,
> >> >
> >> > I'm going to replace an old reader/writer synchronization mechanism we
> >> had
> >> > implemented with the new near realtime search facilities in Lucene
> 2.9.
> >> > However, it's still a bit unclear on how to efficiently do it.
> >> >
> >> > Is the following implementation the good way to do achieve it ? The
> >> context
> >> > is concurrent read/writes on an index :
> >> >
> >> > 1. create a Directory instance
> >> > 2. create a writer on this directory
> >> > 3. on each write request, add document to the writer
> >> > 4. on each read request,
> >> >  a. use writer.getReader() to obtain an up-to-date reader
> >> >  b. create an IndexSearcher with that reader
> >> >  c. perform Query
> >> >  d. close IndexSearcher
> >> > 5. on application close
> >> >  a. close writer
> >> >  b. close directory
> >> >
> >> > While this seems to be ok, I'm really wondering about the performance
> of
> >> > opening a searcher for each request. I could introduce some kind of
> delay
> >> > and cache a searcher for some seconds, but I'm not sure it's the best
> >> thing
> >> > to do.
> >> >
> >> > Thanks,
> >> >
> >> > Cedric
> >> >
> >> >
> >> > --
> >> > View this message in context:
> >>
> http://www.nabble.com/Realtime-search-best-practices-tp25852756p25852756.html
> >> > Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >> >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Realtime search best practices

Posted by Yonik Seeley <yo...@lucidimagination.com>.

Guys, please - you're not new at this... this is what JavaDoc is for:

  /**
   * Returns a readonly reader containing all
   * current updates.  Flush is called automatically.  This
   * provides "near real-time" searching, in that changes
   * made during an IndexWriter session can be made
   * available for searching without closing the writer.
   *
   * <p>It's near real-time because there is no hard
   * guarantee on how quickly you can get a new reader after
   * making changes with IndexWriter.  You'll have to
   * experiment in your situation to determine if it's
   * fast enough.  As this is a new and experimental
   * feature, please report back on your findings so we can
   * learn, improve and iterate.</p>
   *
   * <p>The resulting reader supports {@link
   * IndexReader#reopen}, but that call will simply forward
   * back to this method (though this may change in the
   * future).</p>
   *
   * <p>The very first time this method is called, this
   * writer instance will make every effort to pool the
   * readers that it opens for doing merges, applying
   * deletes, etc.  This means additional resources (RAM,
   * file descriptors, CPU time) will be consumed.</p>
   *
   * <p>For lower latency on reopening a reader, you should
   * call {@link #setMergedSegmentWarmer} to
   * pre-warm a newly merged segment before it's committed
   * to the index.  This is important for minimizing
   * index-to-search delay after a large merge.  </p>
   *
   * <p>If an addIndexes* call is running in another thread,
   * then this reader will only search those segments from
   * the foreign index that have been successfully copied
   * over, so far</p>.
   *
   * <p><b>NOTE</b>: Once the writer is closed, any
   * outstanding readers may continue to be used.  However,
   * if you attempt to reopen any of those readers, you'll
   * hit an {@link AlreadyClosedException}.</p>
   *
   * <p><b>NOTE:</b> This API is experimental and might
   * change in incompatible ways in the next release.</p>
   *
   * @return IndexReader that covers entire index plus all
   * changes made so far by this IndexWriter instance
   *
   * @throws IOException
   */
  public IndexReader getReader() throws IOException {

-Yonik
http://www.lucidimagination.com

On Mon, Oct 12, 2009 at 4:18 PM, John Wang <jo...@gmail.com> wrote:
> Oh, that is really good to know!
> Is this deterministic? e.g. as long as writer.addDocument() is called, next
> getReader reflects the change? Does it work with deletes? e.g.
> writer.deleteDocuments()?
> Thanks Mike for clarifying!
>
> -John
>
> On Mon, Oct 12, 2009 at 12:11 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> Just to clarify: IndexWriter.newReader returns a reader that searches
>> uncommitted changes as well.  Ie, you need not call IndexWriter.commit
>> to make the changes visible.
>>
>> However, if you're opening a reader the "normal" way
>> (IndexReader.open) then it is necessary to first call
>> IndexWriter.commit.
>>
>> Mike
>>
>> On Mon, Oct 12, 2009 at 5:24 AM, melix <ce...@lingway.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > I'm going to replace an old reader/writer synchronization mechanism we
>> had
>> > implemented with the new near realtime search facilities in Lucene 2.9.
>> > However, it's still a bit unclear on how to efficiently do it.
>> >
>> > Is the following implementation the good way to do achieve it ? The
>> context
>> > is concurrent read/writes on an index :
>> >
>> > 1. create a Directory instance
>> > 2. create a writer on this directory
>> > 3. on each write request, add document to the writer
>> > 4. on each read request,
>> >  a. use writer.getReader() to obtain an up-to-date reader
>> >  b. create an IndexSearcher with that reader
>> >  c. perform Query
>> >  d. close IndexSearcher
>> > 5. on application close
>> >  a. close writer
>> >  b. close directory
>> >
>> > While this seems to be ok, I'm really wondering about the performance of
>> > opening a searcher for each request. I could introduce some kind of delay
>> > and cache a searcher for some seconds, but I'm not sure it's the best
>> thing
>> > to do.
>> >
>> > Thanks,
>> >
>> > Cedric
>> >
>> >
>> > --
>> > View this message in context:
>> http://www.nabble.com/Realtime-search-best-practices-tp25852756p25852756.html
>> > Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Realtime search best practices

Posted by John Wang <jo...@gmail.com>.

Oh, that is really good to know!
Is this deterministic? e.g. as long as writer.addDocument() is called, next
getReader reflects the change? Does it work with deletes? e.g.
writer.deleteDocuments()?
Thanks Mike for clarifying!

-John

On Mon, Oct 12, 2009 at 12:11 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Just to clarify: IndexWriter.newReader returns a reader that searches
> uncommitted changes as well.  Ie, you need not call IndexWriter.commit
> to make the changes visible.
>
> However, if you're opening a reader the "normal" way
> (IndexReader.open) then it is necessary to first call
> IndexWriter.commit.
>
> Mike
>
> On Mon, Oct 12, 2009 at 5:24 AM, melix <ce...@lingway.com>
> wrote:
> >
> > Hi,
> >
> > I'm going to replace an old reader/writer synchronization mechanism we
> had
> > implemented with the new near realtime search facilities in Lucene 2.9.
> > However, it's still a bit unclear on how to efficiently do it.
> >
> > Is the following implementation the good way to do achieve it ? The
> context
> > is concurrent read/writes on an index :
> >
> > 1. create a Directory instance
> > 2. create a writer on this directory
> > 3. on each write request, add document to the writer
> > 4. on each read request,
> >  a. use writer.getReader() to obtain an up-to-date reader
> >  b. create an IndexSearcher with that reader
> >  c. perform Query
> >  d. close IndexSearcher
> > 5. on application close
> >  a. close writer
> >  b. close directory
> >
> > While this seems to be ok, I'm really wondering about the performance of
> > opening a searcher for each request. I could introduce some kind of delay
> > and cache a searcher for some seconds, but I'm not sure it's the best
> thing
> > to do.
> >
> > Thanks,
> >
> > Cedric
> >
> >
> > --
> > View this message in context:
> http://www.nabble.com/Realtime-search-best-practices-tp25852756p25852756.html
> > Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Realtime search best practices

Posted by Jake Mannix <ja...@gmail.com>.

On Mon, Oct 12, 2009 at 12:26 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> On Mon, Oct 12, 2009 at 3:17 PM, Jake Mannix <ja...@gmail.com>
> wrote:
>
> > Wait, so according to the javadocs, the IndexReader which you got from
> > the IndexWriter forwards calls to reopen() back to
> IndexWriter.getReader(),
> > which means that if the user has a NRT reader, and the user keeps calling
> > reopen() on it, they're getting uncommitted changes as well, while if
> they
> > call reopen() on a regular IndexReader, they do not?
>
> That's right.
>

So maybe since it's an "expert" feature, this is ok, but if users are used
to using
isCurrent() on their reader instances, this seems like it might get
confusing, since
now some readers are even more current than current, and in fact the NRT
readers may be current w.r.t. the most recent commit, but calling reopen()
on
them will actually still make them more current, in that they now get a view
on even more recent uncommitted changes...

  -jake

Re: Realtime search best practices

Posted by Michael McCandless <lu...@mikemccandless.com>.

On Mon, Oct 12, 2009 at 3:17 PM, Jake Mannix <ja...@gmail.com> wrote:

> Wait, so according to the javadocs, the IndexReader which you got from
> the IndexWriter forwards calls to reopen() back to IndexWriter.getReader(),
> which means that if the user has a NRT reader, and the user keeps calling
> reopen() on it, they're getting uncommitted changes as well, while if they
> call reopen() on a regular IndexReader, they do not?

That's right.

> How does this play nicely with the transactional semantics given by
> commit()?

The transactional semantics are still intact... it's just that an NRT
reader sees the uncommitted changes, ie, all changes done since the
last commit.

If disaster strikes (machine/os/jvm crashes, power loss, kill -9,
etc.) then on reboot/restart your index will still only show the last
successfull commit.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Realtime search best practices

Posted by Jake Mannix <ja...@gmail.com>.

Wait, so according to the javadocs, the IndexReader which you got from
the IndexWriter forwards calls to reopen() back to IndexWriter.getReader(),
which means that if the user has a NRT reader, and the user keeps calling
reopen() on it, they're getting uncommitted changes as well, while if they
call reopen() on a regular IndexReader, they do not?

How does this play nicely with the transactional semantics given by
commit()?

On Mon, Oct 12, 2009 at 12:11 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Just to clarify: IndexWriter.newReader returns a reader that searches
> uncommitted changes as well.  Ie, you need not call IndexWriter.commit
> to make the changes visible.
>
> However, if you're opening a reader the "normal" way
> (IndexReader.open) then it is necessary to first call
> IndexWriter.commit.
>
> Mike
>
> On Mon, Oct 12, 2009 at 5:24 AM, melix <ce...@lingway.com>
> wrote:
> >
> > Hi,
> >
> > I'm going to replace an old reader/writer synchronization mechanism we
> had
> > implemented with the new near realtime search facilities in Lucene 2.9.
> > However, it's still a bit unclear on how to efficiently do it.
> >
> > Is the following implementation the good way to do achieve it ? The
> context
> > is concurrent read/writes on an index :
> >
> > 1. create a Directory instance
> > 2. create a writer on this directory
> > 3. on each write request, add document to the writer
> > 4. on each read request,
> >  a. use writer.getReader() to obtain an up-to-date reader
> >  b. create an IndexSearcher with that reader
> >  c. perform Query
> >  d. close IndexSearcher
> > 5. on application close
> >  a. close writer
> >  b. close directory
> >
> > While this seems to be ok, I'm really wondering about the performance of
> > opening a searcher for each request. I could introduce some kind of delay
> > and cache a searcher for some seconds, but I'm not sure it's the best
> thing
> > to do.
> >
> > Thanks,
> >
> > Cedric
> >
> >
> > --
> > View this message in context:
> http://www.nabble.com/Realtime-search-best-practices-tp25852756p25852756.html
> > Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Realtime search best practices

Posted by Michael McCandless <lu...@mikemccandless.com>.

Just to clarify: IndexWriter.newReader returns a reader that searches
uncommitted changes as well.  Ie, you need not call IndexWriter.commit
to make the changes visible.

However, if you're opening a reader the "normal" way
(IndexReader.open) then it is necessary to first call
IndexWriter.commit.

Mike

On Mon, Oct 12, 2009 at 5:24 AM, melix <ce...@lingway.com> wrote:
>
> Hi,
>
> I'm going to replace an old reader/writer synchronization mechanism we had
> implemented with the new near realtime search facilities in Lucene 2.9.
> However, it's still a bit unclear on how to efficiently do it.
>
> Is the following implementation the good way to do achieve it ? The context
> is concurrent read/writes on an index :
>
> 1. create a Directory instance
> 2. create a writer on this directory
> 3. on each write request, add document to the writer
> 4. on each read request,
>  a. use writer.getReader() to obtain an up-to-date reader
>  b. create an IndexSearcher with that reader
>  c. perform Query
>  d. close IndexSearcher
> 5. on application close
>  a. close writer
>  b. close directory
>
> While this seems to be ok, I'm really wondering about the performance of
> opening a searcher for each request. I could introduce some kind of delay
> and cache a searcher for some seconds, but I'm not sure it's the best thing
> to do.
>
> Thanks,
>
> Cedric
>
>
> --
> View this message in context: http://www.nabble.com/Realtime-search-best-practices-tp25852756p25852756.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org