You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by Jason Rutherglen <ja...@gmail.com> on 2008/12/24 02:51:43 UTC

Realtime Search

We've discussed realtime search before, it looks like after the next release
we can get some sort of realtime search working.  I was going to open a new
issue but decided it might be best to discuss realtime search on the dev
list.

Lucene can implement realtime search as the ability to add, update, or
delete documents with latency in the sub 5 millisecond range.  A couple of
different options are available.

1) Expose a rolling set of realtime readers over the memory index used by
IndexWriter.  Requires incrementally updating field caches and filters, and
is somewhat unclear how IndexReader versioning would work (for example
versions of the term dictionary).
2) Implement realtime search by incrementally creating and merging readers
in memory.  The system would use MemoryIndex or InstantiatedIndex to quickly
(more quickly than RAMDirectory) create indexes from added documents.  The
in memory indexes would be periodically merged in the background and
according to RAM used write to disk.  Each update would generate a new
IndexReader or MultiSearcher that includes the new updates.  Field caches
and filters could be cached per IndexReader according to how Lucene works
today.  The downside of this approach is the indexing will not be as fast as
#1 because of the in memory merging which similar to the Lucene pre 2.3
which merged in memory segments using RAMDirectory.

Are there other implementation options?

A new patch would focus on providing in memory indexing as part of the core
of Lucene.  The work of LUCENE-1483 and LUCENE-1314 would be used.  I am not
sure if option #2 can become part of core if it relies on a contrib module?
It makes sense to provide a new realtime oriented merge policy that merges
segments based on the number of deletes rather than a merge factor.  The
realtime merge policy would keep the segments within a minimum and maximum
size in kilobytes to limit the time consumed by merging which is assumed
would occur frequently.

LUCENE-1313 which includes a transaction log with rollback and was designed
with distributed search and may be retired or the components split out.

Re: Realtime Search

Posted by "J. Delgado" <jo...@gmail.com>.

One thing that I forgot to mention is that in our implementation the
real-time indexing took place with many "folder-based" listeners writing  to
many  tiny in-memory indexes partitioned by "sub-sources" with fewer
long-term and archive indexes per box. Overall distributed search across
various lucene-based search services was done using a federator component,
very much like shard based searches is done today (I believe).

-- Joaquin.
l


On Fri, Dec 26, 2008 at 10:48 AM, J. Delgado <jo...@gmail.com>wrote:

> The addition of docs into tiny segments using the current data structures
> seems the right way to go. Sometime back one of my engineers implemented
> pseudo real-time using MultiSearcher by having an in-memory (RAM based)
> "short-term" index that auto-merged into a disk-based "long term" index that
> eventually get merged into "archive" indexes. Index optimization would take
> place during these merges. The search we required was very time-sensitive
> (searching last-minute breaking news wires). The advantage of having an
> archive index is that very old documents in our applications were not
> usually searched on unless archives were explicitely selected.
>
> -- Joaquin
>
>
> On Fri, Dec 26, 2008 at 10:20 AM, Doug Cutting <cu...@apache.org> wrote:
>
>> Michael McCandless wrote:
>>
>>> So then I think we should start with approach #2 (build real-time on
>>> top of the Lucene core) and iterate from there.  Newly added docs go
>>> into a tiny segments, which IndexReader.reopen pulls in.  Replaced or
>>> deleted docs record the delete against the right SegmentReader (and
>>> LUCENE-1314 lets reopen carry those pending deletes forward, in RAM).
>>>
>>> I would take the simple approach first: use ordinary SegmentReader on
>>> a RAMDirectory for the tiny segments.  If that proves too slow, swap
>>> in Memory/InstantiatedIndex for the tiny segments.  If that proves too
>>> slow, build a reader impl that reads from DocumentsWriter RAM buffer.
>>>
>>
>> +1 This sounds like a good approach to me.  I don't see any fundamental
>> reasons why we need different representations, and fewer implementations of
>> IndexWriter and IndexReader is generally better, unless they get way too
>> hairy.  Mostly it seems that real-time can be done with our existing toolbox
>> of datastructures, but with some slightly different control structures.
>>  Once we have the control structure in place then we should look at
>> optimizing data structures as needed.
>>
>> Doug
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>

Re: Realtime Search

Posted by "J. Delgado" <jo...@gmail.com>.

The addition of docs into tiny segments using the current data structures
seems the right way to go. Sometime back one of my engineers implemented
pseudo real-time using MultiSearcher by having an in-memory (RAM based)
"short-term" index that auto-merged into a disk-based "long term" index that
eventually get merged into "archive" indexes. Index optimization would take
place during these merges. The search we required was very time-sensitive
(searching last-minute breaking news wires). The advantage of having an
archive index is that very old documents in our applications were not
usually searched on unless archives were explicitely selected.

-- Joaquin

On Fri, Dec 26, 2008 at 10:20 AM, Doug Cutting <cu...@apache.org> wrote:

> Michael McCandless wrote:
>
>> So then I think we should start with approach #2 (build real-time on
>> top of the Lucene core) and iterate from there.  Newly added docs go
>> into a tiny segments, which IndexReader.reopen pulls in.  Replaced or
>> deleted docs record the delete against the right SegmentReader (and
>> LUCENE-1314 lets reopen carry those pending deletes forward, in RAM).
>>
>> I would take the simple approach first: use ordinary SegmentReader on
>> a RAMDirectory for the tiny segments.  If that proves too slow, swap
>> in Memory/InstantiatedIndex for the tiny segments.  If that proves too
>> slow, build a reader impl that reads from DocumentsWriter RAM buffer.
>>
>
> +1 This sounds like a good approach to me.  I don't see any fundamental
> reasons why we need different representations, and fewer implementations of
> IndexWriter and IndexReader is generally better, unless they get way too
> hairy.  Mostly it seems that real-time can be done with our existing toolbox
> of datastructures, but with some slightly different control structures.
>  Once we have the control structure in place then we should look at
> optimizing data structures as needed.
>
> Doug
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Realtime Search

Posted by Grant Ingersoll <gs...@apache.org>.

Just thinking out loud...  haven't looked at your patch yet (one of  
these days I will be back up for air)

My initial thought is that you would have a factory that produced both  
the Reader and the Writer as a pair, or was at least aware of what to  
go get from the Writer

Something like:

class IndexFactory{
	IndexWriter getWriter()

	IndexReader getReader()

	//Not sure if this is needed yet, but????
     	IndexReader getReader(IndexWriter)
}

The factory (or whatever you want to call it) is responsible for  
making sure the Writer and Reader have the pieces they need, i.e. the  
SegmentInfos.

The first getReader will get you the plain old Reader that everyone  
knows and loves today (assuming there is a benefit to keeping it  
around), the second one knows what to get off the Writer to create the  
appropriate Reader.

It's nothing particularly hard to implement over what you are  
proposing, I don't think.  Just trying to keep the Reader out of the  
Writer from an API cleanliness standpoint.

-Grant


On Jan 12, 2009, at 12:55 PM, Jason Rutherglen wrote:

> Grant,
>
> Do you have a proposal in mind?  It would help to suggest something  
> like some classes and methods to help understand an alternative to  
> what is being discussed.
>
> -J
>
> On Fri, Jan 9, 2009 at 12:05 PM, Grant Ingersoll  
> <gs...@apache.org> wrote:
> I realize we aren't adding read functionality to the Writer, but it  
> would be coupling the Writer to the Reader nonetheless.  I  
> understand it is brainstorming (like I said, not trying to distract  
> from the discussion), just saying that if the Reader and the Writer  
> both need access to the underlying data structures, then we should  
> refactor to make that possible, not just glom the Reader onto the  
> Writer.  I suspect if that is done, anyway, that it may make the  
> bigger picture a bit clearer, too.
>
>
> On Jan 9, 2009, at 2:53 PM, Michael McCandless wrote:
>
>
> Grant Ingersoll wrote:
>
> We've spent a lot of time up until now getting write functionality  
> out of the Reader, and now we are going to add read functionality  
> into the Writer?
>
> Well... we're not really adding read functionality into IW; instead,
> we are asking IW to open the reader for us, except the reader is
> provided the SegmentInfos it should use from IW (instead of trying to
> find the latest segments_N file in the Directory).
>
> Ie, what IW.getReader returns is an otherwise normal
> MultiSegmentReader.
>
> The goal is to allow an IndexReader to access segments flushed but
> not yet committed by IW.  These segments are normally "private" to IW,
> in memory in its SegmentInfos instance.
>
> And this is all just thinking-out-loud-brainstorming.  There are  
> still many
> details to work through...
>
> Mike
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ

Re: Realtime Search

Posted by Jason Rutherglen <ja...@gmail.com>.

Grant,

Do you have a proposal in mind?  It would help to suggest something like
some classes and methods to help understand an alternative to what is being
discussed.

-J

On Fri, Jan 9, 2009 at 12:05 PM, Grant Ingersoll <gs...@apache.org>wrote:

> I realize we aren't adding read functionality to the Writer, but it would
> be coupling the Writer to the Reader nonetheless.  I understand it is
> brainstorming (like I said, not trying to distract from the discussion),
> just saying that if the Reader and the Writer both need access to the
> underlying data structures, then we should refactor to make that possible,
> not just glom the Reader onto the Writer.  I suspect if that is done,
> anyway, that it may make the bigger picture a bit clearer, too.
>
>
> On Jan 9, 2009, at 2:53 PM, Michael McCandless wrote:
>
>
>> Grant Ingersoll wrote:
>>
>>  We've spent a lot of time up until now getting write functionality out of
>>> the Reader, and now we are going to add read functionality into the Writer?
>>>
>>
>> Well... we're not really adding read functionality into IW; instead,
>> we are asking IW to open the reader for us, except the reader is
>> provided the SegmentInfos it should use from IW (instead of trying to
>> find the latest segments_N file in the Directory).
>>
>> Ie, what IW.getReader returns is an otherwise normal
>> MultiSegmentReader.
>>
>> The goal is to allow an IndexReader to access segments flushed but
>> not yet committed by IW.  These segments are normally "private" to IW,
>> in memory in its SegmentInfos instance.
>>
>> And this is all just thinking-out-loud-brainstorming.  There are still
>> many
>> details to work through...
>>
>> Mike
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Realtime Search

Posted by Grant Ingersoll <gs...@apache.org>.

I realize we aren't adding read functionality to the Writer, but it  
would be coupling the Writer to the Reader nonetheless.  I understand  
it is brainstorming (like I said, not trying to distract from the  
discussion), just saying that if the Reader and the Writer both need  
access to the underlying data structures, then we should refactor to  
make that possible, not just glom the Reader onto the Writer.  I  
suspect if that is done, anyway, that it may make the bigger picture a  
bit clearer, too.

On Jan 9, 2009, at 2:53 PM, Michael McCandless wrote:

>
> Grant Ingersoll wrote:
>
>> We've spent a lot of time up until now getting write functionality  
>> out of the Reader, and now we are going to add read functionality  
>> into the Writer?
>
> Well... we're not really adding read functionality into IW; instead,
> we are asking IW to open the reader for us, except the reader is
> provided the SegmentInfos it should use from IW (instead of trying to
> find the latest segments_N file in the Directory).
>
> Ie, what IW.getReader returns is an otherwise normal
> MultiSegmentReader.
>
> The goal is to allow an IndexReader to access segments flushed but
> not yet committed by IW.  These segments are normally "private" to IW,
> in memory in its SegmentInfos instance.
>
> And this is all just thinking-out-loud-brainstorming.  There are  
> still many
> details to work through...
>
> Mike
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Michael McCandless <lu...@mikemccandless.com>.

Grant Ingersoll wrote:

> We've spent a lot of time up until now getting write functionality  
> out of the Reader, and now we are going to add read functionality  
> into the Writer?

Well... we're not really adding read functionality into IW; instead,
we are asking IW to open the reader for us, except the reader is
provided the SegmentInfos it should use from IW (instead of trying to
find the latest segments_N file in the Directory).

Ie, what IW.getReader returns is an otherwise normal
MultiSegmentReader.

The goal is to allow an IndexReader to access segments flushed but
not yet committed by IW.  These segments are normally "private" to IW,
in memory in its SegmentInfos instance.

And this is all just thinking-out-loud-brainstorming.  There are still  
many
details to work through...

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Grant Ingersoll <gs...@apache.org>.

On Jan 9, 2009, at 8:39 AM, Michael McCandless wrote:

>
> Jason Rutherglen wrote:
>
>> Patch #1: Expose an IndexWriter.getReader method that returns the  
>> current reader and shares the write lock
>
> I tentatively like this approach so far...
>
> That reader is opened using IndexWriter's SegmentInfos instance, so it
> can read segments & deletions that have been flushed but not
> committed.  It's allowed to do its own deletions & norms updating.
> When reopen() is called, it grabs the writers SegmentInfos again.

Minor design nit...
We've spent a lot of time up until now getting write functionality out  
of the Reader, and now we are going to add read functionality into the  
Writer?  Is that the right thing to do?  Perhaps there is an interface  
or some shared objects to be used/exposed or maybe people should get  
Readers/Writers from a factory and you could have a RT Factory and a  
default Factory?  Not trying to distract from the deeper issues here,  
but I don't think it makes sense to have the Writer coupled to the  
Reader.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Jason Rutherglen <ja...@gmail.com>.

"Patch #2: Implement a realtime ram index class I think this one is
optional, or, rather an optimazation that we can swap in later
if/when necessary? Ie for starters little segments are written into
the main Directory."

John, Zoie could be of use for this patch. In addition, we may want to
implement flushing the IW ram buffer to a RAMDir for reading as M.M.
suggested.

First though the IW -> IR integration LUCENE-1516 needs to be
implemented otherwise it's not possible to properly execute updates
in realtime.


On Fri, Jan 9, 2009 at 5:39 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

>
> Jason Rutherglen wrote:
>
>  Patch #1: Expose an IndexWriter.getReader method that returns the current
>> reader and shares the write lock
>>
>
> I tentatively like this approach so far...
>
> That reader is opened using IndexWriter's SegmentInfos instance, so it
> can read segments & deletions that have been flushed but not
> committed.  It's allowed to do its own deletions & norms updating.
> When reopen() is called, it grabs the writers SegmentInfos again.
>
>  Patch #2: Implement a realtime ram index class
>>
>
> I think this one is optional, or, rather an optimazation that we can
> swap in later if/when necessary?  Ie for starters little segments are
> written into the main Directory.
>
>  Patch #3: Implement realtime transactions in IndexWriter or in a subclass
>> of IndexWriter by implementing a createTransaction method that generates a
>> realtime Transaction object.  When the transaction is flushed, the
>> transaction index modifications are available via the getReader method of
>> IndexWriter
>>
>
> Can't this be layered on top?
>
> Or... are you looking to add support for multiple transactions in
> flight at once on IndexWriter?
>
> Mike
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Realtime Search

Posted by Jason Rutherglen <ja...@gmail.com>.

I think the IW integrated IR needs a rule regarding the behavior of
IW.flush and IR.flush. There will need to be a flush lock that is
shared between the IW and IR. The lock is acquired at the beginning
of a flush and released immediately after a successful or
unsuccessful call. We will need to share this lock down to the
SegmentReader level as presumably IR.getSequentialReaders may be
called and operated on individually.

A few questions that need to be answered as to desired behavior. What
happens when IW flushes w/deletes and IR has pending deletes not
flushed yet? Can we automatically flush the IR deletes? If not
automatically flushed are the IR deletes still valid and can the IR
later flush them and not create a conflict (I think this is doable)?
Or does the reader become readonly and IR.reopen must be called to
obtain the new deletes? In the reverse scenario where IW has pending
deletes and IR flushes deletes, are there issues that arise when IW
later flushes? I think if it's made clear to the user the
implications of using IR and IW in combination for deletes then there
should not be an issue with supporting deletes from IR and IW.

(I found another way to format with hard line breaks
http://emailformattool.com/)

Re: Realtime Search

Posted by Jason Rutherglen <ja...@gmail.com>.

> deletes made through reader (by docID) are immediately visible, but
through writer are buffered until a flush or reopen?

This is what I was thinking, IW buffers deletes, IR does not. Making
IW.deletes visible immediately by applying them to the IR makes sense
as well.

What should be the behavior of IW.updateDocument?

LUCENE-1314 is in and we've agreed IR.reopen causes an IW.flush
so I'll continue the LUCENE-1516 patch.

On Fri, Jan 30, 2009 at 6:04 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Jason Rutherglen <ja...@gmail.com> wrote:
>
> > > We'd also need to ensure when a merge kicks off, the SegmentReaders
> > > used by the merging are not newly reopened but also "borrowed" from
> >
> > The IW merge code currently opens the SegmentReader with a 4096
> > buffer size (different than the 1024 default), how will this case be
> > handled?
>
> I think we'd just use 1024 when merging.
>
> > > reopen would then flush any added docs to new segments
> >
> > IR.reopen would call IW.flush?
>
> I think it has to?  (Whether "it" is IR.reopen, or a class that sits
> on top of both IR & IW, I'm not sure).
>
> Ie the interface would be you add/delete/updateDoc, setNorm a bunch of
> times, during which none of these changes are visible to your
> currently open reader, followed by "reopen" to get a reader that then
> sees those changes?
>
> (This is all still brainstorming at this point of course....)
>
> > > When IW.commit is called, it also then asks each SegmentReader to
> > > commit. Ie, IR.commit would not be used.
> >
> > Why is this? SegmentReader.commitChanges would be called instead?
>
> Because IR.commit is doing other stuff (invoking deletion policy,
> syncing newly referenced files, writing new segments file, rollback
> logic on hitting an exception, etc.) that overlaps what IW.commit also
> does.  It'd be great to factor this common stuff out so IW and IR
> would share a single source.  (Yes, SR.commitChanges would be called
> directly, I think).
>
> > > Then when reopen is called, we must internally reopen that clone()
> > > such that its deleted docs are carried over to the newly reopened
> > > reader and newly flushed docs from IW are visible as new
> > > SegmentReaders.
> >
> > If deletes are made to the external reader (meaning the one obtained
> > by IW.getReader), then deletes are made via IW.deleteDocument, then
> > reopen is called, what happens in this case? We will need to merge
> > the del docs from the internal clone into the newly reopened reader?
>
> I guess we could merge them.  Ie, deletes made through reader (by
> docID) are immediately visible, but through through writer are
> buffered until a flush or reopen?
>
> Still, I don't like exposing two ways to do deletions, with two
> different behaviours (buffered or not).  It's weird.  Maybe, instead,
> all deletes done via IW would be immediate?
>
> It seems like either 1) all deletes are buffered until reopen, or 2)
> all deletes are immediately materialized.  I think half/half is too
> strange.
>
> > > the IR becomes transactional as well -- deletes are not visible
> > > immediately until reopen is called
> >
> > Interesting. I'd rather somehow merge the IW and external reader's
> > deletes, otherwise it seems like we're radically changing how IR
> > works. Perhaps the IW keeps a copy of the external IR that has the
> > write lock (thinking of IR.clone where the write lock is passed onto
> > the latest clone). This way IW.getReader is about the same as
> > reopen/clone (because it will call reopen on presumably the latest
> > IR).
>
> We'd only be "radically changing" how the RealTimeReader works.
>
> I think the initial approach here might be to simply open up enough
> package-private APIs or subclass-ability on IR and IW so that we can
> experiment with these realtime ideas.  Then we iterate w/ different
> experiments to see how things flesh out...
>
> Actually could you redo LUCENE-1516 now that LUCENE-1314 is in?
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Realtime Search

Posted by Michael McCandless <lu...@mikemccandless.com>.

Jason Rutherglen <ja...@gmail.com> wrote:

> > We'd also need to ensure when a merge kicks off, the SegmentReaders
> > used by the merging are not newly reopened but also "borrowed" from
>
> The IW merge code currently opens the SegmentReader with a 4096
> buffer size (different than the 1024 default), how will this case be
> handled?

I think we'd just use 1024 when merging.

> > reopen would then flush any added docs to new segments
>
> IR.reopen would call IW.flush?

I think it has to?  (Whether "it" is IR.reopen, or a class that sits
on top of both IR & IW, I'm not sure).

Ie the interface would be you add/delete/updateDoc, setNorm a bunch of
times, during which none of these changes are visible to your
currently open reader, followed by "reopen" to get a reader that then
sees those changes?

(This is all still brainstorming at this point of course....)

> > When IW.commit is called, it also then asks each SegmentReader to
> > commit. Ie, IR.commit would not be used.
>
> Why is this? SegmentReader.commitChanges would be called instead?

Because IR.commit is doing other stuff (invoking deletion policy,
syncing newly referenced files, writing new segments file, rollback
logic on hitting an exception, etc.) that overlaps what IW.commit also
does.  It'd be great to factor this common stuff out so IW and IR
would share a single source.  (Yes, SR.commitChanges would be called
directly, I think).

> > Then when reopen is called, we must internally reopen that clone()
> > such that its deleted docs are carried over to the newly reopened
> > reader and newly flushed docs from IW are visible as new
> > SegmentReaders.
>
> If deletes are made to the external reader (meaning the one obtained
> by IW.getReader), then deletes are made via IW.deleteDocument, then
> reopen is called, what happens in this case? We will need to merge
> the del docs from the internal clone into the newly reopened reader?

I guess we could merge them.  Ie, deletes made through reader (by
docID) are immediately visible, but through through writer are
buffered until a flush or reopen?

Still, I don't like exposing two ways to do deletions, with two
different behaviours (buffered or not).  It's weird.  Maybe, instead,
all deletes done via IW would be immediate?

It seems like either 1) all deletes are buffered until reopen, or 2)
all deletes are immediately materialized.  I think half/half is too
strange.

> > the IR becomes transactional as well -- deletes are not visible
> > immediately until reopen is called
>
> Interesting. I'd rather somehow merge the IW and external reader's
> deletes, otherwise it seems like we're radically changing how IR
> works. Perhaps the IW keeps a copy of the external IR that has the
> write lock (thinking of IR.clone where the write lock is passed onto
> the latest clone). This way IW.getReader is about the same as
> reopen/clone (because it will call reopen on presumably the latest
> IR).

We'd only be "radically changing" how the RealTimeReader works.

I think the initial approach here might be to simply open up enough
package-private APIs or subclass-ability on IR and IW so that we can
experiment with these realtime ideas.  Then we iterate w/ different
experiments to see how things flesh out...

Actually could you redo LUCENE-1516 now that LUCENE-1314 is in?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Jason Rutherglen <ja...@gmail.com>.

> We'd also need to ensure when a merge kicks off, the SegmentReaders
used by the merging are not newly reopened but also "borrowed" from

The IW merge code currently opens the SegmentReader with a 4096
buffer size (different than the 1024 default), how will this case be
handled?

> reopen would then flush any added docs to new segments

IR.reopen would call IW.flush?

> When IW.commit is called, it also then asks each SegmentReader to
commit. Ie, IR.commit would not be used.

Why is this? SegmentReader.commitChanges would be called instead?

> Then when reopen is called, we must internally reopen that clone()
such that its deleted docs are carried over to the newly reopened
reader and newly flushed docs from IW are visible as new
SegmentReaders.

If deletes are made to the external reader (meaning the one obtained
by IW.getReader), then deletes are made via IW.deleteDocument, then
reopen is called, what happens in this case? We will need to merge
the del docs from the internal clone into the newly reopened reader?

> the IR becomes transactional as well -- deletes are not visible
immediately until reopen is called

Interesting. I'd rather somehow merge the IW and external reader's
deletes, otherwise it seems like we're radically changing how IR
works. Perhaps the IW keeps a copy of the external IR that has the
write lock (thinking of IR.clone where the write lock is passed onto
the latest clone). This way IW.getReader is about the same as
reopen/clone (because it will call reopen on presumably the latest
IR).




On Sat, Jan 24, 2009 at 4:29 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Jason Rutherglen wrote:
>
>  > "But I think for realtime we don't want to be using IW's deletion at
>> all.  We should do all deletes via the IndexReader.  In fact if IW has
>> handed out a reader (via getReader()) and that reader (or a reopened
>> derivative) remains open we may have to block deletions via IW.  Not
>> sure..."
>>
>> Can't IW use the IR to do it's deletions?  Currently deletions in IW are
>> implemented in DocumentsWriter.applyDeletes by loading a segment with
>> SegmentReader.get() and making the deletions which causes term index load
>> overhead per flush.  If IW has an internal IR then the deletion process can
>> use it (not SegmentReader.get) and there should not be a conflict anymore
>> between the IR and IW deletion processes.
>>
>
> Today, IW quickly opens each SegmentReader, applies deletes, then
> commits & closes it, because we have considered it too costly to leave
> these readers open.
>
> But if you've opened a persistent IR via the IndexWriter anyway, we
> should use the SegmentReaders from that IR instead.
>
> It seems like the joint IR+IW would allow you to do adds, deletes,
> setNorms, all of which are not visible in the exposed IR until
> IR.reopen is called.  reopen would then flush any added docs to new
> segments, materialize any buffered deletes into the BitVectors (or
> future transactional sorted int tree thingy), likewise for norms, and
> then return a new IR.
>
> Ie, the IR becomes transactional as well -- deletes are not visible
> immeidately until reopen is called (unlike today when you delete via
> IR).  I think this means, internally when IW wants to make changes to
> the shared IR, it should make a clone() and do the changes privately
> to that instance.  Then when reopen is called, we must internally
> reopen that clone() such that its deleted docs are carried over to the
> newly reopened reader and newly flushed docs from IW are visible as
> new SegmentReaders.
>
> And on reopen, the deletes should not be flushed to the Directory --
> they only need to be "moved" into each SegmentReader's deletedDocs.
> We'd also need to ensure when a merge kicks off, the SegmentReaders
> used by the merging are not newly reopened but also "borrowed" from
> the already open IR.  This could actually mean that some deleted docs
> get merged away before the deletions ever get flushed to the Directory.
>
>  > "we may have to block deletions via IW"
>>
>> Hopefully they can be buffered.
>>
>> Where else does the write lock need to be coordinated between IR and IW?
>>
>> > "somehow IW & IR have to "split" the write lock else we may
>> need to merge deletions somehow."
>>
>> This is a part I'd like to settle on before start of implementation.  It
>> looks like in IW deletes are buffered as terms or queries until flushed.  I
>> don't think there needs to be a lock until the flush is performed?
>>
>> For the merge changes to the index, the deletionpolicy can be used to
>> insure a reader still has access to the segments it needs from the main
>> directory.
>>
>
> The write lock is held to prevent multiple writers from buffering and
> then writing changes to the index.  Since we will have this joint
> IR/IW share state, as long as we properly synchronize/share things
> between IR/IW, it's fine if they both "share" the write lock.
>
> It seems like IR.reopen suddenly means "have IW materialize all
> pending stuff and give me a new reader", where stuff is adds &
> deletes.  Adds must materialize via the directory.  Deletes can
> materialize entirely in RAM.  Likewise for norms.
>
> When IW.commit is called, it also then asks each SegmentReader to
> commit.  Ie, IR.commit would not be used.
>
>  > "We have to test performance to measure the net add -> search latency.
>> For many apps this approach may be plenty fast.  If your IO system is
>> an SSD it could be extremely fast.  Swapping in RAMDir
>> just makes it faster w/o changing the basic approach."
>>
>> It is true that this is best way to start and in fact may be good enough
>> for many users.  It could help new users to expose a reader from IW so the
>> delineation between them is removed and Lucene becomes easier to use.
>>
>> At the very least this system allows concurrently updateable IR and IW due
>> to sharing the write lock something that has is currently incorrect in
>> Lucene.
>>
>
> I wouldn't call it "incorrect".  It was an explicit design tradeoff to
> make the division between IR & IW, and done for many good reasons.  We
> are now talking about relaxing that and it clearly raises a number of
> "challenging" issues...
>
>  > "Besides the transaction log (for crash recovery), which should fit
>> "above" Lucene nicely, what else is needed for realtime beyond the
>> single-transaction support Lucene already provides?"
>>
>> What we have described above (exposing IR via IW) will be sufficient and
>> realtime will live above it.
>>
>
> OK, good.
>
> In this model, the combined IR+IW is still jointly transactional, in
> that the IW's commit() method still behaves as it does today.  It's just
> that the IR that's linked to the IW is allowed to "see" changes, shared
> only in RAM, that a freshly opened IR on the index would not see until
> commit has been called.
>
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Realtime Search

Posted by Michael McCandless <lu...@mikemccandless.com>.

Jason Rutherglen wrote:

> > "But I think for realtime we don't want to be using IW's deletion at
> all.  We should do all deletes via the IndexReader.  In fact if IW has
> handed out a reader (via getReader()) and that reader (or a reopened
> derivative) remains open we may have to block deletions via IW.  Not
> sure..."
>
> Can't IW use the IR to do it's deletions?  Currently deletions in IW  
> are implemented in DocumentsWriter.applyDeletes by loading a segment  
> with SegmentReader.get() and making the deletions which causes term  
> index load overhead per flush.  If IW has an internal IR then the  
> deletion process can use it (not SegmentReader.get) and there should  
> not be a conflict anymore between the IR and IW deletion processes.

Today, IW quickly opens each SegmentReader, applies deletes, then
commits & closes it, because we have considered it too costly to leave
these readers open.

But if you've opened a persistent IR via the IndexWriter anyway, we
should use the SegmentReaders from that IR instead.

It seems like the joint IR+IW would allow you to do adds, deletes,
setNorms, all of which are not visible in the exposed IR until
IR.reopen is called.  reopen would then flush any added docs to new
segments, materialize any buffered deletes into the BitVectors (or
future transactional sorted int tree thingy), likewise for norms, and
then return a new IR.

Ie, the IR becomes transactional as well -- deletes are not visible
immeidately until reopen is called (unlike today when you delete via
IR).  I think this means, internally when IW wants to make changes to
the shared IR, it should make a clone() and do the changes privately
to that instance.  Then when reopen is called, we must internally
reopen that clone() such that its deleted docs are carried over to the
newly reopened reader and newly flushed docs from IW are visible as
new SegmentReaders.

And on reopen, the deletes should not be flushed to the Directory --
they only need to be "moved" into each SegmentReader's deletedDocs.
We'd also need to ensure when a merge kicks off, the SegmentReaders
used by the merging are not newly reopened but also "borrowed" from
the already open IR.  This could actually mean that some deleted docs
get merged away before the deletions ever get flushed to the Directory.

> > "we may have to block deletions via IW"
>
> Hopefully they can be buffered.
>
> Where else does the write lock need to be coordinated between IR and  
> IW?
>
> > "somehow IW & IR have to "split" the write lock else we may
> need to merge deletions somehow."
>
> This is a part I'd like to settle on before start of  
> implementation.  It looks like in IW deletes are buffered as terms  
> or queries until flushed.  I don't think there needs to be a lock  
> until the flush is performed?
>
> For the merge changes to the index, the deletionpolicy can be used  
> to insure a reader still has access to the segments it needs from  
> the main directory.

The write lock is held to prevent multiple writers from buffering and
then writing changes to the index.  Since we will have this joint
IR/IW share state, as long as we properly synchronize/share things
between IR/IW, it's fine if they both "share" the write lock.

It seems like IR.reopen suddenly means "have IW materialize all
pending stuff and give me a new reader", where stuff is adds &
deletes.  Adds must materialize via the directory.  Deletes can
materialize entirely in RAM.  Likewise for norms.

When IW.commit is called, it also then asks each SegmentReader to
commit.  Ie, IR.commit would not be used.

> > "We have to test performance to measure the net add -> search  
> latency.
> For many apps this approach may be plenty fast.  If your IO system is
> an SSD it could be extremely fast.  Swapping in RAMDir
> just makes it faster w/o changing the basic approach."
>
> It is true that this is best way to start and in fact may be good  
> enough for many users.  It could help new users to expose a reader  
> from IW so the delineation between them is removed and Lucene  
> becomes easier to use.
>
> At the very least this system allows concurrently updateable IR and  
> IW due to sharing the write lock something that has is currently  
> incorrect in Lucene.

I wouldn't call it "incorrect".  It was an explicit design tradeoff to
make the division between IR & IW, and done for many good reasons.  We
are now talking about relaxing that and it clearly raises a number of
"challenging" issues...

> > "Besides the transaction log (for crash recovery), which should fit
> "above" Lucene nicely, what else is needed for realtime beyond the
> single-transaction support Lucene already provides?"
>
> What we have described above (exposing IR via IW) will be sufficient  
> and realtime will live above it.

OK, good.

In this model, the combined IR+IW is still jointly transactional, in
that the IW's commit() method still behaves as it does today.  It's just
that the IR that's linked to the IW is allowed to "see" changes, shared
only in RAM, that a freshly opened IR on the index would not see until
commit has been called.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Jason Rutherglen <ja...@gmail.com>.

> "But I think for realtime we don't want to be using IW's deletion at
all.  We should do all deletes via the IndexReader.  In fact if IW has
handed out a reader (via getReader()) and that reader (or a reopened
derivative) remains open we may have to block deletions via IW.  Not
sure..."

Can't IW use the IR to do it's deletions?  Currently deletions in IW are
implemented in DocumentsWriter.applyDeletes by loading a segment with
SegmentReader.get() and making the deletions which causes term index load
overhead per flush.  If IW has an internal IR then the deletion process can
use it (not SegmentReader.get) and there should not be a conflict anymore
between the IR and IW deletion processes.

> "we may have to block deletions via IW"

Hopefully they can be buffered.

Where else does the write lock need to be coordinated between IR and IW?

> "somehow IW & IR have to "split" the write lock else we may
need to merge deletions somehow."

This is a part I'd like to settle on before start of implementation.  It
looks like in IW deletes are buffered as terms or queries until flushed.  I
don't think there needs to be a lock until the flush is performed?

For the merge changes to the index, the deletionpolicy can be used to insure
a reader still has access to the segments it needs from the main directory.


> "We have to test performance to measure the net add -> search latency.
For many apps this approach may be plenty fast.  If your IO system is
an SSD it could be extremely fast.  Swapping in RAMDir
just makes it faster w/o changing the basic approach."

It is true that this is best way to start and in fact may be good enough for
many users.  It could help new users to expose a reader from IW so the
delineation between them is removed and Lucene becomes easier to use.

At the very least this system allows concurrently updateable IR and IW due
to sharing the write lock something that has is currently incorrect in
Lucene.

> "Besides the transaction log (for crash recovery), which should fit
"above" Lucene nicely, what else is needed for realtime beyond the
single-transaction support Lucene already provides?"

What we have described above (exposing IR via IW) will be sufficient and
realtime will live above it.



On Fri, Jan 9, 2009 at 11:15 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Jason Rutherglen <ja...@gmail.com> wrote:
>
> > Are you referring to the IW.pendingCommit SegmentInfos variable?
>
> No, I'm referring to segmentInfos.  (pendingCommit is the "snapshot"
> of segmentInfos taken when committing...).
>
> > When you say "flushed" you are referring to the IW.prepareCommit method?
>
> No, I'm referring to "flush"... it writes a new segment but not a new
> segments_N, does not sync the files, and does not invoke the deletion
> policy.
>
> > I think step #1 is important and should be generally useful outside of
> realtime search, however it's unclear how/when calls to IW.deleteDocument
> will reflect in IW.getReader?
>
> You'd have to flush (to materialize pending deletions inside IW) then
> reopen the reader, to see any deletions done via the writer.  But I
> think instead realtime search would do deletions via the reader
> (because if you use IW you're updating deletes through the Directory =
> too slow).
>
> > Interleaving deletes with documents added isn't possible because if the
> documents are in the IW ram buffer, they are not necessarily deleted
>
> Well, we buffer the delete and then on flush we materialize the
> delete.  So if you add a doc with field X=77, then delete-by-term
> X:77, then flush, you'll flush a 1 document segment whose only
> document is marked as deleted.
>
> But I think for realtime we don't want to be using IW's deletion at
> all.  We should do all deletes via the IndexReader.  In fact if IW has
> handed out a reader (via getReader()) and that reader (or a reopened
> derivative) remains open we may have to block deletions via IW.  Not
> sure... somehow IW & IR have to "split" the write lock else we may
> need to merge deletions somehow.
>
> > If this is swapped in later how is the system realtime except perhaps
> deletes?
>
> We have to test performance to measure the net add -> search latency.
> For many apps this approach may be plenty fast.  If your IO system is
> an SSD it could be extremely fast.  Swapping in RAMDir
> just makes it faster w/o changing the basic approach.
>
> > Adding support for multiple transactions at once on IndexWriter outside
> of the realtime transactions seems to require a lot of refactoring.
>
> Besides the transaction log (for crash recovery), which should fit
> "above" Lucene nicely, what else is needed for realtime beyond the
> single-transaction support Lucene already provides?
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Realtime Search

Posted by Michael McCandless <lu...@mikemccandless.com>.

Jason Rutherglen <ja...@gmail.com> wrote:

> Are you referring to the IW.pendingCommit SegmentInfos variable?

No, I'm referring to segmentInfos.  (pendingCommit is the "snapshot"
of segmentInfos taken when committing...).

> When you say "flushed" you are referring to the IW.prepareCommit method?

No, I'm referring to "flush"... it writes a new segment but not a new
segments_N, does not sync the files, and does not invoke the deletion
policy.

> I think step #1 is important and should be generally useful outside of realtime search, however it's unclear how/when calls to IW.deleteDocument will reflect in IW.getReader?

You'd have to flush (to materialize pending deletions inside IW) then
reopen the reader, to see any deletions done via the writer.  But I
think instead realtime search would do deletions via the reader
(because if you use IW you're updating deletes through the Directory =
too slow).

> Interleaving deletes with documents added isn't possible because if the documents are in the IW ram buffer, they are not necessarily deleted

Well, we buffer the delete and then on flush we materialize the
delete.  So if you add a doc with field X=77, then delete-by-term
X:77, then flush, you'll flush a 1 document segment whose only
document is marked as deleted.

But I think for realtime we don't want to be using IW's deletion at
all.  We should do all deletes via the IndexReader.  In fact if IW has
handed out a reader (via getReader()) and that reader (or a reopened
derivative) remains open we may have to block deletions via IW.  Not
sure... somehow IW & IR have to "split" the write lock else we may
need to merge deletions somehow.

> If this is swapped in later how is the system realtime except perhaps deletes?

We have to test performance to measure the net add -> search latency.
For many apps this approach may be plenty fast.  If your IO system is
an SSD it could be extremely fast.  Swapping in RAMDir
just makes it faster w/o changing the basic approach.

> Adding support for multiple transactions at once on IndexWriter outside of the realtime transactions seems to require a lot of refactoring.

Besides the transaction log (for crash recovery), which should fit
"above" Lucene nicely, what else is needed for realtime beyond the
single-transaction support Lucene already provides?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Jason Rutherglen <ja...@gmail.com>.

M.M.: "That reader is opened using IndexWriter's SegmentInfos instance, so
it
can read segments & deletions that have been flushed but not
committed.  It's allowed to do its own deletions & norms updating.
When reopen() is called, it grabs the writers SegmentInfos again."

Are you referring to the IW.pendingCommit SegmentInfos variable?  When you
say "flushed" you are referring to the IW.prepareCommit method?

I think step #1 is important and should be generally useful outside of
realtime search, however it's unclear how/when calls to IW.deleteDocument
will reflect in IW.getReader?  I assumed that IW.commit would result in
IW.deleteDocument changes showing up in IW.getReader.  Calls to
Transaction.deleteDocument/flush would show up immediately otherwise it's
generally unclear to the user the semantics of realtime indexing vs. IW
based batch indexing use cases.  With IW indexing one adds documents and
deletes documents then does a global commit to the main directory.
Interleaving deletes with documents added isn't possible because if the
documents are in the IW ram buffer, they are not necessarily deleted, so it
seems that if the semantics are such that IW.commit or IW.prepareCommit
expose deletes via IW.getReader, what is the difference compared to
IndexReader.reopen on the index except the shared write lock?  Ok, perhaps
this is all one gets and as you mentioned the rest is placed on a level
above IW which hopefully does not confuse the user.

M.M.: "

> Patch #2: Implement a realtime ram index class
>

I think this one is optional, or, rather an optimazation that we can
swap in later if/when necessary?  Ie for starters little segments are
written into the main Directory."

If this is swapped in later how is the system realtime except perhaps
deletes?

M.M.: "Can't this be layered on top?
Or... are you looking to add support for multiple transactions in
flight at once on IndexWriter?"

The initial version can be layered on top, that will make testing easier.
Adding support for multiple transactions at once on IndexWriter outside of
the realtime transactions seems to require a lot of refactoring.

On Fri, Jan 9, 2009 at 5:39 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

>
> Jason Rutherglen wrote:
>
>  Patch #1: Expose an IndexWriter.getReader method that returns the current
>> reader and shares the write lock
>>
>
> I tentatively like this approach so far...
>
> That reader is opened using IndexWriter's SegmentInfos instance, so it
> can read segments & deletions that have been flushed but not
> committed.  It's allowed to do its own deletions & norms updating.
> When reopen() is called, it grabs the writers SegmentInfos again.
>
>  Patch #2: Implement a realtime ram index class
>>
>
> I think this one is optional, or, rather an optimazation that we can
> swap in later if/when necessary?  Ie for starters little segments are
> written into the main Directory.
>
>  Patch #3: Implement realtime transactions in IndexWriter or in a subclass
>> of IndexWriter by implementing a createTransaction method that generates a
>> realtime Transaction object.  When the transaction is flushed, the
>> transaction index modifications are available via the getReader method of
>> IndexWriter
>>
>
> Can't this be layered on top?
>
> Or... are you looking to add support for multiple transactions in
> flight at once on IndexWriter?
>
> Mike
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Realtime Search

Posted by Michael McCandless <lu...@mikemccandless.com>.

Jason Rutherglen wrote:

> Patch #1: Expose an IndexWriter.getReader method that returns the  
> current reader and shares the write lock

I tentatively like this approach so far...

That reader is opened using IndexWriter's SegmentInfos instance, so it
can read segments & deletions that have been flushed but not
committed.  It's allowed to do its own deletions & norms updating.
When reopen() is called, it grabs the writers SegmentInfos again.

> Patch #2: Implement a realtime ram index class

I think this one is optional, or, rather an optimazation that we can
swap in later if/when necessary?  Ie for starters little segments are
written into the main Directory.

> Patch #3: Implement realtime transactions in IndexWriter or in a  
> subclass of IndexWriter by implementing a createTransaction method  
> that generates a realtime Transaction object.  When the transaction  
> is flushed, the transaction index modifications are available via  
> the getReader method of IndexWriter

Can't this be layered on top?

Or... are you looking to add support for multiple transactions in
flight at once on IndexWriter?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Jason Rutherglen <ja...@gmail.com>.

Based on our discussions, it seems best to get realtime search going in
small steps.  Below are some possible steps to take.

Patch #1: Expose an IndexWriter.getReader method that returns the current
reader and shares the write lock
Patch #2: Implement a realtime ram index class
Patch #3: Implement realtime transactions in IndexWriter or in a subclass of
IndexWriter by implementing a createTransaction method that generates a
realtime Transaction object.  When the transaction is flushed, the
transaction index modifications are available via the getReader method of
IndexWriter

The remaining question is how to synchronize the flushes to disk with
IndexWriter's other index update locking mechanisms.  The flushing could
simply use IW.addIndexes which has in place a locking mechanism.  After
flushing to disk, queued deletes would be applied to the newly copied disk
segments.  I think this entails opening the newly copied disk segments and
applying deletes that occurred to the corresponding ram segments by cloning
the new disk segments and replacing the deleteddocs bitvector then flushing
the deleteddocs to disk.  This system would allow us to avoid using UID in
documents.

The API needs to clearly separate realtime transactions vs. the existing
index update method such as addDocument, deleteDocuments, and
updateDocument.  I don't think it's possible to transparently implement both
because the underlying implementations behave differently.  It is expected
that multiple transaction may be created at once however the
Transaction.flush method would block.

Re: Realtime Search

Posted by Jason Rutherglen <ja...@gmail.com>.

+1 Agreed, the initial version should use RAMDirectory in order to keep
things simple and to benchmark against other MemoryIndex like index
representations.

On Fri, Dec 26, 2008 at 10:20 AM, Doug Cutting <cu...@apache.org> wrote:

> Michael McCandless wrote:
>
>> So then I think we should start with approach #2 (build real-time on
>> top of the Lucene core) and iterate from there.  Newly added docs go
>> into a tiny segments, which IndexReader.reopen pulls in.  Replaced or
>> deleted docs record the delete against the right SegmentReader (and
>> LUCENE-1314 lets reopen carry those pending deletes forward, in RAM).
>>
>> I would take the simple approach first: use ordinary SegmentReader on
>> a RAMDirectory for the tiny segments.  If that proves too slow, swap
>> in Memory/InstantiatedIndex for the tiny segments.  If that proves too
>> slow, build a reader impl that reads from DocumentsWriter RAM buffer.
>>
>
> +1 This sounds like a good approach to me.  I don't see any fundamental
> reasons why we need different representations, and fewer implementations of
> IndexWriter and IndexReader is generally better, unless they get way too
> hairy.  Mostly it seems that real-time can be done with our existing toolbox
> of datastructures, but with some slightly different control structures.
>  Once we have the control structure in place then we should look at
> optimizing data structures as needed.
>
> Doug
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Realtime Search

Posted by Doug Cutting <cu...@apache.org>.

Michael McCandless wrote:
> So then I think we should start with approach #2 (build real-time on
> top of the Lucene core) and iterate from there.  Newly added docs go
> into a tiny segments, which IndexReader.reopen pulls in.  Replaced or
> deleted docs record the delete against the right SegmentReader (and
> LUCENE-1314 lets reopen carry those pending deletes forward, in RAM).
> 
> I would take the simple approach first: use ordinary SegmentReader on
> a RAMDirectory for the tiny segments.  If that proves too slow, swap
> in Memory/InstantiatedIndex for the tiny segments.  If that proves too
> slow, build a reader impl that reads from DocumentsWriter RAM buffer.

+1 This sounds like a good approach to me.  I don't see any fundamental 
reasons why we need different representations, and fewer implementations 
of IndexWriter and IndexReader is generally better, unless they get way 
too hairy.  Mostly it seems that real-time can be done with our existing 
toolbox of datastructures, but with some slightly different control 
structures.  Once we have the control structure in place then we should 
look at optimizing data structures as needed.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Michael McCandless <lu...@mikemccandless.com>.

I think the necessary low-level changes to Lucene for real-time are
actually already well underway...

The biggest barrier is how we now ask for FieldCache values a the
Multi*Reader level.  This makes reopen cost catastrophic for a large
index.

Once we succeed in making FieldCache usage within Lucene
segment-centric (LUCENE-1483 = sorting becomes segment-centric),
LUCENE-831 (= deprecate old FieldCache API in favor of segment-centric
or iteration API), we are most of the way there.  LUCENE-1231 (column
stride fields) should make initing the per-segment FieldCache much
faster, though I think that's a "nice to have" for real-time search
(because either 1) warming will happen in the BG, or 2) the segment is
tiny).

So then I think we should start with approach #2 (build real-time on
top of the Lucene core) and iterate from there.  Newly added docs go
into a tiny segments, which IndexReader.reopen pulls in.  Replaced or
deleted docs record the delete against the right SegmentReader (and
LUCENE-1314 lets reopen carry those pending deletes forward, in RAM).

I would take the simple approach first: use ordinary SegmentReader on
a RAMDirectory for the tiny segments.  If that proves too slow, swap
in Memory/InstantiatedIndex for the tiny segments.  If that proves too
slow, build a reader impl that reads from DocumentsWriter RAM buffer.

One challenge is reopening after a big merge finishes... we'd need a
way to 1) allow the merge to be committed, then 2) start warming a new
reader in the BG, but 3) allow newly flushed segments to use the old
SegmentReaders reading the segments that were merged (because they are
still warm), and 4) once new reader is warm, we decref old segments
and use the new reader going forwards.

Alternatively, and maybe simpler, a merge is not allowed to commit
until a new SegmentReader has been warmed against the newly merged
segment.

I'm not sure how best to do this... we may need more info in
SegmentInfo[s] to track the genealogy of each segment, or something.
We may need to have IndexWriter give more info when it's modifying
SegmentInfos, eg we'd need the reader to access newly flushed segments
(IndexWriter does not write a new segments_N until commit).  Maybe
IndexWriter needs to warm readers... maybe IndexReader.open/reopen
needs to be given an IndexWriter and then access its un-flushed
in-memory SegmentInfos... not sure.  We'd need to fix
SegmentReader.get to provide single instance for a given segment.

I agree we'd want a specialized merge policy.  EG it should merge RAM
segments w/ higher priority, and probably not merge mixed RAM & disk
segments.

Mike
Jason Rutherglen <ja...@gmail.com> wrote:

> We've discussed realtime search before, it looks like after the next
> release we can get some sort of realtime search working.  I was going to
> open a new issue but decided it might be best to discuss realtime search on
> the dev list.
>
> Lucene can implement realtime search as the ability to add, update, or
> delete documents with latency in the sub 5 millisecond range.  A couple of
> different options are available.
>
> 1) Expose a rolling set of realtime readers over the memory index used by
> IndexWriter.  Requires incrementally updating field caches and filters, and
> is somewhat unclear how IndexReader versioning would work (for example
> versions of the term dictionary).
> 2) Implement realtime search by incrementally creating and merging readers
> in memory.  The system would use MemoryIndex or InstantiatedIndex to quickly
> (more quickly than RAMDirectory) create indexes from added documents.  The
> in memory indexes would be periodically merged in the background and
> according to RAM used write to disk.  Each update would generate a new
> IndexReader or MultiSearcher that includes the new updates.  Field caches
> and filters could be cached per IndexReader according to how Lucene works
> today.  The downside of this approach is the indexing will not be as fast as
> #1 because of the in memory merging which similar to the Lucene pre 2.3
> which merged in memory segments using RAMDirectory.
>
> Are there other implementation options?
>
> A new patch would focus on providing in memory indexing as part of the core
> of Lucene.  The work of LUCENE-1483 and LUCENE-1314 would be used.  I am not
> sure if option #2 can become part of core if it relies on a contrib module?
> It makes sense to provide a new realtime oriented merge policy that merges
> segments based on the number of deletes rather than a merge factor.  The
> realtime merge policy would keep the segments within a minimum and maximum
> size in kilobytes to limit the time consumed by merging which is assumed
> would occur frequently.
>
> LUCENE-1313 which includes a transaction log with rollback and was designed
> with distributed search and may be retired or the components split out.
>
>

Re: Realtime Search

Posted by robert engels <re...@ix.netcom.com>.

On Dec 24, 2008, at 12:23 PM, Jason Rutherglen wrote:

> > Also, what are the requirements?  Must a document be visible to  
> search within 10ms of being added?
>
> 0-5ms.  Otherwise it's not realtime, it's batch indexing.  The  
> realtime system can support small batches by encoding them into  
> RAMDirectories if they are of sufficient size.
>
> > Or must it be visible to search from the time that the call to  
> add it returns?
>
> Most people probably expect the update latency offered by SQL  
> databases.

This is the problem spot. In an SQL database, when an update/add  
occurs, the same connection/transaction will see the changes when  
requested IMMEDIATELY - there is 0 latency.

In order to do this you MUST have the concept of transactions and/or  
connections.

OR you must make it so that every update/add is immediately available  
- this is probably simpler.

You just need to always search the ram and the disk index. The  
deletions must be mapped to the disk index, and the "latest" version  
of the document must be obtained from the ram index (if it is there).

You just need to merge the ram and disk in the background... and  
continually create new/merged ram disks.

The memory requirements are going to go up, but you can always add a  
"block" so that if the background merger gets too far behind, the  
system blocks any current requests (to avoid the system running out  
of memory).

>
>
> > As a baseline, how fast is it to simply use RAMDirectory?
>
> It depends on how fast searches over the realtime index need to  
> be.  The detriment to speed occurs with having many small segments  
> that are continuously decoded (terms, postings, etc).  The  
> advantage of MemoryIndex and InstantiatedIndex is an actual  
> increase in search speed compared with RAMDirectory (see the  
> Performance Notes at http://hudson.zones.apache.org/hudson/job/ 
> Lucene-trunk/javadoc//org/apache/lucene/index/memory/ 
> MemoryIndex.html and )and no need to continuously decode segments  
> that are short lived.
>
> Anecdotal tests indicated the merging overhead of using  
> RAMDirectory as compared with MI or II is significant enough to  
> make it only useful for doing batches in the 1000s which does not  
> seem to be what people expect from realtime search.
>
> On Wed, Dec 24, 2008 at 9:53 AM, Doug Cutting <cu...@apache.org>  
> wrote:
> Jason Rutherglen wrote:
> 2) Implement realtime search by incrementally creating and merging  
> readers in memory.  The system would use MemoryIndex or  
> InstantiatedIndex to quickly (more quickly than RAMDirectory)  
> create indexes from added documents.
>
> As a baseline, how fast is it to simply use RAMDirectory?  If one,  
> e.g., flushes changes every 10ms or so, and has a background thread  
> that uses IndexReader.reopen() to keep a fresh version for reading?
>
> Also, what are the requirements?  Must a document be visible to  
> search within 10ms of being added?  Or must it be visible to search  
> from the time that the call to add it returns?  In the latter case  
> one might still use an approach like the above.  Writing a small  
> new segment to a RAMDirectory and then, with no merging, calling  
> IndexReader.reopen(), should be quite fast.  All merging could be  
> done in the background, as should post-merge reopens() that involve  
> large segments.
>
> In short, I wonder if new reader and writer implementations are in  
> fact required or whether, perhaps with a few optimizations, the  
> existing implementations might meet this need.
>
> Doug
>
> ---------------------------------------------------------------------
>
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Realtime Search

Posted by Jason Rutherglen <ja...@gmail.com>.

> Also, what are the requirements?  Must a document be visible to search
within 10ms of being added?

0-5ms.  Otherwise it's not realtime, it's batch indexing.  The realtime
system can support small batches by encoding them into RAMDirectories if
they are of sufficient size.

> Or must it be visible to search from the time that the call to add it
returns?

Most people probably expect the update latency offered by SQL databases.

> As a baseline, how fast is it to simply use RAMDirectory?

It depends on how fast searches over the realtime index need to be.  The
detriment to speed occurs with having many small segments that are
continuously decoded (terms, postings, etc).  The advantage of MemoryIndex
and InstantiatedIndex is an actual increase in search speed compared with
RAMDirectory (see the Performance Notes at
http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/index/memory/MemoryIndex.htmland
)and no need to continuously decode segments that are short lived.

Anecdotal tests indicated the merging overhead of using RAMDirectory as
compared with MI or II is significant enough to make it only useful for
doing batches in the 1000s which does not seem to be what people expect from
realtime search.

On Wed, Dec 24, 2008 at 9:53 AM, Doug Cutting <cu...@apache.org> wrote:

> Jason Rutherglen wrote:
>
>> 2) Implement realtime search by incrementally creating and merging readers
>> in memory.  The system would use MemoryIndex or InstantiatedIndex to quickly
>> (more quickly than RAMDirectory) create indexes from added documents.
>>
>
> As a baseline, how fast is it to simply use RAMDirectory?  If one, e.g.,
> flushes changes every 10ms or so, and has a background thread that uses
> IndexReader.reopen() to keep a fresh version for reading?
>
> Also, what are the requirements?  Must a document be visible to search
> within 10ms of being added?  Or must it be visible to search from the time
> that the call to add it returns?  In the latter case one might still use an
> approach like the above.  Writing a small new segment to a RAMDirectory and
> then, with no merging, calling IndexReader.reopen(), should be quite fast.
>  All merging could be done in the background, as should post-merge reopens()
> that involve large segments.
>
> In short, I wonder if new reader and writer implementations are in fact
> required or whether, perhaps with a few optimizations, the existing
> implementations might meet this need.
>
> Doug
>
> ---------------------------------------------------------------------
>
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Realtime Search

Posted by Doug Cutting <cu...@apache.org>.

Jason Rutherglen wrote:
> 2) Implement realtime search by incrementally creating and merging 
> readers in memory.  The system would use MemoryIndex or 
> InstantiatedIndex to quickly (more quickly than RAMDirectory) create 
> indexes from added documents.

As a baseline, how fast is it to simply use RAMDirectory?  If one, e.g., 
flushes changes every 10ms or so, and has a background thread that uses 
IndexReader.reopen() to keep a fresh version for reading?

Also, what are the requirements?  Must a document be visible to search 
within 10ms of being added?  Or must it be visible to search from the 
time that the call to add it returns?  In the latter case one might 
still use an approach like the above.  Writing a small new segment to a 
RAMDirectory and then, with no merging, calling IndexReader.reopen(), 
should be quite fast.  All merging could be done in the background, as 
should post-merge reopens() that involve large segments.

In short, I wonder if new reader and writer implementations are in fact 
required or whether, perhaps with a few optimizations, the existing 
implementations might meet this need.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by robert engels <re...@ix.netcom.com>.

Also, if you are thinking that accessing the "buffer" directly will  
be faster than "parsing" the packed structure, I'm not so sure.

You can review the source for the various buffers, and since the is  
no "struct" support in Java, you end up combining bytes to make  
longs, etc. Also, a lot of the accesses are though Unsafe, which is  
slower than the indirection on a Java object to access a field.

My review of these classes makes me think that parsing the "skip to"  
index once into java objects for later use is going to be a lot  
faster overall than accessing the entire mapped file directly on  
every invocation.

On Dec 23, 2008, at 9:20 PM, Marvin Humphrey wrote:

> On Tue, Dec 23, 2008 at 08:36:24PM -0600, robert engels wrote:
>> Is there something that I am missing?
>
> Yes.
>
>> I see lots of references to  using "memory mapped" files to  
>> "dramatically"
>> improve performance.
>
> There have been substantial discussions about this design in JIRA,
> notably LUCENE-1458.
>
> The "dramatic" improvement is WRT to opening/reopening an IndexReader.
> Presently in both KS and Lucene, certain data structures have to be  
> read at
> IndexReader startup and unpacked into process memory -- in  
> particular, the
> term dictionary index and sort caches.  If those data structures  
> can be
> represented by a memory mapped file rather than built up from  
> scratch, we save
> big.
>
> Marvin Humphrey
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Marvin Humphrey <ma...@rectangular.com>.

On Wed, Dec 24, 2008 at 12:02:24PM -0600, robert engels wrote:
> As I understood this discussion though, it was an attempt to remove  
> the in memory 'skip to' index, to avoid the reading of this during  
> index open/reopen.

No.  That idea was entertained briefly and quickly discarded.  There seems to
be an awful lot of irrelevant noise in the current thread arising due to lack
of familiarity with the ongoing discussions in JIRA.

Marvin Humphrey

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by robert engels <re...@ix.netcom.com>.

As I pointed out in another email, I understand the benefits of  
compression (compressed disks vs. uncompressed, etc.). PFOR is  
definitely a winner !

As I understood this discussion though, it was an attempt to remove  
the in memory 'skip to' index, to avoid the reading of this during  
index open/reopen.

I was attempting to point out that this in-memory index is still  
needed, but there are ways to improve the current process.

I don't think a mapped file for the term index is going to work for a  
variety of reasons. Mapped files are designed as a programming  
simplification - mainly for older systems that use line delimited  
files - rather than having to create "page/section" caches when  
processing very large files (when only a small portion is used at any  
given time - ie. the data visible on the screen). When you end up  
visiting a large portion of the file anyway (to do a full  
repagination), an in-process intelligent cache is going to be far  
superior.

My review of the Java Buffer related classes does not give me the  
impression it is going to be faster - in fact it will be slower- than  
a single copy into user space, and process/decompress there. The  
Buffer system is suitable when perform little inspection, and then  
direct copy to another buffer (think reading from a file, and sending  
out on a socket). If you end up inspecting the buffer, it is going to  
be very slow.

On Dec 24, 2008, at 11:33 AM, Paul Elschot wrote:

>
> Op Wednesday 24 December 2008 17:51:04 schreef robert engels:
>> Thinking about this some more, you could use fixed length pages for
>> the term index, with a page header containing a count of entries, and
>> use key compression (to avoid the constant entry size).
>>
>> The problem with this is that you still have to decode the entries
>> (slowing the processing - since a simple binary search on the page is
>> not possible).
>
> The cache between the pages and the cpu is also a bottleneck nowadays.
> See here:
>
> Super-Scalar RAM-CPU Cache Compression
> M Zukowski, S Heman, N Nes, P Boncz - cwi.nl
>
> currently available from this link:
>
> http://www.cwi.nl/htbin/ins1/publications? 
> request=pdfgz&key=ZuHeNeBo:ICDE:06
>
> Also, some preliminary results on lucene indexes
> are available at LUCENE-1410.
>
> Regards,
> Paul Elschot
>
>
>>
>> But, if you also add a 'least term and greatest term' to the page
>> header (you can avoid the duplicate storage of these entries as
>> well), you can perform a binary search of the term index much faster.
>> You only need to decode the index page containing (maybe) the desired
>> entry.
>>
>> If you were doing a prefix/range search, you will still end up
>> decoding lots of pages...
>>
>> This is why a database has their own page cache, and usually caches
>> the decoded form (for index pages) for faster processing - at the
>> expense of higher memory usage. Usually data pages are not cached in
>> the decoded/uncompressed form. In most cases the database vendor will
>> recommend removing the OS page cache on the database server, and
>> allocating all of the memory to the database process.
>>
>> You may be able to avoid some of the warm-up of an index using memory
>> mapped files, but with proper ordering of the writing of the index,
>> it probably isn't necessary. Beyond that, processing the term index
>> directly using NIO does not appear that it will be faster than using
>> an in-process cache of the term index (similar to the skip-to memory
>> index now).
>>
>> The BEST approach is probably to have the index writer build the
>> memory "skip to" structure as it writes the segment, and then include
>> this in the segment during the reopen - no warming required !. As
>> long as the reader and writer are in the same process, it will be a
>> winner !
>>
>> On Dec 23, 2008, at 11:02 PM, robert engels wrote:
>>> Seems doubtful you will be able to do this without increasing the
>>> index size dramatically. Since it will need to be stored
>>> "unpacked" (in order to have random access), yet the terms are
>>> variable length - leading to using a maximum=minimum size for every
>>> term.
>>>
>>> In the end I highly doubt it will make much difference in speed -
>>> here's several reasons why...
>>>
>>> 1. with "fixed" size terms, the additional IO (larger pages)
>>> probably offsets a lot of the random access benefit. This is why
>>> "compressed" disks on a fast machine (CPU) are often faster than
>>> "uncompressed" - more data is read during every IO access.
>>>
>>> 2. with a reopen, only new segments are "read", and since it is a
>>> new segment, it is most likely already in the disk cache (from the
>>> write), so the reopen penalty is negligible (especially if the term
>>> index "skip to" is written last).
>>>
>>> 3. If the reopen is after an optimize - when the OS cache has
>>> probably been obliterated, then the warm up time is going to be
>>> similar in most cases anyway, since the "index" pages will also not
>>> be in core (in the case of memory mapped files). Again, writing the
>>> "skip to" last can help with this.
>>>
>>> Just because a file is memory mapped does not mean its pages will
>>> have an greater likelihood to be in the cache. The locality of
>>> reference is going to control this, just as the most/often access
>>> controls it in the OS disk cache.  Also, most OSs will take real
>>> memory from the virtual address space and add it to the disk cache
>>> if the process is doing lots of IO.
>>>
>>> If you have a memory mapped "term index", you are still going to
>>> need to perform a binary search to find the correct term "page",
>>> and after an optimize the visited pages will not be in the cache
>>> (or in core).
>>>
>>> On Dec 23, 2008, at 9:20 PM, Marvin Humphrey wrote:
>>>> On Tue, Dec 23, 2008 at 08:36:24PM -0600, robert engels wrote:
>>>>> Is there something that I am missing?
>>>>
>>>> Yes.
>>>>
>>>>> I see lots of references to  using "memory mapped" files to
>>>>> "dramatically"
>>>>> improve performance.
>>>>
>>>> There have been substantial discussions about this design in JIRA,
>>>> notably LUCENE-1458.
>>>>
>>>> The "dramatic" improvement is WRT to opening/reopening an
>>>> IndexReader.
>>>> Presently in both KS and Lucene, certain data structures have to
>>>> be read at
>>>> IndexReader startup and unpacked into process memory -- in
>>>> particular, the
>>>> term dictionary index and sort caches.  If those data structures
>>>> can be
>>>> represented by a memory mapped file rather than built up from
>>>> scratch, we save
>>>> big.
>>>>
>>>> Marvin Humphrey
>>>>
>>>>
>>>> ------------------------------------------------------------------
>>>> --- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>> -------------------------------------------------------------------
>>> -- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Paul Elschot <pa...@xs4all.nl>.

Op Wednesday 24 December 2008 17:51:04 schreef robert engels:
> Thinking about this some more, you could use fixed length pages for
> the term index, with a page header containing a count of entries, and
> use key compression (to avoid the constant entry size).
>
> The problem with this is that you still have to decode the entries
> (slowing the processing - since a simple binary search on the page is
> not possible).

The cache between the pages and the cpu is also a bottleneck nowadays.
See here:

Super-Scalar RAM-CPU Cache Compression
M Zukowski, S Heman, N Nes, P Boncz - cwi.nl

currently available from this link:

http://www.cwi.nl/htbin/ins1/publications?request=pdfgz&key=ZuHeNeBo:ICDE:06

Also, some preliminary results on lucene indexes
are available at LUCENE-1410.

Regards,
Paul Elschot


>
> But, if you also add a 'least term and greatest term' to the page
> header (you can avoid the duplicate storage of these entries as
> well), you can perform a binary search of the term index much faster.
> You only need to decode the index page containing (maybe) the desired
> entry.
>
> If you were doing a prefix/range search, you will still end up
> decoding lots of pages...
>
> This is why a database has their own page cache, and usually caches
> the decoded form (for index pages) for faster processing - at the
> expense of higher memory usage. Usually data pages are not cached in
> the decoded/uncompressed form. In most cases the database vendor will
> recommend removing the OS page cache on the database server, and
> allocating all of the memory to the database process.
>
> You may be able to avoid some of the warm-up of an index using memory
> mapped files, but with proper ordering of the writing of the index,
> it probably isn't necessary. Beyond that, processing the term index
> directly using NIO does not appear that it will be faster than using
> an in-process cache of the term index (similar to the skip-to memory
> index now).
>
> The BEST approach is probably to have the index writer build the
> memory "skip to" structure as it writes the segment, and then include
> this in the segment during the reopen - no warming required !. As
> long as the reader and writer are in the same process, it will be a
> winner !
>
> On Dec 23, 2008, at 11:02 PM, robert engels wrote:
> > Seems doubtful you will be able to do this without increasing the
> > index size dramatically. Since it will need to be stored
> > "unpacked" (in order to have random access), yet the terms are
> > variable length - leading to using a maximum=minimum size for every
> > term.
> >
> > In the end I highly doubt it will make much difference in speed -
> > here's several reasons why...
> >
> > 1. with "fixed" size terms, the additional IO (larger pages)
> > probably offsets a lot of the random access benefit. This is why
> > "compressed" disks on a fast machine (CPU) are often faster than
> > "uncompressed" - more data is read during every IO access.
> >
> > 2. with a reopen, only new segments are "read", and since it is a
> > new segment, it is most likely already in the disk cache (from the
> > write), so the reopen penalty is negligible (especially if the term
> > index "skip to" is written last).
> >
> > 3. If the reopen is after an optimize - when the OS cache has
> > probably been obliterated, then the warm up time is going to be
> > similar in most cases anyway, since the "index" pages will also not
> > be in core (in the case of memory mapped files). Again, writing the
> > "skip to" last can help with this.
> >
> > Just because a file is memory mapped does not mean its pages will
> > have an greater likelihood to be in the cache. The locality of
> > reference is going to control this, just as the most/often access
> > controls it in the OS disk cache.  Also, most OSs will take real
> > memory from the virtual address space and add it to the disk cache
> > if the process is doing lots of IO.
> >
> > If you have a memory mapped "term index", you are still going to
> > need to perform a binary search to find the correct term "page",
> > and after an optimize the visited pages will not be in the cache
> > (or in core).
> >
> > On Dec 23, 2008, at 9:20 PM, Marvin Humphrey wrote:
> >> On Tue, Dec 23, 2008 at 08:36:24PM -0600, robert engels wrote:
> >>> Is there something that I am missing?
> >>
> >> Yes.
> >>
> >>> I see lots of references to  using "memory mapped" files to
> >>> "dramatically"
> >>> improve performance.
> >>
> >> There have been substantial discussions about this design in JIRA,
> >> notably LUCENE-1458.
> >>
> >> The "dramatic" improvement is WRT to opening/reopening an
> >> IndexReader.
> >> Presently in both KS and Lucene, certain data structures have to
> >> be read at
> >> IndexReader startup and unpacked into process memory -- in
> >> particular, the
> >> term dictionary index and sort caches.  If those data structures
> >> can be
> >> represented by a memory mapped file rather than built up from
> >> scratch, we save
> >> big.
> >>
> >> Marvin Humphrey
> >>
> >>
> >> ------------------------------------------------------------------
> >>--- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> > -------------------------------------------------------------------
> >-- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by robert engels <re...@ix.netcom.com>.

Thinking about this some more, you could use fixed length pages for  
the term index, with a page header containing a count of entries, and  
use key compression (to avoid the constant entry size).

The problem with this is that you still have to decode the entries  
(slowing the processing - since a simple binary search on the page is  
not possible).

But, if you also add a 'least term and greatest term' to the page  
header (you can avoid the duplicate storage of these entries as  
well), you can perform a binary search of the term index much faster.  
You only need to decode the index page containing (maybe) the desired  
entry.

If you were doing a prefix/range search, you will still end up  
decoding lots of pages...

This is why a database has their own page cache, and usually caches  
the decoded form (for index pages) for faster processing - at the  
expense of higher memory usage. Usually data pages are not cached in  
the decoded/uncompressed form. In most cases the database vendor will  
recommend removing the OS page cache on the database server, and  
allocating all of the memory to the database process.

You may be able to avoid some of the warm-up of an index using memory  
mapped files, but with proper ordering of the writing of the index,  
it probably isn't necessary. Beyond that, processing the term index  
directly using NIO does not appear that it will be faster than using  
an in-process cache of the term index (similar to the skip-to memory  
index now).

The BEST approach is probably to have the index writer build the  
memory "skip to" structure as it writes the segment, and then include  
this in the segment during the reopen - no warming required !. As  
long as the reader and writer are in the same process, it will be a  
winner !

On Dec 23, 2008, at 11:02 PM, robert engels wrote:

> Seems doubtful you will be able to do this without increasing the  
> index size dramatically. Since it will need to be stored  
> "unpacked" (in order to have random access), yet the terms are  
> variable length - leading to using a maximum=minimum size for every  
> term.
>
> In the end I highly doubt it will make much difference in speed -  
> here's several reasons why...
>
> 1. with "fixed" size terms, the additional IO (larger pages)  
> probably offsets a lot of the random access benefit. This is why  
> "compressed" disks on a fast machine (CPU) are often faster than  
> "uncompressed" - more data is read during every IO access.
>
> 2. with a reopen, only new segments are "read", and since it is a  
> new segment, it is most likely already in the disk cache (from the  
> write), so the reopen penalty is negligible (especially if the term  
> index "skip to" is written last).
>
> 3. If the reopen is after an optimize - when the OS cache has  
> probably been obliterated, then the warm up time is going to be  
> similar in most cases anyway, since the "index" pages will also not  
> be in core (in the case of memory mapped files). Again, writing the  
> "skip to" last can help with this.
>
> Just because a file is memory mapped does not mean its pages will  
> have an greater likelihood to be in the cache. The locality of  
> reference is going to control this, just as the most/often access  
> controls it in the OS disk cache.  Also, most OSs will take real  
> memory from the virtual address space and add it to the disk cache  
> if the process is doing lots of IO.
>
> If you have a memory mapped "term index", you are still going to  
> need to perform a binary search to find the correct term "page",  
> and after an optimize the visited pages will not be in the cache  
> (or in core).
>
> On Dec 23, 2008, at 9:20 PM, Marvin Humphrey wrote:
>
>> On Tue, Dec 23, 2008 at 08:36:24PM -0600, robert engels wrote:
>>> Is there something that I am missing?
>>
>> Yes.
>>
>>> I see lots of references to  using "memory mapped" files to  
>>> "dramatically"
>>> improve performance.
>>
>> There have been substantial discussions about this design in JIRA,
>> notably LUCENE-1458.
>>
>> The "dramatic" improvement is WRT to opening/reopening an  
>> IndexReader.
>> Presently in both KS and Lucene, certain data structures have to  
>> be read at
>> IndexReader startup and unpacked into process memory -- in  
>> particular, the
>> term dictionary index and sort caches.  If those data structures  
>> can be
>> represented by a memory mapped file rather than built up from  
>> scratch, we save
>> big.
>>
>> Marvin Humphrey
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Marvin Humphrey <ma...@rectangular.com>.

On Tue, Dec 23, 2008 at 11:02:56PM -0600, robert engels wrote:
> Seems doubtful you will be able to do this without increasing the  
> index size dramatically. Since it will need to be stored  
> "unpacked" (in order to have random access), yet the terms are  
> variable length - leading to using a maximum=minimum size for every  
> term.

Wow.  That's a spectacularly awful design.  Its worst case -- one outlier
term, say, 1000 characters in length, in a field where the average term length
is in the single digits -- would explode the index size and incur wasteful IO
overhead, just as you say.

Good thing we've never considered it.  :)

I'm hoping we can improve on this, but for now, we've ended up at a two-file
design for the term dictionary index.

  1) Stacked 64-bit file pointers.
  2) Variable length character and term info data, interpreted using a 
     pluggable codec.

In the index at least, each entry would contain the full term text, encoded as
UTF-8.  Probably the primary term dictionary would continue to use string
diffs. 

That design offers no significant benefits other than those that flow from
compatibility with mmap: faster IndexReader open/reaopen, lower RAM usage
under multiple processes by way of buffer sharing.  IO bandwidth requirements
and speed are probably a little better, but lookups on the term dictionary
index are not a significant search-time bottleneck.

Additionally, sort caches would be written at index time in three files, and
memory mapped as laid out in 
<https://issues.apache.org/jira/browse/LUCENE-831?focusedCommentId=12656150#action_12656150>.

  1) Stacked 64-bit file pointers.
  2) Character data.
  3) Doc num to ord mapping.

Marvin Humphrey

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by robert engels <re...@ix.netcom.com>.

Seems doubtful you will be able to do this without increasing the  
index size dramatically. Since it will need to be stored  
"unpacked" (in order to have random access), yet the terms are  
variable length - leading to using a maximum=minimum size for every  
term.

In the end I highly doubt it will make much difference in speed -  
here's several reasons why...

1. with "fixed" size terms, the additional IO (larger pages) probably  
offsets a lot of the random access benefit. This is why "compressed"  
disks on a fast machine (CPU) are often faster than "uncompressed" -  
more data is read during every IO access.

2. with a reopen, only new segments are "read", and since it is a new  
segment, it is most likely already in the disk cache (from the  
write), so the reopen penalty is negligible (especially if the term  
index "skip to" is written last).

3. If the reopen is after an optimize - when the OS cache has  
probably been obliterated, then the warm up time is going to be  
similar in most cases anyway, since the "index" pages will also not  
be in core (in the case of memory mapped files). Again, writing the  
"skip to" last can help with this.

Just because a file is memory mapped does not mean its pages will  
have an greater likelihood to be in the cache. The locality of  
reference is going to control this, just as the most/often access  
controls it in the OS disk cache.  Also, most OSs will take real  
memory from the virtual address space and add it to the disk cache if  
the process is doing lots of IO.

If you have a memory mapped "term index", you are still going to need  
to perform a binary search to find the correct term "page", and after  
an optimize the visited pages will not be in the cache (or in core).

On Dec 23, 2008, at 9:20 PM, Marvin Humphrey wrote:

> On Tue, Dec 23, 2008 at 08:36:24PM -0600, robert engels wrote:
>> Is there something that I am missing?
>
> Yes.
>
>> I see lots of references to  using "memory mapped" files to  
>> "dramatically"
>> improve performance.
>
> There have been substantial discussions about this design in JIRA,
> notably LUCENE-1458.
>
> The "dramatic" improvement is WRT to opening/reopening an IndexReader.
> Presently in both KS and Lucene, certain data structures have to be  
> read at
> IndexReader startup and unpacked into process memory -- in  
> particular, the
> term dictionary index and sort caches.  If those data structures  
> can be
> represented by a memory mapped file rather than built up from  
> scratch, we save
> big.
>
> Marvin Humphrey
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Marvin Humphrey <ma...@rectangular.com>.

On Tue, Dec 23, 2008 at 08:36:24PM -0600, robert engels wrote:
> Is there something that I am missing? 

Yes. 

> I see lots of references to  using "memory mapped" files to "dramatically"
> improve performance.

There have been substantial discussions about this design in JIRA,
notably LUCENE-1458.

The "dramatic" improvement is WRT to opening/reopening an IndexReader.
Presently in both KS and Lucene, certain data structures have to be read at
IndexReader startup and unpacked into process memory -- in particular, the
term dictionary index and sort caches.  If those data structures can be
represented by a memory mapped file rather than built up from scratch, we save
big.

Marvin Humphrey

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by robert engels <re...@ix.netcom.com>.

Is there something that I am missing? I see lots of references to  
using "memory mapped" files to "dramatically" improve performance.

I don't think this is the case at all. At the lowest levels, it is  
somewhat more efficient from a CPU standpoint, but with a decent OS  
cache the IO performance difference is going to negligible.

The primary benefit of memory mapped files is simplicity in code  
(although in Java there is another layer needed - think C ), and the  
file can be treated as a random accessible memory array.

 From my OS design experience, the page at http://en.wikipedia.org/ 
wiki/Memory-mapped_file is incorrect.

Even if the memory mapped file is mapped into the virtual memory  
space, unless you specialized memory controllers and disk systems,  
when a page fault occurs, the OS loads the page just as any other.

The difference with direct IO, is that there is first a simple  
translation from position to disk page, and the OS disk page cache is  
checked. Almost exactly the same thing occurs with a memory mapped file.

The memory addressed is accessed, if not in memory, a page fault  
occurs, and the page is loaded from the file (it may be loaded from  
the OS disk cache in this process).

The point being, if the page is not in the cache (which is probably  
the case with a large index), the time to load the page is far  
greater than the difference between the IO address translation and  
the memory address lookup.

If all of the pages of the index can fit in memory, a properly  
configured system is going to have them in the page cache anyway....

On Dec 23, 2008, at 8:22 PM, Marvin Humphrey wrote:

> On Tue, Dec 23, 2008 at 05:51:43PM -0800, Jason Rutherglen wrote:
>
>> Are there other implementation options?
>
> Here's the plan for Lucy/KS:
>
>   1) Design index formats that can be memory mapped rather than  
> slurped,
>      bringing the cost of opening/reopening an IndexReader down to a
>      negligible level.
>   2) Enable segment-centric sorted search. (LUCENE-1483)
>   3) Implement tombstone-based deletions, so that the cost of deleting
>      documents scales with the number of deletions rather than the  
> size of the
>      index.
>   4) Allow 2 concurrent writers: one for small, fast updates, and  
> one for
>      big background merges.
>
> Marvin Humphrey
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Michael McCandless <lu...@mikemccandless.com>.

Marvin Humphrey <ma...@rectangular.com> wrote:

> The goal is to improve worst-case write performance.
> ...
> In between the time when the background merge writer starts up and the time it finishes consolidating segment data, we assume that the primary writer will have modified the index.
>
>   * New docs have been added in new segments.
>   * Tombstones have been added which suppress documents in segments which didn't even exist when the background merge writer started up.
>   * Tombstones have been added which suppress documents in segments which existed when the background merge writer started up, but were not merged.
>   * Tombstones have been added which suppress documents in segments which have just been merged.
>
> Only the last category of deletions matters.
>
> At this point, the background merge writer aquires an exclusive write lock on the index. It examines recently added tombstones, translates the document numbers and writes a tombstone file against itself. Then it writes the snapshot file to commit its changes and releases the write lock.

OK, now I understand KS's two-writer model.  Lucene has already solved
this with the ConcurrentMergeScheduler -- all segment merges are done
in the BG (by default).

We also have to compute the deletions against the new segment to
include deletions that happened to the merged segments after the merge
kicked off.

Still, it's not a panacea since often the IO system has horrible
degradation in performance while a merge is running.  If only we could
mark all IO (reads & writes) associated with merging as low priority
and have the OS actually do the right thing...

> It's true that we are decoupling the process of making logical changes to the index from the process of internal consolidation. I probably wouldn't describe that as being done from the reader's standpoint, though.

Right, we have a different problem in Lucene (because we must warm a
reader before using it): after a large merge, warming the new
IndexReader that includes that segment can be costly (though that cost
is going down with LUCENE-1483, and eventually column-stride fields).

But we can solve this by allowing a reopened reader to use the old
segments, until the new segment is warmed.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Marvin Humphrey <ma...@rectangular.com>.

On Fri, Dec 26, 2008 at 06:22:23AM -0500, Michael McCandless wrote:
> >  4) Allow 2 concurrent writers: one for small, fast updates, and one for
> >     big background merges.
> 
> Marvin can you describe more detail here? 

The goal is to improve worst-case write performance.  

Currently, writes are quick most of the time, but occassionally you'll trigger
a big merge and get stuck.  To solve this problem, we can assign a merge
policy to our primary writer which tells it to merge no more than
mergeThreshold documents.  The value of mergeTheshold will need tuning
depending on document size, change rate, and so on, but the idea is that we
want this writer to do as much merging as it can while still keeping
worst-case write performance down to an acceptable number.

Doing only small merges just puts off the day of reckoning, of course.  By
avoiding big consolidations, we are slowly accumulating small-to-medium sized
segments and causing a gradual degradation of search-time performance.

What we'd like is a separate write process, operating (mostly) in the
background, dedicated solely to merging segments which contain at least
mergeThreshold docs.

If all we have to do is add documents to the index, adding that second write
process isn't a big deal.  We have to worry about competion for segment,
snapshot, and temp file names, but that's about it.

Deletions make matters more complicated, but with a tombstone-based deletions
mechanism, the problems are solvable.

When the background merge writer starts up, it will see a particular view of
the index in time, including deletions.  It will perform nearly all of its
operations based on this view of the index, mapping around documents which
were marked as deleted at init time.

In between the time when the background merge writer starts up and the time it
finishes consolidating segment data, we assume that the primary writer will
have modified the index.

  * New docs have been added in new segments.
  * Tombstones have been added which suppress documents in segments which
    didn't even exist when the background merge writer started up.
  * Tombstones have been added which suppress documents in segments which
    existed when the background merge writer started up, but were not merged.
  * Tombstones have been added which suppress documents in segments which have
    just been merged.

Only the last category of deletions matters.

At this point, the background merge writer aquires an exclusive write lock on
the index.  It examines recently added tombstones, translates the document
numbers and writes a tombstone file against itself.  Then it writes the
snapshot file to commit its changes and releases the write lock.

Worst case update performance for the system is now the sum of the time it
takes the background merge writer consolidate tombstones and worst-case
performance of the primary writer.

> It sounds like this is your solution for "decoupling" segments changes due
> to merges from changes from docs being indexed, from a reader's standpoint?

It's true that we are decoupling the process of making logical changes to the
index from the process of internal consolidation.  I probably wouldn't
describe that as being done from the reader's standpoint, though.

With mmap and data structures optimized for it, we basically solve the
read-time responsiveness cost problem.  From the client perspective, the delay
between firing off a change order and seeing that change made live is now
dominated by the time it takes to actually update the index.  The time between
the commit and having an IndexReader which can see that commit is negligible
in comparision.

> Since you are using mmap to achieve near zero brand-new IndexReader
> creation, whereas in Lucene we are moving towards achieving real-time
> by always reopening a current IndexReader (not a brand new one), it
> seems like you should not actually have to worry about the case of
> reopening a reader after a large merge has finished?

Even though we can rely on mmap rather than slurping, there are potentially a
lot of files to open and a lot of JSON-encoded metadata to parse, so I'm not
certain that Lucy/KS will never have to worry about the time it takes to open
a new IndexReader.  Fortunately, we can implement reopen() if we need to.

> We need to deal with this case (background the warming) because
> creating that new SegmentReader (on the newly merged segment) can take
> a non-trivial amount of time.

Yes.  Without mmap or some other solution, I think improvements to worst-case
update performance in Lucene will continue to be constrained by post-commit
IndexReader opening costs.

Marvin Humphrey

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Realtime Search

Posted by Michael McCandless <lu...@mikemccandless.com>.

Marvin Humphrey <ma...@rectangular.com> wrote:

>  4) Allow 2 concurrent writers: one for small, fast updates, and one for
>     big background merges.

Marvin can you describe more detail here?  It sounds like this is your
solution for "decoupling" segments changes due to merges from changes
from docs being indexed, from a reader's standpoint?

Since you are using mmap to achieve near zero brand-new IndexReader
creation, whereas in Lucene we are moving towards achieving real-time
by always reopening a current IndexReader (not a brand new one), it
seems like you should not actually have to worry about the case of
reopening a reader after a large merge has finished?

We need to deal with this case (background the warming) because
creating that new SegmentReader (on the newly merged segment) can take
a non-trivial amount of time.

Mike

Re: Realtime Search

Posted by Marvin Humphrey <ma...@rectangular.com>.

On Tue, Dec 23, 2008 at 05:51:43PM -0800, Jason Rutherglen wrote:

> Are there other implementation options?

Here's the plan for Lucy/KS:

  1) Design index formats that can be memory mapped rather than slurped,
     bringing the cost of opening/reopening an IndexReader down to a
     negligible level.
  2) Enable segment-centric sorted search. (LUCENE-1483) 
  3) Implement tombstone-based deletions, so that the cost of deleting
     documents scales with the number of deletions rather than the size of the
     index.
  4) Allow 2 concurrent writers: one for small, fast updates, and one for
     big background merges.

Marvin Humphrey


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org