You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Michael McCandless <lu...@mikemccandless.com> on 2006/08/21 12:43:27 UTC

Re: index update with database insertion

> In my project,I want to update the lucene's index when there has database insertion operations,in this way,my users could search the fresh information immediately if someone inserted the information into database.That's what I need,could someone give me suggestions to implement my need?Thank you very much!

Have a look at this recent thread:

  http://www.gossamer-threads.com/lists/lucene/java-user/38636

That was about DB2 in particular but it should carry over to your DB.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: index update with database insertion

Posted by Michael McCandless <lu...@mikemccandless.com>.
Jason Polites wrote:
> I'm not sure about the solution in the referenced thread.  It will work, 
> but
> doesn't it run the risk of breaching the transaction isolation of the
> database write?
> 
> The issue is when the index is notified of a database update.  If it is
> notified prior to the transaction commit, and the commit fails then you are
> out of synch.  If it is notified after commit, and the notify fails you are
> out of sync.
> 
> Maybe more of a "publish/subscribe" model would work (or.. maybe this is
> what was suggested and I am way off!).  That is, when the DB write occurs,
> and within the same transaction, an "index" task is published to some new
> "IndexTask" table which your index process polls.  This way if the index
> process is offline.. when it comes back up it simply has a list of tasks to
> perform.
> 
> Apologies if this is what you meant Michael.
> 
> Or.. better still (for transaction isolation, but arguably not performance)
> do the index update within the same transaction.  I have just been
> implementing some test code using Compass (Open Symphony) which provides
> full transaction isolation for Lucene.

That is an excellent point Jason -- that thread I linked to doesn't
talk to how to have transaction isolation with your DB updates.

I like the idea of a separate "IndexTasks" table.  This is better
than a simple timestamp column added to all content tables because it
can handle deleted docs correctly.  And it allows full decoupling of
when the DB transaction is committed vs. when Lucene commits its
corresponding index updates.

I think ideally you would indeed want to commit both at once (your
last paragraph above), meaning, your end users see changes to DB
simultaneous with changes to searching on the Lucene index.  Compass
sounds like it will address this nicely!

Solr also adds some transactional semantics on top of Lucene, in that
you add/delete documents, but these changes are not visible to the end
user until a "commit" is issued.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: index update with database insertion

Posted by Jason Polites <ja...@gmail.com>.
I'm not sure about the solution in the referenced thread.  It will work, but
doesn't it run the risk of breaching the transaction isolation of the
database write?

The issue is when the index is notified of a database update.  If it is
notified prior to the transaction commit, and the commit fails then you are
out of synch.  If it is notified after commit, and the notify fails you are
out of sync.

Maybe more of a "publish/subscribe" model would work (or.. maybe this is
what was suggested and I am way off!).  That is, when the DB write occurs,
and within the same transaction, an "index" task is published to some new
"IndexTask" table which your index process polls.  This way if the index
process is offline.. when it comes back up it simply has a list of tasks to
perform.

Apologies if this is what you meant Michael.

Or.. better still (for transaction isolation, but arguably not performance)
do the index update within the same transaction.  I have just been
implementing some test code using Compass (Open Symphony) which provides
full transaction isolation for Lucene.

On 8/21/06, Michael McCandless <lu...@mikemccandless.com> wrote:
>
>
> > In my project,I want to update the lucene's index when there has
> database insertion operations,in this way,my users could search the fresh
> information immediately if someone inserted the information into
> database.That's what I need,could someone give me suggestions to implement
> my need?Thank you very much!
>
> Have a look at this recent thread:
>
>   http://www.gossamer-threads.com/lists/lucene/java-user/38636
>
> That was about DB2 in particular but it should carry over to your DB.
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>