You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Sergey Kabashnyuk <ks...@gmail.com> on 2008/02/22 10:42:54 UTC

Transactions in Lucene

Hi.
I have a question about transactions  in Lucene.

Lets say I have 1000 Documents and want to add all of them or none of  
them(if something happen) to the index.

What the best strategy to do it in multithreaded environment?

Sergey Kabashnyuk.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Transactions in Lucene

Posted by Michael McCandless <lu...@mikemccandless.com>.
Lucene only allows for one open transaction at a time per index (IndexWriter).

So if you have multiple threads, and each of them is making changes,
all those changes will fall under one transaction, controlled by the
single writer the threads are sharing.

If you really want separate transactions, I think you have to do this
against separate indices, and then merge the indices together
(IndexWriter.addIndexes) into a single one, or, leave them separate
and use multiple searchers over them.

Mike

Sergey Kabashnyuk <ks...@gmail.com> wrote:
> Thanks Mike for you replay.
>
>  What about multithreading? If in one transaction can make both
>  adding and deleting documents and in the same time can be more then
>  one open transaction. Should each thread use it personal Index and
>  after commiting transaction somehow it merges or all threads must use one
>  index?
>
>  Sergey Kabashnyuk.
>
>
>  > You should open the IndexWriter with autoCommit=false, then make
>  > changes.  During this time, any reader that opens the index will not
>  > see any changes you are making.
>  >
>  > Then, you can call close() to commit the changes to the index, or
>  > abort() to rollback the index to the starting state (when the writer
>  > was opened).
>  >
>  > Note that in 3.0, IndexWriter will be hardwired to autoCommit=false
>  > (in trunk those ctors taking autoCommit are deprecated) and a new
>  > commit() method can be used to periodically commit without closing if
>  > you want to.
>  >
>  > Mike
>  >
>  > Sergey Kabashnyuk <ks...@gmail.com> wrote:
>  >> Hi.
>  >>  I have a question about transactions  in Lucene.
>  >>
>  >>  Lets say I have 1000 Documents and want to add all of them or none of
>  >>  them(if something happen) to the index.
>  >>
>  >>  What the best strategy to do it in multithreaded environment?
>  >>
>  >>  Sergey Kabashnyuk.
>  >>
>  >>  ---------------------------------------------------------------------
>  >>  To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>  >>  For additional commands, e-mail: java-user-help@lucene.apache.org
>  >>
>  >>
>  >
>  > ---------------------------------------------------------------------
>  > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>  > For additional commands, e-mail: java-user-help@lucene.apache.org
>  >
>
>
>
>
>  --
>  Отправлено M2, революционной почтовой программой Opera:
>  http://www.opera.com/mail/
>
>  ---------------------------------------------------------------------
>  To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>  For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Transactions in Lucene

Posted by Sergey Kabashnyuk <ks...@gmail.com>.
Thanks Mike for you replay.

What about multithreading? If in one transaction can make both
adding and deleting documents and in the same time can be more then
one open transaction. Should each thread use it personal Index and
after commiting transaction somehow it merges or all threads must use one  
index?

Sergey Kabashnyuk.


> You should open the IndexWriter with autoCommit=false, then make
> changes.  During this time, any reader that opens the index will not
> see any changes you are making.
>
> Then, you can call close() to commit the changes to the index, or
> abort() to rollback the index to the starting state (when the writer
> was opened).
>
> Note that in 3.0, IndexWriter will be hardwired to autoCommit=false
> (in trunk those ctors taking autoCommit are deprecated) and a new
> commit() method can be used to periodically commit without closing if
> you want to.
>
> Mike
>
> Sergey Kabashnyuk <ks...@gmail.com> wrote:
>> Hi.
>>  I have a question about transactions  in Lucene.
>>
>>  Lets say I have 1000 Documents and want to add all of them or none of
>>  them(if something happen) to the index.
>>
>>  What the best strategy to do it in multithreaded environment?
>>
>>  Sergey Kabashnyuk.
>>
>>  ---------------------------------------------------------------------
>>  To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>  For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>



-- 
Отправлено M2, революционной почтовой программой Opera:  
http://www.opera.com/mail/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Transactions in Lucene

Posted by Michael McCandless <lu...@mikemccandless.com>.
Super!  Thanks for testing this & posting...

Mike

<sp...@gmx.eu> wrote:

>> I don't think creating an IndexWriter is very expensive at all.
>
> Ah ok. I tested it. Creating an IndexWriter on an index with 10.000  
> docs
> (about 15 MB) takes about 200 ms.
>
> This is a very cheap operation for me ;)
>
> I only saw the many calls in init() which reads files and so on and
> therefore I tought it could be expensive.
>
> Thank you!
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Transactions in Lucene

Posted by sp...@gmx.eu.
> I don't think creating an IndexWriter is very expensive at all.

Ah ok. I tested it. Creating an IndexWriter on an index with 10.000 docs
(about 15 MB) takes about 200 ms.

This is a very cheap operation for me ;)

I only saw the many calls in init() which reads files and so on and
therefore I tought it could be expensive.

Thank you!


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Transactions in Lucene

Posted by Michael McCandless <lu...@mikemccandless.com>.
I don't think creating an IndexWriter is very expensive at all.

Especially compared to creating an IndexReader, for example.

Mike

<sp...@gmx.eu> wrote:

>>> For what time is the 2.4 release planned?
>>
>> Not really sure at this point ...
>
> Hm. Digging into IndexWriter#init it seems that this is a really  
> expensive
> operation and thus my self made "commit" too. Isn't it?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Transactions in Lucene

Posted by sp...@gmx.eu.
> > For what time is the 2.4 release planned?
> 
> Not really sure at this point ...

Hm. Digging into IndexWriter#init it seems that this is a really expensive
operation and thus my self made "commit" too. Isn't it?


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Transactions in Lucene

Posted by Michael McCandless <lu...@mikemccandless.com>.
<sp...@gmx.eu> wrote:

> For what time is the 2.4 release planned?

Not really sure at this point ...

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Transactions in Lucene

Posted by sp...@gmx.eu.
> In 2.4, commit() sets the rollback point.  So abort() will 
> roll index  
> back to the last time you called commit() (or to when the writer was  
> opened if you haven't called commit).
> 
> In 2.3, your only choice is to close & re-open the writer to reset  
> the rollback point.

OK, thank you.

For what time is the 2.4 release planned?


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Transactions in Lucene

Posted by Michael McCandless <lu...@mikemccandless.com>.
<sp...@gmx.eu> wrote:

>> Then, you can call close() to commit the changes to the index, or
>> abort() to rollback the index to the starting state (when the writer
>> was opened).
>
> As I understand the docs, the index will get rolled back to the  
> state as it
> was when the index was opened.
>
> How can I achieve a rollback which only goes back to the state of  
> the last
> flush (2.3) / commit (2.4/3.0)?

In 2.4, commit() sets the rollback point.  So abort() will roll index  
back to the last time you called commit() (or to when the writer was  
opened if you haven't called commit).

In 2.3, your only choice is to close & re-open the writer to reset  
the rollback point.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Transactions in Lucene

Posted by sp...@gmx.eu.
> Then, you can call close() to commit the changes to the index, or
> abort() to rollback the index to the starting state (when the writer
> was opened).

As I understand the docs, the index will get rolled back to the state as it
was when the index was opened.

How can I achieve a rollback which only goes back to the state of the last
flush (2.3) / commit (2.4/3.0)?

Until now I call flush to commit, but I do not know how to rollback...

Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Transactions in Lucene

Posted by Michael McCandless <lu...@mikemccandless.com>.
You should open the IndexWriter with autoCommit=false, then make
changes.  During this time, any reader that opens the index will not
see any changes you are making.

Then, you can call close() to commit the changes to the index, or
abort() to rollback the index to the starting state (when the writer
was opened).

Note that in 3.0, IndexWriter will be hardwired to autoCommit=false
(in trunk those ctors taking autoCommit are deprecated) and a new
commit() method can be used to periodically commit without closing if
you want to.

Mike

Sergey Kabashnyuk <ks...@gmail.com> wrote:
> Hi.
>  I have a question about transactions  in Lucene.
>
>  Lets say I have 1000 Documents and want to add all of them or none of
>  them(if something happen) to the index.
>
>  What the best strategy to do it in multithreaded environment?
>
>  Sergey Kabashnyuk.
>
>  ---------------------------------------------------------------------
>  To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>  For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org