You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Phani Chaitanya <pv...@gmail.com> on 2013/09/13 23:21:00 UTC

Committing when indexing in parallel

I'm wondering what happens to commit while we are indexing in parallel in
Solr. Are the indexing update requests blocked until the commit finishes ?

Lets say I've a process P1 which issued a commit request and there is
another process P2 which is still indexing to the same index. What happens
to the index in that scenario. Are the P2 indexing requests blocked until P1
commit request finishes ?

I'm just wondering about what is the behavior of Solr in the above case.



-----
Phani Chaitanya
--
View this message in context: http://lucene.472066.n3.nabble.com/Committing-when-indexing-in-parallel-tp4089953.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Committing when indexing in parallel

Posted by Phani Chaitanya <pv...@gmail.com>.
Thanks Jack. In this case my question is just out of curiosity around what
happens in the scenario I mentioned. Nothing else.

Regards,
Phani.



-----
Phani Chaitanya
--
View this message in context: http://lucene.472066.n3.nabble.com/Committing-when-indexing-in-parallel-tp4089953p4090243.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Committing when indexing in parallel

Posted by Jack Krupansky <ja...@basetechnology.com>.
You really are barking up the wrong tree here. Solr is a search engine, 
designed for batch update and "eventual consistency". This fantasy you have 
of knowing exactly when a document is committed is completely inappropriate 
with Solr. Sure, you can in fact do a hard commit at any time to guarantee 
that recent updates are immediately searchable, but that is strongly 
discouraged for performance reasons, since Solr is batch update oriented.

And the level of detail you are requesting is merely how Solr happens to 
work today and is not necessarily guaranteed for future releases - since the 
guaranteed model is only for commits with eventual consistency.

It appears that you are trying to imagine Solr as a traditional, 
transaction-based database, when that is not the case.

I've asked you before to disclose what problem you are really trying to 
solve, and so far you have not yet let us in on your secret. You are 
certainly welcome to peruse the Solr and Lucene source code if your goal is 
merely idle curiosity, but you really should not be designing a Solr-based 
application around this non-guaranteed level of detail.

Soft commit, commit within, hard commit, real time get, and eventual 
consistency are the proper tools to design a Solr-based application around.

And if you wish to have multiple, uncoordinated update streams, you need to 
relax your requirements for eventual consistency even further.

In short, to summarize again, Solr is not a transaction-based database, but 
instead is a batch-oriented search engine with eventual consistency. Focus 
on exploiting Solr's strengths, not trying to treat Solr as something that 
it is not.

-- Jack Krupansky

-----Original Message----- 
From: Phani Chaitanya
Sent: Saturday, September 14, 2013 7:36 PM
To: solr-user@lucene.apache.org
Subject: Re: Committing when indexing in parallel

Thanks Erick. I completely did not get the point you are trying to make 
w.r.t
my question. I'll add it again according to your example.

In the example you gave w.r.t P1.1, P2.1, P1.2, P2.2 and P1 issues a commit,
I understand that all documents are committed.

Now what happens when P1 issues a commit and P2 sends another document, say
P2.3, to index at the same time ?

As you said the requests are treated as if they are serial - if P1.commit is
the first one among P1.commit & P2.3, P2.3 will be indexed into a new
segment ?

Regards,
Phani.



-----
Phani Chaitanya
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Committing-when-indexing-in-parallel-tp4089953p4090120.html
Sent from the Solr - User mailing list archive at Nabble.com. 


Re: Committing when indexing in parallel

Posted by Phani Chaitanya <pv...@gmail.com>.
Thanks Erick. I completely did not get the point you are trying to make w.r.t
my question. I'll add it again according to your example.

In the example you gave w.r.t P1.1, P2.1, P1.2, P2.2 and P1 issues a commit,
I understand that all documents are committed.

Now what happens when P1 issues a commit and P2 sends another document, say
P2.3, to index at the same time ?

As you said the requests are treated as if they are serial - if P1.commit is
the first one among P1.commit & P2.3, P2.3 will be indexed into a new
segment ?

Regards,
Phani.



-----
Phani Chaitanya
--
View this message in context: http://lucene.472066.n3.nabble.com/Committing-when-indexing-in-parallel-tp4089953p4090120.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Committing when indexing in parallel

Posted by Erick Erickson <er...@gmail.com>.
First, there is no blocking, your P1 and P2 indexing
processes should continue on just fine. That nit aside...

No matter how many processes are indexing to Solr,
the documents are treated as though they came in
serial requests as far as committing is concerned.
Let's say you send the data to Solr from P1 and P2
interleaved and let's claim they're perfectly interleaved as
P1.1
P2.1
P1.2
P2.2
commit from p1. All the docs are committed from both
processes

Committing from the indexing side is simply flushing
anything currently in memory to the current segments
and opening a new segment where new updates will
be written. There's a tiny bit of delay while this happens
I guess..

HTH
Erick


On Fri, Sep 13, 2013 at 5:21 PM, Phani Chaitanya <pv...@gmail.com>wrote:

>
> I'm wondering what happens to commit while we are indexing in parallel in
> Solr. Are the indexing update requests blocked until the commit finishes ?
>
> Lets say I've a process P1 which issued a commit request and there is
> another process P2 which is still indexing to the same index. What happens
> to the index in that scenario. Are the P2 indexing requests blocked until
> P1
> commit request finishes ?
>
> I'm just wondering about what is the behavior of Solr in the above case.
>
>
>
> -----
> Phani Chaitanya
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Committing-when-indexing-in-parallel-tp4089953.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Committing when indexing in parallel

Posted by Amit Jha <sh...@gmail.com>.
Hi,

As per my knowledge, any number of requests can be issued in parallel for index the documents. Any commit request will write them to index. 

So if P1 issued a commit then all documents of P2 those are eligible get committed and remaining documents will get committed on other commit request. 


Rgds
AJ

On 14-Sep-2013, at 2:51, Phani Chaitanya <pv...@gmail.com> wrote:

> 
> I'm wondering what happens to commit while we are indexing in parallel in
> Solr. Are the indexing update requests blocked until the commit finishes ?
> 
> Lets say I've a process P1 which issued a commit request and there is
> another process P2 which is still indexing to the same index. What happens
> to the index in that scenario. Are the P2 indexing requests blocked until P1
> commit request finishes ?
> 
> I'm just wondering about what is the behavior of Solr in the above case.
> 
> 
> 
> -----
> Phani Chaitanya
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Committing-when-indexing-in-parallel-tp4089953.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Committing when indexing in parallel

Posted by Phani Chaitanya <pv...@gmail.com>.
Thanks Yonik.



-----
Phani Chaitanya
--
View this message in context: http://lucene.472066.n3.nabble.com/Committing-when-indexing-in-parallel-tp4089953p4090246.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Committing when indexing in parallel

Posted by Yonik Seeley <yo...@lucidworks.com>.
On Fri, Sep 13, 2013 at 5:21 PM, Phani Chaitanya <pv...@gmail.com> wrote:
> I'm wondering what happens to commit while we are indexing in parallel in
> Solr. Are the indexing update requests blocked until the commit finishes ?

Nope.  The add (updates) and commit can proceed in parallel.
Because of this, the add that is happening at the same time as the
commit may or may not make it into the commit.  The add would make it
into the next commit after the add completes of course.

-Yonik
http://lucidworks.com