You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Renaud Delbru <re...@deri.org> on 2010/12/16 15:39:26 UTC

Why does Solr commit block indexing?

Hi,

See log at [1].
We are using the latest snapshot of lucene_branch3.1. We have configured 
Solr to use the ConcurrentMergeScheduler:
<mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>

When a commit() runs, it blocks indexing (all imcoming update requests 
are blocked until the commit operation is finished) ... at the end of 
the log we notice a 4 minute gap during which none of the solr cients 
trying to add data receive any attention.
This is a bit annoying as it leads to timeout exception on the client 
side. Here, the commit time is only 4 minutes, but it can be larger if 
there are merges of large segments
I thought Solr was able to handle commits and updates at the same time: 
the commit operation should be done in the background, and the server 
still continue to receive update requests (maybe at a slower rate than 
normal). But it looks like it is not the case. Is it a normal behaviour ?

[1] http://pastebin.com/KPkusyVb

Regards
-- 
Renaud Delbru

Re: Why does Solr commit block indexing?

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Fri, Dec 17, 2010 at 8:05 AM, Grant Ingersoll <gs...@apache.org> wrote:
> I'm not sure if there is a issue open, but I know I've talked w/ Yonik about this and a few other changes to the DirectUpdateHandler2 in the past.  It does indeed need to be fixed.

It stems from the APIs that were available at the time in Lucene 1.4.
IIRC, Mark worked up a patch that avoided ever closing the reader I
think, and delegated more of the concurrency control to Lucene (since
it can handle it these days).  I think maybe there was just a problem
with rollback or something...

-Yonik
http://www.lucidimagination.com




> -Grant
>
> On Dec 17, 2010, at 7:04 AM, Renaud Delbru wrote:
>
>> Hi Michael,
>>
>> thanks for your answer.
>> Do the Solr team is aware of the problem ? Is there an issue opened about this, or ongoing work about that ?
>>
>> Regards,
>> --
>> Renaud Delbru
>>
>> On 16/12/10 16:45, Michael McCandless wrote:
>>> Unfortunately, (I think?) Solr currently commits by closing the
>>> IndexWriter, which must wait for any running merges to complete, and
>>> then opening a new one.
>>>
>>> This is really rather silly because IndexWriter has had its own commit
>>> method (which does not block ongoing indexing nor merging) for quite
>>> some time now.
>>>
>>> I'm not sure why we haven't switched over already... there must be
>>> some trickiness involved.
>>>
>>> Mike
>>>
>>> On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbru<re...@deri.org>  wrote:
>>>> Hi,
>>>>
>>>> See log at [1].
>>>> We are using the latest snapshot of lucene_branch3.1. We have configured
>>>> Solr to use the ConcurrentMergeScheduler:
>>>> <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
>>>>
>>>> When a commit() runs, it blocks indexing (all imcoming update requests are
>>>> blocked until the commit operation is finished) ... at the end of the log we
>>>> notice a 4 minute gap during which none of the solr cients trying to add
>>>> data receive any attention.
>>>> This is a bit annoying as it leads to timeout exception on the client side.
>>>> Here, the commit time is only 4 minutes, but it can be larger if there are
>>>> merges of large segments
>>>> I thought Solr was able to handle commits and updates at the same time: the
>>>> commit operation should be done in the background, and the server still
>>>> continue to receive update requests (maybe at a slower rate than normal).
>>>> But it looks like it is not the case. Is it a normal behaviour ?
>>>>
>>>> [1] http://pastebin.com/KPkusyVb
>>>>
>>>> Regards
>>>> --
>>>> Renaud Delbru
>>>>
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem docs using Solr/Lucene:
> http://www.lucidimagination.com/search
>
>

Re: Why does Solr commit block indexing?

Posted by Renaud Delbru <re...@deri.org>.
Hi Grant,

looking forward for a fix ;o). Such a fix would improve quite a lot the 
performance of Solr update throughput (even if its performance is 
already quite impressive).

cheers
-- 
Renaud Delbru

On 17/12/10 13:05, Grant Ingersoll wrote:
> I'm not sure if there is a issue open, but I know I've talked w/ Yonik about this and a few other changes to the DirectUpdateHandler2 in the past.  It does indeed need to be fixed.
>
> -Grant
>
> On Dec 17, 2010, at 7:04 AM, Renaud Delbru wrote:
>
>> Hi Michael,
>>
>> thanks for your answer.
>> Do the Solr team is aware of the problem ? Is there an issue opened about this, or ongoing work about that ?
>>
>> Regards,
>> -- 
>> Renaud Delbru
>>
>> On 16/12/10 16:45, Michael McCandless wrote:
>>> Unfortunately, (I think?) Solr currently commits by closing the
>>> IndexWriter, which must wait for any running merges to complete, and
>>> then opening a new one.
>>>
>>> This is really rather silly because IndexWriter has had its own commit
>>> method (which does not block ongoing indexing nor merging) for quite
>>> some time now.
>>>
>>> I'm not sure why we haven't switched over already... there must be
>>> some trickiness involved.
>>>
>>> Mike
>>>
>>> On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbru<re...@deri.org>   wrote:
>>>> Hi,
>>>>
>>>> See log at [1].
>>>> We are using the latest snapshot of lucene_branch3.1. We have configured
>>>> Solr to use the ConcurrentMergeScheduler:
>>>> <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
>>>>
>>>> When a commit() runs, it blocks indexing (all imcoming update requests are
>>>> blocked until the commit operation is finished) ... at the end of the log we
>>>> notice a 4 minute gap during which none of the solr cients trying to add
>>>> data receive any attention.
>>>> This is a bit annoying as it leads to timeout exception on the client side.
>>>> Here, the commit time is only 4 minutes, but it can be larger if there are
>>>> merges of large segments
>>>> I thought Solr was able to handle commits and updates at the same time: the
>>>> commit operation should be done in the background, and the server still
>>>> continue to receive update requests (maybe at a slower rate than normal).
>>>> But it looks like it is not the case. Is it a normal behaviour ?
>>>>
>>>> [1] http://pastebin.com/KPkusyVb
>>>>
>>>> Regards
>>>> --
>>>> Renaud Delbru
>>>>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem docs using Solr/Lucene:
> http://www.lucidimagination.com/search
>


Re: Why does Solr commit block indexing?

Posted by Grant Ingersoll <gs...@apache.org>.
I'm not sure if there is a issue open, but I know I've talked w/ Yonik about this and a few other changes to the DirectUpdateHandler2 in the past.  It does indeed need to be fixed.

-Grant

On Dec 17, 2010, at 7:04 AM, Renaud Delbru wrote:

> Hi Michael,
> 
> thanks for your answer.
> Do the Solr team is aware of the problem ? Is there an issue opened about this, or ongoing work about that ?
> 
> Regards,
> -- 
> Renaud Delbru
> 
> On 16/12/10 16:45, Michael McCandless wrote:
>> Unfortunately, (I think?) Solr currently commits by closing the
>> IndexWriter, which must wait for any running merges to complete, and
>> then opening a new one.
>> 
>> This is really rather silly because IndexWriter has had its own commit
>> method (which does not block ongoing indexing nor merging) for quite
>> some time now.
>> 
>> I'm not sure why we haven't switched over already... there must be
>> some trickiness involved.
>> 
>> Mike
>> 
>> On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbru<re...@deri.org>  wrote:
>>> Hi,
>>> 
>>> See log at [1].
>>> We are using the latest snapshot of lucene_branch3.1. We have configured
>>> Solr to use the ConcurrentMergeScheduler:
>>> <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
>>> 
>>> When a commit() runs, it blocks indexing (all imcoming update requests are
>>> blocked until the commit operation is finished) ... at the end of the log we
>>> notice a 4 minute gap during which none of the solr cients trying to add
>>> data receive any attention.
>>> This is a bit annoying as it leads to timeout exception on the client side.
>>> Here, the commit time is only 4 minutes, but it can be larger if there are
>>> merges of large segments
>>> I thought Solr was able to handle commits and updates at the same time: the
>>> commit operation should be done in the background, and the server still
>>> continue to receive update requests (maybe at a slower rate than normal).
>>> But it looks like it is not the case. Is it a normal behaviour ?
>>> 
>>> [1] http://pastebin.com/KPkusyVb
>>> 
>>> Regards
>>> --
>>> Renaud Delbru
>>> 
> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem docs using Solr/Lucene:
http://www.lucidimagination.com/search


Re: Why does Solr commit block indexing?

Posted by Renaud Delbru <re...@deri.org>.
Hi Michael,

thanks for your answer.
Do the Solr team is aware of the problem ? Is there an issue opened 
about this, or ongoing work about that ?

Regards,
-- 
Renaud Delbru

On 16/12/10 16:45, Michael McCandless wrote:
> Unfortunately, (I think?) Solr currently commits by closing the
> IndexWriter, which must wait for any running merges to complete, and
> then opening a new one.
>
> This is really rather silly because IndexWriter has had its own commit
> method (which does not block ongoing indexing nor merging) for quite
> some time now.
>
> I'm not sure why we haven't switched over already... there must be
> some trickiness involved.
>
> Mike
>
> On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbru<re...@deri.org>  wrote:
>> Hi,
>>
>> See log at [1].
>> We are using the latest snapshot of lucene_branch3.1. We have configured
>> Solr to use the ConcurrentMergeScheduler:
>> <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
>>
>> When a commit() runs, it blocks indexing (all imcoming update requests are
>> blocked until the commit operation is finished) ... at the end of the log we
>> notice a 4 minute gap during which none of the solr cients trying to add
>> data receive any attention.
>> This is a bit annoying as it leads to timeout exception on the client side.
>> Here, the commit time is only 4 minutes, but it can be larger if there are
>> merges of large segments
>> I thought Solr was able to handle commits and updates at the same time: the
>> commit operation should be done in the background, and the server still
>> continue to receive update requests (maybe at a slower rate than normal).
>> But it looks like it is not the case. Is it a normal behaviour ?
>>
>> [1] http://pastebin.com/KPkusyVb
>>
>> Regards
>> --
>> Renaud Delbru
>>


Re: Why does Solr commit block indexing?

Posted by Michael McCandless <lu...@mikemccandless.com>.
Unfortunately, (I think?) Solr currently commits by closing the
IndexWriter, which must wait for any running merges to complete, and
then opening a new one.

This is really rather silly because IndexWriter has had its own commit
method (which does not block ongoing indexing nor merging) for quite
some time now.

I'm not sure why we haven't switched over already... there must be
some trickiness involved.

Mike

On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbru <re...@deri.org> wrote:
> Hi,
>
> See log at [1].
> We are using the latest snapshot of lucene_branch3.1. We have configured
> Solr to use the ConcurrentMergeScheduler:
> <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
>
> When a commit() runs, it blocks indexing (all imcoming update requests are
> blocked until the commit operation is finished) ... at the end of the log we
> notice a 4 minute gap during which none of the solr cients trying to add
> data receive any attention.
> This is a bit annoying as it leads to timeout exception on the client side.
> Here, the commit time is only 4 minutes, but it can be larger if there are
> merges of large segments
> I thought Solr was able to handle commits and updates at the same time: the
> commit operation should be done in the background, and the server still
> continue to receive update requests (maybe at a slower rate than normal).
> But it looks like it is not the case. Is it a normal behaviour ?
>
> [1] http://pastebin.com/KPkusyVb
>
> Regards
> --
> Renaud Delbru
>