You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Renaud Delbru <re...@deri.org> on 2010/12/16 15:39:26 UTC
Why does Solr commit block indexing?
Hi,
See log at [1].
We are using the latest snapshot of lucene_branch3.1. We have configured
Solr to use the ConcurrentMergeScheduler:
<mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
When a commit() runs, it blocks indexing (all imcoming update requests
are blocked until the commit operation is finished) ... at the end of
the log we notice a 4 minute gap during which none of the solr cients
trying to add data receive any attention.
This is a bit annoying as it leads to timeout exception on the client
side. Here, the commit time is only 4 minutes, but it can be larger if
there are merges of large segments
I thought Solr was able to handle commits and updates at the same time:
the commit operation should be done in the background, and the server
still continue to receive update requests (maybe at a slower rate than
normal). But it looks like it is not the case. Is it a normal behaviour ?
[1] http://pastebin.com/KPkusyVb
Regards
--
Renaud Delbru
Re: Why does Solr commit block indexing?
Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Fri, Dec 17, 2010 at 8:05 AM, Grant Ingersoll <gs...@apache.org> wrote:
> I'm not sure if there is a issue open, but I know I've talked w/ Yonik about this and a few other changes to the DirectUpdateHandler2 in the past. It does indeed need to be fixed.
It stems from the APIs that were available at the time in Lucene 1.4.
IIRC, Mark worked up a patch that avoided ever closing the reader I
think, and delegated more of the concurrency control to Lucene (since
it can handle it these days). I think maybe there was just a problem
with rollback or something...
-Yonik
http://www.lucidimagination.com
> -Grant
>
> On Dec 17, 2010, at 7:04 AM, Renaud Delbru wrote:
>
>> Hi Michael,
>>
>> thanks for your answer.
>> Do the Solr team is aware of the problem ? Is there an issue opened about this, or ongoing work about that ?
>>
>> Regards,
>> --
>> Renaud Delbru
>>
>> On 16/12/10 16:45, Michael McCandless wrote:
>>> Unfortunately, (I think?) Solr currently commits by closing the
>>> IndexWriter, which must wait for any running merges to complete, and
>>> then opening a new one.
>>>
>>> This is really rather silly because IndexWriter has had its own commit
>>> method (which does not block ongoing indexing nor merging) for quite
>>> some time now.
>>>
>>> I'm not sure why we haven't switched over already... there must be
>>> some trickiness involved.
>>>
>>> Mike
>>>
>>> On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbru<re...@deri.org> wrote:
>>>> Hi,
>>>>
>>>> See log at [1].
>>>> We are using the latest snapshot of lucene_branch3.1. We have configured
>>>> Solr to use the ConcurrentMergeScheduler:
>>>> <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
>>>>
>>>> When a commit() runs, it blocks indexing (all imcoming update requests are
>>>> blocked until the commit operation is finished) ... at the end of the log we
>>>> notice a 4 minute gap during which none of the solr cients trying to add
>>>> data receive any attention.
>>>> This is a bit annoying as it leads to timeout exception on the client side.
>>>> Here, the commit time is only 4 minutes, but it can be larger if there are
>>>> merges of large segments
>>>> I thought Solr was able to handle commits and updates at the same time: the
>>>> commit operation should be done in the background, and the server still
>>>> continue to receive update requests (maybe at a slower rate than normal).
>>>> But it looks like it is not the case. Is it a normal behaviour ?
>>>>
>>>> [1] http://pastebin.com/KPkusyVb
>>>>
>>>> Regards
>>>> --
>>>> Renaud Delbru
>>>>
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem docs using Solr/Lucene:
> http://www.lucidimagination.com/search
>
>
Re: Why does Solr commit block indexing?
Posted by Renaud Delbru <re...@deri.org>.
Hi Grant,
looking forward for a fix ;o). Such a fix would improve quite a lot the
performance of Solr update throughput (even if its performance is
already quite impressive).
cheers
--
Renaud Delbru
On 17/12/10 13:05, Grant Ingersoll wrote:
> I'm not sure if there is a issue open, but I know I've talked w/ Yonik about this and a few other changes to the DirectUpdateHandler2 in the past. It does indeed need to be fixed.
>
> -Grant
>
> On Dec 17, 2010, at 7:04 AM, Renaud Delbru wrote:
>
>> Hi Michael,
>>
>> thanks for your answer.
>> Do the Solr team is aware of the problem ? Is there an issue opened about this, or ongoing work about that ?
>>
>> Regards,
>> --
>> Renaud Delbru
>>
>> On 16/12/10 16:45, Michael McCandless wrote:
>>> Unfortunately, (I think?) Solr currently commits by closing the
>>> IndexWriter, which must wait for any running merges to complete, and
>>> then opening a new one.
>>>
>>> This is really rather silly because IndexWriter has had its own commit
>>> method (which does not block ongoing indexing nor merging) for quite
>>> some time now.
>>>
>>> I'm not sure why we haven't switched over already... there must be
>>> some trickiness involved.
>>>
>>> Mike
>>>
>>> On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbru<re...@deri.org> wrote:
>>>> Hi,
>>>>
>>>> See log at [1].
>>>> We are using the latest snapshot of lucene_branch3.1. We have configured
>>>> Solr to use the ConcurrentMergeScheduler:
>>>> <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
>>>>
>>>> When a commit() runs, it blocks indexing (all imcoming update requests are
>>>> blocked until the commit operation is finished) ... at the end of the log we
>>>> notice a 4 minute gap during which none of the solr cients trying to add
>>>> data receive any attention.
>>>> This is a bit annoying as it leads to timeout exception on the client side.
>>>> Here, the commit time is only 4 minutes, but it can be larger if there are
>>>> merges of large segments
>>>> I thought Solr was able to handle commits and updates at the same time: the
>>>> commit operation should be done in the background, and the server still
>>>> continue to receive update requests (maybe at a slower rate than normal).
>>>> But it looks like it is not the case. Is it a normal behaviour ?
>>>>
>>>> [1] http://pastebin.com/KPkusyVb
>>>>
>>>> Regards
>>>> --
>>>> Renaud Delbru
>>>>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem docs using Solr/Lucene:
> http://www.lucidimagination.com/search
>
Re: Why does Solr commit block indexing?
Posted by Grant Ingersoll <gs...@apache.org>.
I'm not sure if there is a issue open, but I know I've talked w/ Yonik about this and a few other changes to the DirectUpdateHandler2 in the past. It does indeed need to be fixed.
-Grant
On Dec 17, 2010, at 7:04 AM, Renaud Delbru wrote:
> Hi Michael,
>
> thanks for your answer.
> Do the Solr team is aware of the problem ? Is there an issue opened about this, or ongoing work about that ?
>
> Regards,
> --
> Renaud Delbru
>
> On 16/12/10 16:45, Michael McCandless wrote:
>> Unfortunately, (I think?) Solr currently commits by closing the
>> IndexWriter, which must wait for any running merges to complete, and
>> then opening a new one.
>>
>> This is really rather silly because IndexWriter has had its own commit
>> method (which does not block ongoing indexing nor merging) for quite
>> some time now.
>>
>> I'm not sure why we haven't switched over already... there must be
>> some trickiness involved.
>>
>> Mike
>>
>> On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbru<re...@deri.org> wrote:
>>> Hi,
>>>
>>> See log at [1].
>>> We are using the latest snapshot of lucene_branch3.1. We have configured
>>> Solr to use the ConcurrentMergeScheduler:
>>> <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
>>>
>>> When a commit() runs, it blocks indexing (all imcoming update requests are
>>> blocked until the commit operation is finished) ... at the end of the log we
>>> notice a 4 minute gap during which none of the solr cients trying to add
>>> data receive any attention.
>>> This is a bit annoying as it leads to timeout exception on the client side.
>>> Here, the commit time is only 4 minutes, but it can be larger if there are
>>> merges of large segments
>>> I thought Solr was able to handle commits and updates at the same time: the
>>> commit operation should be done in the background, and the server still
>>> continue to receive update requests (maybe at a slower rate than normal).
>>> But it looks like it is not the case. Is it a normal behaviour ?
>>>
>>> [1] http://pastebin.com/KPkusyVb
>>>
>>> Regards
>>> --
>>> Renaud Delbru
>>>
>
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem docs using Solr/Lucene:
http://www.lucidimagination.com/search
Re: Why does Solr commit block indexing?
Posted by Renaud Delbru <re...@deri.org>.
Hi Michael,
thanks for your answer.
Do the Solr team is aware of the problem ? Is there an issue opened
about this, or ongoing work about that ?
Regards,
--
Renaud Delbru
On 16/12/10 16:45, Michael McCandless wrote:
> Unfortunately, (I think?) Solr currently commits by closing the
> IndexWriter, which must wait for any running merges to complete, and
> then opening a new one.
>
> This is really rather silly because IndexWriter has had its own commit
> method (which does not block ongoing indexing nor merging) for quite
> some time now.
>
> I'm not sure why we haven't switched over already... there must be
> some trickiness involved.
>
> Mike
>
> On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbru<re...@deri.org> wrote:
>> Hi,
>>
>> See log at [1].
>> We are using the latest snapshot of lucene_branch3.1. We have configured
>> Solr to use the ConcurrentMergeScheduler:
>> <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
>>
>> When a commit() runs, it blocks indexing (all imcoming update requests are
>> blocked until the commit operation is finished) ... at the end of the log we
>> notice a 4 minute gap during which none of the solr cients trying to add
>> data receive any attention.
>> This is a bit annoying as it leads to timeout exception on the client side.
>> Here, the commit time is only 4 minutes, but it can be larger if there are
>> merges of large segments
>> I thought Solr was able to handle commits and updates at the same time: the
>> commit operation should be done in the background, and the server still
>> continue to receive update requests (maybe at a slower rate than normal).
>> But it looks like it is not the case. Is it a normal behaviour ?
>>
>> [1] http://pastebin.com/KPkusyVb
>>
>> Regards
>> --
>> Renaud Delbru
>>
Re: Why does Solr commit block indexing?
Posted by Michael McCandless <lu...@mikemccandless.com>.
Unfortunately, (I think?) Solr currently commits by closing the
IndexWriter, which must wait for any running merges to complete, and
then opening a new one.
This is really rather silly because IndexWriter has had its own commit
method (which does not block ongoing indexing nor merging) for quite
some time now.
I'm not sure why we haven't switched over already... there must be
some trickiness involved.
Mike
On Thu, Dec 16, 2010 at 9:39 AM, Renaud Delbru <re...@deri.org> wrote:
> Hi,
>
> See log at [1].
> We are using the latest snapshot of lucene_branch3.1. We have configured
> Solr to use the ConcurrentMergeScheduler:
> <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
>
> When a commit() runs, it blocks indexing (all imcoming update requests are
> blocked until the commit operation is finished) ... at the end of the log we
> notice a 4 minute gap during which none of the solr cients trying to add
> data receive any attention.
> This is a bit annoying as it leads to timeout exception on the client side.
> Here, the commit time is only 4 minutes, but it can be larger if there are
> merges of large segments
> I thought Solr was able to handle commits and updates at the same time: the
> commit operation should be done in the background, and the server still
> continue to receive update requests (maybe at a slower rate than normal).
> But it looks like it is not the case. Is it a normal behaviour ?
>
> [1] http://pastebin.com/KPkusyVb
>
> Regards
> --
> Renaud Delbru
>