You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Colin Bartolome <co...@e-e.com> on 2014/02/18 02:06:44 UTC
Preventing multiple on-deck searchers without causing failed commits
We're using Solr version 4.2.1, in case new functionality has helped with
this issue.
We have our Solr servers doing automatic soft commits with maxTime=1000.
We also have a scheduled job that triggers a hard commit every fifteen
minutes. When one of these hard commits happens while a soft commit is
already in progress, we get that ubiquitous warning:
PERFORMANCE WARNING: Overlapping onDeckSearchers=2
Recently, we had an occasion to have a second scheduled job also issue a
hard commit every now and then. Since our maxWarmingSearchers value was
set to the default, 2, we occasionally had a hard commit trigger when two
other searchers were already warming up, which led to this:
org.apache.solr.client.solrj.SolrServerException: No live SolrServers
available to handle this request
as the servers started responded with a 503 HTTP response.
It seems like automatic soft commits wait until the hard commits are out
of the way before they proceed. Is there a way to do the same for hard
commits? Since we're passing waitSearcher=true in the update request that
triggers the hard commits, I would expect the request to block until the
server had enough headroom to service the commit. I did not expect that
we'd start getting 503 responses.
Is there a way to pull this off, either via some extra request parameters
or via some server-side configuration?
Re: Preventing multiple on-deck searchers without causing failed commits
Posted by Greg Walters <gr...@answers.com>.
> A quick peek at the code (branch_4x, SolrCore.java, starting at line 1647) seems to confirm this.
It seems my understanding of that option was wrong! Thanks for correcting me Shawn.
Greg
On Feb 19, 2014, at 11:19 AM, Shawn Heisey <so...@elyograg.org> wrote:
> On 2/19/2014 8:59 AM, Greg Walters wrote:
>> I believe that there's a configuration option that'll make on-deck searchers be used if they're needed even if they're not fully warmed yet. You might try that option and see if it doesn't solve your 503 errors.
>
> I'm fairly sure that this option (useColdSearcher) only applies to warming queries defined in solrconfig.xml, and that it only applies to situations when the searcher that is warming up is the *ONLY* searcher that exists. The only time that should happen is at Solr startup and core reload. At that time, the only warming queries that will be executed are those configured for the firstSearcher event.
>
> A quick peek at the code (branch_4x, SolrCore.java, starting at line 1647) seems to confirm this. I did not do an in-depth analysis.
>
> Thanks,
> Shawn
>
Re: Preventing multiple on-deck searchers without causing failed
commits
Posted by Shawn Heisey <so...@elyograg.org>.
On 2/19/2014 8:59 AM, Greg Walters wrote:
> I believe that there's a configuration option that'll make on-deck searchers be used if they're needed even if they're not fully warmed yet. You might try that option and see if it doesn't solve your 503 errors.
I'm fairly sure that this option (useColdSearcher) only applies to
warming queries defined in solrconfig.xml, and that it only applies to
situations when the searcher that is warming up is the *ONLY* searcher
that exists. The only time that should happen is at Solr startup and
core reload. At that time, the only warming queries that will be
executed are those configured for the firstSearcher event.
A quick peek at the code (branch_4x, SolrCore.java, starting at line
1647) seems to confirm this. I did not do an in-depth analysis.
Thanks,
Shawn
Re: Preventing multiple on-deck searchers without causing failed commits
Posted by Greg Walters <gr...@answers.com>.
I believe that there's a configuration option that'll make on-deck searchers be used if they're needed even if they're not fully warmed yet. You might try that option and see if it doesn't solve your 503 errors.
Thanks,
Greg
On Feb 18, 2014, at 9:05 PM, Erick Erickson <er...@gmail.com> wrote:
> Colin:
>
> Stop. Back up. The automatic soft commits will make updates available to
> your users every second. Those documents _include_ anything from your "hard
> commit" jobs. What could be faster? Parenthetically I'll add that 1 second
> soft commits are rarely an actual requirement, but that's your decision.
>
> For the hard commits. Fine. Do them if you insist. Just set
> openSearcher=false. The documents will be searchable the next time the soft
> commit happens, within one second. The key is openSearcher=false. That
> prevents starting a brand new searcher.
>
> BTW, your commits are not failing. It's just that _after_ the commit
> happens, the warming searcher limit is exceeded.
>
> You can even wait until the segments are flushed to disk. All without
> opening a searcher.
>
> Shawn is spot on in his recommendations to not fixate on the commits. Solr
> handles that. Here's a long blog about all the details of durability .vs.
> visibility.
> http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> You're over-thinking the problem here, trying to control commits with a
> sledgehammer when you don't need to, just use the built-in capabilities.
>
> Best,
> Erick
>
>
>
> On Tue, Feb 18, 2014 at 10:33 AM, Colin Bartolome <co...@e-e.com> wrote:
>
>> On 02/18/2014 10:15 AM, Shawn Heisey wrote:
>>
>>> If you want to be completely in control like that, get rid of the
>>> automatic soft commits and just do the hard commits.
>>>
>>> I would personally choose another option for your setup -- get rid of
>>> *all* explicit commits entirely, and just configure autoCommit and
>>> autoSoftCommit in the server config. Since you're running 4.x, you really
>>> should have the transaction log (updateLog in the config) enabled. You
>>> can rely on the transaction log to replay updates since the last hard
>>> commit if there's ever a crash.
>>>
>>> I would also recommend upgrading to 4.6.1, but that's a completely
>>> separate item.
>>>
>>> Thanks,
>>> Shawn
>>>
>>>
>> We use the automatic soft commits to get search index updates to our users
>> faster, via Near Realtime Searching. We have the updateLog enabled. I'm not
>> worried that the Solr side of the equation will lose data; I'm worried that
>> the communication from our web servers and scheduled jobs to the Solr
>> servers will break down and nothing will come along to make sure everything
>> is up to date. It sounds like what we're picturing is not currently
>> supported, so I'll file the RFE.
>>
>> Will upgrading to 4.6.1 help at all with this issue?
>>
Re: Preventing multiple on-deck searchers without causing failed
commits
Posted by Colin Bartolome <co...@e-e.com>.
Inline quoting ahead, sorry:
> Colin:
>
> Stop. Back up. The automatic soft commits will make updates available to
> your users every second. Those documents _include_ anything from your "hard
> commit" jobs. What could be faster? Parenthetically I'll add that 1 second
> soft commits are rarely an actual requirement, but that's your decision.
The one-second commits not my decision, per se; it's the default value
in solrconfig.xml and is also suggested as a "common configuration" in
the "Near Real Time Searching" section of the reference guide.
(Our users at Experts Exchange used to have to wait up to five minutes
before the search index updated with the latest content. While switching
to Solr, we saw that the recommended configuration would refresh the
index in seconds, rather than minutes, and rejoiced. We'd rather not
increase the latency too far to solve this problem.)
> For the hard commits. Fine. Do them if you insist. Just set
> openSearcher=false. The documents will be searchable the next time the soft
> commit happens, within one second. The key is openSearcher=false. That
> prevents starting a brand new searcher.
Are you saying that the automatic soft commit will trigger, no matter
what, even after our code has explicitly requested a hard commit? If so,
that is, if the automatic soft commit triggers, even if no additional
update requests have come in since the hard commit, then great! We'll do
that!
> BTW, your commits are not failing. It's just that _after_ the commit
> happens, the warming searcher limit is exceeded.
My commits may indeed be succeeding, but the server is returning a HTTP
503 response, which leads to SolrJ throwing a SolrServerException with
the message "No live SolrServers available to handle this request." Our
code, understandably, interprets that as a failed request. This causes
our job to abort and try again the next time it runs.
> You can even wait until the segments are flushed to disk. All without
> opening a searcher.
We will go with this if the automatic soft commit does indeed trigger
after the explicit hard commit, thanks.
> Shawn is spot on in his recommendations to not fixate on the commits. Solr
> handles that. Here's a long blog about all the details of durability .vs.
> visibility.
> http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> You're over-thinking the problem here, trying to control commits with a
> sledgehammer when you don't need to, just use the built-in capabilities.
I get what you both are saying. If the problem is that I'm doing
explicit hard commits, the solution is that I should stop doing explicit
hard commits.
That's not really a solution, though.
What if, for whatever reason, I absolutely *had to* perform explicit
hard commits? (I know you're saying I *don't* have to, but please
indulge me for a moment.) Fortunately, the SolrJ client provides a way I
can do this. But now my Solr server logs are full of "Overlapping
onDeckSearchers" performance warnings. Fine, I'll turn
maxWarmingSearchers down to 1. Now the server returns HTTP 503 responses
every now and then and SolrJ throws an exception.
I think that's a problem that the servers can solve: just queue up the
request until the number of warming searchers is under the limit. So I
filed that RFE. Even when all the above suggestions work perfectly and
fix our issues, it's still a valid RFE.
Re: Preventing multiple on-deck searchers without causing failed commits
Posted by Erick Erickson <er...@gmail.com>.
Colin:
Stop. Back up. The automatic soft commits will make updates available to
your users every second. Those documents _include_ anything from your "hard
commit" jobs. What could be faster? Parenthetically I'll add that 1 second
soft commits are rarely an actual requirement, but that's your decision.
For the hard commits. Fine. Do them if you insist. Just set
openSearcher=false. The documents will be searchable the next time the soft
commit happens, within one second. The key is openSearcher=false. That
prevents starting a brand new searcher.
BTW, your commits are not failing. It's just that _after_ the commit
happens, the warming searcher limit is exceeded.
You can even wait until the segments are flushed to disk. All without
opening a searcher.
Shawn is spot on in his recommendations to not fixate on the commits. Solr
handles that. Here's a long blog about all the details of durability .vs.
visibility.
http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
You're over-thinking the problem here, trying to control commits with a
sledgehammer when you don't need to, just use the built-in capabilities.
Best,
Erick
On Tue, Feb 18, 2014 at 10:33 AM, Colin Bartolome <co...@e-e.com> wrote:
> On 02/18/2014 10:15 AM, Shawn Heisey wrote:
>
>> If you want to be completely in control like that, get rid of the
>> automatic soft commits and just do the hard commits.
>>
>> I would personally choose another option for your setup -- get rid of
>> *all* explicit commits entirely, and just configure autoCommit and
>> autoSoftCommit in the server config. Since you're running 4.x, you really
>> should have the transaction log (updateLog in the config) enabled. You
>> can rely on the transaction log to replay updates since the last hard
>> commit if there's ever a crash.
>>
>> I would also recommend upgrading to 4.6.1, but that's a completely
>> separate item.
>>
>> Thanks,
>> Shawn
>>
>>
> We use the automatic soft commits to get search index updates to our users
> faster, via Near Realtime Searching. We have the updateLog enabled. I'm not
> worried that the Solr side of the equation will lose data; I'm worried that
> the communication from our web servers and scheduled jobs to the Solr
> servers will break down and nothing will come along to make sure everything
> is up to date. It sounds like what we're picturing is not currently
> supported, so I'll file the RFE.
>
> Will upgrading to 4.6.1 help at all with this issue?
>
Re: Preventing multiple on-deck searchers without causing failed
commits
Posted by Colin Bartolome <co...@e-e.com>.
On 02/18/2014 10:15 AM, Shawn Heisey wrote:
> If you want to be completely in control like that, get rid of the
> automatic soft commits and just do the hard commits.
>
> I would personally choose another option for your setup -- get rid of
> *all* explicit commits entirely, and just configure autoCommit and
> autoSoftCommit in the server config. Since you're running 4.x, you really
> should have the transaction log (updateLog in the config) enabled. You
> can rely on the transaction log to replay updates since the last hard
> commit if there's ever a crash.
>
> I would also recommend upgrading to 4.6.1, but that's a completely
> separate item.
>
> Thanks,
> Shawn
>
We use the automatic soft commits to get search index updates to our users
faster, via Near Realtime Searching. We have the updateLog enabled. I'm
not worried that the Solr side of the equation will lose data; I'm worried
that the communication from our web servers and scheduled jobs to the Solr
servers will break down and nothing will come along to make sure
everything is up to date. It sounds like what we're picturing is not
currently supported, so I'll file the RFE.
Will upgrading to 4.6.1 help at all with this issue?
Re: Preventing multiple on-deck searchers without causing failed
commits
Posted by Shawn Heisey <so...@elyograg.org>.
On 2/18/2014 10:59 AM, Colin Bartolome wrote:
> I'll describe a bit more about our setup, so I can say why I don't
> think that'll work for us:
>
> * Our web servers send update requests to Solr via a background
> thread, so HTTP requests don't have to wait for the request to complete.
> * That background thread has a small chance of failing. If it does,
> the update request won't happen until our "hard commit" job runs.
> * Other scheduled jobs can send update requests to Solr. Some jobs
> suppress this, because they do a lot of updating, instead relying on
> the "hard commit" job.
> * The "hard commit" job does a batch of updates, waits for the commit
> to complete, then sets some flags in our database to indicate that the
> content has been successfully indexed.
>
> It's that last point that leads us to want to do explicit hard
> commits. By setting those flags in our database, we're assuring
> ourselves that, no matter if any other steps failed along the way,
> we're absolutely sure the content was indexed properly.
If you want to be completely in control like that, get rid of the
automatic soft commits and just do the hard commits.
I would personally choose another option for your setup -- get rid of
*all* explicit commits entirely, and just configure autoCommit and
autoSoftCommit in the server config. Since you're running 4.x, you
really should have the transaction log (updateLog in the config)
enabled. You can rely on the transaction log to replay updates since
the last hard commit if there's ever a crash.
I would also recommend upgrading to 4.6.1, but that's a completely
separate item.
Thanks,
Shawn
Re: Preventing multiple on-deck searchers without causing failed
commits
Posted by Colin Bartolome <co...@e-e.com>.
On 02/17/2014 09:46 PM, Shawn Heisey wrote:
> I think I put too much information in my reply. Apologies. Here's the
> most important information to deal with first:
>
> Don't send hard commits at all. Configure autoCommit in your server
> config, with the all-important openSearcher parameter set to false.
> That will take care of all your hard commit needs, but those commits
> will never open a new searcher, so they cannot cause an overlap with the
> soft commits that DO open a new searcher.
>
> Thanks,
> Shawn
>
I'll describe a bit more about our setup, so I can say why I don't think
that'll work for us:
* Our web servers send update requests to Solr via a background thread, so
HTTP requests don't have to wait for the request to complete.
* That background thread has a small chance of failing. If it does, the
update request won't happen until our "hard commit" job runs.
* Other scheduled jobs can send update requests to Solr. Some jobs
suppress this, because they do a lot of updating, instead relying on the
"hard commit" job.
* The "hard commit" job does a batch of updates, waits for the commit to
complete, then sets some flags in our database to indicate that the
content has been successfully indexed.
It's that last point that leads us to want to do explicit hard commits. By
setting those flags in our database, we're assuring ourselves that, no
matter if any other steps failed along the way, we're absolutely sure the
content was indexed properly.
If there's no other way to do this, I'm okay with filing an RFE in JIRA
and continuing to ignore the "multiple on-deck searchers" warning for now.
Re: Preventing multiple on-deck searchers without causing failed
commits
Posted by Shawn Heisey <so...@elyograg.org>.
On 2/17/2014 7:06 PM, Colin Bartolome wrote:
> Increasing the maxTime value doesn't actually solve the problem, though;
> it just makes it a little less likely. Really, the soft commits aren't
> the problem here, as far as we can tell. It's that a request that
> triggers a hard commit simply fails when the server is already at
> maxWarmingSearchers. I would expect the request to queue up and wait
> until the server could handle it.
I think I put too much information in my reply. Apologies. Here's the
most important information to deal with first:
Don't send hard commits at all. Configure autoCommit in your server
config, with the all-important openSearcher parameter set to false.
That will take care of all your hard commit needs, but those commits
will never open a new searcher, so they cannot cause an overlap with the
soft commits that DO open a new searcher.
Thanks,
Shawn
Re: Preventing multiple on-deck searchers without causing failed
commits
Posted by Colin Bartolome <cb...@experts-exchange.com>.
On 02/17/2014 05:38 PM, Shawn Heisey wrote:
> On 2/17/2014 6:06 PM, Colin Bartolome wrote:
>> We're using Solr version 4.2.1, in case new functionality has helped
>> with this issue.
>>
>> We have our Solr servers doing automatic soft commits with maxTime=1000.
>> We also have a scheduled job that triggers a hard commit every fifteen
>> minutes. When one of these hard commits happens while a soft commit is
>> already in progress, we get that ubiquitous warning:
>>
>> PERFORMANCE WARNING: Overlapping onDeckSearchers=2
>>
>> Recently, we had an occasion to have a second scheduled job also issue a
>> hard commit every now and then. Since our maxWarmingSearchers value was
>> set to the default, 2, we occasionally had a hard commit trigger when
>> two other searchers were already warming up, which led to this:
>>
>> org.apache.solr.client.solrj.SolrServerException: No live SolrServers
>> available to handle this request
>>
>> as the servers started responded with a 503 HTTP response.
>>
>> It seems like automatic soft commits wait until the hard commits are out
>> of the way before they proceed. Is there a way to do the same for hard
>> commits? Since we're passing waitSearcher=true in the update request
>> that triggers the hard commits, I would expect the request to block
>> until the server had enough headroom to service the commit. I did not
>> expect that we'd start getting 503 responses.
>
> Remember this mantra: Hard commits are about durability, soft commits
> are about visibility. You might already know this, but it is the key to
> figuring out how to handle commits, whether they are user-triggered or
> done automatically by the server.
>
> With Solr 4.x, it's best to *always* configure autoCommit with
> openSearcher=false. This does a hard commit but does not open a new
> searcher. The result: Data is flushed to disk and the current
> transaction log is closed. New documents will not be searchable after
> this kind of commit. For maxTime and maxDocs, pick values that won't
> result in huge transaction logs, which increase Solr startup time.
>
> http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup
>
> For document visibility, you can rely on autoSoftCommit, and you
> indicated that you already have it configured. Decide how long you can
> wait for new content that has just been indexed. Do you *really* need
> new data to be searchable within one second? If so, you're good. If
> not, increase the maxTime value here. Be sure to make the value at
> least a little bit longer than the amount of time it takes for a soft
> commit to finish, including cache warmup time.
>
> Thanks,
> Shawn
>
Increasing the maxTime value doesn't actually solve the problem, though;
it just makes it a little less likely. Really, the soft commits aren't
the problem here, as far as we can tell. It's that a request that
triggers a hard commit simply fails when the server is already at
maxWarmingSearchers. I would expect the request to queue up and wait
until the server could handle it.
Re: Preventing multiple on-deck searchers without causing failed
commits
Posted by Shawn Heisey <so...@elyograg.org>.
On 2/17/2014 6:06 PM, Colin Bartolome wrote:
> We're using Solr version 4.2.1, in case new functionality has helped
> with this issue.
>
> We have our Solr servers doing automatic soft commits with maxTime=1000.
> We also have a scheduled job that triggers a hard commit every fifteen
> minutes. When one of these hard commits happens while a soft commit is
> already in progress, we get that ubiquitous warning:
>
> PERFORMANCE WARNING: Overlapping onDeckSearchers=2
>
> Recently, we had an occasion to have a second scheduled job also issue a
> hard commit every now and then. Since our maxWarmingSearchers value was
> set to the default, 2, we occasionally had a hard commit trigger when
> two other searchers were already warming up, which led to this:
>
> org.apache.solr.client.solrj.SolrServerException: No live SolrServers
> available to handle this request
>
> as the servers started responded with a 503 HTTP response.
>
> It seems like automatic soft commits wait until the hard commits are out
> of the way before they proceed. Is there a way to do the same for hard
> commits? Since we're passing waitSearcher=true in the update request
> that triggers the hard commits, I would expect the request to block
> until the server had enough headroom to service the commit. I did not
> expect that we'd start getting 503 responses.
Remember this mantra: Hard commits are about durability, soft commits
are about visibility. You might already know this, but it is the key to
figuring out how to handle commits, whether they are user-triggered or
done automatically by the server.
With Solr 4.x, it's best to *always* configure autoCommit with
openSearcher=false. This does a hard commit but does not open a new
searcher. The result: Data is flushed to disk and the current
transaction log is closed. New documents will not be searchable after
this kind of commit. For maxTime and maxDocs, pick values that won't
result in huge transaction logs, which increase Solr startup time.
http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup
For document visibility, you can rely on autoSoftCommit, and you
indicated that you already have it configured. Decide how long you can
wait for new content that has just been indexed. Do you *really* need
new data to be searchable within one second? If so, you're good. If
not, increase the maxTime value here. Be sure to make the value at
least a little bit longer than the amount of time it takes for a soft
commit to finish, including cache warmup time.
Thanks,
Shawn