You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Erick Erickson <er...@gmail.com> on 2018/01/04 18:21:24 UTC
SolrCmdDistributor retries.
Down in SolrCmdDistibuted.doRetriesIfNeeded there are a series of specific
codes that we retry on, here:
if (isRetry) {
if (rspCode == 404 || rspCode == 403 || rspCode == 503) {
doRetry = true;
}
...
Absent is a 401. What I think I'm seeing
in the field is that there's a timeout with
this reported during a distributed update.
Invalid key request timestamp: 1513273351295 ,
received timestamp: 1513273356700 , TTL: 5000
This appears to only happen very occasionally.
The problem is that this leads to a the leader putting the
follower into LIR and all the problems that entails.
Certainly the timeout can be lengthened with:
-Dpkiauth.ttl=10000
or whatever, but my question is whether it makes
sense to retry in this case in SolrCmdDistributor.
NOTE: I only have logs from 6.3 for this, but see
no evidence this has been changed since then.
Comments?