You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Matteo Grolla <ma...@gmail.com> on 2018/09/20 14:18:02 UTC

Re: Could not load collection from ZK:

Hi everybody,
    I'm facing the same problem on solr 7.3.
Probably requesting a longer session to zk (the default 10s seems too
short) will solve the problem but I'm puzzled by the fact that this error
is reported by solrj as a SolrException with status code 400 (BAD_REQUEST).
in ZkStateReader

   public static DocCollection getCollectionLive(ZkStateReader zkStateReader,
String coll) {

  try {
    return zkStateReader.fetchCollectionState(coll, null);
  } catch (KeeperException e) {
    throw new SolrException(ErrorCode.BAD_REQUEST, "Could not load
collection from ZK: " + coll, e);
  } catch (InterruptedException e) {
    Thread.currentThread().interrupt();
    throw new SolrException(ErrorCode.BAD_REQUEST, "Could not load
collection from ZK: " + coll, e);
  }
}


Retrying the reques could solve the problem, but a client should't retry a
BAD_REQUEST.
Why isn't this reported as a 503 (SERVICE_UNAVAILABLE) ?
I think solrj should distinguish the cases:
A: communication problem with zk -> 503
B: user asked a non existing collection ->400

Thanks


Il giorno ven 25 mag 2018 alle ore 05:02 Aman Singh <
amandeep.cool99@gmail.com> ha scritto:

> Hi Shawn & Alessandro,
> We have tried to increase the heap also but we were facing the same issue
> but after removing the ZK from the solr server to their dedicated server
> this problem goes away, Yes when we are facing  this issue the GC activity
> was high around 60-70% out of 400%.
> Regards,
> Aman Deep singh
>
> On 25/05/18, 5:08 AM, "Shawn Heisey" <ap...@elyograg.org> wrote:
>
>     On 6/20/2017 9:46 AM, Aman Deep Singh wrote:
>     > Sorry Shawn,
>     > It didn't copy entire stacktrace I put the stacktrace at
>     > https://www.dropbox.com/s/zf8b87m24ei2ils/solr%20exception2?dl=0
>     >
>     > Note: I have shaded the solr library under com.gdn.solr620  so all
> solr
>     > class will be appear as com.gdn.solr620.org.apache.solr.*
>
>     Wow, I really dropped the ball here.  The thread is nearly a year old.
>     I somehow missed the reply.  I am sorry about that!  Thank Alessandro
>     for reviving the thread and making it clear that I never replied.
>
>     This is the innermost cause:
>
>     Caused by:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
>     KeeperErrorCode = Session expired for
>     /collections/productCollection/state.json
>
>     Either there are network issues talking to ZooKeeper, or something else
>     caused a timeout.  Solr's default ZK client timeout when it is not
>     configured is 15 seconds.  In recent versions, the example
>     configurations have an explicit setting of 30 seconds.  Solr's
>     zkClientTimeout is used to set ZooKeeper's sessionTimeout, and that's
>     what is exceeded when a session expires.
>
>     When this kind of error happens, it means something has gone VERY wrong
>     -- 15 seconds is a REALLY long time when programs are trying to talk to
>     each other.
>
>     One common cause of problems like this is extreme GC pauses.  Typically
>     a pause problem capable of causing a ZK timeout would be due to the
> heap
>     being too small, but it's always possible that it could happen because
>     the heap is VERY large.
>
>     Errors on the client side may not be as informative as corresponding
>     errors in the solr.log file on the server(s).  It would be a good idea
>     to check solr.log for errors as well.
>
>     Thanks,
>     Shawn
>
>
>
>
>