You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Marius Grama (JIRA)" <ji...@apache.org> on 2015/05/25 17:51:17 UTC

[jira] [Comment Edited] (SOLR-7566) Search requests should return the shard name that is down

    [ https://issues.apache.org/jira/browse/SOLR-7566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14558347#comment-14558347 ] 

Marius Grama edited comment on SOLR-7566 at 5/25/15 3:51 PM:
-------------------------------------------------------------

[~shalinmangar] thank you. I could reproduce the issue and I have also found the cause of it.
When doing a distributed search, the available shards are taken from the cluster state and are joined together (HttpShardHandler#checkDistributed(ResponseBuilder) method)
{code:title=HttpShardHandler#checkDistributed}
// ...
            StringBuilder sliceShardsStr = new StringBuilder();
            for (Replica replica : sliceShards.values()) {
              if (!clusterState.liveNodesContain(replica.getNodeName())
                  || replica.getState() != Replica.State.ACTIVE) {
                continue;
              }
              if (first) {
                first = false;
              } else {
                sliceShardsStr.append('|');
              }
              String url = ZkCoreNodeProps.getCoreUrl(replica);
              sliceShardsStr.append(url);
            }

            rb.shards[i] = sliceShardsStr.toString();
{code}

In the case when the replicas for a shard are not available, the string corresponding to the shard addresses will remain empty.

In the SearchHandler#handleRequestBody method, the empty shard will be simply forwarded to the HttpShardHandler to be evaluated asynchronously : 
SearchHandler.java line 352
{code:language=java}
shardHandler1.submit(sreq, shard, params);
{code}
and in the HttpShardHandler#submit() method will be thrown the exception with an inconsistent message because the shard is empty.
{code:title=HttpShardHandler#submit|language=java}
          // if there are no shards available for a slice, urls.size()==0
          if (urls.size()==0) {
            // TODO: what's the right error code here? We should use the same thing when
            // all of the servers for a shard are down.
            throw new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE, "no servers hosting shard: " + shard);
          }
{code}


One solution would be to throw the SolrException within the code of HttpShardHandler#checkDistributed method when the _sliceShardsStr_ StringBuilder is empty. This seems to me the easy way to handle this situation.

Can somebody give me feedback whether I am on the right track here? Thanks in advance.



was (Author: mariusneo):
[~shalinmangar] thank you. I could reproduce the issue and I have also found the cause of it.
When doing a distributed search, the available shards are taken from the cluster state and are joined together (HttpShardHandler#checkDistributed(ResponseBuilder) method)
{code:title=HttpShardHandler#checkDistributed}
// ...
            StringBuilder sliceShardsStr = new StringBuilder();
            for (Replica replica : sliceShards.values()) {
              if (!clusterState.liveNodesContain(replica.getNodeName())
                  || replica.getState() != Replica.State.ACTIVE) {
                continue;
              }
              if (first) {
                first = false;
              } else {
                sliceShardsStr.append('|');
              }
              String url = ZkCoreNodeProps.getCoreUrl(replica);
              sliceShardsStr.append(url);
            }

            rb.shards[i] = sliceShardsStr.toString();
{code}

In the case when the replicas for a shard are not available, the string corresponding to the shard addresses will remain empty.

In the SearchHandler#handleRequestBody method, the empty shard will be simply forwarded to the HttpShardHandler to be evaluated asynchronously : 
SearchHandler.java line 352
{code:language=java}
shardHandler1.submit(sreq, shard, params);
{code}
and in the HttpShardHandler#submit() method will be thrown the exception with an inconsistent message because the shard is empty.
{code:language=java}
// if there are no shards available for a slice, urls.size()==0
          if (urls.size()==0) {
            // TODO: what's the right error code here? We should use the same thing when
            // all of the servers for a shard are down.
            throw new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE, "no servers hosting shard: " + shard);
          }
{code}


One solution would be to throw the SolrException within the code of HttpShardHandler#checkDistributed method when the _sliceShardsStr_ StringBuilder is empty. This seems to me the easy way to handle this situation.

Can somebody give me feedback whether I am on the right track here? Thanks in advance.


> Search requests should return the shard name that is down
> ---------------------------------------------------------
>
>                 Key: SOLR-7566
>                 URL: https://issues.apache.org/jira/browse/SOLR-7566
>             Project: Solr
>          Issue Type: Bug
>          Components: search, SolrCloud
>    Affects Versions: 5.1
>            Reporter: Shalin Shekhar Mangar
>            Priority: Trivial
>             Fix For: Trunk, 5.2
>
>
> If no replicas of a shard are up and running, a search request gives the following response:
> {code}
> {
>   "responseHeader": {
>     "status": 503,
>     "QTime": 2,
>     "params": {
>       "q": "*:*",
>       "indent": "true",
>       "wt": "json",
>       "_": "1432048084930"
>     }
>   },
>   "error": {
>     "msg": "no servers hosting shard: ",
>     "code": 503
>   }
> }
> {code}
> The message should mention the shard which is down/unreachable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org