You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2016/02/25 20:34:18 UTC

[jira] [Commented] (SOLR-8738) invalid DBQ initially sent to a non-leader node will report success

    [ https://issues.apache.org/jira/browse/SOLR-8738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167711#comment-15167711 ] 

Hoss Man commented on SOLR-8738:
--------------------------------

The most trivial/obvious way to reproduce is...

* {{bin/solr -e cloud}}
* Pick "*3*" for number of nodes
* accept the default port numbers (8983, 7574, 8984)
* accept the default collection name (gettingstarted)
* pick "*1*" for the number of shards
* accept the default number of replicas per shard (2)
* accept the default config set (data_driven_schema_configs)

(So now you should have a single collection with a single shard with 2 replicas on 2 diff nodes and the remaining node doesn't host any cores related to the collection)

Now try running a broken DBQ against all 3 nodes...

{noformat}
$ curl -H 'Content-Type: application/json' 'http://127.0.1.1:8983/solr/gettingstarted/update' --data-binary '{"delete":{"query" : "foo_i:yak"}}'
{"responseHeader":{"status":400,"QTime":18},"error":{"metadata":["error-class","org.apache.solr.common.SolrException","root-error-class","org.apache.solr.common.SolrException"],"msg":"Invalid Number: yak","code":400}}
$ curl -H 'Content-Type: application/json' 'http://127.0.1.1:8984/solr/gettingstarted/update' --data-binary '{"delete":{"query" : "foo_i:yak"}}'
{"responseHeader":{"status":0,"QTime":25}}
$ curl -H 'Content-Type: application/json' 'http://127.0.1.1:7574/solr/gettingstarted/update' --data-binary '{"delete":{"query" : "foo_i:yak"}}'
{"responseHeader":{"status":0,"QTime":7}}
{noformat}

...only the node hosting the leader correctly repsponds back with the error, requests that initially hit nodes only hosting replicas or not hosting any cores incorrectly indicate that the delete succeeded.

----

2 important notes:

# This can also be reproduced using {{numShards > 1}}, most easily by running {{-e cloud}} and choosing *4* nodes, and accepting the default 2 shards, 2 replicas.  Then repeat the same curl commands above over all 4 ports.
#* you should see 2 nodes correctly return failures, and 2 nodes incorrectly claim success
# You can also reproduce using {{-e cloud -noprompt}} but since that that defaults to only 2 nodes they are garunteed to each have a leader on them, so you have to be more explicit about the requests.
#* Use the Solr UI to determine the _non-leader_ core_node_names (ex: {{gettingstarted_shard1_replica1}}) and which node they are located on, then use those in url instead of the simple collection name (otherwise smple collection paths will be auto-route to a leader on each node)




> invalid DBQ initially sent to a non-leader node will report success
> -------------------------------------------------------------------
>
>                 Key: SOLR-8738
>                 URL: https://issues.apache.org/jira/browse/SOLR-8738
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>
> Discovered this while working on SOLR-445.
> If a Delete By Query gets sent to a node which is not hosting a leader (ie: only hosts replicas, or doesn't host any cores related to the specified collection) then a success will be returned, even if the DBQ is completely malformed and actually failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org