You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2019/03/07 09:58:00 UTC

[jira] [Commented] (SOLR-13245) Status checking of streaming "daemon"-s is buggy and misleading

    [ https://issues.apache.org/jira/browse/SOLR-13245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16786577#comment-16786577 ] 

Andrzej Bialecki  commented on SOLR-13245:
------------------------------------------

bq. create a worker collection specifically to run daemons 
I don't think this is the right solution. A separate collection has an overhead (you need to create at least one SolrCore) and a life-cycle (who is responsible for deleting that collection afterwards?), it also prevents daemons from executing on different nodes, whereas starting one on an existing multi-replica collection at least has a chance of being load-balanced against other running daemons.

Instead I think that daemons should either create ephemeral znodes that tell StreamHandler where to look for them, or the handler should iterate over all replicas and collect status of running daemons.

> Status checking of streaming "daemon"-s is buggy and misleading
> ---------------------------------------------------------------
>
>                 Key: SOLR-13245
>                 URL: https://issues.apache.org/jira/browse/SOLR-13245
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: streaming expressions
>    Affects Versions: 7.7, 8.0, master (9.0)
>            Reporter: Andrzej Bialecki 
>            Priority: Major
>
> When a {{daemon}} is started Solr randomly picks a replica to process that request and then it executes in the context of that particular replica. The response from the request mentions this specifically:
> {code:java}
> {
>   "result-set":{
>     "docs":[{
>         "DaemonOp":"Deamon:testD1 started on gettingstarted_shard2_replica_n6"}
>       ,{
>         "EOF":true}]}}
> {code}
> Subsequent requests to {{/solr/gettingstarted/stream?action=list}} only sometimes return the status of this daemon, specifically only when the request is randomly routed to the actual replica that the daemon is running on - in other cases the response doesn't show the running daemon.
> This is very easy to reproduce using the {{cloud}} example with 2 (local) nodes and a source collection with multiple shards and multiple replicas.
> Currently the only workaround is to request the status using a non-cloud core URL - in this case a request to {{/solr/gettingstarted_shard2_replica_n6/stream?action=list}} always returns correct status.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org