You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org> on 2017/05/20 07:16:04 UTC

[jira] [Commented] (SOLR-10515) Persist intermediate trigger state in ZK to continue tracking information across overseer restarts

    [ https://issues.apache.org/jira/browse/SOLR-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018350#comment-16018350 ] 

Shalin Shekhar Mangar commented on SOLR-10515:
----------------------------------------------

Thanks Andrzej. I reviewed the changes at jira/SOLR-10515 branch. A few comments:
# The ScheduledTrigger.run does not take care of the return value of the triggerFired() method. So when a event is being replayed, and there are pending actions, the event will be lost because it has already been polled from the queue. We should peek from the queue, attempt to fire the listener and poll only if the fire was successful.
# If an event is not in the replaying state, then we enqueue it as well as try to execute it immediately. It looks like it can lead to duplicate executions of the same event, once by the trigger actually firing and again from the trigger.run() method? I think we can simplify it in two ways: 
## ScheduledTrigger.run() tries to fire queued events only on first run i.e. when the overseer first starts up. This is akin to how the work queue is used in overseer i.e. tasks are queued before execution and removed post execution and the work queue is only ever read/polled on overseer startup.
## The event listener can queue items to ZK but does not submit them to the action executor. The ScheduledTrigger.run peeks from the event queue, submits them to the action executor and polls if successful. Basically this splits the complex logic of queuing, and submitting from the same event listener being set in the ScheduledTriggers.add method into two places.
# The TriggerIntegrationTest.testEventQueue method has a {{await = actionStarted.await(600, TimeUnit.SECONDS);}}. That's way too much time. I ran the test only once and it timed out.
# This design ensures that events once generated by a trigger aren't lost but it doesn't protect the trigger from losing the tracking information in the first place. For example, a node added trigger might detect the addition of a new node, put that node into its tracking map and then the overseer node goes down. The new overseer node will never detect that a new node was ever added and the event queue will of course be empty because no event was ever generated. Same applies for node lost events. I think we need to do something fundamentally different for cluster events such as these. Perhaps a node when registering itself with /live_nodes should do a multi write to the nodeAdded trigger queue? For nodeLost, what if every node (or a subset, say the top 3 in overseer election queue) tried to add the lost node's name to a fixed znode, say /autoscaling/events/nodeLost/actual_node_name:port_ctx. In future for metrics, we should think about adding rolled up metrics to the .system collection so we don't lose them or perhaps they can re-calculated from leader?

What do you think?

> Persist intermediate trigger state in ZK to continue tracking information across overseer restarts
> --------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-10515
>                 URL: https://issues.apache.org/jira/browse/SOLR-10515
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Andrzej Bialecki 
>              Labels: autoscaling
>             Fix For: master (7.0)
>
>
> The current trigger design is simplistic and keeps all the intermediate state in memory. But this presents two problems when the overseer itself fails:
> # We lose tracking state such as which node was added before the overseer restarted
> # A nodeLost trigger can never really fire for the overseer node itself
> So we need a way, preferably in the trigger API itself to save intermediate state or checkpoints so that it can seamlessly continue on overseer restarts.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org