You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Robert Joseph Evans (JIRA)" <ji...@apache.org> on 2018/04/10 22:02:00 UTC

[jira] [Created] (STORM-3024) Allow scheduling for RAS to happen in the background

Robert Joseph Evans created STORM-3024:
------------------------------------------

             Summary: Allow scheduling for RAS to happen in the background
                 Key: STORM-3024
                 URL: https://issues.apache.org/jira/browse/STORM-3024
             Project: Apache Storm
          Issue Type: New Feature
          Components: storm-server
    Affects Versions: 2.0.0
            Reporter: Robert Joseph Evans
            Assignee: Robert Joseph Evans


We have run into some issues recently where occasionally a strategy on a very large cluster will take an extra long amount of time finish scheduling.  This slowness cascades into other issues, like topologies not being able to be killed because the timer thread is still in use trying to run scheduling.

The plan is to make scheduling happen in a thread pool.  The main thread will wait for up to a configurable amount of time for the topology to be scheduled, but if it does not complete in that time it will be left to keep running in the background thread in hopes that later on it will be scheduled.

If for some reason the state of the cluster changes while scheduling is happening in the background we will cancel the scheduling, as any scheduling it produced may not be able to fit on the cluster.  The next time the scheduler runs it will restart the scheduling and hopefully allow the cluster to reach a steady state even if it takes a while, but without blocking kills and other critical operations from happening.

Note that we are also working on optimizing scheduling as well so that these issues don't happen in the first place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)