You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@storm.apache.org by "P. Taylor Goetz (JIRA)" <ji...@apache.org> on 2018/09/21 17:28:00 UTC

[jira] [Updated] (STORM-3112) Incremental scheduling supports

     [ https://issues.apache.org/jira/browse/STORM-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

P. Taylor Goetz updated STORM-3112:
-----------------------------------
    Fix Version/s:     (was: 2.0.0)

> Incremental scheduling supports
> -------------------------------
>
>                 Key: STORM-3112
>                 URL: https://issues.apache.org/jira/browse/STORM-3112
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-server
>    Affects Versions: 2.0.0
>            Reporter: Yuzhao Chen
>            Assignee: Yuzhao Chen
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 3h
>  Remaining Estimate: 0h
>
> As https://issues.apache.org/jira/browse/STORM-3093 described, now the scheduling work for a round is a complete scan and computation for all the topologies on cluster, which is a very heavy work when topologies increment to hundreds.
> So this JIRA is to refactor the scheduling logic that only care about topologies that need to.
> Promotions list:
> 1. Cache the id to storm base mapping which reduce the pressure to ZooKeeper.
> 2. Only schedule the topologies that need to: with dead executors or not enough running workers.
> 3. For some schedulers we still need a full scheduling, i.e. IsolationScheduler.
> 4. Cache the scheduling resource bestride multi scheduling round, i.e. nodeId -> used slot, nodeId -> used resource, nodeId -> totalResource.
> Cause in https://issues.apache.org/jira/browse/STORM-3093 i already cache the storm-id -> executors mapping, now for a scheduling round, thing we will do:
> 1. Scan all the active storm bases( cached ) and local storm-conf/storm-topology, then to refresh the heartbeats cache, and we will know which topologies need to schedule.
> 2. Compute scheduleAssignment only for need scheduling topologies.
> About robustness when nimbus restarts:
> 1. The cached storm-bases are taken care of by ILocalAssignmentsBackend.
> 2. the scheduling cache will be refresh for the first time scheduling through a full topologies scheduling.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)