You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by danny0405 <gi...@git.apache.org> on 2017/01/13 07:46:44 UTC

[GitHub] storm pull request #1874: Strom Rebalance command should support arbitrary c...

GitHub user danny0405 opened a pull request:

    https://github.com/apache/storm/pull/1874

    Strom Rebalance command should support arbitrary component parallelism

        For legacy reasons,  config TOPOLOGY-TASKS is considered first when schedule a topology, for a component, if user don\u2019t specify TOPOLOGY-TASKS, storm just override it to be equal of component parallelism hint, and schedule based on TOPOLOGY-TASKS later on. 
        This works for the most cases, but not Rebalance command, now, when do Rebalance, the StormBase :component->executors attribute will be overridden in Zookeeper is used to partition tasks into executors for components, as we said above, the TOPOLOGY-TASKS is considered here as the real tasks number for components, something goes weird here:
        If we override a bigger executor numbers for a component when do rebalance, it just don\u2019t work because smaller TOPOLOGY-TASKS [ not changed since first submitted at all ]is partitioned to bigger number of executors which read from ZooKeeper overridden by Rebalance command, but for smaller task, it works fine.
        I see that storm support a command like this now: [storm rebalance topology-name [-w wait-time-secs] [-n new-num-workers] [-e component=parallelism]*] which indicate that user can override a component parallelism freely, i think it\u2019s more sensible to support this and it's meaningless to have a restriction like this.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/danny0405/storm nimbus-promotion

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/1874.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1874
    
----
commit de4374fd1da4bb0a647f125b91f2f15644c0b2f7
Author: chenyuzhao <ch...@meituan.com>
Date:   2017-01-12T09:21:55Z

    refact rebalance to allow modify component parallelism freely

commit 1384071d58020d03c018d9ce98638ee330539300
Author: chenyuzhao <ch...@meituan.com>
Date:   2017-01-13T07:22:09Z

    format code

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1874: STORM-2286 Storm Rebalance command should support arbitra...

Posted by danny0405 <gi...@git.apache.org>.
Github user danny0405 commented on the issue:

    https://github.com/apache/storm/pull/1874
  
    @arunmahadevan thx for your nice review, i have checked the code
    
    The fact is that, now for storm, we will pass in a `:scratch-topology-id` to execute mk-assignments function for rebalance command, thus, nimbus will just ignore the old assignment for the rebalanced topology. Finally, it will get a new storm-id and totally new assignment on cluster.
    
    When supervisor feels this, it will kill all the old workers for the rebalanced topology, and launch new. In other words, the rebalanced topology and the old one are totally two different topologies to nimbus and cluster. Storm doesn't keep any state and just kill the old worker after the wait time we passed in [from rebalance command].
    
    So, the problem you suggest doesn't exist. The rebalance command is just like a composite of kill and submit command.
    
    Here is the code snippet:
    
    ```java
    (defn do-rebalance [nimbus storm-id status storm-base]
      (let [rebalance-options (:topology-action-options storm-base)]
        (.update-storm! (:storm-cluster-state nimbus)
          storm-id
            (-> {:topology-action-options nil}
              (assoc-non-nil :component->executors (:component->executors rebalance-options))
              (assoc-non-nil :num-workers (:num-workers rebalance-options))))
      (mk-assignments nimbus :scratch-topology-id storm-id))
    
    (defnk mk-assignments [nimbus :scratch-topology-id nil]
      (if (is-leader nimbus :throw-exception false)
        (let [conf (:conf nimbus)
            storm-cluster-state (:storm-cluster-state nimbus)
            ^INimbus inimbus (:inimbus nimbus)
            ;; read all the topologies
            topologies (locking (:submit-lock nimbus) (into {} (for [tid (.active-storms storm-cluster-state)]
                                  {tid (read-topology-details nimbus tid)})))
            topologies (Topologies. topologies)
            ;; read all the assignments
            assigned-topology-ids (.assignments storm-cluster-state nil)
            existing-assignments (into {} (for [tid assigned-topology-ids]
                                            ;; for the topology which wants rebalance (specified by the scratch-topology-id)
                                            ;; we exclude its assignment, meaning that all the slots occupied by its assignment
                                            ;; will be treated as free slot in the scheduler code.
                                            (when (or (nil? scratch-topology-id) (not= tid scratch-topology-id))
                                              {tid (.assignment-info storm-cluster-state tid nil)})))]`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1874: STORM-2286 Strom Rebalance command should support arbitra...

Posted by erikdw <gi...@git.apache.org>.
Github user erikdw commented on the issue:

    https://github.com/apache/storm/pull/1874
  
    @danny0405 : given that I personally find no value in having this separate task vs. executor distinction that Storm supports (i.e., allowing for a different ratio than 1:1 for tasks to executors), I'm probably not the best person to review this. :)    But it would help if you gave some specific examples, instead of only talking abstractly about the problem.   Also just FYI, you have a typo in the headline for this PR as well as the storm ticket (Strom --> Storm).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1874: STORM-2286 Storm Rebalance command should support arbitra...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on the issue:

    https://github.com/apache/storm/pull/1874
  
    @danny0405 right now rebalance to a higher number of executors is meaningful only if a component was provisioned with more number of tasks than executors during deploy time. 
    
    May be there is a reason why the number of tasks as not changed with rebalance. Say if the tasks are maintaining some local state and the inputs to the tasks are keyed on a field and the rebalance alters the number of tasks, the states will not be valid anymore since the same keys will not be mapped to the same tasks anymore.
    
    The idea of dynamically increasing the tasks with rebalance is good, but we need to think through how to handle the key based partitioning use cases.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1874: STORM-2286 Strom Rebalance command should support arbitra...

Posted by danny0405 <gi...@git.apache.org>.
Github user danny0405 commented on the issue:

    https://github.com/apache/storm/pull/1874
  
    @erikdw Could you please help me to review this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1874: STORM-2286 Storm Rebalance command should support arbitra...

Posted by erikdw <gi...@git.apache.org>.
Github user erikdw commented on the issue:

    https://github.com/apache/storm/pull/1874
  
    I'm unable to reconcile the statements about the ratio always being 1:1 with the content of your "For example" paragraph.  Those examples sound like you have having a different number of executors than tasks.  Am I misreading them?
    
    In any case, I'm not a clojure expert, so you probably wanna get a core contributor (e.g., Bobby Evans, Jungtaek Kim, someone else) to look at this.
    
    However, another Q that comes to mind is:  does this behavior also exist in the master branch at HEAD?  i.e., the Java-based code.   If so then you'll wanna do this change in Java too, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1874: STORM-2286 Storm Rebalance command should support arbitra...

Posted by danny0405 <gi...@git.apache.org>.
Github user danny0405 commented on the issue:

    https://github.com/apache/storm/pull/1874
  
    @erikdw thx for your nice review, i have fixed the headline and ticket title.
    
    For this PR, the ratio is still 1:1 for tasks and executors for a component, but not for the Storm before this PR [only for rebalance command].
    
    For the old storm, The key here is that TOPOLOGY-TASKS is not changed for a component when we do rebalance, we only change the executors number. Thus, when we rebalance a smaller executors number for a component, the old bigger TOPOLOGY-TASKS is partitioned into smaller number of executors, it works [we finally reduce the executors], but some executors will have more than one tasks; When we rebalance a bigger executors number for a component, it doesn't work at all because the old smaller TOPOLOGY-TASKS is partitioned into bigger number of executors, so some executors will not have task assigned and will not start.
    
    For example, for the old storm, if we have a topology named example-topology which contains component like Spout S1[1 executors] -- Bolt B1[2 executors] -- Bolt B2[4 executors], if we do a command `storm rebalance  example-topology -e B1=1` it will work, and the only executor for B1 will have 2 tasks inside; but for command `storm rebalance  example-topology -e B1=4`, it doesn't work at all and nothing changes finally.
    
    For this PR, all the commands above work, and the ratio for tasks and executors of a component is always 1:1
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1874: STORM-2286 Storm Rebalance command should support arbitra...

Posted by danny0405 <gi...@git.apache.org>.
Github user danny0405 commented on the issue:

    https://github.com/apache/storm/pull/1874
  
    @arunmahadevan There are over 2500+ topologies on our cluster, in most cases, we wanna increase component executors too when we add workers to a topology in order to make full use of these workers. And it's awful to kill and re-submit the topology just for increasing workers and executors.
    
    As you said, there may be some problem with grouping or something, i need to make sure, thx for your suggest.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1874: STORM-2286 Storm Rebalance command should support ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/1874


---