You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by revans2 <gi...@git.apache.org> on 2016/11/07 22:51:09 UTC

[GitHub] storm pull request #1764: STORM-2190: reduce contention between submission a...

GitHub user revans2 opened a pull request:

    https://github.com/apache/storm/pull/1764

    STORM-2190: reduce contention between submission and scheduling

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/revans2/incubator-storm STORM-2190

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/1764.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1764
    
----
commit 43b808e7c304a9ce2c3e4aac669898640a398253
Author: Robert (Bobby) Evans <ev...@yahoo-inc.com>
Date:   2016-11-07T22:50:27Z

    STORM-2190: reduce contention between submission and scheduling

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1764: STORM-2190: reduce contention between submission a...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1764#discussion_r90542812
  
    --- Diff: storm-core/src/clj/org/apache/storm/daemon/nimbus.clj ---
    @@ -1008,23 +1008,24 @@
           (reset! (:id->worker-resources nimbus) {}))
         ;; tasks figure out what tasks to talk to by looking at topology at runtime
         ;; only log/set when there's been a change to the assignment
    -    (doseq [[topology-id assignment] new-assignments
    -            :let [existing-assignment (get existing-assignments topology-id)
    -                  topology-details (.getById topologies topology-id)]]
    -      (if (= existing-assignment assignment)
    -        (log-debug "Assignment for " topology-id " hasn't changed")
    -        (do
    -          (log-message "Setting new assignment for topology id " topology-id ": " (pr-str assignment))
    -          (.setAssignment storm-cluster-state topology-id (thriftify-assignment assignment))
    -          )))
    -    (->> new-assignments
    -          (map (fn [[topology-id assignment]]
    -            (let [existing-assignment (get existing-assignments topology-id)]
    -              [topology-id (map to-worker-slot (newly-added-slots existing-assignment assignment))]
    -              )))
    -          (into {})
    -          (.assignSlots inimbus topologies)))
    -    (log-message "not a leader, skipping assignments")))
    +    (locking (:sched-lock nimbus)
    --- End diff --
    
    I agree and now that nimbus is in java we can look at doing some refactoring along those lines.  If you feel that we need to do it now and that this is a blocker I can spend some time looking into how to do that better.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1764: STORM-2190: reduce contention between submission and sche...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1764
  
    @HeartSaVioR I plan on doing that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1764: STORM-2190: reduce contention between submission and sche...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1764
  
    Just so we don't miss the comment from @jerrypeng 
    
    > Couldn't a wrong ordering of events happen since we are locking when calculating a scheduling then unlocking and then locking and uploading the new scheduling and unlocking
    > for example:
    > T0: submit
    > T1: rebalance
    > T2: rebalance - calculate new scheduling
    > T3: submit - calculate new scheduling
    > T4: rebalance - upload new scheduling to zk
    > T5: submit - upload new scheduling to zk
    > 
    > even though we should end up with the scheduling calculated by the rebalance but we end up with scheduling calculated from the original submit.
    
    Yes, that is correct.  We should do something here, and he suggested that perhaps as part of a refactor of Nimbus we should look at support for long running scheduling.
    
    In the short term I think I might make scheduling and writing to ZK atomic, but long term I think I will file a JIRA to look at better scheduling.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1764: STORM-2190: reduce contention between submission and sche...

Posted by ptgoetz <gi...@git.apache.org>.
Github user ptgoetz commented on the issue:

    https://github.com/apache/storm/pull/1764
  
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1764: STORM-2190: reduce contention between submission a...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/1764


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1764: STORM-2190: reduce contention between submission a...

Posted by jerrypeng <gi...@git.apache.org>.
Github user jerrypeng commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1764#discussion_r90540863
  
    --- Diff: storm-core/src/clj/org/apache/storm/daemon/nimbus.clj ---
    @@ -1008,23 +1008,24 @@
           (reset! (:id->worker-resources nimbus) {}))
         ;; tasks figure out what tasks to talk to by looking at topology at runtime
         ;; only log/set when there's been a change to the assignment
    -    (doseq [[topology-id assignment] new-assignments
    -            :let [existing-assignment (get existing-assignments topology-id)
    -                  topology-details (.getById topologies topology-id)]]
    -      (if (= existing-assignment assignment)
    -        (log-debug "Assignment for " topology-id " hasn't changed")
    -        (do
    -          (log-message "Setting new assignment for topology id " topology-id ": " (pr-str assignment))
    -          (.setAssignment storm-cluster-state topology-id (thriftify-assignment assignment))
    -          )))
    -    (->> new-assignments
    -          (map (fn [[topology-id assignment]]
    -            (let [existing-assignment (get existing-assignments topology-id)]
    -              [topology-id (map to-worker-slot (newly-added-slots existing-assignment assignment))]
    -              )))
    -          (into {})
    -          (.assignSlots inimbus topologies)))
    -    (log-message "not a leader, skipping assignments")))
    +    (locking (:sched-lock nimbus)
    --- End diff --
    
    maybe its also time to start thinking about decentralized scheduling mechanism if certain scheduling strategies may take a while to compute a schedule, but that would require a major overhaul in nimbus.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1764: STORM-2190: reduce contention between submission a...

Posted by ptgoetz <gi...@git.apache.org>.
Github user ptgoetz commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1764#discussion_r90317672
  
    --- Diff: examples/storm-starter/src/jvm/org/apache/storm/starter/WordCountTopology.java ---
    @@ -90,7 +90,9 @@ public static void main(String[] args) throws Exception {
         if (args != null && args.length > 0) {
           conf.setNumWorkers(3);
     
    -      StormSubmitter.submitTopologyWithProgressBar(args[0], conf, builder.createTopology());
    +      for (String name: args) {
    --- End diff --
    
    See comment on #1765. Is this change an artifact of manual testing?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1764: STORM-2190: reduce contention between submission and sche...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1764
  
    @ptgoetz @jerrypeng 
    
    I made a few changes to thing.  I fixed the race condition and I addressed the review comments, but I also put in some optimizations to storm submitter.  We were literally calling getClusterInfo 3+ times for each topology submission, and because the ultimate goal of STORM-2190 is to make it more scalable this helps a lot.  There is still some lock contention, but it is much better then it was before.
    
    If things look good here I will backport the changes to my other pull request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1764: STORM-2190: reduce contention between submission and sche...

Posted by jerrypeng <gi...@git.apache.org>.
Github user jerrypeng commented on the issue:

    https://github.com/apache/storm/pull/1764
  
    LGTM +1 @revans2  thanks for making the optimizations


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1764: STORM-2190: reduce contention between submission and sche...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1764
  
    +1 again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1764: STORM-2190: reduce contention between submission a...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1764#discussion_r90538639
  
    --- Diff: storm-core/src/clj/org/apache/storm/daemon/nimbus.clj ---
    @@ -1008,23 +1008,24 @@
           (reset! (:id->worker-resources nimbus) {}))
         ;; tasks figure out what tasks to talk to by looking at topology at runtime
         ;; only log/set when there's been a change to the assignment
    -    (doseq [[topology-id assignment] new-assignments
    -            :let [existing-assignment (get existing-assignments topology-id)
    -                  topology-details (.getById topologies topology-id)]]
    -      (if (= existing-assignment assignment)
    -        (log-debug "Assignment for " topology-id " hasn't changed")
    -        (do
    -          (log-message "Setting new assignment for topology id " topology-id ": " (pr-str assignment))
    -          (.setAssignment storm-cluster-state topology-id (thriftify-assignment assignment))
    -          )))
    -    (->> new-assignments
    -          (map (fn [[topology-id assignment]]
    -            (let [existing-assignment (get existing-assignments topology-id)]
    -              [topology-id (map to-worker-slot (newly-added-slots existing-assignment assignment))]
    -              )))
    -          (into {})
    -          (.assignSlots inimbus topologies)))
    -    (log-message "not a leader, skipping assignments")))
    +    (locking (:sched-lock nimbus)
    --- End diff --
    
    @jerrypeng You are correct that this could happen.  I don't really think it will be that likely to happen in practice but I'll think about it and see if we can fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1764: STORM-2190: reduce contention between submission and sche...

Posted by ptgoetz <gi...@git.apache.org>.
Github user ptgoetz commented on the issue:

    https://github.com/apache/storm/pull/1764
  
    One question about the for loop in WordCountTopology. I'm +1 once that's answered/addressed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1764: STORM-2190: reduce contention between submission and sche...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1764
  
    +1. Please apply the optimization to #1765 as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---