You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by abellina <gi...@git.apache.org> on 2016/08/15 19:38:07 UTC

[GitHub] storm pull request #1627: STORM-2039: backpressure refactoring in worker and...

GitHub user abellina opened a pull request:

    https://github.com/apache/storm/pull/1627

    STORM-2039: backpressure refactoring in worker and executor

    For 1.x branch, will submit a PR for 2.x branch.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/abellina/storm STORM-2039_backpressure_refactoring_in_worker_and_executor

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/1627.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1627
    
----
commit 75b82f3449a2bd4ae8cf3c0999ee3f5f1defad0c
Author: Alessandro Bellina <ab...@yahoo-inc.com>
Date:   2016-08-15T19:28:40Z

    STORM-2039: backpressure refactoring in worker and executor

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1627: STORM-2039: backpressure refactoring in worker and...

Posted by knusbaum <gi...@git.apache.org>.
Github user knusbaum commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1627#discussion_r74990425
  
    --- Diff: storm-core/src/clj/org/apache/storm/daemon/worker.clj ---
    @@ -316,7 +316,6 @@
           :load-mapping (LoadMapping.)
           :assignment-versions assignment-versions
           :backpressure (atom false) ;; whether this worker is going slow
    -      :transfer-backpressure (atom false) ;; if the transfer queue is backed-up
           :backpressure-trigger (atom false) ;; a trigger for synchronization with executors
    --- End diff --
    
    Let's just make this an Object. The fact that it's an atom is misleading.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1627: STORM-2039: backpressure refactoring in worker and...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/1627


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1627: STORM-2039: backpressure refactoring in worker and execut...

Posted by knusbaum <gi...@git.apache.org>.
Github user knusbaum commented on the issue:

    https://github.com/apache/storm/pull/1627
  
    +1 for the changes. My comments are suggestions for a couple more.
    The travis failures look unrelated. Storm Core passes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1627: STORM-2039: backpressure refactoring in worker and...

Posted by knusbaum <gi...@git.apache.org>.
Github user knusbaum commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1627#discussion_r74987767
  
    --- Diff: storm-core/src/clj/org/apache/storm/daemon/worker.clj ---
    @@ -743,7 +736,7 @@
                             (fn [& args]
                               (check-credentials-changed)
                               (if ((:storm-conf worker) TOPOLOGY-BACKPRESSURE-ENABLE)
    -                            (check-throttle-changed))))
    +                            (topology-backpressure-callback))))
    --- End diff --
    
    This feels like the wrong place for this. 
    For some reason the backpressure is tacked onto the credentials timer?
    
    I know this was already the way it worked, but maybe there's a better place for it now? I don't think it's good that the value of the `task.credentials.poll.secs` config affects the frequency of backpressure polling.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1627: STORM-2039: backpressure refactoring in worker and execut...

Posted by abellina <gi...@git.apache.org>.
Github user abellina commented on the issue:

    https://github.com/apache/storm/pull/1627
  
    @zhuoliu thanks for the comment. I am taking a look at STORM-1949 to try and reproduce it, and will comment there. I think that is a separate issue from this PR so we should see a potential fix to STORM-1949 in a different PR. Are you ok with this going in as is?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1627: STORM-2039: backpressure refactoring in worker and execut...

Posted by knusbaum <gi...@git.apache.org>.
Github user knusbaum commented on the issue:

    https://github.com/apache/storm/pull/1627
  
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1627: STORM-2039: backpressure refactoring in worker and execut...

Posted by abellina <gi...@git.apache.org>.
Github user abellina commented on the issue:

    https://github.com/apache/storm/pull/1627
  
    @knusbaum I made the changes suggested. Also removed an extra fn wrapper on that function triggered by the timer. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1627: STORM-2039: backpressure refactoring in worker and...

Posted by abellina <gi...@git.apache.org>.
Github user abellina closed the pull request at:

    https://github.com/apache/storm/pull/1627


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1627: STORM-2039: backpressure refactoring in worker and execut...

Posted by zhuoliu <gi...@git.apache.org>.
Github user zhuoliu commented on the issue:

    https://github.com/apache/storm/pull/1627
  
    sure, Alex. Really appreciate if you can submit a separate PR for 1949.
    +1 to this one!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1627: STORM-2039: backpressure refactoring in worker and execut...

Posted by zhuoliu <gi...@git.apache.org>.
Github user zhuoliu commented on the issue:

    https://github.com/apache/storm/pull/1627
  
    @abellina looks good to me, just one extra suggestion:
    In
    https://github.com/apache/storm/blob/master/storm-core/src/clj/org/apache/storm/daemon/worker.clj#L156
    To reduce the traffic to ZK, we checked the previous flag to update to Zookeeper only when it is necessary (the flag has changed), but still there may some weird case we have inconsistency between local "prev-backpressure-flag" and the worker backpressure node at Zookeeper (see STORM-1949).
    
    So to strike a tradeoff and avoid the possible wrong stall of a topology, could you add do a random number generation here, e.g., generate a number between 0-9, if it is zero, we will call the try block no matter "prev-backpressure-flag" and "curr-backpressure-flag" is equal or not.
    
    Thanks a lot!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1627: STORM-2039: backpressure refactoring in worker and...

Posted by abellina <gi...@git.apache.org>.
Github user abellina commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1627#discussion_r75032583
  
    --- Diff: storm-core/src/clj/org/apache/storm/daemon/worker.clj ---
    @@ -316,7 +316,6 @@
           :load-mapping (LoadMapping.)
           :assignment-versions assignment-versions
           :backpressure (atom false) ;; whether this worker is going slow
    -      :transfer-backpressure (atom false) ;; if the transfer queue is backed-up
           :backpressure-trigger (atom false) ;; a trigger for synchronization with executors
    --- End diff --
    
    Nice catch. Will do.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1627: STORM-2039: backpressure refactoring in worker and...

Posted by abellina <gi...@git.apache.org>.
GitHub user abellina reopened a pull request:

    https://github.com/apache/storm/pull/1627

    STORM-2039: backpressure refactoring in worker and executor

    For 1.x branch, will submit a PR for 2.x branch.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/abellina/storm STORM-2039_backpressure_refactoring_in_worker_and_executor

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/1627.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1627
    
----
commit 75b82f3449a2bd4ae8cf3c0999ee3f5f1defad0c
Author: Alessandro Bellina <ab...@yahoo-inc.com>
Date:   2016-08-15T19:28:40Z

    STORM-2039: backpressure refactoring in worker and executor

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1627: STORM-2039: backpressure refactoring in worker and execut...

Posted by zhuoliu <gi...@git.apache.org>.
Github user zhuoliu commented on the issue:

    https://github.com/apache/storm/pull/1627
  
    Since with the refactoring we directly access the disruptor queue to get the backpressure status rather than having an additional flag in executor data which is error-prone when flipping (which is very possibly the root cause of the halt in STORM-1949), we may NOT need to add the ADDITIONAL ZK pressure which I mentioned in the above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1627: STORM-2039: backpressure refactoring in worker and execut...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1627
  
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1627: STORM-2039: backpressure refactoring in worker and...

Posted by abellina <gi...@git.apache.org>.
Github user abellina commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1627#discussion_r75030675
  
    --- Diff: storm-core/src/clj/org/apache/storm/daemon/worker.clj ---
    @@ -743,7 +736,7 @@
                             (fn [& args]
                               (check-credentials-changed)
                               (if ((:storm-conf worker) TOPOLOGY-BACKPRESSURE-ENABLE)
    -                            (check-throttle-changed))))
    +                            (topology-backpressure-callback))))
    --- End diff --
    
    yeah, agreed. I'll add another timer for this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---