You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/03/23 03:46:36 UTC

[GitHub] [beam] arunpandianp opened a new pull request #17162: [BEAM-14157] Don't request work on a closed windmill GetWorkStream

arunpandianp opened a new pull request #17162:
URL: https://github.com/apache/beam/pull/17162


   requestObserver.onNext() which is called by send should not be called after requestObserver.onCompleted() is called.
   requestObserver.onCompleted() is called when the stream is closed.
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #17162: [BEAM-14157] Don't call requestObserver.onNext on closed windmill streams

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #17162:
URL: https://github.com/apache/beam/pull/17162#discussion_r834455046



##########
File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java
##########
@@ -848,7 +848,11 @@ public void run() {
           new TimerTask() {
             @Override
             public void run() {
-              refreshActiveWork();
+              try {
+                refreshActiveWork();
+              } catch (RuntimeException e) {
+                LOG.warn("Failed to refresh active work: ", e);

Review comment:
       Sorry I'm not that familiar with this code, could you help me understand the change? 
   
   I'm assuming that this `refreshActiveWork()` call ultimately calls `GrpcWindmiillServier.send()`, so we're catching the error here and logging it. This is an improvement because previously that situation caused stalled GetWork streams?
   
   Are there other places where we should be catching the `IllegalStateException`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] scwhittle commented on a change in pull request #17162: [BEAM-14157] Don't request work on a closed windmill GetWorkStream

Posted by GitBox <gi...@apache.org>.
scwhittle commented on a change in pull request #17162:
URL: https://github.com/apache/beam/pull/17162#discussion_r833684035



##########
File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/GrpcWindmillServer.java
##########
@@ -936,7 +936,11 @@ protected void onResponse(StreamingGetWorkResponseChunk chunk) {
               .execute(
                   () -> {
                     try {
-                      send(extension);
+                      synchronized (this) {
+                        if (!clientClosed.get()) {

Review comment:
       To keep contract the same for other call-sites you could throw IllegalStateException yourself if the client is closed instead of relying on grpc to do so (since that adds delay it sounds like).
   
   LGTM though if you want to put a comment/TODO for that to try to get this in the next release. You'll have to find someone else to merge though.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #17162: [BEAM-14157] Don't call requestObserver.onNext on closed windmill streams

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #17162:
URL: https://github.com/apache/beam/pull/17162#discussion_r834580237



##########
File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingDataflowWorker.java
##########
@@ -848,7 +848,11 @@ public void run() {
           new TimerTask() {
             @Override
             public void run() {
-              refreshActiveWork();
+              try {
+                refreshActiveWork();
+              } catch (RuntimeException e) {
+                LOG.warn("Failed to refresh active work: ", e);

Review comment:
       Got it, thank you!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] arunpandianp commented on pull request #17162: [BEAM-14157] Don't request work on a closed windmill GetWorkStream

Posted by GitBox <gi...@apache.org>.
arunpandianp commented on pull request #17162:
URL: https://github.com/apache/beam/pull/17162#issuecomment-1075885167


   R: @scwhittle 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit merged pull request #17162: [BEAM-14157] Don't call requestObserver.onNext on closed windmill streams

Posted by GitBox <gi...@apache.org>.
TheNeuralBit merged pull request #17162:
URL: https://github.com/apache/beam/pull/17162


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] arunpandianp commented on a change in pull request #17162: [BEAM-14157] Don't call requestObserver.onNext on closed windmill streams

Posted by GitBox <gi...@apache.org>.
arunpandianp commented on a change in pull request #17162:
URL: https://github.com/apache/beam/pull/17162#discussion_r834131793



##########
File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/GrpcWindmillServer.java
##########
@@ -936,7 +936,11 @@ protected void onResponse(StreamingGetWorkResponseChunk chunk) {
               .execute(
                   () -> {
                     try {
-                      send(extension);
+                      synchronized (this) {
+                        if (!clientClosed.get()) {

Review comment:
       Moved the closed check inside send.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] arunpandianp commented on a change in pull request #17162: [BEAM-14157] Don't request work on a closed windmill GetWorkStream

Posted by GitBox <gi...@apache.org>.
arunpandianp commented on a change in pull request #17162:
URL: https://github.com/apache/beam/pull/17162#discussion_r833483603



##########
File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/GrpcWindmillServer.java
##########
@@ -936,7 +936,11 @@ protected void onResponse(StreamingGetWorkResponseChunk chunk) {
               .execute(
                   () -> {
                     try {
-                      send(extension);
+                      synchronized (this) {
+                        if (!clientClosed.get()) {

Review comment:
       > Was this causing issues other than unnecessary IllegalStateExceptions?
   
   I am investigating stalled GetWork streams with logs like `Output channel stalled for 31s, outbound thread`. I think sending messages on a closed stream has something to do with it. In test runs with this change, I'm yet to see a stalled GetWorkStream.
   
   > Should send itself be changed to check this?
   
   There are a few sends without the catch(IllegalStateException), they need to be looked at individually. Can we get this in first? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] scwhittle commented on pull request #17162: [BEAM-14157] Don't call requestObserver.onNext on closed windmill streams

Posted by GitBox <gi...@apache.org>.
scwhittle commented on pull request #17162:
URL: https://github.com/apache/beam/pull/17162#issuecomment-1077529086






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] scwhittle commented on a change in pull request #17162: [BEAM-14157] Don't request work on a closed windmill GetWorkStream

Posted by GitBox <gi...@apache.org>.
scwhittle commented on a change in pull request #17162:
URL: https://github.com/apache/beam/pull/17162#discussion_r832974995



##########
File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/GrpcWindmillServer.java
##########
@@ -936,7 +936,11 @@ protected void onResponse(StreamingGetWorkResponseChunk chunk) {
               .execute(
                   () -> {
                     try {
-                      send(extension);
+                      synchronized (this) {
+                        if (!clientClosed.get()) {

Review comment:
       Should send itself be changed to check this?
   
   Most sends are wrapped with
    catch (IllegalStateException) 
   but then ignore the exception or just log.
   
   Was this causing issues other than unnecessary IllegalStateExceptions?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] arunpandianp commented on pull request #17162: [BEAM-14157] Don't call requestObserver.onNext on closed windmill streams

Posted by GitBox <gi...@apache.org>.
arunpandianp commented on pull request #17162:
URL: https://github.com/apache/beam/pull/17162#issuecomment-1077459761


   R: @TheNeuralBit 
   
   @TheNeuralBit could you help review this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] ayberk commented on a change in pull request #17162: [BEAM-14157] Don't request work on a closed windmill GetWorkStream

Posted by GitBox <gi...@apache.org>.
ayberk commented on a change in pull request #17162:
URL: https://github.com/apache/beam/pull/17162#discussion_r833692179



##########
File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/GrpcWindmillServer.java
##########
@@ -936,7 +936,11 @@ protected void onResponse(StreamingGetWorkResponseChunk chunk) {
               .execute(
                   () -> {
                     try {
-                      send(extension);
+                      synchronized (this) {
+                        if (!clientClosed.get()) {

Review comment:
       +1 to throwing `IllegalStateException` if the stream is closed. Otherwise we risk having regressions by changing the behavior. It's safer to throw the exception for the purposes of yoru change.

##########
File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/GrpcWindmillServer.java
##########
@@ -936,7 +936,11 @@ protected void onResponse(StreamingGetWorkResponseChunk chunk) {
               .execute(
                   () -> {
                     try {
-                      send(extension);
+                      synchronized (this) {
+                        if (!clientClosed.get()) {

Review comment:
       +1 to throwing `IllegalStateException` if the stream is closed. Otherwise we risk having regressions by changing the behavior. It's safer to throw the exception for the purposes of your change.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org