You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/07/29 19:04:26 UTC

[GitHub] [beam] apilloud opened a new issue, #22533: [Bug]: Python expansion service hanging in tests

apilloud opened a new issue, #22533:
URL: https://github.com/apache/beam/issues/22533

   ### What happened?
   
   It appears the expansion service is hanging, causing a bunch of test suites to time out.
   https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/2949/
   https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/5697/
   https://ci-beam.apache.org/job/beam_PostCommit_XVR_PythonUsingJavaSQL_Dataflow/715/
   https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/4014/
   https://ci-beam.apache.org/job/beam_PostCommit_Python37/5528/
   https://ci-beam.apache.org/job/beam_PostCommit_Python38/2939/
   https://ci-beam.apache.org/job/beam_PostCommit_Python39/661/
   https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/5931/
   
   ```
   22:32:02 apache_beam/transforms/sql_test.py::SqlTransformTest::test_filter 
   22:32:02 -------------------------------- live log call ---------------------------------
   22:32:02 INFO     apache_beam.utils.subprocess_server:subprocess_server.py:239 Using pre-built snapshot at /home/jenkins/jenkins-slave/workspace/beam_PostCommit_XVR_Direct/src/sdks/java/extensions/sql/expansion-service/build/libs/beam-sdks-java-extensions-sql-expansion-service-2.42.0-SNAPSHOT.jar
   22:32:02 INFO     root:external.py:773 Starting a JAR-based expansion service from JAR /home/jenkins/jenkins-slave/workspace/beam_PostCommit_XVR_Direct/src/sdks/java/extensions/sql/expansion-service/build/libs/beam-sdks-java-extensions-sql-expansion-service-2.42.0-SNAPSHOT.jar 
   22:32:02 INFO     apache_beam.utils.subprocess_server:subprocess_server.py:116 Starting service with ['java' '-jar' '/home/jenkins/jenkins-slave/workspace/beam_PostCommit_XVR_Direct/src/sdks/java/extensions/sql/expansion-service/build/libs/beam-sdks-java-extensions-sql-expansion-service-2.42.0-SNAPSHOT.jar' '56083' '--filesToStage=/home/jenkins/jenkins-slave/workspace/beam_PostCommit_XVR_Direct/src/sdks/java/extensions/sql/expansion-service/build/libs/beam-sdks-java-extensions-sql-expansion-service-2.42.0-SNAPSHOT.jar']
   22:32:02 Build timed out (after 100 minutes). Marking the build as aborted.
   23:50:24 Build was aborted
   ```
   
   ### Issue Priority
   
   Priority: 1
   
   ### Issue Component
   
   Component: cross-language


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1199976299

   Are there any additional details as to what step is failing ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aaltay commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1213648967

   @kileys - Is this not a release blocker? We haven an RC but this issue is still open?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1199954401

   I can reproduce this locally. With grpcio 1.48.0 the "subprocess.Popen" invocation below gets stuck.
   
   https://github.com/apache/beam/blob/e5e3cb25ca4fc2e31c10eb3dbda8289c6bfc7140/sdks/python/apache_beam/utils/subprocess_server.py#L117
   
   cmd is: ['java' '-jar' '<path>/beam/sdks/java/io/expansion-service/build/libs/beam-sdks-java-io-expansion-service-2.42.0-SNAPSHOT.jar' '55010' '--filesToStage=<path>/beam/sdks/java/io/expansion-service/build/libs/beam-sdks-java-io-expansion-service-2.42.0-SNAPSHOT.jar']
   
   It works fine with grpcio 1.47.0.
   
   @tvalentyn any idea how to proceed here ? Sounds to me like a bug in new grpcio. Should we try to pin the version ?
   (that will still break old SDKs though).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1200064082

   bisection points to https://github.com/grpc/grpc/commit/977ebbef0925e70867bb627d8cde50f4fcee45aa
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1199975741

   I think we need to reach out to the maintainers of grpcio to clarify.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1200059939

   Update: we could reproduce locally. @tvalentyn is running a bisect on the grpcio repo to to identify the culprit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] gnossen commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
gnossen commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1208733009

   @chamikaramj Thanks for the feedback. We'll keep digging and come back to you when we have another candidate to test.
   
   I'll also keep on trying to create our own in-repo reproduction.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1209898650

   This will unfortunately result in a breakage for our released SDKs. 
   
   Also, we can very reliably reproduce this for gRPC 1.48.0 and all released Beam SDKs will automatically pick up the change when it's released. Is there a way to update https://github.com/grpc/grpc/pull/30473 (suspected PR that introduced the change in behavior) so that it does not result in a breakage for existing users ?
   
   Also, I suspect other gRPC users might run into similar issues when the change is released.
   
   cc: @aaltay @tvalentyn 
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] gnossen commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
gnossen commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1208528753

   @chamikaramj We have a prospective fix [here](https://github.com/grpc/grpc/pull/30473). Any chance you'd be willing to pull it down and test it out. We _believe_ we have an automated test that has been fixed by this change, but better safe than sorry.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] joar commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
joar commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1231749470

   I realized it might be too late for that stacktrace, it might help alternative timelines of me find this issue in the future if the timelines happen to merge. Thanks for the fix!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] gnossen commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
gnossen commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1209888595

   > more stdout/stderr than there was in v1.47.
   
   Agreed. We've seen this many times in our codebase. We tend to use [TemporaryFiles](https://github.com/grpc/grpc/blob/9cbedfe66fe266ca767cfc8a534076b96bd87c0c/examples/python/multiprocessing/test/_multiprocessing_example_test.py#L53) to avoid the issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1210018832

   Update:
   
   @drfloob confirmed that he's not able to confirm the fix anynmore as well. So possibly something was not setup correctly when the fix was tried the first time ?
   
   Also, after reading subprocess docs.
   
   Seems like the deadlock scenario is following.
   * Do subprocess.Popen() to start a process while using stdout=PIPE or stderr=PIPE.
   * Do a wait() on that process.
   * This could deadlock if the child process generates enough output to block waiting for the OS buffer to clear.
   
   But that's not what Beam is doing. Beam does following.
   * Do a subprocess.Popen() to start a process while using stdout=subprocess.PIPE and stderr=subprocess.STDOUT.
   * Start a separate thread to read lines from the child process's stdout and print those lines.
   
   From documentation:
   subprocess.PIPE:
     Special value that can be used as the stdin, stdout or stderr argument to [Popen] 
    (https://docs.python.org/3.8/library/subprocess.html#subprocess.Popen) and indicates that a pipe to the standard stream should be opened. Most useful with [Popen.communicate()](https://docs.python.org/3.8/library/subprocess.html#subprocess.Popen.communicate).
   
   subprocess.STDOUT:
     Special value that can be used as the stderr argument to [Popen] 
    (https://docs.python.org/3.8/library/subprocess.html#subprocess.Popen) and indicates that standard error should go into the same handle as standard output.
   
   Relavent code is here: https://github.com/apache/beam/blob/4799828f75309fd494e14eab5af40559e9a140a3/sdks/python/apache_beam/utils/subprocess_server.py#L129
   
   So I think Beam's implementation is correct and should not result in the deadlock scenario mentioned in the documentation.
   
   @drfloob and @gnossen do you agree ?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1201473386

   Detailed instructions to reproduce using an existing Apache Beam release.
   
   # Install Apache Beam 2.40.0
   
   $ mkdir test_env
   $ python -m venv test_env/
   $ . ./test_env/bin/activate
   $ pip install apache-beam[gcp]==2.40.0
   
   # pip freeze
   # (if gRPC version is not 1.48.0 uninstall and install version 1.48.0 to test or build from source).
   
   # Run the test
   
   $ export BOOTSTRAP_SERVER="dummy_broker"
   $ export PROJECT=dummyproject
   $ export TEMP_LOCATION=gs://dummybucket/tmp
   $ export REGION="us-central1"
   $ export JOB_NAME="kafka-taxi-`date +%Y%m%d-%H%M%S`"
   $ export NUM_WORKERS="5"
   
   $ python -m apache_beam.examples.kafkataxi.kafka_taxi \
     --runner DataflowRunner \
     --temp_location $TEMP_LOCATION \
     --project $PROJECT \
     --region $REGION \
     --num_workers $NUM_WORKERS \
     --job_name $JOB_NAME \
     --bootstrap_servers $BOOTSTRAP_SERVER
   
   You'll notice that the test get stuck at the "subprocess.Popen" call at below location (right after the log at the line "Starting service with...").
   https://github.com/apache/beam/blob/e5e3cb25ca4fc2e31c10eb3dbda8289c6bfc7140/sdks/python/apache_beam/utils/subprocess_server.py#L117
   
   If you try the same test with gRPC version 1.47.0 the test will proceed (but note that the test will still fail at a later location since we provided dummy parameters above).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1199867123

   Will check. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] veblush commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
veblush commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1200009523

   @gnossen for gRPC Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1199978582

   cc: @veblush from grpcio team.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] drfloob commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
drfloob commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1209881902

   @chamikaramj I believe you're running into a deadlock with Popen using `stdout=subprocess.PIPE`. I confirmed that the gRPC fork handlers aren't running at all when your example hangs, which means this isn't likely a threading issue. And by changing that one line in subprocess_server.py to `subprocess.Popen(cmd, stdout=None, stderr=None)`, Popen no longer hangs. Please try reproducing my results.
   
   I'm not sure what would have changed in python 1.48 to precipitate this, but presumably: more stdout/stderr than there was in v1.47. 
   
   See https://docs.python.org/3.8/library/subprocess.html#subprocess.Popen


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1209963402

   I'm still running into this even with the fix suggested in https://github.com/apache/beam/issues/22533#issuecomment-1209881902
   
   So something is off. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] gnossen commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
gnossen commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1213566618

   We've been playing around in your docker dev environment and have come up with another fix to try. Can you please give this PR a shot? https://github.com/grpc/grpc/pull/30572


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] apilloud commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
apilloud commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1199900995

   Per @Abacn this is due to grpcio 1.48.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] gnossen commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
gnossen commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1208650077

   It's ready to be tested right now. We're hoping to get verification from your end before merging the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1208723352

   Hi @gnossen, unfortunately I'm still running into the issue both on https://github.com/grpc/grpc/pull/30473 and grpcio HEAD. @tvalentyn will you be able to check as well ?
   (note, that I had to manually re-install protobuf 3.20.1)
   
   Happy to check again if needed.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] drfloob commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
drfloob commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1210912777

   > I tried both master and pull/30473/head with stdout and stderr set to None. Job get stuck in both cases.
   
   After rebuilding my beam dev environment (I lost some state when shutting down the docker container), I was unable to make the fix work again reliably. It's unclear to me how it got into a working state before. Apologies.
   
   > I don't have a stacktrace since it just hangs till I kill the process.
   
   I can confirm I'm seeing the same behavior. I added enough debug logging in gRPC to establish that the pthread pre-fork handlers are not being run on `subprocess.Popen`, so I'm not sure where exactly this is hanging yet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1214463570

   @gnossen and @drfloob  yes, that seems to have fixed it. The test doesn't get stuck on pull/30572/head but gets stuck on gRPC master. Thanks a lot for investigating.
   
   @aaltay and @kileys seems like the issue is in the gRPC side. So no need to block on this. I'll resolve this Beam side issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] gnossen commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
gnossen commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1210054944

   @chamikaramj What version are you testing with? `master`? I expect you'll need both https://github.com/grpc/grpc/pull/30473 _and_ the change to remove `PIPE`.
   
   Do you have a stacktrace of where the test is currently timing out?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1201538143

   Offending release was yanked.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1200001943

   I would consider trying to install grpcio from sources, and try to do a bisection via `git bisect` within grpcio repo. There are instructions on building from sources in https://pypi.org/project/grpcio/1.48.0/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1199905464

   @kileys making this a release blocker in case we have to update Python or x-lang wrappers somehow for automatic expansion service startup to not hang.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1210833277

   I opened https://github.com/apache/beam/pull/22654 to remove usage of subprocess.PIPE but it doesn't fix the expansion service hanging with unreleased gRPC as mentioned above.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn closed issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
tvalentyn closed issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.
URL: https://github.com/apache/beam/issues/22533


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj closed issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
chamikaramj closed issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.
URL: https://github.com/apache/beam/issues/22533


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] joar commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
joar commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1231581928

   > Do you have a stacktrace of where the test is currently timing out?
   
   @gnossen I have a stracktrace from the hung spot with symbols:
   
   ```
   #0  0x00007ffff7da056d in syscall () from /usr/lib/libc.so.6
   #1  0x00007ffff479e496 in absl::lts_20220623::synchronization_internal::FutexImpl::WaitUntil (t=..., val=0, v=0x7ffff0db0340) at third_party/abseil-cpp/absl/synchronization/internal/futex.h:104
   #2  absl::lts_20220623::synchronization_internal::Waiter::Wait (this=this@entry=0x7ffff0db0340, t=t@entry=...) at third_party/abseil-cpp/absl/synchronization/internal/waiter.cc:93
   #3  0x00007ffff479e3d1 in AbslInternalPerThreadSemWait_lts_20220623 (t=...) at third_party/abseil-cpp/absl/synchronization/internal/per_thread_sem.cc:89
   #4  0x00007ffff42c38b8 in absl::lts_20220623::synchronization_internal::PerThreadSem::Wait (t=...) at third_party/abseil-cpp/absl/synchronization/internal/per_thread_sem.h:107
   #5  absl::lts_20220623::Mutex::DecrementSynchSem (t=..., w=<optimized out>, mu=0x555556fd7c38) at third_party/abseil-cpp/absl/synchronization/mutex.cc:579
   #6  absl::lts_20220623::CondVar::WaitCommon (this=0x555556fd7c40, mutex=0x555556fd7c38, t=...) at third_party/abseil-cpp/absl/synchronization/mutex.cc:2577
   #7  0x00007ffff45221df in gpr_cv_wait (cv=0x555556fd7c40, mu=0x555556fd7c38, abs_deadline=...) at src/core/lib/gpr/sync_abseil.cc:85
   #8  0x00007ffff452375b in grpc_core::internal::ThreadState::AwaitThreads (this=0x555556fd7c30) at src/core/lib/gprpp/fork.cc:145
   #9  grpc_core::Fork::AwaitThreads () at src/core/lib/gprpp/fork.cc:232
   #10 0x00007ffff4540016 in grpc_prefork () at src/core/lib/iomgr/fork_posix.cc:84
   #11 0x00007ffff7d72a17 in ?? () from /usr/lib/libc.so.6
   #12 0x00007ffff7d720fe in fork () from /usr/lib/libc.so.6
   #13 0x00007ffff7fb2066 in subprocess_fork_exec (self=<optimized out>, args=<optimized out>) at /tmp/python-build.20220812191842.128421/Python-3.9.13/Modules/_posixsubprocess.c:844
   ```
   
   grpcio installed via 
   1. `git checkout v1.48.0` (in https://github.com/grpc/grpc)
   2. This diff
      ```diff
      diff --git a/src/python/grpcio/commands.py b/src/python/grpcio/commands.py
      index d93b6c7039..936e4df4db 100644
      --- a/src/python/grpcio/commands.py
      +++ b/src/python/grpcio/commands.py
      @@ -194,6 +194,7 @@ def try_cythonize(extensions, linetracing=False, mandatory=True):
                   include_dir for extension in extensions
                   for include_dir in extension.include_dirs
               ] + [CYTHON_STEM],
      +        gdb_debug=True,
               compiler_directives=cython_compiler_directives)
       
       
      diff --git a/src/python/grpcio/grpc/_grpcio_metadata.py b/src/python/grpcio/grpc/_grpcio_metadata.py
      index a1e53d7e81..d5d9fe287a 100644
      --- a/src/python/grpcio/grpc/_grpcio_metadata.py
      +++ b/src/python/grpcio/grpc/_grpcio_metadata.py
      @@ -1,17 +1 @@
      -# Copyright 2017 gRPC authors.
      -#
      -# Licensed under the Apache License, Version 2.0 (the "License");
      -# you may not use this file except in compliance with the License.
      -# You may obtain a copy of the License at
      -#
      -#     http://www.apache.org/licenses/LICENSE-2.0
      -#
      -# Unless required by applicable law or agreed to in writing, software
      -# distributed under the License is distributed on an "AS IS" BASIS,
      -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      -# See the License for the specific language governing permissions and
      -# limitations under the License.
      -
      -# AUTO-GENERATED FROM `$REPO_ROOT/templates/src/python/grpcio/grpc/_grpcio_metadata.py.template`!!!
      -
      -__version__ = """1.48.0"""
      +__version__ = """1.48.0"""
      \ No newline at end of file
      ```
   3. `GRPC_PYTHON_BUILD_WITH_CYTHON=1 pip install --ignore-installed .`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1210111770

   I tried both master and pull/30473/head with stdout and stderr set to None. Job get stuck in both cases.
   
   I don't have a stacktrace since it just hangs till I kill the process.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1200062320

   Bisection notes. To rebuild grpcio between bisection iterations:
    - deactivate and recreate a venv
    - install apache-beam[gcp] 
    - uninstall grpcio
    - cleanup leftovers in grpcio clone from previous build iteration, prepare for new build. ` git clean -dfx; git submodule update --init ; pip install -r requirements.txt ;` 
    - install grpcio  `GRPC_PYTHON_BUILD_WITH_CYTHON=1 pip install . `
    - stash local changes w/ `git stash`
    - try to repro the issue
    - continue bisection
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] apilloud commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
apilloud commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1199862599

   @chamikaramj @ihji Who would be a good person to investigate this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] veblush commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
veblush commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1200009937

   @tvalentyn bisecting would be helpful and callstack when it's hanging would be also helpful if possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1199903131

   Thanks for looking @Abacn.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1201443231

   reopen to also merge the CP to release branch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1209923023

   Tentatively re-opening to fix the ongoing release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aaltay commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1209996534

   @kileys - this still looks like a release blocker.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1208530864

   Yeah, I can try. Lemme know when it's ready to be tested (or I can do it when the PR is merged).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22533: [Bug]: Python expansion service hanging in tests

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1200064112

   cc: @ctiller


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1200068171

   passing back to @chamikaramj in case grpcio team needs help / questions about with xlang.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22533:
URL: https://github.com/apache/beam/issues/22533#issuecomment-1200067698

   filed https://github.com/grpc/grpc/issues/30446


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn closed issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.

Posted by GitBox <gi...@apache.org>.
tvalentyn closed issue #22533: [Bug]: grpcio==1.48.0 causes Java expansion service to hang on startup in Python XLang pipelines.
URL: https://github.com/apache/beam/issues/22533


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org