You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/10/05 10:22:29 UTC

[GitHub] [beam] calvinleungyk commented on pull request #15105: [BEAM-11275] Support remote package download from remote filesystems in Stager

calvinleungyk commented on pull request #15105:
URL: https://github.com/apache/beam/pull/15105#issuecomment-934274755


   I updated the PR and also tested the code with the following command (private info redacted):
   ```
   python3 sdks/python/apache_beam/examples/wordcount.py --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<scratch-bucket>/counts --runner DataflowRunner --project <project> --region <region> --temp_location gs://<scratch-bucket>/tmp_beam --extra_package="gs://<gcs-bucket>/extra.whl" --sdk_location=container --no_use_public_ips --service_account_email=email.com --network=network --subnetwork=https://www.googleapis.com/compute/v1/projects/... --experiment=shuffle_mode=service
   ```
   
   The job launched successfully with the following logs:
   ```
   INFO:apache_beam.runners.portability.stager:Downloading extra package: gs://beam-dataflow-it/wheel/tfx_twitter.whl locally before staging
   INFO:apache_beam.runners.portability.stager:Copied remote file from gs://beam-dataflow-it/wheel/tfx_twitter.whl to /var/folders/jl/3vwrt5kd6vg9vrhjpyy6b8dh0000gp/T/tmpm_bqr7q1/tmp9n_at3dp/tfx_twitter.whl.
   ...
   INFO:apache_beam.runners.dataflow.internal.apiclient:Completed GCS upload to gs://scratch-user.calvinl.dp.gcp.twttr.net/tmp_beam/beamapp-calvinl-1005095238-436424.1633427558.436689/tfx_twitter.whl in 0 seconds.
   ```
   
   However, the job did not finish due to an internal GCP issue that leads to `Error syncing pod ` on Dataflow. If you'd like to see a finished job, I can add the logs once we resolve that internally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org