You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Илья Соин <il...@gmail.com> on 2022/02/17 19:26:43 UTC

[Question] How to provide S3 endpoint-url for --artifacts_dir="s3://..."

Hello,

I’m new to Beam and trying to submit python WordCount job to a Standalone Flink Cluster. I submit the job from the same node where my Flink JobManager is running.

python -m apache_beam.examples.wordcount --input /input.txt --output count --runner FlinkRunner --flink_master=localhost:8081 --environment_type=PROCESS --environment_config='{"command": "/opt/flink/bin/boot"}' --flink_job_server_jar="/beam-runners-flink-1.13-job-server-2.36.0.jar" --artifacts_dir="s3://my_project”

If I understand correctly, I can’t use local file system for artifacts_dir since artifacts need to be present on each TaskManager node, so I want to use S3. I couldn’t figure out how to provide endpoint-url for S3. I have a local S3 installation that I need to use.


__
Best, Ilya