You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Elias Djurfeldt (Jira)" <ji...@apache.org> on 2020/01/31 09:26:00 UTC
[jira] [Comment Edited] (BEAM-9218) Template staging broken on Beam
2.18.0
[ https://issues.apache.org/jira/browse/BEAM-9218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027310#comment-17027310 ]
Elias Djurfeldt edited comment on BEAM-9218 at 1/31/20 9:25 AM:
----------------------------------------------------------------
I've patched the fix in [https://github.com/apache/beam/pull/10728] and can confirm that it works.
One thing to note, which might be a separate issue is that when supplying the `requirements_file` argument to the command that creates the template, the execution time increases considerably to around 3-4 minutes, instead of the 3 seconds or so when removing it. [~robertwb] Is this a bug or expected behaviour when working with templates? As far as I know, requirements shouldn't be determined at template staging time, but rather during execution time.
Example below takes 3-4 minutes to stage:
{code:java}
python -m run_pipeline \
--project=$PROJECT \
--runner=$RUNNER \
--staging_location=$STAGING_LOCATION \
--temp_location=$TEMP_LOCATION \
--template_location=$TEMPLATE_LOCATION \
--requirements_file=requirements.txt \
--region=REGION \
{code}
Whereas this example only takes 4 seconds or so
{code:java}
python -m run_pipeline \
--project=$PROJECT \
--runner=$RUNNER \
--staging_location=$STAGING_LOCATION \
--temp_location=$TEMP_LOCATION \
--template_location=$TEMPLATE_LOCATION \
--region=REGION \
{code}
was (Author: eliasdjur):
I've patched the fix in [https://github.com/apache/beam/pull/10728] and can confirm that it works.
One thing to note, which might be a separate issue is that when supplying the `requirements_file` argument to the command that creates the template, the execution time increases considerably to around 3-4 minutes, instead of the 3 seconds or so when removing it. [~robertwb] Is this a bug or expected behaviour when working with templates?
Example below takes 3-4 minutes to stage:
{code:java}
python -m run_pipeline \
--project=$PROJECT \
--runner=$RUNNER \
--staging_location=$STAGING_LOCATION \
--temp_location=$TEMP_LOCATION \
--template_location=$TEMPLATE_LOCATION \
--requirements_file=requirements.txt \
--region=REGION \
{code}
Whereas this example only takes 4 seconds or so
{code:java}
python -m run_pipeline \
--project=$PROJECT \
--runner=$RUNNER \
--staging_location=$STAGING_LOCATION \
--temp_location=$TEMP_LOCATION \
--template_location=$TEMPLATE_LOCATION \
--region=REGION \
{code}
> Template staging broken on Beam 2.18.0
> --------------------------------------
>
> Key: BEAM-9218
> URL: https://issues.apache.org/jira/browse/BEAM-9218
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow
> Affects Versions: 2.18.0
> Reporter: Michael Charkin
> Priority: Major
>
> beam 2.18.0 can not stage cloud Dataflow templates
>
> Looks like it is trying to access the RuntimeValueProvider during staging causing 'not accessible'
>
> Repo with code to reproduce the issue: [https://github.com/firemuzzy/dataflow-templates-bug]
>
> With the help of stack overflow narrowed the issue to the latest beam release and not python versions
> [https://stackoverflow.com/questions/59940069/how-do-you-create-a-google-cloud-dataflow-template-with-python-3?noredirect=1#59940069]
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)