You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Elias Djurfeldt (Jira)" <ji...@apache.org> on 2020/01/31 09:26:00 UTC

[jira] [Comment Edited] (BEAM-9218) Template staging broken on Beam 2.18.0

    [ https://issues.apache.org/jira/browse/BEAM-9218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027310#comment-17027310 ] 

Elias Djurfeldt edited comment on BEAM-9218 at 1/31/20 9:25 AM:
----------------------------------------------------------------

I've patched the fix in [https://github.com/apache/beam/pull/10728] and can confirm that it works.

One thing to note, which might be a separate issue is that when supplying the `requirements_file` argument to the command that creates the template, the execution time increases considerably to around 3-4 minutes, instead of the 3 seconds or so when removing it. [~robertwb] Is this a bug or expected behaviour when working with templates? As far as I know, requirements shouldn't be determined at template staging time, but rather during execution time.

Example below takes 3-4 minutes to stage:
{code:java}
python -m run_pipeline \
  --project=$PROJECT \
  --runner=$RUNNER \
  --staging_location=$STAGING_LOCATION \
  --temp_location=$TEMP_LOCATION \
  --template_location=$TEMPLATE_LOCATION \
  --requirements_file=requirements.txt \
  --region=REGION \
{code}
Whereas this example only takes 4 seconds or so
{code:java}
python -m run_pipeline \
  --project=$PROJECT \
  --runner=$RUNNER \
  --staging_location=$STAGING_LOCATION \
  --temp_location=$TEMP_LOCATION \
  --template_location=$TEMPLATE_LOCATION \
  --region=REGION \
{code}
 


was (Author: eliasdjur):
I've patched the fix in [https://github.com/apache/beam/pull/10728] and can confirm that it works.

One thing to note, which might be a separate issue is that when supplying the `requirements_file` argument to the command that creates the template, the execution time increases considerably to around 3-4 minutes, instead of the 3 seconds or so when removing it. [~robertwb] Is this a bug or expected behaviour when working with templates?

Example below takes 3-4 minutes to stage:
{code:java}
python -m run_pipeline \
  --project=$PROJECT \
  --runner=$RUNNER \
  --staging_location=$STAGING_LOCATION \
  --temp_location=$TEMP_LOCATION \
  --template_location=$TEMPLATE_LOCATION \
  --requirements_file=requirements.txt \
  --region=REGION \
{code}

Whereas this example only takes 4 seconds or so
{code:java}
python -m run_pipeline \
  --project=$PROJECT \
  --runner=$RUNNER \
  --staging_location=$STAGING_LOCATION \
  --temp_location=$TEMP_LOCATION \
  --template_location=$TEMPLATE_LOCATION \
  --region=REGION \
{code}
 

> Template staging broken on Beam 2.18.0
> --------------------------------------
>
>                 Key: BEAM-9218
>                 URL: https://issues.apache.org/jira/browse/BEAM-9218
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>    Affects Versions: 2.18.0
>            Reporter: Michael Charkin
>            Priority: Major
>
> beam 2.18.0 can not stage cloud Dataflow templates
>  
> Looks like it is trying to access the RuntimeValueProvider during staging causing 'not accessible'
>  
> Repo with code to reproduce the issue: [https://github.com/firemuzzy/dataflow-templates-bug]
>  
> With the help of stack overflow narrowed the issue to the latest beam release and not python versions
> [https://stackoverflow.com/questions/59940069/how-do-you-create-a-google-cloud-dataflow-template-with-python-3?noredirect=1#59940069]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)