You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Iuliia Volkova (JIRA)" <ji...@apache.org> on 2018/09/12 18:33:00 UTC
[jira] [Commented] (AIRFLOW-2469) example task in documentation
causes dataflow operator to fail
[ https://issues.apache.org/jira/browse/AIRFLOW-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612566#comment-16612566 ]
Iuliia Volkova commented on AIRFLOW-2469:
-----------------------------------------
link to the doc with pattern matching condition: [https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/options/PipelineOptions.JobNameFactory.html]
I don't think what correct set validation for one task_id name. I will implement removing symbols, what not matching pattern before use task_id like apache beam job name:
[https://github.com/apache/incubator-airflow/blob/c3939c8e721870d263997e7aeaebc28e678d544b/airflow/contrib/hooks/gcp_dataflow_hook.py#L213]
> example task in documentation causes dataflow operator to fail
> --------------------------------------------------------------
>
> Key: AIRFLOW-2469
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2469
> Project: Apache Airflow
> Issue Type: Bug
> Components: Dataflow
> Affects Versions: 1.9.0
> Reporter: Chris Chow
> Assignee: Iuliia Volkova
> Priority: Critical
>
> https://github.com/apache/incubator-airflow/blob/c7a472ed6b0d8a4720f57ba1140c8cf665757167/airflow/contrib/operators/dataflow_operator.py#L176
> {noformat}
> t1 = DataflowTemplateOperator(
> task_id='datapflow_example',
> template='{{var.value.gcp_dataflow_base}}',
> parameters={
> 'inputFile': "gs://bucket/input/my_input.txt",
> 'outputFile': "gs://bucket/output/my_output.txt"
> },
> gcp_conn_id='gcp-airflow-service-account',
> dag=my-dag){noformat}
> If you actually name a dataflow task 'datapflow_example', the Google dataflow service will not accept the job because it is not named correctly. Dataflow job names can't have '_' in them. Strictly speaking, apache beam jobnames must adhere to the regex
> [a-z]([-a-z0-9]*[a-z0-9])?.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)