You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Kamil Bregula (Jira)" <ji...@apache.org> on 2020/02/26 05:50:00 UTC
[jira] [Resolved] (AIRFLOW-6824) EMRAddStepsOperator does not work
well with multi-step XCom
[ https://issues.apache.org/jira/browse/AIRFLOW-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kamil Bregula resolved AIRFLOW-6824.
------------------------------------
Fix Version/s: 2.0.0
Resolution: Fixed
> EMRAddStepsOperator does not work well with multi-step XCom
> -----------------------------------------------------------
>
> Key: AIRFLOW-6824
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6824
> Project: Apache Airflow
> Issue Type: Bug
> Components: aws
> Affects Versions: 1.10.9
> Reporter: Bjorn Olsen
> Assignee: Bjorn Olsen
> Priority: Minor
> Fix For: 2.0.0
>
>
> EmrAddStepsOperator allows you to add several steps to EMR for processing - the steps must be supplied as a list.
> This works well when passing an actual Python list as the 'steps' value, but we want to be able to generate the list of steps from a previous task - using an XCom.
> We must use the operator as follows, for the templating to work correctly and for it to resolve the XCom:
>
> {code:java}
> add_steps_task = EmrAddStepsOperator(
> task_id='add_steps',
> job_flow_id=job_flow_id,
> aws_conn_id='aws_default',
> provide_context=True,
> steps="{{task_instance.xcom_pull(task_ids='generate_steps')}}"
> ){code}
>
> The value in XCom from the 'generate_steps' task looks like (simplified):
> {code:java}
> [{'Name':'Step1'}, {'Name':'Step2'}]
> {code}
> However this is passed as a string to the operator, which cannot be passed to the underlying boto3 library which expects a list object.
> The following won't work either:
> {code:java}
> add_steps_task = EmrAddStepsOperator(
> task_id='add_steps',
> job_flow_id=job_flow_id,
> aws_conn_id='aws_default',
> provide_context=True,
> steps={{task_instance.xcom_pull(task_ids='generate_steps')}}
> ){code}
> Since this is not valid Python.
> We have to pass the steps as a string to the operator, and then convert it into a list after the render_template_fields has happened (immediately before the execute). Therefore the only option is to do the conversion from string to list in the operator's execute method.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)