You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/08/11 01:26:49 UTC

[GitHub] [airflow] mik-laj opened a new issue #10285: Update example DAGs to latest syntax

mik-laj opened a new issue #10285:
URL: https://github.com/apache/airflow/issues/10285


   Hello,
   
   When we stop releasing backport packages, we will be able to update all example DAGs to use the new syntax for handling outputs introduced by AIP-31.
   
   This is an example change.
   ```diff
    product_set_update = CloudVisionUpdateProductSetOperator(
        location=GCP_VISION_LOCATION,
   -    product_set_id="{{ task_instance.xcom_pull('product_set_create') }}",
   +    product_set_id=product_set_create.output,
        product_set=ProductSet(display_name='My Product Set 2'), 
       task_id='product_set_update',
   )
   ```
   Best regards,
   Kamil Bregułła


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-867759831


   I think this could be done as extra task afterwards


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-865669729


   Sure. Take a look please :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell edited a comment on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
josh-fell edited a comment on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-870582068


   Going through these example DAGs there are a number of `xcom_pull()` uses in which values from the `XCom` are directly accessed via an index and/or key as well as a `for loop` in Jinja.  Trying to rewrite these instances using the new syntax is  awfully cumbersome IMO.  For example:
   ```python
   # Access by dict key
   job_id="{{task_instance.xcom_pull('start_python_job_dataflow_runner_async')['dataflow_job_id']}}"
   
   # Access by index
   object_name="{{ task_instance.xcom_pull('upload_sheet_to_gcs')[0] }}"
   
   # Using a for loop and accessing by dict key
   echo_cmd = """
   {% for m in task_instance.xcom_pull('pull_messages') %}
       echo "AckID: {{ m.get('ackId') }}, Base64-Encoded: {{ m.get('message') }}"
   {% endfor %}
   """
   ```
   The `output` property of operators doesn't seem to elegantly handle this type of access currently.  I can use a workaround with ugly hacks like these:
   ```python
   # Dict key access workaround (index access would be written similarly)
   start_python_job_dataflow_runner_async_output = str(start_python_job_dataflow_runner_async.output).strip("{ }")
   
   wait_for_python_job_dataflow_runner_async_done = DataflowJobStatusSensor(
           task_id="wait-for-python-job-async-done",
           job_id="{{{{ {start_python_job_dataflow_runner_async_output}['dataflow_job_id'] }}}}",
           expected_statuses={DataflowJobStatus.JOB_STATE_DONE},
           project_id=GCP_PROJECT_ID,
           location='us-central1',
       )
   
   # For loop workaround
   echo_cmd = f"""
       {{% for m in {pull_messages_output} %}}
           echo "AckID: {{{{ m.get('ackId') }}}}, Base64-Encoded: {{{{ m.get('message') }}}}"
       {{% endfor %}}
       """
   ```
   I logged #16618 as a feature request to enhance the use of `XComArgs` to handle such access.  I was wondering if you all thought it was worth implementing these workarounds or to leave them as-is for now.  Perhaps there is a cleaner way to mimic the same result between `XComArg` and `xcom_pull()`? What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #10285: Update example DAG iles to latest syntax

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-671837374


   Or Maybe we can do automated refactoring for that one ? Sounds doable. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj closed issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
mik-laj closed issue #10285:
URL: https://github.com/apache/airflow/issues/10285


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on issue #10285: Update example DAG iles to latest syntax

Posted by GitBox <gi...@apache.org>.
turbaszek commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-671839207


   This should be probably part of #9041 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell edited a comment on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
josh-fell edited a comment on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-867651847


   @mik-laj @potiuk @turbaszek Would you all also like to see `PythonOperator` use be converted the `@task` decorator or would that kind of update be considered out of scope for the enhancement?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] hsnprsd commented on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
hsnprsd commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-727273937


   Hey, I am new here :). Can I contribute on this issue as my first contribution? Has someone already addressed this issue?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-911944188


   @josh-fell thanks. Closed!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
josh-fell commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-870582068


   Going through these example DAGs there are a number of `xcom_pull()` uses in which values from the `XCom` are directly accessed via an index and/or key as well as a `for loop` in Jinja.  Trying to rewrite these instances using the new syntax is  awfully cumbersome IMO.  For example:
   ```python
   # Access by dict key
   job_id="{{task_instance.xcom_pull('start_python_job_dataflow_runner_async')['dataflow_job_id']}}"
   
   # Access by index
   object_name="{{ task_instance.xcom_pull('upload_sheet_to_gcs')[0] }}"
   
   # Using a for loop and accessing by dict key
   echo_cmd = """
   {% for m in task_instance.xcom_pull('pull_messages') %}
       echo "AckID: {{ m.get('ackId') }}, Base64-Encoded: {{ m.get('message') }}"
   {% endfor %}
   """
   ```
   The `output` property of operators doesn't seem to elegantly handle this type of access currently.  I can use a workaround with ugly hacks like these:
   ```python
   # Dict key access workaround (index access would be written similarly)
   start_python_job_dataflow_runner_async_output = str(start_python_job_dataflow_runner_async.output).strip("{ }")
   
   wait_for_python_job_dataflow_runner_async_done = DataflowJobStatusSensor(
           task_id="wait-for-python-job-async-done",
           job_id="{{{{ {start_python_job_dataflow_runner_async_output}['dataflow_job_id'] }}}}",
           expected_statuses={DataflowJobStatus.JOB_STATE_DONE},
           project_id=GCP_PROJECT_ID,
           location='us-central1',
       )
   
   # For loop workaround
   echo_cmd = f"""
       {{% for m in {pull_messages_output} %}}
           echo "AckID: {{{{ m.get('ackId') }}}}, Base64-Encoded: {{{{ m.get('message') }}}}"
       {{% endfor %}}
       """
   ```
   I logged #16618 as a feature request to enhance the use of `XComArgs` to handle such access.  I was wondering if you all thought it was worth implementing these workarounds or to leave them as-is for now.  What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
josh-fell commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-867651847


   @mik-laj @potiuk @turbaszek Would you all also like to see `PythonOperator` use be converted to using the `@task` decorator or would that kind of update be considered out of scope for the enhancement?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb edited a comment on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
ashb edited a comment on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-870649900


   @josh-fell Can't you do `start_python_job_dataflow_runner_async.output['dataflow_job_id']` or similar? Looking at the docs that looks like it should work.
   
   Oh, no I'm mis-reading what/where the Key is used. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-870649900


   @josh-fell Can't you do `xcom['dataflow_job_id']` or similar? Looking at the docs that looks like it should work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
josh-fell commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-865413377


   @mik-laj @potiuk @turbaszek Is this enhancement still desired?  I could take a crack at it if it is.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-911947263


   Fantastic! Thanks @josh-fell !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #10285: Update example DAG iles to latest syntax

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-671837772


   CC: @turbaszek 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek edited a comment on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
turbaszek edited a comment on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-727275456


   @hsnprsd feel free to take this one 🚀 I'm happy to help with eventual questions. I think, as a good start we should refactor `airflow/example_dags/*` to use `.output` instead of jinja whenever it is possible (those DAGs are not part of providers).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb edited a comment on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
ashb edited a comment on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-870649900


   @josh-fell Can't you do `start_python_job_dataflow_runner_async.output['dataflow_job_id']` or similar? Looking at the docs that looks like it should work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell edited a comment on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
josh-fell edited a comment on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-870582068






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
josh-fell commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-911902545


   @turbaszek @mik-laj @potiuk All of the PRs to update the example DAGs have now been merged. This issue can be closed if you feel they have satisfied the scope of this issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
turbaszek commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-727275456


   @hsnprsd feel free to take this one 🚀 I'm happy to help with eventual questions. I think, as a good start we should refactor `airflow/example_dags/*` to use `.output` instead of jinja whenever it is possible.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
josh-fell commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-865413377


   @mik-laj @potiuk @turbaszek Is this enhancement still desired?  I could take a crack at it if it is.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
josh-fell commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-870582068


   Going through these example DAGs there are a number of `xcom_pull()` uses in which values from the `XCom` are directly accessed via an index and/or key as well as a `for loop` in Jinja.  Trying to rewrite these instances using the new syntax is  awfully cumbersome IMO.  For example:
   ```python
   # Access by dict key
   job_id="{{task_instance.xcom_pull('start_python_job_dataflow_runner_async')['dataflow_job_id']}}"
   
   # Access by index
   object_name="{{ task_instance.xcom_pull('upload_sheet_to_gcs')[0] }}"
   
   # Using a for loop and accessing by dict key
   echo_cmd = """
   {% for m in task_instance.xcom_pull('pull_messages') %}
       echo "AckID: {{ m.get('ackId') }}, Base64-Encoded: {{ m.get('message') }}"
   {% endfor %}
   """
   ```
   The `output` property of operators doesn't seem to elegantly handle this type of access currently.  I can use a workaround with ugly hacks like these:
   ```python
   # Dict key access workaround (index access would be written similarly)
   start_python_job_dataflow_runner_async_output = str(start_python_job_dataflow_runner_async.output).strip("{ }")
   
   wait_for_python_job_dataflow_runner_async_done = DataflowJobStatusSensor(
           task_id="wait-for-python-job-async-done",
           job_id="{{{{ {start_python_job_dataflow_runner_async_output}['dataflow_job_id'] }}}}",
           expected_statuses={DataflowJobStatus.JOB_STATE_DONE},
           project_id=GCP_PROJECT_ID,
           location='us-central1',
       )
   
   # For loop workaround
   echo_cmd = f"""
       {{% for m in {pull_messages_output} %}}
           echo "AckID: {{{{ m.get('ackId') }}}}, Base64-Encoded: {{{{ m.get('message') }}}}"
       {{% endfor %}}
       """
   ```
   I logged #16618 as a feature request to enhance the use of `XComArgs` to handle such access.  I was wondering if you all thought it was worth implementing these workarounds or to leave them as-is for now.  What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #10285: Update example DAG files to latest syntax

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #10285:
URL: https://github.com/apache/airflow/issues/10285#issuecomment-865669729


   Sure. Take a look please :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org