You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "jack (JIRA)" <ji...@apache.org> on 2018/12/11 12:00:00 UTC

[jira] [Updated] (AIRFLOW-3499) Add flag to Opeators to write the Render to the log.

     [ https://issues.apache.org/jira/browse/AIRFLOW-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

jack updated AIRFLOW-3499:
--------------------------
    Description: 
*Motivation:*

I have few operators who uses Variable. The variable is updated consistently and gets overwritten with more recent data.

Say I have this operator:

 
{code:java}
NEXT_ORDER_ID= Variable.get("next_order_id_to_import")
import_orders = MySqlToGoogleCloudStorageOperator(
    task_id='import',
    mysql_conn_id='c_production',
    google_cloud_storage_conn_id='gcp_m',
    approx_max_file_size_bytes = 100000000, 
    sql = 'Select … from … where orders_id between {{ params.next_order_id}} and {{ ti.xcom_pull('max_order_id') }}',
    params={'next_order_id_to_import': NEXT_ORDER_ID},
    bucket=GCS_BUCKET_ID,
    filename=file_name_orders_products,
    dag=dag)
{code}
 

 

 

The problem is that I can not see the parameters for this query on the render page.

In fact +there is no way of knowing what was the actual query that the task executed+.

 

To be exact : 
{code:java}
 {{ ti.xcom_pull('max_order_id') }}{code}
 - Saved in Database so it always show the correct one per DAG.

 
{code:java}
{{ params.next_order_id}} {code}
 - Will always show the most recent value as this is not DAG parameter and not saved to the database. When clicking on the render page it goes to the variable and take the value from there, regardless if this was the value during the run or not.

 

 

*Suggested Solution:*

Since it's unlikely that the Render tab will be change (as my use case could be different than how other use it) the best solution is simply to allow to write the Render as it was during the execution of the task to the task log. This will help to traceback issues.

 

Basically add to all operators (Base Operator?) a flag :
{code:java}
write_render_to_log {code}
which default is False. If this flag set to true than the render content will be flushed to the log of the task.

 

  was:
*Motivation:*

I have few operators who uses Variable. The variable is updated consistently and gets overwritten with more recent data.

Say I have this operator:

 
{code:java}
NEXT_ORDER_ID= Variable.get("next_order_id_to_import")
import_orders = MySqlToGoogleCloudStorageOperator(
    task_id='import',
    mysql_conn_id='c_production',
    google_cloud_storage_conn_id='gcp_m',
    approx_max_file_size_bytes = 100000000, 
    sql = 'Select … from … where orders_id between {{ params.next_order_id}} and {{ ti.xcom_pull('max_order_id') }}',
    params={'next_order_id_to_import': NEXT_ORDER_ID},
    bucket=GCS_BUCKET_ID,
    filename=file_name_orders_products,
    dag=dag)
{code}
 

 

 

The problem is that If I can not see the parameters for this query on the render page.

To be exact : 
{code:java}
 {{ ti.xcom_pull('max_order_id') }}{code}
- Saved in Database so it always show the correct one per DAG.

 
{code:java}
{{ params.next_order_id}} {code}
- Will always show the most recent value as this is not DAG parameter and not saved to the database. When clicking on the render page it goes to the variable and take the value from there, regardless if this was the value during the run or not.

 

 

*Suggested Solution:*

Since it's unlikely that the Render tab will be change (as my use case could be different than how other use it) the best solution is simply to allow to write the Render as it was during the execution of the task to the task log. This will help to traceback issues.

 

Basically add to all operators (Base Operator?) a flag :
{code:java}
write_render_to_log {code}
which default is False. If this flag set to true than the render content will be flushed to the log of the task.

 


> Add flag to Opeators to write the Render to the log.
> ----------------------------------------------------
>
>                 Key: AIRFLOW-3499
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3499
>             Project: Apache Airflow
>          Issue Type: Wish
>    Affects Versions: 2.0.0
>            Reporter: jack
>            Priority: Major
>
> *Motivation:*
> I have few operators who uses Variable. The variable is updated consistently and gets overwritten with more recent data.
> Say I have this operator:
>  
> {code:java}
> NEXT_ORDER_ID= Variable.get("next_order_id_to_import")
> import_orders = MySqlToGoogleCloudStorageOperator(
>     task_id='import',
>     mysql_conn_id='c_production',
>     google_cloud_storage_conn_id='gcp_m',
>     approx_max_file_size_bytes = 100000000, 
>     sql = 'Select … from … where orders_id between {{ params.next_order_id}} and {{ ti.xcom_pull('max_order_id') }}',
>     params={'next_order_id_to_import': NEXT_ORDER_ID},
>     bucket=GCS_BUCKET_ID,
>     filename=file_name_orders_products,
>     dag=dag)
> {code}
>  
>  
>  
> The problem is that I can not see the parameters for this query on the render page.
> In fact +there is no way of knowing what was the actual query that the task executed+.
>  
> To be exact : 
> {code:java}
>  {{ ti.xcom_pull('max_order_id') }}{code}
>  - Saved in Database so it always show the correct one per DAG.
>  
> {code:java}
> {{ params.next_order_id}} {code}
>  - Will always show the most recent value as this is not DAG parameter and not saved to the database. When clicking on the render page it goes to the variable and take the value from there, regardless if this was the value during the run or not.
>  
>  
> *Suggested Solution:*
> Since it's unlikely that the Render tab will be change (as my use case could be different than how other use it) the best solution is simply to allow to write the Render as it was during the execution of the task to the task log. This will help to traceback issues.
>  
> Basically add to all operators (Base Operator?) a flag :
> {code:java}
> write_render_to_log {code}
> which default is False. If this flag set to true than the render content will be flushed to the log of the task.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)