You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/08/04 12:46:25 UTC

[GitHub] [airflow] ozw1z5rd commented on issue #10155: Airflow 1.10.10 + DAG SERIALIZATION = fails to start manually the DAG's operators

ozw1z5rd commented on issue #10155:
URL: https://github.com/apache/airflow/issues/10155#issuecomment-668574862


   
   
   You must enable dag serialisation to replicate my issue, no serialisation no issue on company's system.
   
   These are my setting ( from pilot installation ) 
   
   ```
   min_serialized_dag_update_interval = 15
   store_dag_code = True
   max_num_rendered_ti_fields_per_task = 0 # this avoid the problem of Dead lock, which seems to affect MySQL only engine
   ```
   Any dag is affected, my tests where on this specific one:
   
   ```
   from builtins import range
   from datetime import timedelta
   
   from airflow.models import DAG
   from airflow.operators.bash_operator import BashOperator
   from airflow.operators.dummy_operator import DummyOperator
   from airflow.utils.dates import days_ago
   
   args = {
       'owner': 'Airflow',
       'start_date': days_ago(2),
   }
   
   dag = DAG(
       dag_id='example_sequence_restart', 
       default_args=args,
       schedule_interval='0 0 * * *',
       dagrun_timeout=timedelta(minutes=60),
       tags=['example']
   )
   
   run_this_last = DummyOperator(
       task_id='run_this_last',
       dag=dag,
   )
   
   # [START howto_operator_bash]
   run_this = BashOperator(
       task_id='run_after_loop',
       bash_command='echo 1',
       dag=dag,
   )
   # [END howto_operator_bash]
   
   run_this >> run_this_last
   
   task = BashOperator(
       task_id='start',
       bash_command='echo "{{ task_instance_key_str }}" && sleep 1',
       dag=dag,
   )
   task >> run_this
   
   # [START howto_operator_bash_template]
   also_run_this = BashOperator(
       task_id='also_run_this',
       bash_command='echo "run_id={{ run_id }} | dag_run={{ dag_run }}"',
       dag=dag,
   )
   # [END howto_operator_bash_template]
   
   also_run_this >> run_this_last
   ```
   
   I have to say that after the database migration I changes the database a bit:
   
   * dag_tag 
   change the constraint to 
    CONSTRAINT `dag_tag_ibfk_1` FOREIGN KEY (`dag_id`) REFERENCES `dag` (`dag_id`) on delete cascade 
   
   * rendered_task_instance_fields
   changed the execution_date from timestamp to timestamp(6)
   execution_date timestamp(6)
   
   * task_fail
   changed the execution_date to timestamp(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6)
   
   I was before these changes ( mostly the on on rendered_task_instance_fields ) i was unable to manually trigger the same task twice and get the two execution completed without errors. One completed, the other was unable to make the insert into the rendered_task_instance_fields:
   
   ```
   IntegrityError: (_mysql_exceptions.IntegrityError) (1062, "Duplicate entry 'PARTITIONADD-partition_add-2020-07-28 17:17:13' for key 'PRIMARY'")
   [SQL: INSERT INTO rendered_task_instance_fields (dag_id, task_id, execution_date, rendered_fields) VALUES (%s, %s, %s, %s)]
   [parameters: ('PARTITIONADD', 'partition_add', datetime.datetime(2020, 7, 28, 17, 17, 13, 315192), '{"hql": "\\n             ALTER TABLE unifieddata_cat.transient_ww_eventsjson\\n             ADD IF NOT EXISTS PARTITION( country = \'{country}\',year ... (158 characters truncated) ... e_url": "http://httpfs-preprod.hd.docomodigital.com:14000", "hdfs_path_pattern": "/Vault/Docomodigital/Preproduction/rawEvents/{country}/2020/07/28"}')]
   (Background on this error at: http://sqlalche.me/e/gkpj)
   
   ```
   After the change on excution_time anything worked fine.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org