You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/12/03 15:10:21 UTC

[GitHub] [airflow] neptune19821220 opened a new issue #20016: Upgrading Apache Airflow from version 2.1.3 to 2.2.2

neptune19821220 opened a new issue #20016:
URL: https://github.com/apache/airflow/issues/20016


   ### Apache Airflow version
   
   2.2.2 (latest released)
   
   ### Operating System
   
   docker image apache/airflow:2.2.2-python3.9
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Other Docker-based deployment
   
   ### Deployment details
   
   I deployed airflow in AWS ECS Fargate Cluster.
   I use Aurora Serverless mysql5.7 as airflow db backend and AWS Redis cluster as AIRFLOW__CELERY__BROKER_URL.
   I also noticed the db backend charset setting, so I config the db parameter:
   character_set_client: utf8mb4
   character_set_connection: utf8mb4
   character_set_database: utf8mb4
   character_set_server: utf8mb4
   collation_connection: utf8mb4_unicode_ci
   collation_server: utf8mb4_unicode_ci
   
   ### What happened
   
   I use apache/airflow:2.1.3-python3.8 image and everything works well.
   Yesterday I use apache/airflow:2.2.2-python3.9 to upgrade my cluster to latest version.
   
   When the init db service run airflow db upgrade, it throw exception 
   sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (1032, "Can't find record in 'task_instance'")
   [SQL: UPDATE task_instance, dag_run SET task_instance.run_id=dag_run.run_id WHERE dag_run.dag_id = task_instance.dag_id AND dag_run.execution_date = task_instance.execution_date]
   
   I find it may be stopped at the line 187 in the db migration file https://github.com/apache/airflow/blob/5dd690b57a20ca944deb8d96e89ec6ae6161afeb/airflow/migrations/versions/7b2661a43ba3_taskinstance_keyed_to_dagrun.py.
   
   And I even try to run this sql from mysql client, it still throw such error.
   
   Because the upgrade process has finished some step, for example, drop dag_id and dag_id_2, it's impossible to re-run airflow db upgrade to try again.
   May I know how to get rid of this state except restoring db from my backup?
   
   ### What you expected to happen
   
   The db upgrade process should be successful.
   
   ### How to reproduce
   
   Create the ECS Fargate cluster with airflow2.1.3-python3.8 according the file 
   https://airflow.apache.org/docs/apache-airflow/2.1.3/docker-compose.yaml.
   Use Aurora Serverless mysql5.7 as airflow db backend and AWS Redis cluster as AIRFLOW__CELERY__BROKER_URL.
   Then replace airflow image with  airflow2.2.2-python3.9 and update the template.
   
   I am not sure if this issue can be reproduced.
   Because I have three environments which are created by the same cloduformation template, the infra should be the same.
   I am not sure if it is related with the records number in table.
   In develop environment, there are about 3k records in task_instance table and about 300 records in dag_run table.
   In staging environment, there are about 13k records in task_instance table and about 1300 records in dag_run table.
   In production environment, there are about 13k records in task_instance table and about 1300 records in dag_run table.
   
   The upgrade process was successfully finished in develop environment.
   But it always failed in staging and production environments.
   
   ### Anything else
   
   1. If the db upgrade process failed, is it possible to roll back all previous changes during the upgrade?
   2. Is it possible to give one sql syntax file which do the same thing with the db migration file?
       When I am stuck in some step, I can try to update the db manually and then run the following sql directly.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] neptune19821220 commented on issue #20016: Upgrading Apache Airflow from version 2.1.3 to 2.2.2

Posted by GitBox <gi...@apache.org>.
neptune19821220 commented on issue #20016:
URL: https://github.com/apache/airflow/issues/20016#issuecomment-986828774


   Hi All,
   
   For further test, I use mysqldump to dump the airflow database from staging Aurora serverless.
   1.  I restore the data to a new AWS RDS mysql 5.7 instance with the same charset, the airflow db upgrade finished successfully.
   2.  I restore the data to a new AWS Aurora Serverless mysql 5.7 cluster with the same charset, the airflow db failed in the same step.
   
   It seems there are some implicit difference between AWS Aurora Serverless  and RDS, but I am sorry I haven't much experience in mysql.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #20016: Upgrading Apache Airflow from version 2.1.3 to 2.2.2

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #20016:
URL: https://github.com/apache/airflow/issues/20016#issuecomment-986838313


   Is this possible that you send more logs of what happens including possibly server side logs from around the upgrade? 
   
   I have a feeling that this might be similar class of error like https://github.com/apache/airflow/issues/19988 (already solved in `main`) but we just miss a crucial error message and the failing UPDATE is just a result of an earlier error or maybe some replication delays.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] neptune19821220 commented on issue #20016: Upgrading Apache Airflow from version 2.1.3 to 2.2.2

Posted by GitBox <gi...@apache.org>.
neptune19821220 commented on issue #20016:
URL: https://github.com/apache/airflow/issues/20016#issuecomment-989940873


   @potiuk @ashb
   As AWS support reply, 
   1. the RDS mysql 5.7 instance use version 5.7.22 , and the upgraded finished.
   2. the Aurora Serverless mysql 5.7 instance use version 5.7.12 , and the upgrade failed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] neptune19821220 commented on issue #20016: Upgrading Apache Airflow from version 2.1.3 to 2.2.2

Posted by GitBox <gi...@apache.org>.
neptune19821220 commented on issue #20016:
URL: https://github.com/apache/airflow/issues/20016#issuecomment-986844030


   @potiuk @ashb 
   Thank you for your reply.
   
   Though I didn't stop airflow service during the upgrade, there is no any running task during the upgrade.
   I choose a free time window to do the upgrade task.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #20016: Upgrading Apache Airflow from version 2.1.3 to 2.2.2

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #20016:
URL: https://github.com/apache/airflow/issues/20016#issuecomment-986838313


   Is this possible that you send more logs of what happens including possibly server side logs from around the upgrade. I have a feeling that this might be similar class of error like https://github.com/apache/airflow/issues/19988 (already solved in `main`) but we just miss a crucial error message and the failing UPDATE is just a result of an earlier error or maybe some replication delays.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #20016: Upgrading Apache Airflow from version 2.1.3 to 2.2.2

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #20016:
URL: https://github.com/apache/airflow/issues/20016#issuecomment-1019599212


   This issue has been closed because it has not received response from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #20016: Upgrading Apache Airflow from version 2.1.3 to 2.2.2

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #20016:
URL: https://github.com/apache/airflow/issues/20016#issuecomment-985596971


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] neptune19821220 edited a comment on issue #20016: Upgrading Apache Airflow from version 2.1.3 to 2.2.2

Posted by GitBox <gi...@apache.org>.
neptune19821220 edited a comment on issue #20016:
URL: https://github.com/apache/airflow/issues/20016#issuecomment-986828774


   Hi All,
   
   For further test, I use mysqldump to dump the airflow database from staging Aurora serverless.
   1.  I restore the data to a new AWS RDS mysql 5.7 instance (no multiAZ, no replication) with the same charset, the airflow db upgrade finished successfully.
   2.  I restore the data to a new AWS Aurora Serverless mysql 5.7 cluster with the same charset, the airflow db failed in the same step.
   
   It seems there are some implicit difference between AWS Aurora Serverless  and RDS, but I am sorry I haven't much experience in mysql.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #20016: Upgrading Apache Airflow from version 2.1.3 to 2.2.2

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #20016:
URL: https://github.com/apache/airflow/issues/20016#issuecomment-986736514


   ### How to downgrade?
   
   Apply this diff and then run `alembic downgrade ccde3e26fe78` from airflow migrations directory:
   
   ```
   cd $(python -c "import airflow; import os; print(os.path.abspath(os.path.join(airflow.__file__, os.pardir)))")
   alembic downgrade ccde3e26fe78
   ```
   
   The Diff:
   
   ```diff
   diff --git a/airflow/migrations/versions/7b2661a43ba3_taskinstance_keyed_to_dagrun.py b/airflow/migrations/versions/7b2661a43ba3_taskinstance_keyed_to_dagrun.py
   index 25c9aa67a..775d23947 100644
   --- a/airflow/migrations/versions/7b2661a43ba3_taskinstance_keyed_to_dagrun.py
   +++ b/airflow/migrations/versions/7b2661a43ba3_taskinstance_keyed_to_dagrun.py
   @@ -347,12 +347,12 @@ def downgrade():
            batch_op.drop_index('idx_task_reschedule_dag_task_run')
   
        with op.batch_alter_table('task_instance', schema=None) as batch_op:
   +        batch_op.drop_constraint('task_instance_pkey', type_='primary')
            batch_op.alter_column('execution_date', existing_type=dt_type, existing_nullable=True, nullable=False)
            batch_op.alter_column(
                'dag_id', existing_type=string_id_col_type, existing_nullable=True, nullable=True
            )
   
   -        batch_op.drop_constraint('task_instance_pkey', type_='primary')
            batch_op.create_primary_key('task_instance_pkey', ['dag_id', 'task_id', 'execution_date'])
   
            batch_op.drop_constraint('task_instance_dag_run_fkey', type_='foreignkey')
   @@ -416,11 +416,11 @@ def downgrade():
            )
        else:
            with op.batch_alter_table('dag_run', schema=None) as batch_op:
   -            batch_op.drop_index('dag_id_state', table_name='dag_run')
   +            batch_op.drop_index('dag_id_state')
                batch_op.alter_column('run_id', existing_type=sa.VARCHAR(length=250), nullable=True)
                batch_op.alter_column('execution_date', existing_type=dt_type, nullable=True)
                batch_op.alter_column('dag_id', existing_type=sa.VARCHAR(length=250), nullable=True)
   -            batch_op.create_index('dag_id_state', 'dag_run', ['dag_id', 'state'], unique=False)
   +            batch_op.create_index('dag_id_state', ['dag_id', 'state'], unique=False)
   
   
    def _multi_table_update(dialect_name, target, column):
   diff --git a/airflow/migrations/versions/e9304a3141f0_make_xcom_pkey_columns_non_nullable.py b/airflow/migrations/versions/e9304a3141f0_make_xcom_pkey_columns_non_nullable.py
   index bde065b3e..1906fe76c 100644
   --- a/airflow/migrations/versions/e9304a3141f0_make_xcom_pkey_columns_non_nullable.py
   +++ b/airflow/migrations/versions/e9304a3141f0_make_xcom_pkey_columns_non_nullable.py
   @@ -70,7 +70,10 @@ def downgrade():
        """Unapply make xcom pkey columns non-nullable"""
        conn = op.get_bind()
        with op.batch_alter_table('xcom') as bop:
   -        if conn.dialect.name == 'mssql':
   +        try:
                bop.drop_constraint('pk_xcom', 'primary')
   +        except Exception:
   +            pass
            bop.alter_column("key", type_=sa.String(length=512, **COLLATION_ARGS), nullable=True)
            bop.alter_column("execution_date", type_=_get_timestamp(conn), nullable=True)
   +        bop.create_primary_key('pk_xcom', ['dag_id', 'task_id', 'key', 'execution_date'])
   ```
   
   ### What's causing the issue?
   
   I am not sure of this currently, any thoughts @jedcunningham @ashb @uranusjr 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #20016: Upgrading Apache Airflow from version 2.1.3 to 2.2.2

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #20016:
URL: https://github.com/apache/airflow/issues/20016#issuecomment-1013779157


   This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #20016: Upgrading Apache Airflow from version 2.1.3 to 2.2.2

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #20016:
URL: https://github.com/apache/airflow/issues/20016#issuecomment-986793360


   Could it be cos of Airflow still running while trying to upgrade?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] closed issue #20016: Upgrading Apache Airflow from version 2.1.3 to 2.2.2

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed issue #20016:
URL: https://github.com/apache/airflow/issues/20016


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #20016: Upgrading Apache Airflow from version 2.1.3 to 2.2.2

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #20016:
URL: https://github.com/apache/airflow/issues/20016#issuecomment-989947364


   Sounds like an odd problem with Aurora then 😢 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org