You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/21 16:47:29 UTC

[GitHub] [airflow] howardyoo opened a new issue #21023: Running airflow dags test results in error when run twice.

howardyoo opened a new issue #21023:
URL: https://github.com/apache/airflow/issues/21023


   ### Apache Airflow version
   
   main (development)
   
   ### What happened
   
   # Product and Version
   Airflow Version: v2.3.0.dev0 (Git Version: .release:2.3.0.dev0+7a9ab1d7170567b1d53938b2f7345dae2026c6ea) to test and learn its functionalities. I am currently installed this using git clone and building the airflow on my MacOS environment, using python3.9.
   
   # Problem Statement
   When I was doing a test on my DAG, I wanted run 
   `airflow dags test <dag_id> <execution_dt>` so that I don't have to use UI to trigger dag runs each time. Running and looking at the result of the dags test proved to be more productive when doing some rapid tests on your DAG.
   
   The test runs perfectly for the first time it runs, but when I try to re-run the test again the following error message is observed:
   ```
   [2022-01-21 10:30:33,530] {migration.py:201} INFO - Context impl SQLiteImpl.
   [2022-01-21 10:30:33,530] {migration.py:204} INFO - Will assume non-transactional DDL.
   [2022-01-21 10:30:33,568] {dagbag.py:498} INFO - Filling up the DagBag from /Users/howardyoo/airflow/dags
   [2022-01-21 10:30:33,588] {example_python_operator.py:67} WARNING - The virtalenv_python example task requires virtualenv, please install it.
   [2022-01-21 10:30:33,594] {tutorial_taskflow_api_etl_virtualenv.py:29} WARNING - The tutorial_taskflow_api_etl_virtualenv example DAG requires virtualenv, please install it.
   Traceback (most recent call last):
     File "/Users/howardyoo/python3/bin/airflow", line 33, in <module>
       sys.exit(load_entry_point('apache-airflow==2.3.0.dev0', 'console_scripts', 'airflow')())
     File "/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/__main__.py", line 48, in main
       args.func(args)
     File "/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/cli/cli_parser.py", line 50, in command
       return func(*args, **kwargs)
     File "/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/utils/session.py", line 71, in wrapper
       return func(*args, session=session, **kwargs)
     File "/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/utils/cli.py", line 98, in wrapper
       return f(*args, **kwargs)
     File "/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/cli/commands/dag_command.py", line 429, in dag_test
       dag.clear(start_date=args.execution_date, end_date=args.execution_date, dag_run_state=State.NONE)
     File "/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/utils/session.py", line 71, in wrapper
       return func(*args, session=session, **kwargs)
     File "/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/models/dag.py", line 1906, in clear
       clear_task_instances(
     File "/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 286, in clear_task_instances
       dr.state = dag_run_state
     File "<string>", line 1, in __set__
     File "/Users/howardyoo/python3/lib/python3.9/site-packages/airflow/models/dagrun.py", line 207, in set_state
       raise ValueError(f"invalid DagRun state: {state}")
   ValueError: invalid DagRun state: None
   ```
   When going through the DAG runs in my UI, I noticed the following entry on my dag test run.
   ![Screen Shot 2022-01-21 at 10 31 52 AM](https://user-images.githubusercontent.com/32691630/150564356-f8b95b11-794a-451e-b5ad-ab9b59f3b52b.png)
   Looks like when you run the dag with `test` mode, it submits the dag run as `backfill` type. I am not completely sure why the `airflow dags test` would only succeed once, but looks like there might have been some process that may be missing to clear out the test (just my theory).
   
   # Workaround
   A viable workaround to stop it from failing is to find and `deleting` the dag run instance. Once the above dag run entry is deleted, I could successfully run my `airflow dags test` command again.
   
   
   ### What you expected to happen
   
   According to the documentation (https://airflow.apache.org/docs/apache-airflow/stable/tutorial.html#id2), it is stated that:
   
   > The same applies to airflow dags test [dag_id] [logical_date], but on a DAG level. It performs a single DAG run of the given DAG id. While it does take task dependencies into account, no state is registered in the database. It is convenient for locally testing a full run of your DAG, given that e.g. if one of your tasks expects data at some location, it is available.
   
   It does not mention about whether you have to delete the dag run instance to re-run the test, so I would expect that `airflow dags test` command will run successfully, and also successfully on any consecutive runs without any errors.
   
   ### How to reproduce
   
   - Get the reported version of airflow and install it to run.
   - Run airflow standalone using `airflow standalone` command. It should start up the basic webserver, scheduler, triggerer to start testing it.
   - Get any dags that exist in the DAGs. run `airflow dags test <dag_id> <start_dt>` to initiate DAGs test.
   - Once the test is finished, re-run the command and observe the error.
   - Go to the DAG runs, delete the dag run that the first run produced, and run the test again - the test should run successfully.
   
   ### Operating System
   
   MacOS Monterey (Version 12.1)
   
   ### Versions of Apache Airflow Providers
   
   No providers were used
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   This airflow is running as a `standalone` on my local MacOS environment. I have setup a dev env, by cloning from the github and built the airflow to run locally. It is using sqlite as its backend database, and sequentialExecutor to execute tasks sequentially.
   
   ### Anything else
   
   Nothing much. I would like this issue to be resolved so that I could run my DAG tests easily without 'actually' running it or relying on the UI. Also, there seems to be little information on what this `test` means and what it is different from the normal runs, so improving documentation to clarify it would be nice.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed issue #21023: Running airflow dags test results in error when run twice.

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #21023:
URL: https://github.com/apache/airflow/issues/21023


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #21023: Running airflow dags test results in error when run twice.

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #21023:
URL: https://github.com/apache/airflow/issues/21023#issuecomment-1018680119


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #21023: Running airflow dags test results in error when run twice.

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #21023:
URL: https://github.com/apache/airflow/issues/21023#issuecomment-1019954209


   > In my case, the same problem occurs when `airflow dags backfill` in version 2.2.3.
   > Is it the same cause? And will the bug be solved in the 2.2.x version?
   
   Yeah. I already cherry-picked it to 2.2.4 whcih is ~ few weeks away.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uplsh580 edited a comment on issue #21023: Running airflow dags test results in error when run twice.

Posted by GitBox <gi...@apache.org>.
uplsh580 edited a comment on issue #21023:
URL: https://github.com/apache/airflow/issues/21023#issuecomment-1019858152


   In my case, the same problem occurs when `airflow dags backfill` in version 2.2.3.
   Is it the same cause? And will the bug be solved in the 2.2.x version?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uplsh580 edited a comment on issue #21023: Running airflow dags test results in error when run twice.

Posted by GitBox <gi...@apache.org>.
uplsh580 edited a comment on issue #21023:
URL: https://github.com/apache/airflow/issues/21023#issuecomment-1019858152


   In my case, the same problem occurs when `airflow dags backfill` in version 2.2.3. Will the bug be solved in the 2.2.x version?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uplsh580 commented on issue #21023: Running airflow dags test results in error when run twice.

Posted by GitBox <gi...@apache.org>.
uplsh580 commented on issue #21023:
URL: https://github.com/apache/airflow/issues/21023#issuecomment-1019858152


   In my case, the same problem occurs when backfill in version 2.2.3. Will the bug be solved in the 2.2.x version?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] howardyoo commented on issue #21023: Running airflow dags test results in error when run twice.

Posted by GitBox <gi...@apache.org>.
howardyoo commented on issue #21023:
URL: https://github.com/apache/airflow/issues/21023#issuecomment-1019698660


   Thank you, @chenglongyan , for such a quick turnaround on this fix! And thank you @potiuk for approving it!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org