You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/04/16 13:42:23 UTC

[GitHub] [airflow] ashb opened a new pull request #8405: Fix timing-based flakey test in TestLocalTaskJob

ashb opened a new pull request #8405: Fix timing-based flakey test in TestLocalTaskJob
URL: https://github.com/apache/airflow/pull/8405
 
 
   This test suffered from timing-based failures, if the "main" process
   took even fractionally too long then the task process would have already
   cleaned up it's subprocess, so the expected callback in the main/test
   process would never be run.
   
   This changes is so that the callback _will always be called_ in the test
   process if it is called at all.
   
   
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] kaxil commented on a change in pull request #8405: Fix timing-based flakey test in TestLocalTaskJob

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #8405: Fix timing-based flakey test in TestLocalTaskJob
URL: https://github.com/apache/airflow/pull/8405#discussion_r409616777
 
 

 ##########
 File path: tests/jobs/test_local_task_job.py
 ##########
 @@ -325,28 +336,20 @@ def check_failure(context):
                           session=session)
         ti = TaskInstance(task=task, execution_date=DEFAULT_DATE)
         ti.refresh_from_db()
+
         job1 = LocalTaskJob(task_instance=ti,
                             ignore_ti_state=True,
                             executor=SequentialExecutor())
-        from airflow.task.task_runner.standard_task_runner import StandardTaskRunner
-        job1.task_runner = StandardTaskRunner(job1)
-        process = multiprocessing.Process(target=job1.run)
-        process.start()
-        ti.refresh_from_db()
-        for _ in range(0, 50):
-            if ti.state == State.RUNNING:
-                break
-            time.sleep(0.1)
-            ti.refresh_from_db()
-        self.assertEqual(State.RUNNING, ti.state)
-        ti.state = State.FAILED
-        session.merge(ti)
-        session.commit()
+        with timeout(30):
+            # This should be _much_ shorter to run.
+            # If you change this limit, make the timeout in the callbable above bigger
 
 Review comment:
   ```suggestion
               # If you change this limit, make the timeout in the callable above bigger
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] ashb commented on issue #8405: Fix timing-based flakey test in TestLocalTaskJob

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #8405: Fix timing-based flakey test in TestLocalTaskJob
URL: https://github.com/apache/airflow/pull/8405#issuecomment-614715400
 
 
   There was on odd timeout/hang that caused one test job to fail _after_ pytest had finished running its tests.
   
   ```
   2020-04-16T14:32:00.8362178Z = 5500 passed, 224 skipped, 1 xfailed, 2 xpassed, 13 warnings in 1660.97s (0:27:40) =
   2020-04-16T14:32:00.8362298Z No metrics to flush. Continuing.
   2020-04-16T14:32:00.8362409Z No distributions to flush. Continuing.
   2020-04-16T14:32:00.8362498Z No events to flush. Continuing.
   2020-04-16T14:32:00.9181541Z EOF in transport thread
   2020-04-16T14:32:00.9181725Z EOF in transport thread
   2020-04-16T14:32:00.9187980Z EOF in transport thread
   2020-04-16T15:09:29.6183593Z ##[error]The operation was canceled.
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] kaxil commented on a change in pull request #8405: Fix timing-based flakey test in TestLocalTaskJob

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #8405: Fix timing-based flakey test in TestLocalTaskJob
URL: https://github.com/apache/airflow/pull/8405#discussion_r409619288
 
 

 ##########
 File path: tests/jobs/test_local_task_job.py
 ##########
 @@ -325,28 +336,20 @@ def check_failure(context):
                           session=session)
         ti = TaskInstance(task=task, execution_date=DEFAULT_DATE)
         ti.refresh_from_db()
+
         job1 = LocalTaskJob(task_instance=ti,
                             ignore_ti_state=True,
                             executor=SequentialExecutor())
-        from airflow.task.task_runner.standard_task_runner import StandardTaskRunner
-        job1.task_runner = StandardTaskRunner(job1)
-        process = multiprocessing.Process(target=job1.run)
-        process.start()
-        ti.refresh_from_db()
-        for _ in range(0, 50):
-            if ti.state == State.RUNNING:
-                break
-            time.sleep(0.1)
-            ti.refresh_from_db()
-        self.assertEqual(State.RUNNING, ti.state)
-        ti.state = State.FAILED
-        session.merge(ti)
-        session.commit()
+        with timeout(30):
+            # This should be _much_ shorter to run.
+            # If you change this limit, make the timeout in the callbable above bigger
 
 Review comment:
   Where is the other timeout in the callable? Sorry the comment is a bit unclear. Are you talking about `time.sleep(60)` above

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] ashb commented on a change in pull request #8405: Fix timing-based flakey test in TestLocalTaskJob

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8405: Fix timing-based flakey test in TestLocalTaskJob
URL: https://github.com/apache/airflow/pull/8405#discussion_r409636383
 
 

 ##########
 File path: tests/jobs/test_local_task_job.py
 ##########
 @@ -325,28 +336,20 @@ def check_failure(context):
                           session=session)
         ti = TaskInstance(task=task, execution_date=DEFAULT_DATE)
         ti.refresh_from_db()
+
         job1 = LocalTaskJob(task_instance=ti,
                             ignore_ti_state=True,
                             executor=SequentialExecutor())
-        from airflow.task.task_runner.standard_task_runner import StandardTaskRunner
-        job1.task_runner = StandardTaskRunner(job1)
-        process = multiprocessing.Process(target=job1.run)
-        process.start()
-        ti.refresh_from_db()
-        for _ in range(0, 50):
-            if ti.state == State.RUNNING:
-                break
-            time.sleep(0.1)
-            ti.refresh_from_db()
-        self.assertEqual(State.RUNNING, ti.state)
-        ti.state = State.FAILED
-        session.merge(ti)
-        session.commit()
+        with timeout(30):
+            # This should be _much_ shorter to run.
+            # If you change this limit, make the timeout in the callbable above bigger
 
 Review comment:
   Yes, that. Sorry missed this comment before I hit merged.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] ashb merged pull request #8405: Fix timing-based flakey test in TestLocalTaskJob

Posted by GitBox <gi...@apache.org>.
ashb merged pull request #8405: Fix timing-based flakey test in TestLocalTaskJob
URL: https://github.com/apache/airflow/pull/8405
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services