You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2019/04/23 21:14:36 UTC
[GitHub] [airflow] BasPH commented on issue #5159: [AIRFLOW-XXX] Attempt to
remove flakeyness of LocalExecutor
BasPH commented on issue #5159: [AIRFLOW-XXX] Attempt to remove flakeyness of LocalExecutor
URL: https://github.com/apache/airflow/pull/5159#issuecomment-485977458
I have very limited time this week but tried digging into it. At least I managed to reproduce locally:
I set an optional breakpoint before `self.assertEqual(len(executor.running), 0)`:
```python
if len(executor.running) != 0:
import ipdb
ipdb.set_trace()
self.assertEqual(len(executor.running), 0)
```
Let this run a few times and after just a few attempts I hit the breakpoint:
```bash
for n in {1..100}; do nosetests tests/executors/test_local_executor.py:LocalExecutorTest.test_execution_limited_parallelism -s; done
```
For the record, at this point:
- `len(executor.running)` is 1 at this point in time and `executor.running` contains `{'fail': True}`
- `executor.queue.empty()` returns `True`
- `executor.result_queue.get()` returns `('fail', 'failed')` (I would expect this thing to be empty because `sync()` is called which loops over the result_queue, clearing out all items...)
Thinking out loud: my understanding is there's a race condition where the asserts are already being called before `executor.end()` finishes. I imagine there's something iffy about multiple workers and passing around the JoinableQueue to all workers, but unsure exactly what's causing the issue.
Hope this helps so far :-)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services