You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/02/22 15:08:32 UTC

[GitHub] [airflow] amichai07 opened a new pull request #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…

amichai07 opened a new pull request #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…
URL: https://github.com/apache/airflow/pull/7503
 
 
   This change fixes the bug identified by @houqp (described in #7402), but without losing the performance improvement which was introduced in refactor #4751. 
   
   ---
   Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-NNNN]`. AIRFLOW-NNNN = JIRA ID<sup>*</sup>
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   <sup>*</sup> For document-only changes commit message can start with `[AIRFLOW-XXXX]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] amichai07 commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…

Posted by GitBox <gi...@apache.org>.
amichai07 commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…
URL: https://github.com/apache/airflow/pull/7503#issuecomment-607924259
 
 
   @houqp it results in a performance improvement because it passes 'finished_tasks' as an argument in dep_context so no more db queries are made for getting those finished_tasks afterwards.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] houqp commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…

Posted by GitBox <gi...@apache.org>.
houqp commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…
URL: https://github.com/apache/airflow/pull/7503#issuecomment-607437919
 
 
   sorry, my github mention is overwhelmed, so I missed them. taking a look at this now.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] ashb commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…
URL: https://github.com/apache/airflow/pull/7503#issuecomment-590372059
 
 
   WDYT @houqp ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…

Posted by GitBox <gi...@apache.org>.
codecov-io commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…
URL: https://github.com/apache/airflow/pull/7503#issuecomment-589968886
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/7503?src=pr&el=h1) Report
   > Merging [#7503](https://codecov.io/gh/apache/airflow/pull/7503?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/61a8bb65818521ccbb846e647103535b3e36b26d?src=pr&el=desc) will **decrease** coverage by `0.28%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/7503/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/airflow/pull/7503?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #7503      +/-   ##
   ==========================================
   - Coverage   86.82%   86.54%   -0.29%     
   ==========================================
     Files         891      891              
     Lines       42095    42091       -4     
   ==========================================
   - Hits        36549    36426     -123     
   - Misses       5546     5665     +119
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/7503?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [airflow/models/dagrun.py](https://codecov.io/gh/apache/airflow/pull/7503/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFncnVuLnB5) | `96.5% <100%> (-0.06%)` | :arrow_down: |
   | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/7503/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/7503/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/7503/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0%> (-45.08%)` | :arrow_down: |
   | [...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/7503/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==) | `69.38% <0%> (-25.52%)` | :arrow_down: |
   | [airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/7503/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5) | `50.98% <0%> (-23.53%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/7503?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/7503?src=pr&el=footer). Last update [61a8bb6...342515e](https://codecov.io/gh/apache/airflow/pull/7503?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] houqp commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…

Posted by GitBox <gi...@apache.org>.
houqp commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…
URL: https://github.com/apache/airflow/pull/7503#issuecomment-607971751
 
 
   @amichai07 ha, i see, thanks for the explanation.
   
   The performance regression test setup by @mik-laj looks great. You might want to add for your optimization to guard against regressions from future changes :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] houqp commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…

Posted by GitBox <gi...@apache.org>.
houqp commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…
URL: https://github.com/apache/airflow/pull/7503#issuecomment-607457730
 
 
   This change looks correct to me. In my original fix, I also added unit tests to prevent regressions and looks like those tests are still passing, so we should be good 👍 
   
   `_get_ready_tis` also has a stricter depContext, so it should yield a more correct scheduling behavior.
   
   That said, @amichai07 are you sure your patch actually results in performance improvement? Looking at the code, `_get_ready_tis` actually does more work than the else branch being replaced. I don't think this is a blocker, but just curious if I missed anything in the code.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] amichai07 commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…

Posted by GitBox <gi...@apache.org>.
amichai07 commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…
URL: https://github.com/apache/airflow/pull/7503#issuecomment-614664212
 
 
   @ashb WDYT? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…
URL: https://github.com/apache/airflow/pull/7503#issuecomment-607485712
 
 
   @houqp In Q2 I want to add more tests that allow you to count the number of queries. 
   We have tests fo DagFilepProcessor: https://github.com/apache/airflow/blob/6bd2e59/tests/jobs/test_scheduler_job.py#L1130-L1233

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…
URL: https://github.com/apache/airflow/pull/7503#issuecomment-593540891
 
 
   @houqp What do you think?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] houqp edited a comment on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…

Posted by GitBox <gi...@apache.org>.
houqp edited a comment on issue #7503: [AIRFLOW-3607] fixed the bug with picking just special cases while maintaining the p…
URL: https://github.com/apache/airflow/pull/7503#issuecomment-607457730
 
 
   This change looks correct to me. In my original fix, I also added unit tests to prevent regressions and looks like those tests are still passing, so we should be good 👍 
   
   `_get_ready_tis` also has a stricter depContext, so it should yield a more correct scheduling behavior.
   
   That said, @amichai07 are you sure your patch actually results in performance improvement? Looking at the code, `_get_ready_tis` actually does more work than the else branch being replaced. I don't think this is a blocker, but just curious if I missed anything in the code. If it's a refactor without performance gain, we should change the commit message accordingly.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services