You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/07/29 17:09:15 UTC

[GitHub] [airflow] dene14 opened a new pull request, #25404: Removed interfering force of index.

dene14 opened a new pull request, #25404:
URL: https://github.com/apache/airflow/pull/25404

   Forcing index in mysql dialect causing unnecessary complexity to process the query and adds a lot to the load when DAG executed frequently enough (hourly instead of daily) and having a lot of tasks in it (every additional task increment query time significantly).
   
   Below a few explain plans collected on mysql 8.0.28.
   
   - daily execution, 4 tasks in it, forcing index is set, but plan optimizer decides to use range, execution time: 1.5s
   ![image](https://user-images.githubusercontent.com/7289205/181808830-22451dfe-fc92-47bc-b7de-a8164fd4222b.png)
   - daily execution, 4 tasks in it, forcing index is **NOT** set, plan optimizer decides to use range, execution time: 1.5s
   ![image](https://user-images.githubusercontent.com/7289205/181808711-2f4efcce-614b-478e-854b-9949e0c2e708.png)
   
   Summary: when range fits in [range_optimizer_max_mem_size](https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_range_optimizer_max_mem_size), optimizer decides to use range scan instead of the optimization provided by hint and request executed fast enough (less than one second in both cases)
   
   - hourly execution, 96 tasks in it, forcing index is set, plan optimizer follows the hint, execution time: 15s
   ![image](https://user-images.githubusercontent.com/7289205/181808586-29b25c04-fdc6-4ab1-8390-99fe00f79c73.png)
   - hourly execution, 96 tasks in it, forcing index is **NOT** set, plan optimizer does a better job, execution time: 8s
   ![image](https://user-images.githubusercontent.com/7289205/181808438-013327ea-57c7-49b5-87d5-51052a29b10d.png)
   
   Summary: when range **DOESN'T** fit in [range_optimizer_max_mem_size](https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_range_optimizer_max_mem_size), optimizer follows the hint which produces much more expensive plan than w/o this hint.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on pull request #25404: Removed interfering force of index.

Posted by GitBox <gi...@apache.org>.
uranusjr commented on PR #25404:
URL: https://github.com/apache/airflow/pull/25404#issuecomment-1200721802

   Hmm, this line was actually introduced in #2021 to fix a performance issue in the first place. Unfortunately context is sparse on things this old, but I wonder what changed between then and now that the “normal” case is not slower now—we did change TaskInstance’s primary key in 2.2 to use `run_id` instead of `execution_date`, and also added an explicit foreign key to DagRun (which I think is relevant because this query joins that table). Do you think any of these would explain the difference?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk merged pull request #25404: Removed interfering force of index.

Posted by GitBox <gi...@apache.org>.
potiuk merged PR #25404:
URL: https://github.com/apache/airflow/pull/25404


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dene14 commented on pull request #25404: Removed interfering force of index.

Posted by GitBox <gi...@apache.org>.
dene14 commented on PR #25404:
URL: https://github.com/apache/airflow/pull/25404#issuecomment-1201332399

   @uranusjr very could be... 
   
   also, I've just opened #25448 with a bit more details, this adjustment should produce much bigger impact on performance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on pull request #25404: Removed interfering force of index.

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on PR #25404:
URL: https://github.com/apache/airflow/pull/25404#issuecomment-1202346857

   Awesome work, congrats on your first merged pull request!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on pull request #25404: Removed interfering force of index.

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on PR #25404:
URL: https://github.com/apache/airflow/pull/25404#issuecomment-1199765616

   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better 🚀.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25404: Removed interfering force of index.

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25404:
URL: https://github.com/apache/airflow/pull/25404#issuecomment-1202346492

   Yeah. I am all for removing such hints, especially that they really work in specific cases and specific problems and specific DB structure which seems to be already way changed since the time it was needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org