You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/08/02 19:21:50 UTC

[GitHub] [airflow] potiuk commented on issue #25448: Schduler polling is extremely heavy for big DAGs

potiuk commented on issue #25448:
URL: https://github.com/apache/airflow/issues/25448#issuecomment-1203120078

I think this should start with optimising the ORM query itsellf @ashb here to chime-in. But you have to remember that we have multiple databases and SQLAlchemy mapping should work for all of them. Not all queries can be optimised, and - almost by definition - ORM is not there to run "most optimized" queries, but to run "portable queries" and to make it easier to manage and maintain the DB access. This means that not all cases and not all queries can (and should be) optimized.

However if you find a way to optimize certain ORM access using ORM features, that's what should be done.

Also theis is quite normal that some cases are better optimised than others and there are often trade-offs and compromises involved.

I do not want to comment on particular queries at this stage. But I would encourage you to take a look how the orm part can be improved - and working from that trying to follow the logic. If we cannot map super optimized query to ORM, it just might be not worthy doing it.

Some optimisations are not worth implementing.

But if we can both - show improvement and get an ORM way of getting there, then it's great.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org