You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/15 18:57:15 UTC
[GitHub] [arrow-datafusion] Dandandan opened a new issue, #3843: Implement nested join optimization
Dandandan opened a new issue, #3843:
URL: https://github.com/apache/arrow-datafusion/issues/3843
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
For complex queries, like those in TCP-H and TCP-DS it is essential to find a good Join order.
`HashBuildProbeOrder` implements a rule to optimize the probe / build side of joins, but this is only a `local` optimization.
We should implement an algorithm that (tries to) find a more global optimum based on the total estimated cost of the joins.
**Describe the solution you'd like**
Implement an efficient algorithm for optimizing
I'm not sure what the SOTA is on this. Some material I found with some Googling:
https://db.in.tum.de/teaching/ws1415/queryopt/chapter3.pdf
https://db.in.tum.de/~radke/papers/hugejoins.pdf
https://www.cockroachlabs.com/blog/join-ordering-pt1/
https://www.cockroachlabs.com/blog/join-ordering-ii-the-ikkbz-algorithm/
http://mlwiki.org/index.php/Join_Ordering
**Describe alternatives you've considered**
**Additional context**
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] cristian-ilies-vasile commented on issue #3843: Implement nested join optimization
Posted by "cristian-ilies-vasile (via GitHub)" <gi...@apache.org>.
cristian-ilies-vasile commented on issue #3843:
URL: https://github.com/apache/arrow-datafusion/issues/3843#issuecomment-1517378497
Other resources:
Simplicity Done Right for Join Ordering - https://www.cidrdb.org/cidr2021/papers/cidr2021_paper01.pdf
The MonetDB Architecture Martin Kersten CWI - https://homepages.cwi.nl/~manegold/teaching/adt/lectures/lecture2.pdf
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] andygrove commented on issue #3843: Implement nested join optimization
Posted by GitBox <gi...@apache.org>.
andygrove commented on issue #3843:
URL: https://github.com/apache/arrow-datafusion/issues/3843#issuecomment-1293834929
We could also look at DuckDB join reordering: https://www.youtube.com/watch?v=aNRoR0Z3SzU
I filed a duplicate issue before I saw this one, although mine is specifically for the logical plan. https://github.com/apache/arrow-datafusion/issues/3984
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org