You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/03/17 12:49:15 UTC

[GitHub] [spark] peter-toth commented on pull request #32298: [SPARK-34079][SQL] Merge non-correlated scalar subqueries

peter-toth commented on pull request #32298:
URL: https://github.com/apache/spark/pull/32298#issuecomment-1070887317


   @tgravescs, thanks for your interesting in in this PR. I reran the tests, they seem to be fine now.
   This change adds basic query merging logic and affects only `q9` from TPCDS, but it brings considerable performance improvement to it.
   This PR has been ready for review for a while, but haven't got much attention. Your and others review would be much appreciated.
   
   I also have some follow-up PRs/plans:
   - to add more advanced logic to support merging aggregate subqueries with different `Filter` nodes
   - support DSv2
   - try merging correlated subqueries
   
   but this PR is complex enough so I would add these in subsequent PRs only.
   
   We can target 3.3, but AFAIK the branch has been cut so probably we need some kind of approval on the dev list?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org