You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/10/31 08:53:45 UTC

[GitHub] [spark] cloud-fan opened a new pull request #26341: [SPARK-29277][SQL] Add early DSv2 filter and projection pushdown

cloud-fan opened a new pull request #26341: [SPARK-29277][SQL] Add early DSv2 filter and projection pushdown
URL: https://github.com/apache/spark/pull/26341
 
 
   Bring back https://github.com/apache/spark/pull/25955
   
   ### What changes were proposed in this pull request?
   
   This adds a new rule, `V2ScanRelationPushDown`, to push filters and projections in to a new `DataSourceV2ScanRelation` in the optimizer. That scan is then used when converting to a physical scan node. The new relation correctly reports stats based on the scan.
   
   To run scan pushdown before rules where stats are used, this adds a new optimizer override, `earlyScanPushDownRules` and a batch for early pushdown in the optimizer, before cost-based join reordering. The other early pushdown rule, `PruneFileSourcePartitions`, is moved into the early pushdown rule set.
   
   This also moves pushdown helper methods from `DataSourceV2Strategy` into a util class.
   
   ### Why are the changes needed?
   
   This is needed for DSv2 sources to supply stats for cost-based rules in the optimizer.
   
   ### Does this PR introduce any user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   This updates the implementation of stats from `DataSourceV2Relation` so tests will fail if stats are accessed before early pushdown for v2 relations.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org