You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/10/20 03:55:09 UTC

[GitHub] [spark] Gabriel39 commented on pull request #31205: [SPARK-34119][SQL] Keep necessary stats after partition pruning

Gabriel39 commented on pull request #31205:
URL: https://github.com/apache/spark/pull/31205#issuecomment-947302027


   Hi @wangyum @cloud-fan I noticed this PR recently and found partition stats is supported by stats estimation in this PR. As you know, for partitioned table, stats in Hive is partition-based and in Spark is whole-table based. So in our internal Spark fork, if datasource is hive table and table has been analyzed already, we implement partition stats by reading hive partition stats directly from hive metastore. I want to know your opnions about this, if you are interested in this way, I will submit a PR soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org