You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Maria (JIRA)" <ji...@apache.org> on 2017/06/20 16:00:00 UTC

[jira] [Commented] (SPARK-17129) Support statistics collection and cardinality estimation for partitioned tables

    [ https://issues.apache.org/jira/browse/SPARK-17129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16056008#comment-16056008 ] 

Maria commented on SPARK-17129:
-------------------------------

[~ZenWzh], I happen to work with large number of partitioned tables where each partition itself is very large and each table has lots of partitions. At the same time, most queries use data from a limited number of partitions. Hence, I'm interested in collecting statistics on a per-partition basis. For example, I need to be able to issue ANALYZE TABLE table PARTITION xxx COMPUTE STATISTICS statement. I have made a change to support that in Spark and would like to submit a PR. Would that be OK?

> Support statistics collection and cardinality estimation for partitioned tables
> -------------------------------------------------------------------------------
>
>                 Key: SPARK-17129
>                 URL: https://issues.apache.org/jira/browse/SPARK-17129
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Zhenhua Wang
>
> I upgrade this JIRA, because there are many tasks found and needed to be done here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org