You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2016/10/19 03:33:58 UTC

[jira] [Commented] (SPARK-17997) Aggregation function for counting distinct values for multiple intervals

    [ https://issues.apache.org/jira/browse/SPARK-17997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15587548#comment-15587548 ] 

Apache Spark commented on SPARK-17997:
--------------------------------------

User 'wzhfy' has created a pull request for this issue:
https://github.com/apache/spark/pull/15544

> Aggregation function for counting distinct values for multiple intervals
> ------------------------------------------------------------------------
>
>                 Key: SPARK-17997
>                 URL: https://issues.apache.org/jira/browse/SPARK-17997
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Zhenhua Wang
>
> This is for computing ndv's for bins in equi-height histograms. A bin consists of two endpoints which form an interval of values and the ndv in that interval. For computing histogram statistics, after getting the endpoints, we need an agg function to count distinct values in each interval.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org