You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yuming Wang (JIRA)" <ji...@apache.org> on 2019/08/03 14:21:00 UTC
[jira] [Commented] (SPARK-28611) Histogram's height is diffrent
[ https://issues.apache.org/jira/browse/SPARK-28611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899470#comment-16899470 ]
Yuming Wang commented on SPARK-28611:
-------------------------------------
cc [~mgaido]
> Histogram's height is diffrent
> ------------------------------
>
> Key: SPARK-28611
> URL: https://issues.apache.org/jira/browse/SPARK-28611
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.4.3
> Reporter: Yuming Wang
> Priority: Major
>
> {code:sql}
> CREATE TABLE desc_col_table (key int COMMENT 'column_comment') USING PARQUET;
> -- Test output for histogram statistics
> SET spark.sql.statistics.histogram.enabled=true;
> SET spark.sql.statistics.histogram.numBins=2;
> INSERT INTO desc_col_table values 1, 2, 3, 4;
> ANALYZE TABLE desc_col_table COMPUTE STATISTICS FOR COLUMNS key;
> DESC EXTENDED desc_col_table key;
> {code}
> {noformat}
> spark-sql> DESC EXTENDED desc_col_table key;
> col_name key
> data_type int
> comment column_comment
> min 1
> max 4
> num_nulls 0
> distinct_count 4
> avg_col_len 4
> max_col_len 4
> histogram height: 4.0, num_of_bins: 2
> bin_0 lower_bound: 1.0, upper_bound: 2.0, distinct_count: 2
> bin_1 lower_bound: 2.0, upper_bound: 4.0, distinct_count: 2
> {noformat}
> But our result is:
> https://github.com/apache/spark/blob/v2.4.3/sql/core/src/test/resources/sql-tests/results/describe-table-column.sql.out#L231-L242
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org