You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yuming Wang (JIRA)" <ji...@apache.org> on 2019/08/05 12:13:00 UTC
[jira] [Updated] (SPARK-28619) List all cases where the golden
result file is different from spark-sql
[ https://issues.apache.org/jira/browse/SPARK-28619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yuming Wang updated SPARK-28619:
--------------------------------
Description:
List all cases where the golden result file is different from {{spark-sql}}.
Case 1 from {{pgSQL/aggregates_part1.sql}}:
{code:sql}
SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
FROM (VALUES (7000000000005), (7000000000007)) v(x);
{code}
{noformat}
-- spark-sql
spark-sql> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES (100000003), (100000004), (100000006), (100000007)) v(x);
1.00000005E8 2.5000000049670534
-- Our golden result file
SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
FROM (VALUES (100000003), (100000004), (100000006), (100000007)) v(x)
-- !query 33 schema
struct<avg(CAST(x AS DOUBLE)):double,var_pop(CAST(x AS DOUBLE)):double>
-- !query 33 output
1.00000005E8 2.5
{noformat}
Case 2 from {{group-by.sql}}:
{code:sql}
CREATE OR REPLACE TEMPORARY VIEW testData AS SELECT * FROM VALUES
(1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2), (null, 1), (3, null), (null, null)
AS testData(a, b);
SELECT SKEWNESS(a), KURTOSIS(a), MIN(a), MAX(a), AVG(a), VARIANCE(a), STDDEV(a), SUM(a), COUNT(a)
FROM testData;
{code}
{noformat}
-- spark-sql
spark-sql> SELECT SKEWNESS(a), KURTOSIS(a), MIN(a), MAX(a), AVG(a), VARIANCE(a), STDDEV(a), SUM(a), COUNT(a)
> FROM testData;
-0.2723801058145728 -1.5069204152249136 1 3 2.142857142857143 0.8095238095238094 0.8997354108424372 15 7
-- Our golden result file
SELECT SKEWNESS(a), KURTOSIS(a), MIN(a), MAX(a), AVG(a), VARIANCE(a), STDDEV(a), SUM(a), COUNT(a)
FROM testData
-- !query 13 schema
struct<skewness(CAST(a AS DOUBLE)):double,kurtosis(CAST(a AS DOUBLE)):double,min(a):int,max(a):int,avg(a):double,var_samp(CAST(a AS DOUBLE)):double,stddev_samp(CAST(a AS DOUBLE)):double,sum(a):bigint,count(a):bigint>
-- !query 13 output
-0.2723801058145729 -1.5069204152249134 1 3 2.142857142857143 0.8095238095238094 0.8997354108424372 15 7
{noformat}
was:
List all cases where the golden result file is different from {{spark-sql}}.
Case 1 from {{pgSQL/aggregates_part1.sql}}:
{code:sql}
SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
FROM (VALUES (7000000000005), (7000000000007)) v(x);
{code}
{noformat}
-- spark-sql
spark-sql> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES (100000003), (100000004), (100000006), (100000007)) v(x);
1.00000005E8 2.5000000049670534
-- Our golden result file
SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
FROM (VALUES (100000003), (100000004), (100000006), (100000007)) v(x)
-- !query 33 schema
struct<avg(CAST(x AS DOUBLE)):double,var_pop(CAST(x AS DOUBLE)):double>
-- !query 33 output
1.00000005E8 2.5
{noformat}
> List all cases where the golden result file is different from spark-sql
> -----------------------------------------------------------------------
>
> Key: SPARK-28619
> URL: https://issues.apache.org/jira/browse/SPARK-28619
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: Yuming Wang
> Priority: Major
>
> List all cases where the golden result file is different from {{spark-sql}}.
> Case 1 from {{pgSQL/aggregates_part1.sql}}:
> {code:sql}
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES (7000000000005), (7000000000007)) v(x);
> {code}
> {noformat}
> -- spark-sql
> spark-sql> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> > FROM (VALUES (100000003), (100000004), (100000006), (100000007)) v(x);
> 1.00000005E8 2.5000000049670534
> -- Our golden result file
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES (100000003), (100000004), (100000006), (100000007)) v(x)
> -- !query 33 schema
> struct<avg(CAST(x AS DOUBLE)):double,var_pop(CAST(x AS DOUBLE)):double>
> -- !query 33 output
> 1.00000005E8 2.5
> {noformat}
> Case 2 from {{group-by.sql}}:
> {code:sql}
> CREATE OR REPLACE TEMPORARY VIEW testData AS SELECT * FROM VALUES
> (1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2), (null, 1), (3, null), (null, null)
> AS testData(a, b);
> SELECT SKEWNESS(a), KURTOSIS(a), MIN(a), MAX(a), AVG(a), VARIANCE(a), STDDEV(a), SUM(a), COUNT(a)
> FROM testData;
> {code}
> {noformat}
> -- spark-sql
> spark-sql> SELECT SKEWNESS(a), KURTOSIS(a), MIN(a), MAX(a), AVG(a), VARIANCE(a), STDDEV(a), SUM(a), COUNT(a)
> > FROM testData;
> -0.2723801058145728 -1.5069204152249136 1 3 2.142857142857143 0.8095238095238094 0.8997354108424372 15 7
> -- Our golden result file
> SELECT SKEWNESS(a), KURTOSIS(a), MIN(a), MAX(a), AVG(a), VARIANCE(a), STDDEV(a), SUM(a), COUNT(a)
> FROM testData
> -- !query 13 schema
> struct<skewness(CAST(a AS DOUBLE)):double,kurtosis(CAST(a AS DOUBLE)):double,min(a):int,max(a):int,avg(a):double,var_samp(CAST(a AS DOUBLE)):double,stddev_samp(CAST(a AS DOUBLE)):double,sum(a):bigint,count(a):bigint>
> -- !query 13 output
> -0.2723801058145729 -1.5069204152249134 1 3 2.142857142857143 0.8095238095238094 0.8997354108424372 15 7
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org