You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Takeshi Yamamuro (Jira)" <ji...@apache.org> on 2019/11/01 05:23:00 UTC

[jira] [Created] (SPARK-29708) Different answers in aggregates of multiple grouping sets

Takeshi Yamamuro created SPARK-29708:
----------------------------------------

             Summary: Different answers in aggregates of multiple grouping sets
                 Key: SPARK-29708
                 URL: https://issues.apache.org/jira/browse/SPARK-29708
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: Takeshi Yamamuro


A query below with multiple grouping sets seems to have different answers between PgSQL and Spark;
{code:java}
postgres=# create table gstest4(id integer, v integer, unhashable_col bit(4), unsortable_col xid);

postgres=# insert into gstest4
postgres-# values (1,1,b'0000','1'), (2,2,b'0001','1'),
postgres-#        (3,4,b'0010','2'), (4,8,b'0011','2'),
postgres-#        (5,16,b'0000','2'), (6,32,b'0001','2'),
postgres-#        (7,64,b'0010','1'), (8,128,b'0011','1');
INSERT 0 8

postgres=# select unsortable_col, count(*)
postgres-#   from gstest4 group by grouping sets ((unsortable_col),(unsortable_col))
postgres-#   order by text(unsortable_col);
 unsortable_col | count 
----------------+-------
              1 |     8
              1 |     8
              2 |     8
              2 |     8
(4 rows)
{code}
{code:java}
scala> sql("""create table gstest4(id integer, v integer, unhashable_col /* bit(4) */ byte, unsortable_col /* xid */ integer) using parquet""")

scala> sql("""
     | insert into gstest4
     | values (1,1,tinyint('0'),1), (2,2,tinyint('1'),1),
     |        (3,4,tinyint('2'),2), (4,8,tinyint('3'),2),
     |        (5,16,tinyint('0'),2), (6,32,tinyint('1'),2),
     |        (7,64,tinyint('2'),1), (8,128,tinyint('3'),1)
     | """)
res21: org.apache.spark.sql.DataFrame = []

scala> 

scala> sql("""
     | select unsortable_col, count(*)
     |   from gstest4 group by grouping sets ((unsortable_col),(unsortable_col))
     |   order by string(unsortable_col)
     | """).show
+--------------+--------+
|unsortable_col|count(1)|
+--------------+--------+
|             1|       8|
|             2|       8|
+--------------+--------+
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org