You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by "Julian Hyde (JIRA)" <ji...@apache.org> on 2017/06/01 19:31:04 UTC
[jira] [Created] (CALCITE-1824) GROUP_ID returns wrong result
Julian Hyde created CALCITE-1824:
------------------------------------
Summary: GROUP_ID returns wrong result
Key: CALCITE-1824
URL: https://issues.apache.org/jira/browse/CALCITE-1824
Project: Calcite
Issue Type: Bug
Reporter: Julian Hyde
Assignee: Julian Hyde
We implemented the {{GROUP_ID()}} function in CALCITE-512 but we got the specification wrong, and it returns the wrong result.
{{GROUP_ID}} is not in the SQL standard. It is implemented only by Oracle.
I mistakenly believed that {{GROUP_ID()}} is equivalent to {{GROUPING_ID(g1, ..., gn)}} (in a query with {{GROUP BY g1, ..., gn}}). In fact, {{GROUP_ID}} is useful only if you have duplicate grouping sets. If grouping sets are distinct, {{GROUP_ID()}} will always return zero.
Example 1
{code}SELECT deptno, job, GROUP_ID() AS g
FROM Emp
GROUP BY ROLLUP(deptno, job)
DEPTNO JOB G
---------- --------- ----------
10 CLERK 0
10 MANAGER 0
10 PRESIDENT 0
10 0
20 CLERK 0
20 ANALYST 0
20 MANAGER 0
20 0
30 CLERK 0
30 MANAGER 0
30 SALESMAN 0
30 0
0
{code} produces grouping sets (deptno, job), (deptno), (). These are distinct, so GROUP_ID() is 0 for all rows.
Example 2
{code}SELECT deptno, GROUP_ID() AS g
FROM Emp
GROUP BY GROUPING SETS (deptno, (), ());
DEPTNO G
---------- ----------
10 0
20 0
30 0
0
1
{code}
As you can see, the grouping set () occurs twice. So there is one row in the result for each occurrence: the first occurrence has g = 0; the second has g = 1.
In my fix for CALCITE-1069, I will change GROUP_ID() to always return 0. This is wrong, but nevertheless closer to the required behavior.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)