You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by JiaTao Tao <ta...@gmail.com> on 2020/12/08 08:30:11 UTC

grouping() function occurs value large than 1

Hi
After AggregateExpandDistinctAggregatesRule, I got a plan like this:
The $10 in the project node is  $g=[GROUPING($0, $1, $2, $3, $4, $5, $6,
$7, $8)]) and we can see it is compared with value 1/2/3, but I check the
def of grouping(), both pg and oracle, the value of grouping is 0 or 1.

pg:https://www.postgresqltutorial.com/postgresql-grouping-sets/
oracle:
https://docs.oracle.com/cd/B28359_01/server.111/b28286/functions064.htm#SQLRF00647

```
EnumerableProject(xx=[$0], xx=[$1], xx=[$2], xx=[$3], xx=[$4], xx=[$5],
xx=[$6], $f7=[$7], $f8=[$8], gid=[$9], $g_1=[=($10, 1)], $g_2=[=($10, 2)],
$g_3=[=($10, 3)])
     EnumerableHashAggregate(group=[{0, 1, 2, 3, 4, 5, 6, 7, 8}],
groups=[[{0, 1, 2, 3, 4, 5, 6, 7}, {0, 1, 2, 3, 4, 5, 6, 8}, {0, 1, 2, 3,
4, 5, 6}]], dim_type=[grouping_id()], $g=[GROUPING($0, $1, $2, $3, $4, $5,
$6, $7, $8)])
```


Regards!

Aron Tao

Re: grouping() function occurs value large than 1

Posted by JiaTao Tao <ta...@gmail.com>.
Got it, thanks Julian!

Regards!

Aron Tao


Julian Hyde <jh...@gmail.com> 于2020年12月9日周三 下午2:26写道:

> GROUPING_ID is problematic for both optimizers and humans, because if the
> columns are permuted the value changes, and that causes problems. I think
> GROUPING is working well for our purposes.
>
> Of course you can use whichever you like in your queries.
>
> Julian
>
> > On Dec 8, 2020, at 19:00, JiaTao Tao <ta...@gmail.com> wrote:
> >
> > Hi Julian
> > I see, thanks, maybe use grouping id is better? Cuz seems not every
> engine
> > has this grouping behavior, in the doc of oracle[ref1]:
> >
> > The expr in the GROUPING function must match one of the expressions in
> the
> > GROUP BY clause. The function returns a value of 1 if the value of expr
> in
> > the row is a null representing the set of all values. Otherwise, it
> returns
> > zero.
> >
> >
> >
> > ref1:
> >
> https://docs.oracle.com/cd/B28359_01/server.111/b28286/functions064.htm#SQLRF00647
> >
> > Regards!
> >
> > Aron Tao
> >
> >
> > Julian Hyde <jh...@apache.org> 于2020年12月9日周三 上午9:28写道:
> >
> >> GROUPING is defined in the SQL standard. If it has N arguments, it
> >> returns an integer bitmask with N bits.
> >>
> >> PostgreSQL has the same behavior: see example in
> >> https://www.postgresql.org/docs/9.5/functions-aggregate.html.
> >>
> >> Julian
> >>
> >>> On Tue, Dec 8, 2020 at 12:30 AM JiaTao Tao <ta...@gmail.com>
> wrote:
> >>>
> >>> Hi
> >>> After AggregateExpandDistinctAggregatesRule, I got a plan like this:
> >>> The $10 in the project node is  $g=[GROUPING($0, $1, $2, $3, $4, $5,
> $6,
> >>> $7, $8)]) and we can see it is compared with value 1/2/3, but I check
> the
> >>> def of grouping(), both pg and oracle, the value of grouping is 0 or 1.
> >>>
> >>> pg:https://www.postgresqltutorial.com/postgresql-grouping-sets/
> >>> oracle:
> >>>
> >>
> https://docs.oracle.com/cd/B28359_01/server.111/b28286/functions064.htm#SQLRF00647
> >>>
> >>> ```
> >>> EnumerableProject(xx=[$0], xx=[$1], xx=[$2], xx=[$3], xx=[$4], xx=[$5],
> >>> xx=[$6], $f7=[$7], $f8=[$8], gid=[$9], $g_1=[=($10, 1)], $g_2=[=($10,
> >> 2)],
> >>> $g_3=[=($10, 3)])
> >>>     EnumerableHashAggregate(group=[{0, 1, 2, 3, 4, 5, 6, 7, 8}],
> >>> groups=[[{0, 1, 2, 3, 4, 5, 6, 7}, {0, 1, 2, 3, 4, 5, 6, 8}, {0, 1, 2,
> 3,
> >>> 4, 5, 6}]], dim_type=[grouping_id()], $g=[GROUPING($0, $1, $2, $3, $4,
> >> $5,
> >>> $6, $7, $8)])
> >>> ```
> >>>
> >>>
> >>> Regards!
> >>>
> >>> Aron Tao
> >>
>

Re: grouping() function occurs value large than 1

Posted by Julian Hyde <jh...@gmail.com>.
GROUPING_ID is problematic for both optimizers and humans, because if the columns are permuted the value changes, and that causes problems. I think GROUPING is working well for our purposes.

Of course you can use whichever you like in your queries.

Julian

> On Dec 8, 2020, at 19:00, JiaTao Tao <ta...@gmail.com> wrote:
> 
> Hi Julian
> I see, thanks, maybe use grouping id is better? Cuz seems not every engine
> has this grouping behavior, in the doc of oracle[ref1]:
> 
> The expr in the GROUPING function must match one of the expressions in the
> GROUP BY clause. The function returns a value of 1 if the value of expr in
> the row is a null representing the set of all values. Otherwise, it returns
> zero.
> 
> 
> 
> ref1:
> https://docs.oracle.com/cd/B28359_01/server.111/b28286/functions064.htm#SQLRF00647
> 
> Regards!
> 
> Aron Tao
> 
> 
> Julian Hyde <jh...@apache.org> 于2020年12月9日周三 上午9:28写道:
> 
>> GROUPING is defined in the SQL standard. If it has N arguments, it
>> returns an integer bitmask with N bits.
>> 
>> PostgreSQL has the same behavior: see example in
>> https://www.postgresql.org/docs/9.5/functions-aggregate.html.
>> 
>> Julian
>> 
>>> On Tue, Dec 8, 2020 at 12:30 AM JiaTao Tao <ta...@gmail.com> wrote:
>>> 
>>> Hi
>>> After AggregateExpandDistinctAggregatesRule, I got a plan like this:
>>> The $10 in the project node is  $g=[GROUPING($0, $1, $2, $3, $4, $5, $6,
>>> $7, $8)]) and we can see it is compared with value 1/2/3, but I check the
>>> def of grouping(), both pg and oracle, the value of grouping is 0 or 1.
>>> 
>>> pg:https://www.postgresqltutorial.com/postgresql-grouping-sets/
>>> oracle:
>>> 
>> https://docs.oracle.com/cd/B28359_01/server.111/b28286/functions064.htm#SQLRF00647
>>> 
>>> ```
>>> EnumerableProject(xx=[$0], xx=[$1], xx=[$2], xx=[$3], xx=[$4], xx=[$5],
>>> xx=[$6], $f7=[$7], $f8=[$8], gid=[$9], $g_1=[=($10, 1)], $g_2=[=($10,
>> 2)],
>>> $g_3=[=($10, 3)])
>>>     EnumerableHashAggregate(group=[{0, 1, 2, 3, 4, 5, 6, 7, 8}],
>>> groups=[[{0, 1, 2, 3, 4, 5, 6, 7}, {0, 1, 2, 3, 4, 5, 6, 8}, {0, 1, 2, 3,
>>> 4, 5, 6}]], dim_type=[grouping_id()], $g=[GROUPING($0, $1, $2, $3, $4,
>> $5,
>>> $6, $7, $8)])
>>> ```
>>> 
>>> 
>>> Regards!
>>> 
>>> Aron Tao
>> 

Re: grouping() function occurs value large than 1

Posted by JiaTao Tao <ta...@gmail.com>.
Hi Julian
I see, thanks, maybe use grouping id is better? Cuz seems not every engine
has this grouping behavior, in the doc of oracle[ref1]:

The expr in the GROUPING function must match one of the expressions in the
GROUP BY clause. The function returns a value of 1 if the value of expr in
the row is a null representing the set of all values. Otherwise, it returns
zero.



ref1:
https://docs.oracle.com/cd/B28359_01/server.111/b28286/functions064.htm#SQLRF00647

Regards!

Aron Tao


Julian Hyde <jh...@apache.org> 于2020年12月9日周三 上午9:28写道:

> GROUPING is defined in the SQL standard. If it has N arguments, it
> returns an integer bitmask with N bits.
>
> PostgreSQL has the same behavior: see example in
> https://www.postgresql.org/docs/9.5/functions-aggregate.html.
>
> Julian
>
> On Tue, Dec 8, 2020 at 12:30 AM JiaTao Tao <ta...@gmail.com> wrote:
> >
> > Hi
> > After AggregateExpandDistinctAggregatesRule, I got a plan like this:
> > The $10 in the project node is  $g=[GROUPING($0, $1, $2, $3, $4, $5, $6,
> > $7, $8)]) and we can see it is compared with value 1/2/3, but I check the
> > def of grouping(), both pg and oracle, the value of grouping is 0 or 1.
> >
> > pg:https://www.postgresqltutorial.com/postgresql-grouping-sets/
> > oracle:
> >
> https://docs.oracle.com/cd/B28359_01/server.111/b28286/functions064.htm#SQLRF00647
> >
> > ```
> > EnumerableProject(xx=[$0], xx=[$1], xx=[$2], xx=[$3], xx=[$4], xx=[$5],
> > xx=[$6], $f7=[$7], $f8=[$8], gid=[$9], $g_1=[=($10, 1)], $g_2=[=($10,
> 2)],
> > $g_3=[=($10, 3)])
> >      EnumerableHashAggregate(group=[{0, 1, 2, 3, 4, 5, 6, 7, 8}],
> > groups=[[{0, 1, 2, 3, 4, 5, 6, 7}, {0, 1, 2, 3, 4, 5, 6, 8}, {0, 1, 2, 3,
> > 4, 5, 6}]], dim_type=[grouping_id()], $g=[GROUPING($0, $1, $2, $3, $4,
> $5,
> > $6, $7, $8)])
> > ```
> >
> >
> > Regards!
> >
> > Aron Tao
>

Re: grouping() function occurs value large than 1

Posted by Julian Hyde <jh...@apache.org>.
GROUPING is defined in the SQL standard. If it has N arguments, it
returns an integer bitmask with N bits.

PostgreSQL has the same behavior: see example in
https://www.postgresql.org/docs/9.5/functions-aggregate.html.

Julian

On Tue, Dec 8, 2020 at 12:30 AM JiaTao Tao <ta...@gmail.com> wrote:
>
> Hi
> After AggregateExpandDistinctAggregatesRule, I got a plan like this:
> The $10 in the project node is  $g=[GROUPING($0, $1, $2, $3, $4, $5, $6,
> $7, $8)]) and we can see it is compared with value 1/2/3, but I check the
> def of grouping(), both pg and oracle, the value of grouping is 0 or 1.
>
> pg:https://www.postgresqltutorial.com/postgresql-grouping-sets/
> oracle:
> https://docs.oracle.com/cd/B28359_01/server.111/b28286/functions064.htm#SQLRF00647
>
> ```
> EnumerableProject(xx=[$0], xx=[$1], xx=[$2], xx=[$3], xx=[$4], xx=[$5],
> xx=[$6], $f7=[$7], $f8=[$8], gid=[$9], $g_1=[=($10, 1)], $g_2=[=($10, 2)],
> $g_3=[=($10, 3)])
>      EnumerableHashAggregate(group=[{0, 1, 2, 3, 4, 5, 6, 7, 8}],
> groups=[[{0, 1, 2, 3, 4, 5, 6, 7}, {0, 1, 2, 3, 4, 5, 6, 8}, {0, 1, 2, 3,
> 4, 5, 6}]], dim_type=[grouping_id()], $g=[GROUPING($0, $1, $2, $3, $4, $5,
> $6, $7, $8)])
> ```
>
>
> Regards!
>
> Aron Tao