You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by "Chris Rice (Jira)" <ji...@apache.org> on 2023/06/26 21:30:00 UTC

[jira] [Created] (CALCITE-5801) Document that Aggregate grouping keys are emitted from lowest to highest index

Chris Rice created CALCITE-5801:
-----------------------------------

             Summary: Document that Aggregate grouping keys are emitted from lowest to highest index
                 Key: CALCITE-5801
                 URL: https://issues.apache.org/jira/browse/CALCITE-5801
             Project: Calcite
          Issue Type: Improvement
          Components: core
    Affects Versions: 1.34.0
            Reporter: Chris Rice


Consider query:

{code}
SELECT b, a FROM t GROUP BY b, a
{code}

where the schema for table {{t}} emits {{a}} in column 0 and {{b}} in column 1.

You might be tempted to build this relation using:

{code}
relBuilder.scan("t");
var groupKey = relBuilder.groupKey(1, 0);
var aggregate = relBuilder.aggregate(groupKey).build();
{code}

but this will emit a plan which emits `a, b` rather than `b, a`:

{code}
LogicalAggregate(group=[{0, 1}])
  LogicalTableScan(table=[[t]])
{code}

The reason being that {{Aggregate}} represents its {{groupSet}} as a {{BitSet}} of columns which are implicitly always ordered from lowest to highest _index_.  Or, in other words, the line {{relBuilder.groupKey(1, 0)}} always creates the same grouping key as {{relBuilder.groupKey(0, 1)}}.

It should be documented somewhere that Aggregates expect their incoming grouping columns to be sorted in order of lowest to highest index in the incoming schema.  Possible candidates for places for this documentation include:
* On {{RelBuilder#groupKey()}}
* On {{RelBuilder#aggregate()}}
* On {{Aggregate}} itself

This is not just a hypothetical problem - the substrait-java project in fact does not properly handle converting aggregates whose grouping key order is not already sorted from lowest to highest index - See [SubstraitRelNodeConverter#visit(Aggregate)|https://github.com/substrait-io/substrait-java/blob/main/isthmus/src/main/java/io/substrait/isthmus/SubstraitRelNodeConverter.java#L208].  Of course, this is a separate bug with substrait-java, but perhaps it could have been avoided (and future such problems could still be avoided) with slightly more documentation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)