You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by "Julian Hyde (Jira)" <ji...@apache.org> on 2021/08/05 19:18:00 UTC
[jira] [Created] (CALCITE-4720) Obsolete the Collect relational
operator, using Aggregate and ARRAY_AGG (and new aggregate functions
MULTISET_AGG and MAP_AGG) instead
Julian Hyde created CALCITE-4720:
------------------------------------
Summary: Obsolete the Collect relational operator, using Aggregate and ARRAY_AGG (and new aggregate functions MULTISET_AGG and MAP_AGG) instead
Key: CALCITE-4720
URL: https://issues.apache.org/jira/browse/CALCITE-4720
Project: Calcite
Issue Type: Bug
Reporter: Julian Hyde
The {{Collect}} relational operator converts a multi-row relation into a relation with a single row and a column whose type is {{MULTISET}}.
But it is difficult to generalize it; we would like to:
* Generating multiple rows, one for each group key, rather than a single row for the whole relation;
* Generate an {{ARRAY}} or {{MAP}} rather than a {{MULTISET};
* Generate a collection of scalars rather than a collection of records if the input is a single column (e.g. {{INTEGER MULTISET}} rather than {{ROW(INTEGER i) MULTISET}})
And, it is difficult to maintain; it is a minor RelNode that has only 2 implementations (that I know of) and I'm sure that there are bugs and missing support in SqlToRelConverter and the RelOptRule library.
We can achieve the same using the {{Aggregate}} operator and the {{ARRAY_AGG}} aggregate function. We would need new aggregate functions (let's call them {{MULTISET_AGG}} and {{MAP_AGG}}) for the {{MULTISET}} and {{MAP}} types.
Then we can obsolete {{Collect}}, and make current code paths use {{Aggregate}} instead.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)