You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Shuyi Chen (JIRA)" <ji...@apache.org> on 2017/08/25 22:16:00 UTC
[jira] [Comment Edited] (FLINK-7491) Support COLLECT Aggregate
function in Flink SQL
[ https://issues.apache.org/jira/browse/FLINK-7491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16142277#comment-16142277 ]
Shuyi Chen edited comment on FLINK-7491 at 8/25/17 10:15 PM:
-------------------------------------------------------------
Thanks for reviewing the PR. [~jark]
Multiset and Array are different, and they support different set of operators (please see http://farrago.sourceforge.net/design/CollectionTypes.html). Also, the calcite definition of the COLLECT SqlAggFunction explicitly requires the return type to be a MultisetSqlType (see below)
{code:java}
/**
* The COLLECT operator. Multiset aggregator function.
*/
public static final SqlAggFunction COLLECT =
new SqlAggFunction("COLLECT",
null,
SqlKind.COLLECT,
ReturnTypes.TO_MULTISET,
null,
OperandTypes.ANY,
SqlFunctionCategory.SYSTEM, false, false) {
};
{code}
I am worried that, if we use an Array to emulate a Multiset, going down the path, we might have performance problem for large multiset, and potentially calcite integration issues that are related to MultisetSqlType. What do you think?
was (Author: suez1224):
Thanks for reviewing the PR. [~jark]
I think Multiset and Array are different, and they support different set of operators (please see http://farrago.sourceforge.net/design/CollectionTypes.html). Also, the calcite definition of the COLLECT SqlAggFunction explicitly requires the return type to be a Multiset (see below)
{code:java}
/**
* The COLLECT operator. Multiset aggregator function.
*/
public static final SqlAggFunction COLLECT =
new SqlAggFunction("COLLECT",
null,
SqlKind.COLLECT,
ReturnTypes.TO_MULTISET,
null,
OperandTypes.ANY,
SqlFunctionCategory.SYSTEM, false, false) {
};
{code}
I am worried that, if we use an Array to emulate a Multiset, going down the path, we might have performance problem for large multiset, and potentially calcite integration issues that are related to multiset. What do you think?
> Support COLLECT Aggregate function in Flink SQL
> -----------------------------------------------
>
> Key: FLINK-7491
> URL: https://issues.apache.org/jira/browse/FLINK-7491
> Project: Flink
> Issue Type: New Feature
> Components: Table API & SQL
> Reporter: Shuyi Chen
> Assignee: Shuyi Chen
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)