You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "luoyuxia (Jira)" <ji...@apache.org> on 2022/12/08 08:30:00 UTC

[jira] [Commented] (FLINK-28343) Hive dialect fails using union map type

    [ https://issues.apache.org/jira/browse/FLINK-28343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644661#comment-17644661 ] 

luoyuxia commented on FLINK-28343:
----------------------------------

Update: map type is still un-hashable.

> Hive dialect fails using union map type
> ---------------------------------------
>
>                 Key: FLINK-28343
>                 URL: https://issues.apache.org/jira/browse/FLINK-28343
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Connectors / Hive
>            Reporter: tartarus
>            Assignee: luoyuxia
>            Priority: Major
>
> We can reproduce it with the following example
> {code:java}
> @Test
> public void testUnionMapType() {
>     // automatically load hive module in hive-compatible mode
>     HiveModule hiveModule = new HiveModule(hiveCatalog.getHiveVersion());
>     CoreModule coreModule = CoreModule.INSTANCE;
>     for (String loaded : tableEnv.listModules()) {
>         tableEnv.unloadModule(loaded);
>     }
>     tableEnv.loadModule("hive", hiveModule);
>     tableEnv.loadModule("core", coreModule);
>     tableEnv.executeSql(
>             "CREATE TABLE test_map_table (params string) PARTITIONED BY (`p_date` string)");
>     tableEnv.executeSql("select map(\"\",\"\") as params from test_map_table union select map(\"\",\"\") as params from test_map_table");
> } {code}
> Because union semantics need to be de-duplicated, So flink will introduce an Aggregate,
> An exception will be thrown
> {code:java}
> Unsupported type(MAP) to generate hash code, the type(MAP) is not supported as a GROUP_BY/PARTITION_BY/JOIN_EQUAL/UNION field {code}
> We can see the Aggregate operator in the execution plan
> {code:java}
> optimize subquery_rewrite cost 33 ms.
> optimize result: 
> LogicalSink(table=[*anonymous_collect$1*], fields=[params])
> +- LogicalProject(inputs=[0])
>    +- LogicalAggregate(group=[{0}])
>       +- LogicalProject(inputs=[0])
>          +- LogicalUnion(all=[true])
>             :- LogicalProject(exprs=[[map(_UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")]])
>             :  +- LogicalTableScan(table=[[test-catalog, default, test_map_table]])
>             +- LogicalProject(exprs=[[map(_UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")]])
>                +- LogicalTableScan(table=[[test-catalog, default, test_map_table]]) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)