You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "tartarus (Jira)" <ji...@apache.org> on 2022/06/21 03:46:00 UTC

[jira] [Created] (FLINK-28158) Flink supports all modes of Hive UDAF (PARTIAL1, PARTIAL2, FINAL, COMPLETE)

tartarus created FLINK-28158:
--------------------------------

             Summary: Flink supports all modes of Hive UDAF (PARTIAL1, PARTIAL2, FINAL, COMPLETE)
                 Key: FLINK-28158
                 URL: https://issues.apache.org/jira/browse/FLINK-28158
             Project: Flink
          Issue Type: Sub-task
          Components: Connectors / Hive
    Affects Versions: 1.15.0
            Reporter: tartarus


Currently Flink UDAF only supports Hive UDAF's PARTIAL_1 and FINAL mode.

When Flink uses Hive's UDAF percent_rank, it fails with the following exception message
{code:java}
org.apache.flink.table.api.TableException: Unexpected error in type inference logic of function 'percent_rank'. This is a bug.    at org.apache.flink.table.types.inference.TypeInferenceUtil.createUnexpectedException(TypeInferenceUtil.java:206)
    at org.apache.flink.table.planner.functions.inference.TypeInferenceReturnInference.inferReturnType(TypeInferenceReturnInference.java:80)
    at org.apache.calcite.sql.SqlOperator.inferReturnType(SqlOperator.java:482)
    at org.apache.calcite.rex.RexBuilder.deriveReturnType(RexBuilder.java:283)
    at org.apache.calcite.rex.RexBuilder.makeCall(RexBuilder.java:257)
    at org.apache.flink.table.planner.delegation.hive.SqlFunctionConverter.visitOver(SqlFunctionConverter.java:121)
    at org.apache.flink.table.planner.delegation.hive.SqlFunctionConverter.visitOver(SqlFunctionConverter.java:56)
    at org.apache.calcite.rex.RexOver.accept(RexOver.java:121)
    at org.apache.flink.table.planner.delegation.hive.HiveParserCalcitePlanner.getWindowRexAndType(HiveParserCalcitePlanner.java:1859)
    at org.apache.flink.table.planner.delegation.hive.HiveParserCalcitePlanner.genSelectForWindowing(HiveParserCalcitePlanner.java:1913)
    at org.apache.flink.table.planner.delegation.hive.HiveParserCalcitePlanner.genSelectLogicalPlan(HiveParserCalcitePlanner.java:2002)
    at org.apache.flink.table.planner.delegation.hive.HiveParserCalcitePlanner.genLogicalPlan(HiveParserCalcitePlanner.java:2751)
    at org.apache.flink.table.planner.delegation.hive.HiveParserCalcitePlanner.logicalPlan(HiveParserCalcitePlanner.java:284)
    at org.apache.flink.table.planner.delegation.hive.HiveParserCalcitePlanner.genLogicalPlan(HiveParserCalcitePlanner.java:272)
    at org.apache.flink.table.planner.delegation.hive.HiveParser.analyzeSql(HiveParser.java:303)
    at org.apache.flink.table.planner.delegation.hive.HiveParser.processCmd(HiveParser.java:251)
    at org.apache.flink.table.planner.delegation.hive.HiveParser.parse(HiveParser.java:211)
    at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:695)
    at org.apache.flink.connectors.hive.HiveDialectITCase.testPercent_rank(HiveDialectITCase.java:800)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
    at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
    at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
    at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
    at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
    at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
    at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
    at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
    at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235)
    at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54)
Caused by: org.apache.flink.table.functions.hive.FlinkHiveUDFException: Failed to get Hive result type from org.apache.hadoop.hive.ql.udf.generic.GenericUDAFPercentRank
    at org.apache.flink.table.functions.hive.HiveGenericUDAF.inferReturnType(HiveGenericUDAF.java:249)
    at org.apache.flink.table.functions.hive.HiveFunction$HiveFunctionOutputStrategy.inferType(HiveFunction.java:122)
    at org.apache.flink.table.types.inference.TypeInferenceUtil.inferOutputType(TypeInferenceUtil.java:151)
    at org.apache.flink.table.planner.functions.inference.TypeInferenceReturnInference.inferReturnTypeOrError(TypeInferenceReturnInference.java:99)
    at org.apache.flink.table.planner.functions.inference.TypeInferenceReturnInference.inferReturnType(TypeInferenceReturnInference.java:76)
    ... 44 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Only COMPLETE mode supported for Rank function
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFRank$GenericUDAFAbstractRankEvaluator.init(GenericUDAFRank.java:124)
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFPercentRank$GenericUDAFPercentRankEvaluator.init(GenericUDAFPercentRank.java:59)
    at org.apache.flink.table.functions.hive.HiveGenericUDAF.init(HiveGenericUDAF.java:99)
    at org.apache.flink.table.functions.hive.HiveGenericUDAF.inferReturnType(HiveGenericUDAF.java:243)
    ... 48 more {code}
According to the exception message, we can see that it is because the percentage_rank function requires COMPLETE Mode.

We can reproduce it with ITCase:
{code:java}
@Test
public void testPercent_rank() throws Exception {
    // automatically load hive module in hive-compatible mode
    HiveModule hiveModule = new HiveModule(hiveCatalog.getHiveVersion());
    CoreModule coreModule = CoreModule.INSTANCE;
    for (String loaded : tableEnv.listModules()) {
        tableEnv.unloadModule(loaded);
    }
    tableEnv.loadModule("hive", hiveModule);
    tableEnv.loadModule("core", coreModule);
    // Flink UDAF only supports Hive UDAF's PARTIAL_1 and FINAL mode.
    tableEnv.executeSql(
            "create temporary function percent_rank as 'org.apache.hadoop.hive.ql.udf.generic.GenericUDAFPercentRank'");

    tableEnv.executeSql(
            "create table cbo_t1(key string, value string, c_int int, c_float float, c_boolean boolean)");
    List<Row> results =
            CollectionUtil.iteratorToList(
                    tableEnv.executeSql(
                                    "select percent_rank() over(partition by c_float order by key) from cbo_t1")
                            .collect());
} {code}
 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)