You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/03/06 14:19:00 UTC

[jira] [Commented] (IMPALA-11911) Incorrect handling of NULL arguments in Hive GenericUDFs

    [ https://issues.apache.org/jira/browse/IMPALA-11911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696976#comment-17696976 ] 

ASF subversion and git services commented on IMPALA-11911:
----------------------------------------------------------

Commit 67bb870aa302b3509fa4a0f8d846efedc04e1514 in impala's branch refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=67bb870aa ]

IMPALA-11911: Fix NULL argument handling in Hive GenericUDFs

Before this patch if an argument of a GenericUDF was NULL, then Impala
passed it as null instead of a DeferredObject. This was incorrect, as
a DeferredObject is expected with a get() function that returns null.
See the Jira for more details and GenericUDF examples in Hive.

TestGenericUdf's NULL handling was further broken in IMPALA-11549,
leading to throwing null pointer exceptions when the UDF's result is
NULL. This test bug was not detected, because Hive udf tests were
running with default abort_java_udf_on_exception=false, which means
that exceptions from Hive UDFs only led to warnings and returning NULL,
which was the expected result in all affected test queries.

This patch fixes the behavior in HiveUdfExecutorGeneric and improves
FE/EE tests to catch null handling related issues. Most Hive UDF tests
are run with abort_java_udf_on_exception=true after this patch to treat
exceptions in UDFs as errors. The ones where the test checks that NULL
is returned if an exception is thrown while abort_java_udf_on_exception
is false are moved to new .test files.
TestGenericUdf is also fixed (and simplified) to handle NULL return
values correctly.

Change-Id: I53238612f4037572abb6d2cc913dd74ee830a9c9
Reviewed-on: http://gerrit.cloudera.org:8080/19499
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Incorrect handling of NULL arguments in Hive GenericUDFs
> --------------------------------------------------------
>
>                 Key: IMPALA-11911
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11911
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 4.2.0
>            Reporter: Csaba Ringhofer
>            Priority: Major
>
> If an argument of a GenericUDF is NULL then Impala passes a null instead of a deferred object:
> https://github.com/apache/impala/blob/5abbb9bd17373c8aafe6d213d328e16934cdca07/fe/src/main/java/org/apache/impala/hive/executor/HiveUdfExecutorGeneric.java#L74
> This seems to be wrong, as the example GenericUDFs I checked in Hive assume that the argument is not null, but the DeferredObject's get() function can return null:
> https://github.com/apache/hive/blob/7082fd1dfd087c99e6f00a7a0e95a30e198fede8/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIf.java#L165
> This also makes sense as one of the goals of DeferredObject is lazy evaluation, so we may not know before calling get() whether the argument is null
> https://github.com/apache/hive/blob/7082fd1dfd087c99e6f00a7a0e95a30e198fede8/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java#L92
> Even Impala's test UDFs throw an exception for NULL:
> {code}
> create function generic_identity(int) returns int
> location '/test-warehouse/impala-hive-udfs.jar'
> symbol='org.apache.impala.TestGenericUdf';
> select generic_identity(cast(NULL as int));
> WARNINGS: UDF WARNING: Hive UDF path=hdfs://localhost:20500/test-warehouse/impala-hive-udfs.jar class=org.apache.impala.TestGenericUdf failed due to: NullPointerException: null
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org