You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/08/07 21:03:00 UTC

[jira] [Commented] (IMPALA-9478) Runtime profiles should indicate if custom UDFs are being used

    [ https://issues.apache.org/jira/browse/IMPALA-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173469#comment-17173469 ] 

ASF subversion and git services commented on IMPALA-9478:
---------------------------------------------------------

Commit a0057788c5c2300f58b6615a27116b8331171e06 in impala's branch refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a005778 ]

IMPALA-9478: Profiles should indicate if custom UDFs are being used

Adds a marker to runtime profiles and explain plans indicating if custom
(e.g. non-built in) user-defined functions are being used. For explain
plans, a SQL-style comment is added after any function call. For runtime
profiles, a new Frontend entry called "User Defined Functions (UDFs)"
lists out all UDFs analyzed during planning.

Take the following example:

  create function hive_lower(string) returns string location
  '/test-warehouse/hive-exec.jar'
  symbol='org.apache.hadoop.hive.ql.udf.UDFLower';
  set explain_level=3;
  explain select * from functional.alltypes order by hive_lower(string_col);
  ...
  01:SORT
    order by: default.hive_lower(string_col) /* JAVA UDF */ ASC
    materialized: default.hive_lower(string_col) /* JAVA UDF */
  ...

This shows up in the runtime profile as well.

When the above query is actually run, the runtime profile includes the
following entry:

  Frontend
    User Defined Functions (UDFs): default.hive_lower

Error messages will also include SQL-style comments about any UDFs used.
For example:

  select aggfn(int_col) over (partition by int_col) from
  functional.alltypesagg

Throws:

  Aggregate function 'default.aggfn(int_col) /* NATIVE UDF */' not
  supported with OVER clause.

Testing:
* Added tests to test_udfs.py
* Ran core tests

Change-Id: I79122e6cc74fd5a62c76962289a1615fbac2f345
Reviewed-on: http://gerrit.cloudera.org:8080/16188
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Runtime profiles should indicate if custom UDFs are being used
> --------------------------------------------------------------
>
>                 Key: IMPALA-9478
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9478
>             Project: IMPALA
>          Issue Type: Task
>            Reporter: Sahil Takiar
>            Priority: Major
>
> Custom UDFs can include arbitrary user code that can cause query slowdown. In order to better diagnose queries with UDF issues, it is first important to know when a query is even using an UDF.
> Runtime profiles should list out any custom UDFs used by the query, as well as the library the UDF is loaded from.
> For Java UDFs, the full classname of the UDF would be good as well.
> Any other metadata associated with the UDF might be useful as well. There are a few things that are printed by {{show functions}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org