You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@impala.apache.org by Piyush Narang <p....@criteo.com> on 2018/03/07 10:57:42 UTC

Debugging ClassNotFound errors while trying to use Hive UDFs

Hi folks,

I’m running into a strange ClassNotFound error while trying to use one of my Hive UDFs in Impala.
I’ve defined the UDF:
> create function device_type(string, string, string) returns string location 'my/path/to/my-udf-SNAPSHOT.jar' symbol='com.criteo.hadoop.hive.udf.UDFUserAgentToDeviceType';

I try to use it in simple queries and it works fine:
> select device_type("a", "b", "c");
Query: select device_type("a", "b", "c")
…
+----------------------------------------+
| device_type('a', 'b', 'c') |
+----------------------------------------+
| Mobile - Other                         |
+----------------------------------------+
Fetched 1 row(s) in 0.01s
> select ua_device_family, ua_browser_family, ua_os_family, device_type(ua_device_family, ua_browser_family, ua_os_family) from bi_arbitrage_full limit 1;
Query: select ua_device_family, ua_browser_family, ua_os_family, device_type(ua_device_family, ua_browser_family, ua_os_family) from bi_arbitrage_full limit 1
…
+------------------+-------------------+--------------+----------------------------------------------------------------------------+
| ua_device_family | ua_browser_family | ua_os_family | device_type(ua_device_family, ua_browser_family, ua_os_family) |
+------------------+-------------------+--------------+----------------------------------------------------------------------------+
| iPhone           | mobile safari     | iOS          | iPhone                                                                     |
+------------------+-------------------+--------------+----------------------------------------------------------------------------+
Fetched 1 row(s) in 0.73s

Now when I try to run a more complex query (same database)
I get an error:
WARNINGS: ImpalaRuntimeException: Unable to find class.
CAUSED BY: ClassNotFoundException: com.criteo.hadoop.hive.udf.UDFUserAgentToDeviceType

I turned up the log level on the coordinator to debug. I see the calls being made to load the UDF and the use of the UDF in the query but no details on the ClassNotFound.

Has anyone run into a similar issue? We’re running Impala 2.11.0-cdh5.14.0 RELEASE (build d68206561bce6b26762d62c01a78e6cd27aa7690) so I think we should hopefully be clear of this really old bug - https://issues.apache.org/jira/browse/IMPALA-695.
Thanks,

-- Piyush


Re: Debugging ClassNotFound errors while trying to use Hive UDFs

Posted by Piyush Narang <p....@criteo.com>.
Thanks for getting back Vuk.

The CNF in our case seems to be very deterministic for the complex query (it has always failed so far). I was able to strip it down a bit.
This query works:
SELECT ua_device_family, ua_browser_family, ua_os_family, device_type(ua_device_family, ua_browser_family, ua_os_family) as device FROM bi_arbitrage_full WHERE day='2017-10-04';

This one fails:
> SELECT device_type(ua_device_family, ua_browser_family, ua_os_family) as device FROM bi_arbitrage_full WHERE day='2017-10-04' group by device_type(ua_device_family, ua_browser_family, ua_os_family);
Query: select device_type(ua_device_family, ua_browser_family, ua_os_family) as device FROM bi_arbitrage_full WHERE day='2017-10-04' group by device_type(ua_device_family, ua_browser_family, ua_os_family)
…
WARNINGS: ImpalaRuntimeException: Unable to find class.
CAUSED BY: ClassNotFoundException: com.criteo.hadoop.hive.udf.UDFUserAgentToDeviceType

If you want me to send across the full complex query I can do so as well.

Any temporary workaround you can suggest for us to try out?

Thanks,

-- Piyush


From: Vuk Ercegovac <ve...@cloudera.com>
Reply-To: "user@impala.apache.org" <us...@impala.apache.org>
Date: Wednesday, March 7, 2018 at 6:35 PM
To: "user@impala.apache.org" <us...@impala.apache.org>
Subject: Re: Debugging ClassNotFound errors while trying to use Hive UDFs

An issue that we're working on is described here: https://issues.apache.org/jira/browse/IMPALA-6215<https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_IMPALA-2D6215&d=DwMFaQ&c=nxfEpP1JWHVKAq835DW4mA&r=3Ka-O_qIfLiCDaGELmIN3BcChZatNdPOwe36odQXFYo&m=YQ3qW6VNICkhmLVpIhFMXteLaGAiwAFyl43nQmQ8ufk&s=siJvz88rRWb4PHKsfbHDA7aml7w9gPBRFZPhQmfzRDg&e=>.

For your scenario, does the ClassNotFoundException come up deterministically for the complex query or just
once in a while? When you test this query, is it the only query running on the system?

If you don't mind sharing the query so that I can develop a repro, that would be appreciated.

On Wed, Mar 7, 2018 at 2:57 AM, Piyush Narang <p....@criteo.com>> wrote:
Hi folks,

I’m running into a strange ClassNotFound error while trying to use one of my Hive UDFs in Impala.
I’ve defined the UDF:
> create function device_type(string, string, string) returns string location 'my/path/to/my-udf-SNAPSHOT.jar' symbol='com.criteo.hadoop.hive.udf.UDFUserAgentToDeviceType';

I try to use it in simple queries and it works fine:
> select device_type("a", "b", "c");
Query: select device_type("a", "b", "c")
…
+----------------------------------------+
| device_type('a', 'b', 'c') |
+----------------------------------------+
| Mobile - Other                         |
+----------------------------------------+
Fetched 1 row(s) in 0.01s
> select ua_device_family, ua_browser_family, ua_os_family, device_type(ua_device_family, ua_browser_family, ua_os_family) from bi_arbitrage_full limit 1;
Query: select ua_device_family, ua_browser_family, ua_os_family, device_type(ua_device_family, ua_browser_family, ua_os_family) from bi_arbitrage_full limit 1
…
+------------------+-------------------+--------------+----------------------------------------------------------------------------+
| ua_device_family | ua_browser_family | ua_os_family | device_type(ua_device_family, ua_browser_family, ua_os_family) |
+------------------+-------------------+--------------+----------------------------------------------------------------------------+
| iPhone           | mobile safari     | iOS          | iPhone                                                                     |
+------------------+-------------------+--------------+----------------------------------------------------------------------------+
Fetched 1 row(s) in 0.73s

Now when I try to run a more complex query (same database)
I get an error:
WARNINGS: ImpalaRuntimeException: Unable to find class.
CAUSED BY: ClassNotFoundException: com.criteo.hadoop.hive.udf.UDFUserAgentToDeviceType

I turned up the log level on the coordinator to debug. I see the calls being made to load the UDF and the use of the UDF in the query but no details on the ClassNotFound.

Has anyone run into a similar issue? We’re running Impala 2.11.0-cdh5.14.0 RELEASE (build d68206561bce6b26762d62c01a78e6cd27aa7690) so I think we should hopefully be clear of this really old bug - https://issues.apache.org/jira/browse/IMPALA-695<https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_IMPALA-2D695&d=DwMFaQ&c=nxfEpP1JWHVKAq835DW4mA&r=3Ka-O_qIfLiCDaGELmIN3BcChZatNdPOwe36odQXFYo&m=YQ3qW6VNICkhmLVpIhFMXteLaGAiwAFyl43nQmQ8ufk&s=n1OGXyNnnvzY52CpZKbD6LS-3lMPCAIaeCBVtp7tawQ&e=>.
Thanks,

-- Piyush



Re: Debugging ClassNotFound errors while trying to use Hive UDFs

Posted by Vuk Ercegovac <ve...@cloudera.com>.
An issue that we're working on is described here:
https://issues.apache.org/jira/browse/IMPALA-6215.

For your scenario, does the ClassNotFoundException come up
deterministically for the complex query or just
once in a while? When you test this query, is it the only query running on
the system?

If you don't mind sharing the query so that I can develop a repro, that
would be appreciated.

On Wed, Mar 7, 2018 at 2:57 AM, Piyush Narang <p....@criteo.com> wrote:

> Hi folks,
>
>
>
> I’m running into a strange ClassNotFound error while trying to use one of
> my Hive UDFs in Impala.
>
> I’ve defined the UDF:
>
> > create function device_type(string, string, string) returns string
> location 'my/path/to/my-udf-SNAPSHOT.jar' symbol='com.criteo.hadoop.
> hive.udf.UDFUserAgentToDeviceType';
>
>
>
> I try to use it in simple queries and it works fine:
>
> > select device_type("a", "b", "c");
>
> Query: select device_type("a", "b", "c")
>
> …
>
> +----------------------------------------+
>
> | device_type('a', 'b', 'c') |
>
> +----------------------------------------+
>
> | Mobile - Other                         |
>
> +----------------------------------------+
>
> Fetched 1 row(s) in 0.01s
>
> > select ua_device_family, ua_browser_family, ua_os_family,
> device_type(ua_device_family, ua_browser_family, ua_os_family) from
> bi_arbitrage_full limit 1;
>
> Query: select ua_device_family, ua_browser_family, ua_os_family,
> device_type(ua_device_family, ua_browser_family, ua_os_family) from
> bi_arbitrage_full limit 1
>
> …
>
> +------------------+-------------------+--------------+-----
> -----------------------------------------------------------------------+
>
> | ua_device_family | ua_browser_family | ua_os_family |
> device_type(ua_device_family, ua_browser_family, ua_os_family) |
>
> +------------------+-------------------+--------------+-----
> -----------------------------------------------------------------------+
>
> | iPhone           | mobile safari     | iOS          |
> iPhone
> |
>
> +------------------+-------------------+--------------+-----
> -----------------------------------------------------------------------+
>
> Fetched 1 row(s) in 0.73s
>
>
>
> Now when I try to run a more complex query (same database)
>
> I get an error:
>
> WARNINGS: ImpalaRuntimeException: Unable to find class.
>
> CAUSED BY: ClassNotFoundException: com.criteo.hadoop.hive.udf.
> UDFUserAgentToDeviceType
>
>
>
> I turned up the log level on the coordinator to debug. I see the calls
> being made to load the UDF and the use of the UDF in the query but no
> details on the ClassNotFound.
>
>
>
> Has anyone run into a similar issue? We’re running Impala 2.11.0-cdh5.14.0
> RELEASE (build d68206561bce6b26762d62c01a78e6cd27aa7690) so I think we
> should hopefully be clear of this really old bug -
> https://issues.apache.org/jira/browse/IMPALA-695.
>
> Thanks,
>
>
>
> -- Piyush
>
>
>