You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Fang-Yu Rao (Jira)" <ji...@apache.org> on 2021/09/07 18:01:00 UTC

[jira] [Updated] (IMPALA-10906) Adjust dependencies after HIVE-24852

     [ https://issues.apache.org/jira/browse/IMPALA-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Fang-Yu Rao updated IMPALA-10906:
---------------------------------
    Description: 
HIVE-24852 added the dependency on the artifact {{hadoop-hdfs}} under the module of {{hive-service}}, which is a dependency of {{hive-jdbc}} that Impala relies on. However, since {{hadoop-hdfs}} transitively pulls in the artifact {{jersey-server}}, which is a banned dependency by Impala, we had to explicitly exclude {{jersey-server}} when adding {{hive-jdbc}} as a dependency so that Impala frontend could be compiled next time when we bump up {{CDP_BUILD_NUMBER}} that includes this Hive patch.

Moreover, due to the fact that after HIVE-24852, the creation of a partitioned iceberg table requires the existence of the class {{org.apache.hadoop.hdfs.protocol.SnapshotException}} on Impala's classpath at runtime, we had to explicitly add the dependency on the artifact {{hadoop-hdfs}} so that such an operation will not result in a {{NoClassDefFoundError}}. We also need to explicitly excluded some banned artifacts that were transitively pulled in by {{hadoop-hdfs}} so that Impala frontend could be compiled.

For easy reference, a query to reproduce the {{NoClassDefFoundError}} issue is provided in the following. A very similar query will be  executed during the loading of Impala's test data so that once the dependencies are properly revised, we expect Impala to be able to load test data correctly.
{code:sql}
[localhost:21050] default> CREATE TABLE IF NOT EXISTS default.iceberg_int_partitioned_tbl (i INT, j INT, k INT)
                         > PARTITION BY SPEC (i identity, j identity)
                         > STORED AS ICEBERG;
Query: CREATE TABLE IF NOT EXISTS default.iceberg_int_partitioned_tbl (i INT, j INT, k INT)
PARTITION BY SPEC (i identity, j identity)
STORED AS ICEBERG
ERROR: NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/SnapshotException
CAUSED BY: ClassNotFoundException: org.apache.hadoop.hdfs.protocol.SnapshotException
{code}

  was:
HIVE-24852 added the dependency on the artifact {{hadoop-hdfs}} under the module of {{hive-service}}, which is a dependency of {{hive-jdbc}} that Impala relies on. However, since {{hadoop-hdfs}} transitively pulls in the artifact {{jersey-server}}, which is a banned dependency by Impala, we had to explicitly exclude {{jersey-server}} when adding {{hive-jdbc}} as a dependency so that Impala frontend could be compiled next time when we bump up {{CDP_BUILD_NUMBER}} that includes this Hive patch.

Moreover, due to the fact that after HIVE-24852, the creation of a partitioned iceberg table requires the existence of the class {{org.apache.hadoop.hdfs.protocol.SnapshotException}} on Impala's classpath at runtime, we had to explicitly add the dependency on the artifact {{hadoop-hdfs}} so that such an operation will not result in a {{NoClassDefFoundError}}. We also need to  explicitly excluded some banned artifacts that were transitively pulled in by {{hadoop-hdfs}} so that Impala frontend could be compiled.

For easy reference, a query to reproduce the {{NoClassDefFoundError}} issue is provided in the following. A very similar query has been executed during the loading of Impala's test data so that once the dependencies are properly revised, we expect Impala to be able to load test data correctly.
{code:sql}
[localhost:21050] default> CREATE TABLE IF NOT EXISTS default.iceberg_int_partitioned_tbl (i INT, j INT, k INT)
                         > PARTITION BY SPEC (i identity, j identity)
                         > STORED AS ICEBERG;
Query: CREATE TABLE IF NOT EXISTS default.iceberg_int_partitioned_tbl (i INT, j INT, k INT)
PARTITION BY SPEC (i identity, j identity)
STORED AS ICEBERG
ERROR: NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/SnapshotException
CAUSED BY: ClassNotFoundException: org.apache.hadoop.hdfs.protocol.SnapshotException
{code}



> Adjust dependencies after HIVE-24852
> ------------------------------------
>
>                 Key: IMPALA-10906
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10906
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Fang-Yu Rao
>            Assignee: Fang-Yu Rao
>            Priority: Major
>
> HIVE-24852 added the dependency on the artifact {{hadoop-hdfs}} under the module of {{hive-service}}, which is a dependency of {{hive-jdbc}} that Impala relies on. However, since {{hadoop-hdfs}} transitively pulls in the artifact {{jersey-server}}, which is a banned dependency by Impala, we had to explicitly exclude {{jersey-server}} when adding {{hive-jdbc}} as a dependency so that Impala frontend could be compiled next time when we bump up {{CDP_BUILD_NUMBER}} that includes this Hive patch.
> Moreover, due to the fact that after HIVE-24852, the creation of a partitioned iceberg table requires the existence of the class {{org.apache.hadoop.hdfs.protocol.SnapshotException}} on Impala's classpath at runtime, we had to explicitly add the dependency on the artifact {{hadoop-hdfs}} so that such an operation will not result in a {{NoClassDefFoundError}}. We also need to explicitly excluded some banned artifacts that were transitively pulled in by {{hadoop-hdfs}} so that Impala frontend could be compiled.
> For easy reference, a query to reproduce the {{NoClassDefFoundError}} issue is provided in the following. A very similar query will be  executed during the loading of Impala's test data so that once the dependencies are properly revised, we expect Impala to be able to load test data correctly.
> {code:sql}
> [localhost:21050] default> CREATE TABLE IF NOT EXISTS default.iceberg_int_partitioned_tbl (i INT, j INT, k INT)
>                          > PARTITION BY SPEC (i identity, j identity)
>                          > STORED AS ICEBERG;
> Query: CREATE TABLE IF NOT EXISTS default.iceberg_int_partitioned_tbl (i INT, j INT, k INT)
> PARTITION BY SPEC (i identity, j identity)
> STORED AS ICEBERG
> ERROR: NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/SnapshotException
> CAUSED BY: ClassNotFoundException: org.apache.hadoop.hdfs.protocol.SnapshotException
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org