You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Adar Dembo (JIRA)" <ji...@apache.org> on 2017/09/01 23:35:00 UTC

[jira] [Resolved] (KUDU-2127) Undesired classes in kudu-client-tools and kudu-spark2-tools_2.11 jars

     [ https://issues.apache.org/jira/browse/KUDU-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adar Dembo resolved KUDU-2127.
------------------------------
       Resolution: Not A Bug
    Fix Version/s: NA

After some reflection and testing, it seems that the inclusion of parquet-hadoop in kudu-client-tools and sparkavro in kudu-spark-tools should remain. I've found that few platform providers will also provide these artifacts on the MR/spark-submit classpath, so including them into our JARs is a convenience for our users.

> Undesired classes in kudu-client-tools and kudu-spark2-tools_2.11 jars
> ----------------------------------------------------------------------
>
>                 Key: KUDU-2127
>                 URL: https://issues.apache.org/jira/browse/KUDU-2127
>             Project: Kudu
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.5.0
>            Reporter: Adar Dembo
>             Fix For: NA
>
>
> Saw this while examining a 1.5.0 RC, probably not important enough to warrant a new RC, but something we should fix anyway.
> The kudu-client-tools JAR has Apache Commons and Parquet classes in it. The kudu-spark2-tools_2.11 JAR has Spark Avro, Apache Avro, Apache Commons, and other classes in it. These are all extraneous.
> I believe these class inclusions were introduced in commit 5d53a3b, namely in these pom.xml changes:
> {noformat}
> diff --git a/java/kudu-client-tools/pom.xml b/java/kudu-client-tools/pom.xml
> index d4908fa..65ac4e3 100644
> --- a/java/kudu-client-tools/pom.xml
> +++ b/java/kudu-client-tools/pom.xml
> @@ -86,6 +86,11 @@
>              <version>${slf4j.version}</version>
>              <scope>test</scope>
>          </dependency>
> +        <dependency>
> +            <groupId>org.apache.parquet</groupId>
> +            <artifactId>parquet-hadoop</artifactId>
> +            <version>${parquet.version}</version>
> +        </dependency>
>      </dependencies>
>  
> diff --git a/java/kudu-spark-tools/pom.xml b/java/kudu-spark-tools/pom.xml
> index c2eb57f..98ffe28 100644
> --- a/java/kudu-spark-tools/pom.xml
> +++ b/java/kudu-spark-tools/pom.xml
> @@ -98,6 +99,11 @@
>              <scope>test</scope>
>          </dependency>
>          <dependency>
> +            <groupId>com.databricks</groupId>
> +            <artifactId>spark-avro_2.10</artifactId>
> +            <version>${sparkavro.version}</version>
> +        </dependency>
> +        <dependency>
> {noformat}
> Both of these new dependencies should probably be of scope 'provided'.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)