You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Patrick Wendell (JIRA)" <ji...@apache.org> on 2014/05/12 08:32:14 UTC

[jira] [Updated] (SPARK-1802) Audit dependency graph when Spark is built with -Phive

     [ https://issues.apache.org/jira/browse/SPARK-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Wendell updated SPARK-1802:
-----------------------------------

    Description: 
I'd like to have binary release for 1.0 include Hive support. Since this isn't enabled by default in the build I don't think it's as well tested, so we should dig around a bit and decide if we need to e.g. add any excludes.

{code}
$ mvn install -Phive -DskipTests && mvn dependency:build-classpath assembly | grep -v INFO | tr ":" "\n" |  awk ' { FS="/"; print ( $(NF) ); }' | sort > without_hive.txt

$ mvn install -Phive -DskipTests && mvn dependency:build-classpath -Phive -pl assembly | grep -v INFO | tr ":" "\n" |  awk ' { FS="/"; print ( $(NF) ); }' | sort > with_hive.txt

$ diff without_hive.txt with_hive.txt
< antlr-2.7.7.jar
< antlr-3.4.jar
< antlr-runtime-3.4.jar
10,14d6
< avro-1.7.4.jar
< avro-ipc-1.7.4.jar
< avro-ipc-1.7.4-tests.jar
< avro-mapred-1.7.4.jar
< bonecp-0.7.1.RELEASE.jar
22d13
< commons-cli-1.2.jar
25d15
< commons-compress-1.4.1.jar
33,34d22
< commons-logging-1.1.1.jar
< commons-logging-api-1.0.4.jar
38d25
< commons-pool-1.5.4.jar
46,49d32
< datanucleus-api-jdo-3.2.1.jar
< datanucleus-core-3.2.2.jar
< datanucleus-rdbms-3.2.1.jar
< derby-10.4.2.0.jar
53,57d35
< hive-common-0.12.0.jar
< hive-exec-0.12.0.jar
< hive-metastore-0.12.0.jar
< hive-serde-0.12.0.jar
< hive-shims-0.12.0.jar
60,61d37
< httpclient-4.1.3.jar
< httpcore-4.1.3.jar
68d43
< JavaEWAH-0.3.2.jar
73d47
< javolution-5.5.1.jar
76d49
< jdo-api-3.0.1.jar
78d50
< jetty-6.1.26.jar
87d58
< jetty-util-6.1.26.jar
93d63
< json-20090211.jar
98d67
< jta-1.1.jar
103,104d71
< libfb303-0.9.0.jar
< libthrift-0.9.0.jar
112d78
< mockito-all-1.8.5.jar
136d101
< servlet-api-2.5-20081211.jar
139d103
< snappy-0.2.jar
144d107
< spark-hive_2.10-1.0.0.jar
151d113
< ST4-4.0.4.jar
153d114
< stringtemplate-3.2.1.jar
156d116
< velocity-1.7.jar
158d117
< xz-1.0.jar
{code}

Some initial investigation suggests we may need to take some precaution surrounding (a) jetty and (b) servlet-api.

  was:
I'd like to have binaries release for 1.0 include Hive support. Since this isn't enabled by default in the build I don't think it's as well tested, so we should dig around a bit and decide if we need to e.g. add any excludes.

{code}
$ mvn install -Phive -DskipTests && mvn dependency:build-classpath assembly | grep -v INFO | tr ":" "\n" |  awk ' { FS="/"; print ( $(NF) ); }' | sort > without_hive.txt

$ mvn install -Phive -DskipTests && mvn dependency:build-classpath -Phive -pl assembly | grep -v INFO | tr ":" "\n" |  awk ' { FS="/"; print ( $(NF) ); }' | sort > with_hive.txt

$ diff without_hive.txt with_hive.txt
< antlr-2.7.7.jar
< antlr-3.4.jar
< antlr-runtime-3.4.jar
10,14d6
< avro-1.7.4.jar
< avro-ipc-1.7.4.jar
< avro-ipc-1.7.4-tests.jar
< avro-mapred-1.7.4.jar
< bonecp-0.7.1.RELEASE.jar
22d13
< commons-cli-1.2.jar
25d15
< commons-compress-1.4.1.jar
33,34d22
< commons-logging-1.1.1.jar
< commons-logging-api-1.0.4.jar
38d25
< commons-pool-1.5.4.jar
46,49d32
< datanucleus-api-jdo-3.2.1.jar
< datanucleus-core-3.2.2.jar
< datanucleus-rdbms-3.2.1.jar
< derby-10.4.2.0.jar
53,57d35
< hive-common-0.12.0.jar
< hive-exec-0.12.0.jar
< hive-metastore-0.12.0.jar
< hive-serde-0.12.0.jar
< hive-shims-0.12.0.jar
60,61d37
< httpclient-4.1.3.jar
< httpcore-4.1.3.jar
68d43
< JavaEWAH-0.3.2.jar
73d47
< javolution-5.5.1.jar
76d49
< jdo-api-3.0.1.jar
78d50
< jetty-6.1.26.jar
87d58
< jetty-util-6.1.26.jar
93d63
< json-20090211.jar
98d67
< jta-1.1.jar
103,104d71
< libfb303-0.9.0.jar
< libthrift-0.9.0.jar
112d78
< mockito-all-1.8.5.jar
136d101
< servlet-api-2.5-20081211.jar
139d103
< snappy-0.2.jar
144d107
< spark-hive_2.10-1.0.0.jar
151d113
< ST4-4.0.4.jar
153d114
< stringtemplate-3.2.1.jar
156d116
< velocity-1.7.jar
158d117
< xz-1.0.jar
{code}

Some initial investigation suggests we may need to take some precaution surrounding (a) jetty and (b) servlet-api.


> Audit dependency graph when Spark is built with -Phive
> ------------------------------------------------------
>
>                 Key: SPARK-1802
>                 URL: https://issues.apache.org/jira/browse/SPARK-1802
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Patrick Wendell
>            Priority: Blocker
>             Fix For: 1.0.0
>
>
> I'd like to have binary release for 1.0 include Hive support. Since this isn't enabled by default in the build I don't think it's as well tested, so we should dig around a bit and decide if we need to e.g. add any excludes.
> {code}
> $ mvn install -Phive -DskipTests && mvn dependency:build-classpath assembly | grep -v INFO | tr ":" "\n" |  awk ' { FS="/"; print ( $(NF) ); }' | sort > without_hive.txt
> $ mvn install -Phive -DskipTests && mvn dependency:build-classpath -Phive -pl assembly | grep -v INFO | tr ":" "\n" |  awk ' { FS="/"; print ( $(NF) ); }' | sort > with_hive.txt
> $ diff without_hive.txt with_hive.txt
> < antlr-2.7.7.jar
> < antlr-3.4.jar
> < antlr-runtime-3.4.jar
> 10,14d6
> < avro-1.7.4.jar
> < avro-ipc-1.7.4.jar
> < avro-ipc-1.7.4-tests.jar
> < avro-mapred-1.7.4.jar
> < bonecp-0.7.1.RELEASE.jar
> 22d13
> < commons-cli-1.2.jar
> 25d15
> < commons-compress-1.4.1.jar
> 33,34d22
> < commons-logging-1.1.1.jar
> < commons-logging-api-1.0.4.jar
> 38d25
> < commons-pool-1.5.4.jar
> 46,49d32
> < datanucleus-api-jdo-3.2.1.jar
> < datanucleus-core-3.2.2.jar
> < datanucleus-rdbms-3.2.1.jar
> < derby-10.4.2.0.jar
> 53,57d35
> < hive-common-0.12.0.jar
> < hive-exec-0.12.0.jar
> < hive-metastore-0.12.0.jar
> < hive-serde-0.12.0.jar
> < hive-shims-0.12.0.jar
> 60,61d37
> < httpclient-4.1.3.jar
> < httpcore-4.1.3.jar
> 68d43
> < JavaEWAH-0.3.2.jar
> 73d47
> < javolution-5.5.1.jar
> 76d49
> < jdo-api-3.0.1.jar
> 78d50
> < jetty-6.1.26.jar
> 87d58
> < jetty-util-6.1.26.jar
> 93d63
> < json-20090211.jar
> 98d67
> < jta-1.1.jar
> 103,104d71
> < libfb303-0.9.0.jar
> < libthrift-0.9.0.jar
> 112d78
> < mockito-all-1.8.5.jar
> 136d101
> < servlet-api-2.5-20081211.jar
> 139d103
> < snappy-0.2.jar
> 144d107
> < spark-hive_2.10-1.0.0.jar
> 151d113
> < ST4-4.0.4.jar
> 153d114
> < stringtemplate-3.2.1.jar
> 156d116
> < velocity-1.7.jar
> 158d117
> < xz-1.0.jar
> {code}
> Some initial investigation suggests we may need to take some precaution surrounding (a) jetty and (b) servlet-api.



--
This message was sent by Atlassian JIRA
(v6.2#6252)