You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Richard Hillegas <rh...@us.ibm.com> on 2015/09/22 22:28:29 UTC

Derby version in Spark


I see that lib_managed/jars holds these old Derby versions:

  lib_managed/jars/derby-10.10.1.1.jar
  lib_managed/jars/derby-10.10.2.0.jar

The Derby 10.10 release family supports some ancient JVMs: Java SE 5 and
Java ME CDC/Foundation Profile 1.1. It's hard to imagine anyone running
Spark on the resource-constrained Java ME platform. Is Spark really
deployed on Java SE 5? Is there some other reason that Spark uses the 10.10
Derby family?

If no-one needs those ancient JVMs, maybe we could consider changing the
Derby version to 10.11.1.1 or even to the upcoming 10.12.1.1 release (both
run on Java 6 and up).

Thanks,
-Rick

Re: Derby version in Spark

Posted by Richard Hillegas <rh...@us.ibm.com>.
Thanks, Ted. I'll follow up with the Hive folks.

Cheers,
-Rick

Ted Yu <yu...@gmail.com> wrote on 09/22/2015 03:41:12 PM:

> From: Ted Yu <yu...@gmail.com>
> To: Richard Hillegas/San Francisco/IBM@IBMUS
> Cc: Dev <de...@spark.apache.org>
> Date: 09/22/2015 03:41 PM
> Subject: Re: Derby version in Spark
>
> I cloned Hive 1.2 code base and saw:
>
>     <derby.version>10.10.2.0</derby.version>
>
> So the version used by Spark is quite close to what Hive uses.
>
> On Tue, Sep 22, 2015 at 3:29 PM, Ted Yu <yu...@gmail.com> wrote:
> I see.
> I use maven to build so I observe different contents under
> lib_managed directory.
>
> Here is snippet of dependency tree:
>
> [INFO] |  +-
org.spark-project.hive:hive-metastore:jar:1.2.1.spark:compile
> [INFO] |  |  +- com.jolbox:bonecp:jar:0.8.0.RELEASE:compile
> [INFO] |  |  +- org.apache.derby:derby:jar:10.10.1.1:compile
>
> On Tue, Sep 22, 2015 at 3:21 PM, Richard Hillegas <rh...@us.ibm.com>
wrote:
> Thanks, Ted. I'm working on my master branch. The lib_managed/jars
> directory has a lot of jarballs, including hadoop and hive. Maybe
> these were faulted in when I built with the following command?
>
>   sbt/sbt -Phive assembly/assembly
>
> The Derby jars seem to be used in order to manage the metastore_db
> database. Maybe my question should be directed to the Hive community?
>
> Thanks,
> -Rick
>
> Here are the gory details:
>
> bash-3.2$ ls lib_managed/jars
> FastInfoset-1.2.12.jar curator-test-2.4.0.jar jersey-test-framework-
> grizzly2-1.9.jar parquet-format-2.3.0-incubating.jar
> JavaEWAH-0.3.2.jar datanucleus-api-jdo-3.2.6.jar jets3t-0.7.1.jar
> parquet-generator-1.7.0.jar
> ST4-4.0.4.jar datanucleus-core-3.2.10.jar jetty-continuation-8.1.
> 14.v20131031.jar parquet-hadoop-1.7.0.jar
> activation-1.1.jar datanucleus-rdbms-3.2.9.jar jetty-http-8.1.
> 14.v20131031.jar parquet-hadoop-bundle-1.6.0.jar
> akka-actor_2.10-2.3.11.jar derby-10.10.1.1.jar jetty-io-8.1.
> 14.v20131031.jar parquet-jackson-1.7.0.jar
> akka-remote_2.10-2.3.11.jar derby-10.10.2.0.jar jetty-jndi-8.1.
> 14.v20131031.jar platform-3.4.0.jar
> akka-slf4j_2.10-2.3.11.jar genjavadoc-plugin_2.10.4-0.9-spark0.jar
> jetty-plus-8.1.14.v20131031.jar pmml-agent-1.1.15.jar
> akka-testkit_2.10-2.3.11.jar groovy-all-2.1.6.jar jetty-security-8.
> 1.14.v20131031.jar pmml-model-1.1.15.jar
> antlr-2.7.7.jar guava-11.0.2.jar jetty-server-8.1.14.v20131031.jar
> pmml-schema-1.1.15.jar
> antlr-runtime-3.4.jar guice-3.0.jar jetty-servlet-8.1.
> 14.v20131031.jar postgresql-9.3-1102-jdbc41.jar
> aopalliance-1.0.jar h2-1.4.183.jar jetty-util-6.1.26.jar py4j-0.8.2.1.jar
> arpack_combined_all-0.1-javadoc.jar hadoop-annotations-2.2.0.jar
> jetty-util-8.1.14.v20131031.jar pyrolite-4.4.jar
> arpack_combined_all-0.1.jar hadoop-auth-2.2.0.jar jetty-webapp-8.1.
> 14.v20131031.jar quasiquotes_2.10-2.0.0.jar
> asm-3.2.jar hadoop-client-2.2.0.jar jetty-websocket-8.1.
> 14.v20131031.jar reflectasm-1.07-shaded.jar
> avro-1.7.4.jar hadoop-common-2.2.0.jar jetty-xml-8.1.
> 14.v20131031.jar sac-1.3.jar
> avro-1.7.7.jar hadoop-hdfs-2.2.0.jar jline-0.9.94.jar scala-
> compiler-2.10.0.jar
> avro-ipc-1.7.7-tests.jar hadoop-mapreduce-client-app-2.2.0.jar
> jline-2.10.4.jar scala-compiler-2.10.4.jar
> avro-ipc-1.7.7.jar hadoop-mapreduce-client-common-2.2.0.jar jline-2.
> 12.jar scala-library-2.10.4.jar
> avro-mapred-1.7.7-hadoop2.jar hadoop-mapreduce-client-core-2.2.0.jar
> jna-3.4.0.jar scala-reflect-2.10.4.jar
> breeze-macros_2.10-0.11.2.jar hadoop-mapreduce-client-jobclient-2.2.
> 0.jar joda-time-2.5.jar scalacheck_2.10-1.11.3.jar
> breeze_2.10-0.11.2.jar hadoop-mapreduce-client-shuffle-2.2.0.jar
> jodd-core-3.5.2.jar scalap-2.10.0.jar
> calcite-avatica-1.2.0-incubating.jar hadoop-yarn-api-2.2.0.jar
> json-20080701.jar selenium-api-2.42.2.jar
> calcite-core-1.2.0-incubating.jar hadoop-yarn-client-2.2.0.jar
> json-20090211.jar selenium-chrome-driver-2.42.2.jar
> calcite-linq4j-1.2.0-incubating.jar hadoop-yarn-common-2.2.0.jar
> json4s-ast_2.10-3.2.10.jar selenium-firefox-driver-2.42.2.jar
> cglib-2.2.1-v20090111.jar hadoop-yarn-server-common-2.2.0.jar
> json4s-core_2.10-3.2.10.jar selenium-htmlunit-driver-2.42.2.jar
> cglib-nodep-2.1_3.jar hadoop-yarn-server-nodemanager-2.2.0.jar
> json4s-jackson_2.10-3.2.10.jar selenium-ie-driver-2.42.2.jar
> chill-java-0.5.0.jar hamcrest-core-1.1.jar jsr173_api-1.0.jar
> selenium-java-2.42.2.jar
> chill_2.10-0.5.0.jar hamcrest-core-1.3.jar jsr305-1.3.9.jar
> selenium-remote-driver-2.42.2.jar
> commons-beanutils-1.7.0.jar hamcrest-library-1.3.jar jsr305-2.0.
> 1.jar selenium-safari-driver-2.42.2.jar
> commons-beanutils-core-1.8.0.jar hive-exec-1.2.1.spark.jar jta-1.
> 1.jar selenium-support-2.42.2.jar
> commons-cli-1.2.jar hive-metastore-1.2.1.spark.jar jtransforms-2.4.
> 0.jar serializer-2.7.1.jar
> commons-codec-1.10.jar htmlunit-2.14.jar jul-to-slf4j-1.7.10.jar
> slf4j-api-1.7.10.jar
> commons-codec-1.4.jar htmlunit-core-js-2.14.jar junit-4.10.jar
> slf4j-log4j12-1.7.10.jar
> commons-codec-1.5.jar httpclient-4.3.2.jar junit-dep-4.10.jar
snappy-0.2.jar
> commons-codec-1.9.jar httpcore-4.3.1.jar junit-dep-4.8.2.jar spire-
> macros_2.10-0.7.4.jar
> commons-collections-3.2.1.jar httpmime-4.3.2.jar junit-interface-0.
> 10.jar spire_2.10-0.7.4.jar
> commons-compiler-2.7.8.jar istack-commons-runtime-2.16.jar junit-
> interface-0.9.jar stax-api-1.0.1.jar
> commons-compress-1.4.1.jar ivy-2.4.0.jar libfb303-0.9.2.jar
stream-2.7.0.jar
> commons-configuration-1.6.jar jackson-core-asl-1.8.8.jar
> libthrift-0.9.2.jar stringtemplate-3.2.1.jar
> commons-dbcp-1.4.jar jackson-core-asl-1.9.13.jar lz4-1.3.0.jar
> tachyon-client-0.7.1.jar
> commons-digester-1.8.jar jackson-jaxrs-1.8.8.jar mesos-0.21.1-
> shaded-protobuf.jar tachyon-underfs-hdfs-0.7.1.jar
> commons-exec-1.1.jar jackson-mapper-asl-1.9.13.jar minlog-1.2.jar
> tachyon-underfs-local-0.7.1.jar
> commons-httpclient-3.1.jar jackson-xc-1.8.8.jar mockito-core-1.9.
> 5.jar test-interface-0.5.jar
> commons-io-2.1.jar janino-2.7.8.jar mysql-connector-java-5.1.34.jar
> test-interface-1.0.jar
> commons-io-2.4.jar jansi-1.4.jar nekohtml-1.9.20.jar uncommons-
> maths-1.2.2a.jar
> commons-lang-2.5.jar javassist-3.15.0-GA.jar netty-all-4.0.
> 29.Final.jar unused-1.0.0.jar
> commons-lang-2.6.jar javax.inject-1.jar objenesis-1.0.jar
webbit-0.4.14.jar
> commons-lang3-3.3.2.jar jaxb-api-2.2.2.jar objenesis-1.2.jar
xalan-2.7.1.jar
> commons-logging-1.1.3.jar jaxb-api-2.2.7.jar opencsv-2.3.jar
> xercesImpl-2.11.0.jar
> commons-math-2.1.jar jaxb-core-2.2.7.jar oro-2.0.8.jar
xml-apis-1.4.01.jar
> commons-math-2.2.jar jaxb-impl-2.2.3-1.jar paranamer-2.3.jar
xmlenc-0.52.jar
> commons-math3-3.4.1.jar jaxb-impl-2.2.7.jar paranamer-2.6.jar xz-1.0.jar
> commons-net-3.1.jar jblas-1.2.4.jar parquet-avro-1.7.0.jar
zookeeper-3.4.5.jar
> commons-pool-1.5.4.jar jcl-over-slf4j-1.7.10.jar parquet-column-1.7.0.jar
> core-1.1.2.jar jdo-api-3.0.1.jar parquet-common-1.7.0.jar
> cssparser-0.9.13.jar jersey-guice-1.9.jar parquet-encoding-1.7.0.jar
>
> Ted Yu <yu...@gmail.com> wrote on 09/22/2015 01:32:39 PM:
>
> > From: Ted Yu <yu...@gmail.com>
> > To: Richard Hillegas/San Francisco/IBM@IBMUS
> > Cc: Dev <de...@spark.apache.org>
> > Date: 09/22/2015 01:33 PM
> > Subject: Re: Derby version in Spark
>
> >
> > Which Spark release are you building ?
> >
> > For master branch, I get the following:
> >
> > lib_managed/jars/datanucleus-api-jdo-3.2.6.jar  lib_managed/jars/
> > datanucleus-core-3.2.10.jar
lib_managed/jars/datanucleus-rdbms-3.2.9.jar
> >
> > FYI
> >
> > On Tue, Sep 22, 2015 at 1:28 PM, Richard Hillegas <rhilleg@us.ibm.com
> > wrote:
> > I see that lib_managed/jars holds these old Derby versions:
> >
> >   lib_managed/jars/derby-10.10.1.1.jar
> >   lib_managed/jars/derby-10.10.2.0.jar
> >
> > The Derby 10.10 release family supports some ancient JVMs: Java SE 5
> > and Java ME CDC/Foundation Profile 1.1. It's hard to imagine anyone
> > running Spark on the resource-constrained Java ME platform. Is Spark
> > really deployed on Java SE 5? Is there some other reason that Spark
> > uses the 10.10 Derby family?
> >
> > If no-one needs those ancient JVMs, maybe we could consider changing
> > the Derby version to 10.11.1.1 or even to the upcoming 10.12.1.1
> > release (both run on Java 6 and up).
> >
> > Thanks,
> > -Rick

Re: Derby version in Spark

Posted by Ted Yu <yu...@gmail.com>.
I cloned Hive 1.2 code base and saw:

    <derby.version>10.10.2.0</derby.version>

So the version used by Spark is quite close to what Hive uses.

On Tue, Sep 22, 2015 at 3:29 PM, Ted Yu <yu...@gmail.com> wrote:

> I see.
> I use maven to build so I observe different contents under lib_managed
> directory.
>
> Here is snippet of dependency tree:
>
> [INFO] |  +- org.spark-project.hive:hive-metastore:jar:1.2.1.spark:compile
> [INFO] |  |  +- com.jolbox:bonecp:jar:0.8.0.RELEASE:compile
> [INFO] |  |  +- org.apache.derby:derby:jar:10.10.1.1:compile
>
> On Tue, Sep 22, 2015 at 3:21 PM, Richard Hillegas <rh...@us.ibm.com>
> wrote:
>
>> Thanks, Ted. I'm working on my master branch. The lib_managed/jars
>> directory has a lot of jarballs, including hadoop and hive. Maybe these
>> were faulted in when I built with the following command?
>>
>>   sbt/sbt -Phive assembly/assembly
>>
>> The Derby jars seem to be used in order to manage the metastore_db
>> database. Maybe my question should be directed to the Hive community?
>>
>> Thanks,
>> -Rick
>>
>> Here are the gory details:
>>
>> bash-3.2$ ls lib_managed/jars
>> FastInfoset-1.2.12.jar curator-test-2.4.0.jar
>> jersey-test-framework-grizzly2-1.9.jar parquet-format-2.3.0-incubating.jar
>> JavaEWAH-0.3.2.jar datanucleus-api-jdo-3.2.6.jar jets3t-0.7.1.jar
>> parquet-generator-1.7.0.jar
>> ST4-4.0.4.jar datanucleus-core-3.2.10.jar
>> jetty-continuation-8.1.14.v20131031.jar parquet-hadoop-1.7.0.jar
>> activation-1.1.jar datanucleus-rdbms-3.2.9.jar
>> jetty-http-8.1.14.v20131031.jar parquet-hadoop-bundle-1.6.0.jar
>> akka-actor_2.10-2.3.11.jar derby-10.10.1.1.jar
>> jetty-io-8.1.14.v20131031.jar parquet-jackson-1.7.0.jar
>> akka-remote_2.10-2.3.11.jar derby-10.10.2.0.jar
>> jetty-jndi-8.1.14.v20131031.jar platform-3.4.0.jar
>> akka-slf4j_2.10-2.3.11.jar genjavadoc-plugin_2.10.4-0.9-spark0.jar
>> jetty-plus-8.1.14.v20131031.jar pmml-agent-1.1.15.jar
>> akka-testkit_2.10-2.3.11.jar groovy-all-2.1.6.jar
>> jetty-security-8.1.14.v20131031.jar pmml-model-1.1.15.jar
>> antlr-2.7.7.jar guava-11.0.2.jar jetty-server-8.1.14.v20131031.jar
>> pmml-schema-1.1.15.jar
>> antlr-runtime-3.4.jar guice-3.0.jar jetty-servlet-8.1.14.v20131031.jar
>> postgresql-9.3-1102-jdbc41.jar
>> aopalliance-1.0.jar h2-1.4.183.jar jetty-util-6.1.26.jar py4j-0.8.2.1.jar
>> arpack_combined_all-0.1-javadoc.jar hadoop-annotations-2.2.0.jar
>> jetty-util-8.1.14.v20131031.jar pyrolite-4.4.jar
>> arpack_combined_all-0.1.jar hadoop-auth-2.2.0.jar
>> jetty-webapp-8.1.14.v20131031.jar quasiquotes_2.10-2.0.0.jar
>> asm-3.2.jar hadoop-client-2.2.0.jar jetty-websocket-8.1.14.v20131031.jar
>> reflectasm-1.07-shaded.jar
>> avro-1.7.4.jar hadoop-common-2.2.0.jar jetty-xml-8.1.14.v20131031.jar
>> sac-1.3.jar
>> avro-1.7.7.jar hadoop-hdfs-2.2.0.jar jline-0.9.94.jar
>> scala-compiler-2.10.0.jar
>> avro-ipc-1.7.7-tests.jar hadoop-mapreduce-client-app-2.2.0.jar
>> jline-2.10.4.jar scala-compiler-2.10.4.jar
>> avro-ipc-1.7.7.jar hadoop-mapreduce-client-common-2.2.0.jar
>> jline-2.12.jar scala-library-2.10.4.jar
>> avro-mapred-1.7.7-hadoop2.jar hadoop-mapreduce-client-core-2.2.0.jar
>> jna-3.4.0.jar scala-reflect-2.10.4.jar
>> breeze-macros_2.10-0.11.2.jar hadoop-mapreduce-client-jobclient-2.2.0.jar
>> joda-time-2.5.jar scalacheck_2.10-1.11.3.jar
>> breeze_2.10-0.11.2.jar hadoop-mapreduce-client-shuffle-2.2.0.jar
>> jodd-core-3.5.2.jar scalap-2.10.0.jar
>> calcite-avatica-1.2.0-incubating.jar hadoop-yarn-api-2.2.0.jar
>> json-20080701.jar selenium-api-2.42.2.jar
>> calcite-core-1.2.0-incubating.jar hadoop-yarn-client-2.2.0.jar
>> json-20090211.jar selenium-chrome-driver-2.42.2.jar
>> calcite-linq4j-1.2.0-incubating.jar hadoop-yarn-common-2.2.0.jar
>> json4s-ast_2.10-3.2.10.jar selenium-firefox-driver-2.42.2.jar
>> cglib-2.2.1-v20090111.jar hadoop-yarn-server-common-2.2.0.jar
>> json4s-core_2.10-3.2.10.jar selenium-htmlunit-driver-2.42.2.jar
>> cglib-nodep-2.1_3.jar hadoop-yarn-server-nodemanager-2.2.0.jar
>> json4s-jackson_2.10-3.2.10.jar selenium-ie-driver-2.42.2.jar
>> chill-java-0.5.0.jar hamcrest-core-1.1.jar jsr173_api-1.0.jar
>> selenium-java-2.42.2.jar
>> chill_2.10-0.5.0.jar hamcrest-core-1.3.jar jsr305-1.3.9.jar
>> selenium-remote-driver-2.42.2.jar
>> commons-beanutils-1.7.0.jar hamcrest-library-1.3.jar jsr305-2.0.1.jar
>> selenium-safari-driver-2.42.2.jar
>> commons-beanutils-core-1.8.0.jar hive-exec-1.2.1.spark.jar jta-1.1.jar
>> selenium-support-2.42.2.jar
>> commons-cli-1.2.jar hive-metastore-1.2.1.spark.jar jtransforms-2.4.0.jar
>> serializer-2.7.1.jar
>> commons-codec-1.10.jar htmlunit-2.14.jar jul-to-slf4j-1.7.10.jar
>> slf4j-api-1.7.10.jar
>> commons-codec-1.4.jar htmlunit-core-js-2.14.jar junit-4.10.jar
>> slf4j-log4j12-1.7.10.jar
>> commons-codec-1.5.jar httpclient-4.3.2.jar junit-dep-4.10.jar
>> snappy-0.2.jar
>> commons-codec-1.9.jar httpcore-4.3.1.jar junit-dep-4.8.2.jar
>> spire-macros_2.10-0.7.4.jar
>> commons-collections-3.2.1.jar httpmime-4.3.2.jar junit-interface-0.10.jar
>> spire_2.10-0.7.4.jar
>> commons-compiler-2.7.8.jar istack-commons-runtime-2.16.jar
>> junit-interface-0.9.jar stax-api-1.0.1.jar
>> commons-compress-1.4.1.jar ivy-2.4.0.jar libfb303-0.9.2.jar
>> stream-2.7.0.jar
>> commons-configuration-1.6.jar jackson-core-asl-1.8.8.jar
>> libthrift-0.9.2.jar stringtemplate-3.2.1.jar
>> commons-dbcp-1.4.jar jackson-core-asl-1.9.13.jar lz4-1.3.0.jar
>> tachyon-client-0.7.1.jar
>> commons-digester-1.8.jar jackson-jaxrs-1.8.8.jar
>> mesos-0.21.1-shaded-protobuf.jar tachyon-underfs-hdfs-0.7.1.jar
>> commons-exec-1.1.jar jackson-mapper-asl-1.9.13.jar minlog-1.2.jar
>> tachyon-underfs-local-0.7.1.jar
>> commons-httpclient-3.1.jar jackson-xc-1.8.8.jar mockito-core-1.9.5.jar
>> test-interface-0.5.jar
>> commons-io-2.1.jar janino-2.7.8.jar mysql-connector-java-5.1.34.jar
>> test-interface-1.0.jar
>> commons-io-2.4.jar jansi-1.4.jar nekohtml-1.9.20.jar
>> uncommons-maths-1.2.2a.jar
>> commons-lang-2.5.jar javassist-3.15.0-GA.jar netty-all-4.0.29.Final.jar
>> unused-1.0.0.jar
>> commons-lang-2.6.jar javax.inject-1.jar objenesis-1.0.jar
>> webbit-0.4.14.jar
>> commons-lang3-3.3.2.jar jaxb-api-2.2.2.jar objenesis-1.2.jar
>> xalan-2.7.1.jar
>> commons-logging-1.1.3.jar jaxb-api-2.2.7.jar opencsv-2.3.jar
>> xercesImpl-2.11.0.jar
>> commons-math-2.1.jar jaxb-core-2.2.7.jar oro-2.0.8.jar xml-apis-1.4.01.jar
>> commons-math-2.2.jar jaxb-impl-2.2.3-1.jar paranamer-2.3.jar
>> xmlenc-0.52.jar
>> commons-math3-3.4.1.jar jaxb-impl-2.2.7.jar paranamer-2.6.jar xz-1.0.jar
>> commons-net-3.1.jar jblas-1.2.4.jar parquet-avro-1.7.0.jar
>> zookeeper-3.4.5.jar
>> commons-pool-1.5.4.jar jcl-over-slf4j-1.7.10.jar parquet-column-1.7.0.jar
>> core-1.1.2.jar jdo-api-3.0.1.jar parquet-common-1.7.0.jar
>> cssparser-0.9.13.jar jersey-guice-1.9.jar parquet-encoding-1.7.0.jar
>>
>> Ted Yu <yu...@gmail.com> wrote on 09/22/2015 01:32:39 PM:
>>
>> > From: Ted Yu <yu...@gmail.com>
>> > To: Richard Hillegas/San Francisco/IBM@IBMUS
>> > Cc: Dev <de...@spark.apache.org>
>> > Date: 09/22/2015 01:33 PM
>> > Subject: Re: Derby version in Spark
>>
>> >
>> > Which Spark release are you building ?
>> >
>> > For master branch, I get the following:
>> >
>> > lib_managed/jars/datanucleus-api-jdo-3.2.6.jar  lib_managed/jars/
>> > datanucleus-core-3.2.10.jar
>>  lib_managed/jars/datanucleus-rdbms-3.2.9.jar
>> >
>> > FYI
>> >
>> > On Tue, Sep 22, 2015 at 1:28 PM, Richard Hillegas <rh...@us.ibm.com>
>> wrote:
>> > I see that lib_managed/jars holds these old Derby versions:
>> >
>> >   lib_managed/jars/derby-10.10.1.1.jar
>> >   lib_managed/jars/derby-10.10.2.0.jar
>> >
>> > The Derby 10.10 release family supports some ancient JVMs: Java SE 5
>> > and Java ME CDC/Foundation Profile 1.1. It's hard to imagine anyone
>> > running Spark on the resource-constrained Java ME platform. Is Spark
>> > really deployed on Java SE 5? Is there some other reason that Spark
>> > uses the 10.10 Derby family?
>> >
>> > If no-one needs those ancient JVMs, maybe we could consider changing
>> > the Derby version to 10.11.1.1 or even to the upcoming 10.12.1.1
>> > release (both run on Java 6 and up).
>> >
>> > Thanks,
>> > -Rick
>>
>>
>

Re: Derby version in Spark

Posted by Ted Yu <yu...@gmail.com>.
I see.
I use maven to build so I observe different contents under lib_managed
directory.

Here is snippet of dependency tree:

[INFO] |  +- org.spark-project.hive:hive-metastore:jar:1.2.1.spark:compile
[INFO] |  |  +- com.jolbox:bonecp:jar:0.8.0.RELEASE:compile
[INFO] |  |  +- org.apache.derby:derby:jar:10.10.1.1:compile

On Tue, Sep 22, 2015 at 3:21 PM, Richard Hillegas <rh...@us.ibm.com>
wrote:

> Thanks, Ted. I'm working on my master branch. The lib_managed/jars
> directory has a lot of jarballs, including hadoop and hive. Maybe these
> were faulted in when I built with the following command?
>
>   sbt/sbt -Phive assembly/assembly
>
> The Derby jars seem to be used in order to manage the metastore_db
> database. Maybe my question should be directed to the Hive community?
>
> Thanks,
> -Rick
>
> Here are the gory details:
>
> bash-3.2$ ls lib_managed/jars
> FastInfoset-1.2.12.jar curator-test-2.4.0.jar
> jersey-test-framework-grizzly2-1.9.jar parquet-format-2.3.0-incubating.jar
> JavaEWAH-0.3.2.jar datanucleus-api-jdo-3.2.6.jar jets3t-0.7.1.jar
> parquet-generator-1.7.0.jar
> ST4-4.0.4.jar datanucleus-core-3.2.10.jar
> jetty-continuation-8.1.14.v20131031.jar parquet-hadoop-1.7.0.jar
> activation-1.1.jar datanucleus-rdbms-3.2.9.jar
> jetty-http-8.1.14.v20131031.jar parquet-hadoop-bundle-1.6.0.jar
> akka-actor_2.10-2.3.11.jar derby-10.10.1.1.jar
> jetty-io-8.1.14.v20131031.jar parquet-jackson-1.7.0.jar
> akka-remote_2.10-2.3.11.jar derby-10.10.2.0.jar
> jetty-jndi-8.1.14.v20131031.jar platform-3.4.0.jar
> akka-slf4j_2.10-2.3.11.jar genjavadoc-plugin_2.10.4-0.9-spark0.jar
> jetty-plus-8.1.14.v20131031.jar pmml-agent-1.1.15.jar
> akka-testkit_2.10-2.3.11.jar groovy-all-2.1.6.jar
> jetty-security-8.1.14.v20131031.jar pmml-model-1.1.15.jar
> antlr-2.7.7.jar guava-11.0.2.jar jetty-server-8.1.14.v20131031.jar
> pmml-schema-1.1.15.jar
> antlr-runtime-3.4.jar guice-3.0.jar jetty-servlet-8.1.14.v20131031.jar
> postgresql-9.3-1102-jdbc41.jar
> aopalliance-1.0.jar h2-1.4.183.jar jetty-util-6.1.26.jar py4j-0.8.2.1.jar
> arpack_combined_all-0.1-javadoc.jar hadoop-annotations-2.2.0.jar
> jetty-util-8.1.14.v20131031.jar pyrolite-4.4.jar
> arpack_combined_all-0.1.jar hadoop-auth-2.2.0.jar
> jetty-webapp-8.1.14.v20131031.jar quasiquotes_2.10-2.0.0.jar
> asm-3.2.jar hadoop-client-2.2.0.jar jetty-websocket-8.1.14.v20131031.jar
> reflectasm-1.07-shaded.jar
> avro-1.7.4.jar hadoop-common-2.2.0.jar jetty-xml-8.1.14.v20131031.jar
> sac-1.3.jar
> avro-1.7.7.jar hadoop-hdfs-2.2.0.jar jline-0.9.94.jar
> scala-compiler-2.10.0.jar
> avro-ipc-1.7.7-tests.jar hadoop-mapreduce-client-app-2.2.0.jar
> jline-2.10.4.jar scala-compiler-2.10.4.jar
> avro-ipc-1.7.7.jar hadoop-mapreduce-client-common-2.2.0.jar jline-2.12.jar
> scala-library-2.10.4.jar
> avro-mapred-1.7.7-hadoop2.jar hadoop-mapreduce-client-core-2.2.0.jar
> jna-3.4.0.jar scala-reflect-2.10.4.jar
> breeze-macros_2.10-0.11.2.jar hadoop-mapreduce-client-jobclient-2.2.0.jar
> joda-time-2.5.jar scalacheck_2.10-1.11.3.jar
> breeze_2.10-0.11.2.jar hadoop-mapreduce-client-shuffle-2.2.0.jar
> jodd-core-3.5.2.jar scalap-2.10.0.jar
> calcite-avatica-1.2.0-incubating.jar hadoop-yarn-api-2.2.0.jar
> json-20080701.jar selenium-api-2.42.2.jar
> calcite-core-1.2.0-incubating.jar hadoop-yarn-client-2.2.0.jar
> json-20090211.jar selenium-chrome-driver-2.42.2.jar
> calcite-linq4j-1.2.0-incubating.jar hadoop-yarn-common-2.2.0.jar
> json4s-ast_2.10-3.2.10.jar selenium-firefox-driver-2.42.2.jar
> cglib-2.2.1-v20090111.jar hadoop-yarn-server-common-2.2.0.jar
> json4s-core_2.10-3.2.10.jar selenium-htmlunit-driver-2.42.2.jar
> cglib-nodep-2.1_3.jar hadoop-yarn-server-nodemanager-2.2.0.jar
> json4s-jackson_2.10-3.2.10.jar selenium-ie-driver-2.42.2.jar
> chill-java-0.5.0.jar hamcrest-core-1.1.jar jsr173_api-1.0.jar
> selenium-java-2.42.2.jar
> chill_2.10-0.5.0.jar hamcrest-core-1.3.jar jsr305-1.3.9.jar
> selenium-remote-driver-2.42.2.jar
> commons-beanutils-1.7.0.jar hamcrest-library-1.3.jar jsr305-2.0.1.jar
> selenium-safari-driver-2.42.2.jar
> commons-beanutils-core-1.8.0.jar hive-exec-1.2.1.spark.jar jta-1.1.jar
> selenium-support-2.42.2.jar
> commons-cli-1.2.jar hive-metastore-1.2.1.spark.jar jtransforms-2.4.0.jar
> serializer-2.7.1.jar
> commons-codec-1.10.jar htmlunit-2.14.jar jul-to-slf4j-1.7.10.jar
> slf4j-api-1.7.10.jar
> commons-codec-1.4.jar htmlunit-core-js-2.14.jar junit-4.10.jar
> slf4j-log4j12-1.7.10.jar
> commons-codec-1.5.jar httpclient-4.3.2.jar junit-dep-4.10.jar
> snappy-0.2.jar
> commons-codec-1.9.jar httpcore-4.3.1.jar junit-dep-4.8.2.jar
> spire-macros_2.10-0.7.4.jar
> commons-collections-3.2.1.jar httpmime-4.3.2.jar junit-interface-0.10.jar
> spire_2.10-0.7.4.jar
> commons-compiler-2.7.8.jar istack-commons-runtime-2.16.jar
> junit-interface-0.9.jar stax-api-1.0.1.jar
> commons-compress-1.4.1.jar ivy-2.4.0.jar libfb303-0.9.2.jar
> stream-2.7.0.jar
> commons-configuration-1.6.jar jackson-core-asl-1.8.8.jar
> libthrift-0.9.2.jar stringtemplate-3.2.1.jar
> commons-dbcp-1.4.jar jackson-core-asl-1.9.13.jar lz4-1.3.0.jar
> tachyon-client-0.7.1.jar
> commons-digester-1.8.jar jackson-jaxrs-1.8.8.jar
> mesos-0.21.1-shaded-protobuf.jar tachyon-underfs-hdfs-0.7.1.jar
> commons-exec-1.1.jar jackson-mapper-asl-1.9.13.jar minlog-1.2.jar
> tachyon-underfs-local-0.7.1.jar
> commons-httpclient-3.1.jar jackson-xc-1.8.8.jar mockito-core-1.9.5.jar
> test-interface-0.5.jar
> commons-io-2.1.jar janino-2.7.8.jar mysql-connector-java-5.1.34.jar
> test-interface-1.0.jar
> commons-io-2.4.jar jansi-1.4.jar nekohtml-1.9.20.jar
> uncommons-maths-1.2.2a.jar
> commons-lang-2.5.jar javassist-3.15.0-GA.jar netty-all-4.0.29.Final.jar
> unused-1.0.0.jar
> commons-lang-2.6.jar javax.inject-1.jar objenesis-1.0.jar webbit-0.4.14.jar
> commons-lang3-3.3.2.jar jaxb-api-2.2.2.jar objenesis-1.2.jar
> xalan-2.7.1.jar
> commons-logging-1.1.3.jar jaxb-api-2.2.7.jar opencsv-2.3.jar
> xercesImpl-2.11.0.jar
> commons-math-2.1.jar jaxb-core-2.2.7.jar oro-2.0.8.jar xml-apis-1.4.01.jar
> commons-math-2.2.jar jaxb-impl-2.2.3-1.jar paranamer-2.3.jar
> xmlenc-0.52.jar
> commons-math3-3.4.1.jar jaxb-impl-2.2.7.jar paranamer-2.6.jar xz-1.0.jar
> commons-net-3.1.jar jblas-1.2.4.jar parquet-avro-1.7.0.jar
> zookeeper-3.4.5.jar
> commons-pool-1.5.4.jar jcl-over-slf4j-1.7.10.jar parquet-column-1.7.0.jar
> core-1.1.2.jar jdo-api-3.0.1.jar parquet-common-1.7.0.jar
> cssparser-0.9.13.jar jersey-guice-1.9.jar parquet-encoding-1.7.0.jar
>
> Ted Yu <yu...@gmail.com> wrote on 09/22/2015 01:32:39 PM:
>
> > From: Ted Yu <yu...@gmail.com>
> > To: Richard Hillegas/San Francisco/IBM@IBMUS
> > Cc: Dev <de...@spark.apache.org>
> > Date: 09/22/2015 01:33 PM
> > Subject: Re: Derby version in Spark
>
> >
> > Which Spark release are you building ?
> >
> > For master branch, I get the following:
> >
> > lib_managed/jars/datanucleus-api-jdo-3.2.6.jar  lib_managed/jars/
> > datanucleus-core-3.2.10.jar  lib_managed/jars/datanucleus-rdbms-3.2.9.jar
> >
> > FYI
> >
> > On Tue, Sep 22, 2015 at 1:28 PM, Richard Hillegas <rh...@us.ibm.com>
> wrote:
> > I see that lib_managed/jars holds these old Derby versions:
> >
> >   lib_managed/jars/derby-10.10.1.1.jar
> >   lib_managed/jars/derby-10.10.2.0.jar
> >
> > The Derby 10.10 release family supports some ancient JVMs: Java SE 5
> > and Java ME CDC/Foundation Profile 1.1. It's hard to imagine anyone
> > running Spark on the resource-constrained Java ME platform. Is Spark
> > really deployed on Java SE 5? Is there some other reason that Spark
> > uses the 10.10 Derby family?
> >
> > If no-one needs those ancient JVMs, maybe we could consider changing
> > the Derby version to 10.11.1.1 or even to the upcoming 10.12.1.1
> > release (both run on Java 6 and up).
> >
> > Thanks,
> > -Rick
>
>

Re: Derby version in Spark

Posted by Richard Hillegas <rh...@us.ibm.com>.
Thanks, Ted. I'm working on my master branch. The lib_managed/jars
directory has a lot of jarballs, including hadoop and hive. Maybe these
were faulted in when I built with the following command?

  sbt/sbt -Phive assembly/assembly

The Derby jars seem to be used in order to manage the metastore_db
database. Maybe my question should be directed to the Hive community?

Thanks,
-Rick

Here are the gory details:

bash-3.2$ ls lib_managed/jars
FastInfoset-1.2.12.jar				curator-test-2.4.0.jar
	jersey-test-framework-grizzly2-1.9.jar
parquet-format-2.3.0-incubating.jar
JavaEWAH-0.3.2.jar				datanucleus-api-jdo-3.2.6.jar
	jets3t-0.7.1.jar				parquet-generator-1.7.0.jar
ST4-4.0.4.jar					datanucleus-core-3.2.10.jar
	jetty-continuation-8.1.14.v20131031.jar
parquet-hadoop-1.7.0.jar
activation-1.1.jar				datanucleus-rdbms-3.2.9.jar
	jetty-http-8.1.14.v20131031.jar
parquet-hadoop-bundle-1.6.0.jar
akka-actor_2.10-2.3.11.jar			derby-10.10.1.1.jar
	jetty-io-8.1.14.v20131031.jar			parquet-jackson-1.7.0.jar
akka-remote_2.10-2.3.11.jar			derby-10.10.2.0.jar
	jetty-jndi-8.1.14.v20131031.jar			platform-3.4.0.jar
akka-slf4j_2.10-2.3.11.jar
genjavadoc-plugin_2.10.4-0.9-spark0.jar
jetty-plus-8.1.14.v20131031.jar			pmml-agent-1.1.15.jar
akka-testkit_2.10-2.3.11.jar			groovy-all-2.1.6.jar
	jetty-security-8.1.14.v20131031.jar		pmml-model-1.1.15.jar
antlr-2.7.7.jar					guava-11.0.2.jar
jetty-server-8.1.14.v20131031.jar		pmml-schema-1.1.15.jar
antlr-runtime-3.4.jar				guice-3.0.jar
	jetty-servlet-8.1.14.v20131031.jar
postgresql-9.3-1102-jdbc41.jar
aopalliance-1.0.jar				h2-1.4.183.jar
	jetty-util-6.1.26.jar				py4j-0.8.2.1.jar
arpack_combined_all-0.1-javadoc.jar		hadoop-annotations-2.2.0.jar
	jetty-util-8.1.14.v20131031.jar			pyrolite-4.4.jar
arpack_combined_all-0.1.jar			hadoop-auth-2.2.0.jar
	jetty-webapp-8.1.14.v20131031.jar		quasiquotes_2.10-2.0.0.jar
asm-3.2.jar					hadoop-client-2.2.0.jar
jetty-websocket-8.1.14.v20131031.jar		reflectasm-1.07-shaded.jar
avro-1.7.4.jar					hadoop-common-2.2.0.jar
	jetty-xml-8.1.14.v20131031.jar			sac-1.3.jar
avro-1.7.7.jar					hadoop-hdfs-2.2.0.jar
	jline-0.9.94.jar				scala-compiler-2.10.0.jar
avro-ipc-1.7.7-tests.jar
hadoop-mapreduce-client-app-2.2.0.jar		jline-2.10.4.jar
	scala-compiler-2.10.4.jar
avro-ipc-1.7.7.jar
hadoop-mapreduce-client-common-2.2.0.jar	jline-2.12.jar
	scala-library-2.10.4.jar
avro-mapred-1.7.7-hadoop2.jar
hadoop-mapreduce-client-core-2.2.0.jar		jna-3.4.0.jar
		scala-reflect-2.10.4.jar
breeze-macros_2.10-0.11.2.jar
hadoop-mapreduce-client-jobclient-2.2.0.jar	joda-time-2.5.jar
	scalacheck_2.10-1.11.3.jar
breeze_2.10-0.11.2.jar
hadoop-mapreduce-client-shuffle-2.2.0.jar	jodd-core-3.5.2.jar
	scalap-2.10.0.jar
calcite-avatica-1.2.0-incubating.jar		hadoop-yarn-api-2.2.0.jar
		json-20080701.jar				selenium-api-2.42.2.jar
calcite-core-1.2.0-incubating.jar		hadoop-yarn-client-2.2.0.jar
	json-20090211.jar				selenium-chrome-driver-2.42.2.jar
calcite-linq4j-1.2.0-incubating.jar		hadoop-yarn-common-2.2.0.jar
	json4s-ast_2.10-3.2.10.jar
selenium-firefox-driver-2.42.2.jar
cglib-2.2.1-v20090111.jar
hadoop-yarn-server-common-2.2.0.jar		json4s-core_2.10-3.2.10.jar
	selenium-htmlunit-driver-2.42.2.jar
cglib-nodep-2.1_3.jar
hadoop-yarn-server-nodemanager-2.2.0.jar	json4s-jackson_2.10-3.2.10.jar
		selenium-ie-driver-2.42.2.jar
chill-java-0.5.0.jar				hamcrest-core-1.1.jar
	jsr173_api-1.0.jar				selenium-java-2.42.2.jar
chill_2.10-0.5.0.jar				hamcrest-core-1.3.jar
	jsr305-1.3.9.jar				selenium-remote-driver-2.42.2.jar
commons-beanutils-1.7.0.jar			hamcrest-library-1.3.jar
	jsr305-2.0.1.jar				selenium-safari-driver-2.42.2.jar
commons-beanutils-core-1.8.0.jar		hive-exec-1.2.1.spark.jar
	jta-1.1.jar					selenium-support-2.42.2.jar
commons-cli-1.2.jar				hive-metastore-1.2.1.spark.jar
		jtransforms-2.4.0.jar				serializer-2.7.1.jar
commons-codec-1.10.jar				htmlunit-2.14.jar
jul-to-slf4j-1.7.10.jar				slf4j-api-1.7.10.jar
commons-codec-1.4.jar				htmlunit-core-js-2.14.jar
	junit-4.10.jar					slf4j-log4j12-1.7.10.jar
commons-codec-1.5.jar				httpclient-4.3.2.jar
	junit-dep-4.10.jar				snappy-0.2.jar
commons-codec-1.9.jar				httpcore-4.3.1.jar
	junit-dep-4.8.2.jar				spire-macros_2.10-0.7.4.jar
commons-collections-3.2.1.jar			httpmime-4.3.2.jar
	junit-interface-0.10.jar			spire_2.10-0.7.4.jar
commons-compiler-2.7.8.jar			istack-commons-runtime-2.16.jar
		junit-interface-0.9.jar				stax-api-1.0.1.jar
commons-compress-1.4.1.jar			ivy-2.4.0.jar
	libfb303-0.9.2.jar				stream-2.7.0.jar
commons-configuration-1.6.jar			jackson-core-asl-1.8.8.jar
	libthrift-0.9.2.jar				stringtemplate-3.2.1.jar
commons-dbcp-1.4.jar				jackson-core-asl-1.9.13.jar
	lz4-1.3.0.jar					tachyon-client-0.7.1.jar
commons-digester-1.8.jar			jackson-jaxrs-1.8.8.jar
	mesos-0.21.1-shaded-protobuf.jar
tachyon-underfs-hdfs-0.7.1.jar
commons-exec-1.1.jar				jackson-mapper-asl-1.9.13.jar
	minlog-1.2.jar
tachyon-underfs-local-0.7.1.jar
commons-httpclient-3.1.jar			jackson-xc-1.8.8.jar
	mockito-core-1.9.5.jar				test-interface-0.5.jar
commons-io-2.1.jar				janino-2.7.8.jar
mysql-connector-java-5.1.34.jar			test-interface-1.0.jar
commons-io-2.4.jar				jansi-1.4.jar
	nekohtml-1.9.20.jar				uncommons-maths-1.2.2a.jar
commons-lang-2.5.jar				javassist-3.15.0-GA.jar
	netty-all-4.0.29.Final.jar			unused-1.0.0.jar
commons-lang-2.6.jar				javax.inject-1.jar
	objenesis-1.0.jar				webbit-0.4.14.jar
commons-lang3-3.3.2.jar				jaxb-api-2.2.2.jar
	objenesis-1.2.jar				xalan-2.7.1.jar
commons-logging-1.1.3.jar			jaxb-api-2.2.7.jar
	opencsv-2.3.jar					xercesImpl-2.11.0.jar
commons-math-2.1.jar				jaxb-core-2.2.7.jar
	oro-2.0.8.jar					xml-apis-1.4.01.jar
commons-math-2.2.jar				jaxb-impl-2.2.3-1.jar
	paranamer-2.3.jar				xmlenc-0.52.jar
commons-math3-3.4.1.jar				jaxb-impl-2.2.7.jar
	paranamer-2.6.jar				xz-1.0.jar
commons-net-3.1.jar				jblas-1.2.4.jar
	parquet-avro-1.7.0.jar				zookeeper-3.4.5.jar
commons-pool-1.5.4.jar				jcl-over-slf4j-1.7.10.jar
	parquet-column-1.7.0.jar
core-1.1.2.jar					jdo-api-3.0.1.jar
parquet-common-1.7.0.jar
cssparser-0.9.13.jar				jersey-guice-1.9.jar
	parquet-encoding-1.7.0.jar

Ted Yu <yu...@gmail.com> wrote on 09/22/2015 01:32:39 PM:

> From: Ted Yu <yu...@gmail.com>
> To: Richard Hillegas/San Francisco/IBM@IBMUS
> Cc: Dev <de...@spark.apache.org>
> Date: 09/22/2015 01:33 PM
> Subject: Re: Derby version in Spark
>
> Which Spark release are you building ?
>
> For master branch, I get the following:
>
> lib_managed/jars/datanucleus-api-jdo-3.2.6.jar  lib_managed/jars/
> datanucleus-core-3.2.10.jar  lib_managed/jars/datanucleus-rdbms-3.2.9.jar
>
> FYI
>
> On Tue, Sep 22, 2015 at 1:28 PM, Richard Hillegas <rh...@us.ibm.com>
wrote:
> I see that lib_managed/jars holds these old Derby versions:
>
>   lib_managed/jars/derby-10.10.1.1.jar
>   lib_managed/jars/derby-10.10.2.0.jar
>
> The Derby 10.10 release family supports some ancient JVMs: Java SE 5
> and Java ME CDC/Foundation Profile 1.1. It's hard to imagine anyone
> running Spark on the resource-constrained Java ME platform. Is Spark
> really deployed on Java SE 5? Is there some other reason that Spark
> uses the 10.10 Derby family?
>
> If no-one needs those ancient JVMs, maybe we could consider changing
> the Derby version to 10.11.1.1 or even to the upcoming 10.12.1.1
> release (both run on Java 6 and up).
>
> Thanks,
> -Rick

Re: Derby version in Spark

Posted by Ted Yu <yu...@gmail.com>.
Which Spark release are you building ?

For master branch, I get the following:

lib_managed/jars/datanucleus-api-jdo-3.2.6.jar
 lib_managed/jars/datanucleus-core-3.2.10.jar
 lib_managed/jars/datanucleus-rdbms-3.2.9.jar

FYI

On Tue, Sep 22, 2015 at 1:28 PM, Richard Hillegas <rh...@us.ibm.com>
wrote:

> I see that lib_managed/jars holds these old Derby versions:
>
>   lib_managed/jars/derby-10.10.1.1.jar
>   lib_managed/jars/derby-10.10.2.0.jar
>
> The Derby 10.10 release family supports some ancient JVMs: Java SE 5 and
> Java ME CDC/Foundation Profile 1.1. It's hard to imagine anyone running
> Spark on the resource-constrained Java ME platform. Is Spark really
> deployed on Java SE 5? Is there some other reason that Spark uses the 10.10
> Derby family?
>
> If no-one needs those ancient JVMs, maybe we could consider changing the
> Derby version to 10.11.1.1 or even to the upcoming 10.12.1.1 release (both
> run on Java 6 and up).
>
> Thanks,
> -Rick
>