You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jianshi Huang <ji...@gmail.com> on 2014/06/16 13:37:12 UTC

Need help. Spark + Accumulo => Error: java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeBase64String

Hi,

I'm trying to use Accumulo with Spark by writing to AccumuloOutputFormat.
It went all well on my laptop (Accumulo MockInstance + Spark Local mode).

But when I try to submit it to the yarn cluster, the yarn logs shows the
following error message:

14/06/16 02:01:44 INFO cluster.YarnClientClusterScheduler:
YarnClientClusterScheduler.postStartHook done
Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.commons.codec.binary.Base64.encodeBase64String([B)Ljava/lang/String;
        at
org.apache.accumulo.core.client.mapreduce.lib.impl.ConfiguratorBase.setConnectorInfo(ConfiguratorBase.java:127)
        at
org.apache.accumulo.core.client.mapreduce.AccumuloOutputFormat.setConnectorInfo(AccumuloOutputFormat.java:92)
        at
com.paypal.rtgraph.demo.MapReduceWriter$.main(MapReduceWriter.scala:44)
        at
com.paypal.rtgraph.demo.MapReduceWriter.main(MapReduceWriter.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at
org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


Looks like Accumulo's dependency has got problems.

Does Anyone know what's wrong with my code or settings? I've added all
needed jars to spark's classpath. I confirmed that commons-codec-1.7.jar
has been uploaded to hdfs.

14/06/16 04:36:02 INFO yarn.Client: Uploading
file:/x/home/jianshuang/tmp/lib/commons-codec-1.7.jar to
hdfs://manny-lvs/user/jianshuang/.sparkStaging/application_1401752249873_12662/commons-codec-1.7.jar



And here's my spark-submit cmd (all JARs needed are concatenated after
--jars):

~/spark/spark-1.0.0-bin-hadoop2/bin/spark-submit --name 'rtgraph' --class
com.paypal.rtgraph.demo.Tables --master yarn --deploy-mode cluster --jars
`find lib -type f | tr '\n' ','` --driver-memory 4G --driver-cores 4
--executor-memory 20G --executor-cores 8 --num-executors 2 rtgraph.jar

I've tried both cluster mode and client mode and neither worked.


BTW, I tried to use sbt-assembly to created a bundled jar, however I always
got the following error:

[error] (*:assembly) deduplicate: different file contents found in the
following:
[error]
/Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.transaction/orbits/javax.transaction-1.1.1.v201105210645.jar:META-INF/ECLIPSEF.RSA
[error]
/Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.servlet/orbits/javax.servlet-3.0.0.v201112011016.jar:META-INF/ECLIPSEF.RSA
[error]
/Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.mail.glassfish/orbits/javax.mail.glassfish-1.4.1.v201005082020.jar:META-INF/ECLIPSEF.RSA
[error]
/Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.activation/orbits/javax.activation-1.1.0.v201105071233.jar:META-INF/ECLIPSEF.RSA

I googled it and looks like I need to exclude some JARs. Anyone has done
that? Your help is really appreciated.



Cheers,

-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Re: Need help. Spark + Accumulo => Error: java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeBase64String

Posted by anoldbrain <an...@gmail.com>.
Assuming "this should not happen", I don't want to have to keep building a
custom version of spark for every new release, thus preferring the
workaround.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Need-help-Spark-Accumulo-Error-java-lang-NoSuchMethodError-org-apache-commons-codec-binary-Base64-eng-tp7667p8140.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Need help. Spark + Accumulo => Error: java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeBase64String

Posted by Jianshi Huang <ji...@gmail.com>.
Thanks, I solved it by recompiling Spark (I think it's the preferred way).
But I agree that official spark for hadoop2 need to be compiled with newer
libs.

Jianshi


On Mon, Jun 23, 2014 at 7:41 PM, anoldbrain <an...@gmail.com> wrote:

> found a workaround by adding "SPARK_CLASSPATH=.../commons-codec-xxx.jar" to
> spark-env.sh
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Need-help-Spark-Accumulo-Error-java-lang-NoSuchMethodError-org-apache-commons-codec-binary-Base64-eng-tp7667p8117.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>



-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Re: Need help. Spark + Accumulo => Error: java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeBase64String

Posted by anoldbrain <an...@gmail.com>.
found a workaround by adding "SPARK_CLASSPATH=.../commons-codec-xxx.jar" to
spark-env.sh 



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Need-help-Spark-Accumulo-Error-java-lang-NoSuchMethodError-org-apache-commons-codec-binary-Base64-eng-tp7667p8117.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Need help. Spark + Accumulo => Error: java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeBase64String

Posted by anoldbrain <an...@gmail.com>.
I used Java Decompiler to check the included
"org.apache.commons.codec.binary.Base64" .class file (in spark-assembly jar
file) and for both "encodeBase64" and "decodeBase64", there is only (byte
[]) version and no encodeBase64/decodeBase64(String).

I have encountered the reported issue. This conflicts with many libraries
that use version >= 1.4. Since spark-assembly takes precedence over spark
app jar, I have yet to find a way to work-around this.

Any advice, please?

Thank you. 



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Need-help-Spark-Accumulo-Error-java-lang-NoSuchMethodError-org-apache-commons-codec-binary-Base64-eng-tp7667p8111.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Need help. Spark + Accumulo => Error: java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeBase64String

Posted by Sean Owen <so...@cloudera.com>.
No, this is just standard Maven informational license info in
META-INF. It is not going to affect runtime behavior or how classes
are loaded.

On Mon, Jun 23, 2014 at 6:30 AM, anoldbrain <an...@gmail.com> wrote:
> I checked the META-INF/DEPENDENCIES file in the spark-assembly jar from
> official 1.0.0 binary release for CDH4, and found one "commons-codec" entry
>
> From: 'The Apache Software Foundation' (http://jakarta.apache.org)
>   - Codec (http://jakarta.apache.org/commons/codec/)
> commons-codec:commons-codec:jar:1.3
>     License: The Apache Software License, Version 2.0  (/LICENSE.txt)
>   - Digester (http://jakarta.apache.org/commons/digester/)
> commons-digester:commons-digester:jar:1.8
>     License: The Apache Software License, Version 2.0  (/LICENSE.txt)
>   - EL (http://jakarta.apache.org/commons/el/) commons-el:commons-el:jar:1.0
>     License: The Apache Software License, Version 2.0  (/LICENSE.txt)
>
> Should this be filed as a bug?

Re: Need help. Spark + Accumulo => Error: java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeBase64String

Posted by anoldbrain <an...@gmail.com>.
I checked the META-INF/DEPENDENCIES file in the spark-assembly jar from
official 1.0.0 binary release for CDH4, and found one "commons-codec" entry

From: 'The Apache Software Foundation' (http://jakarta.apache.org)
  - Codec (http://jakarta.apache.org/commons/codec/)
commons-codec:commons-codec:jar:1.3
    License: The Apache Software License, Version 2.0  (/LICENSE.txt)
  - Digester (http://jakarta.apache.org/commons/digester/)
commons-digester:commons-digester:jar:1.8
    License: The Apache Software License, Version 2.0  (/LICENSE.txt)
  - EL (http://jakarta.apache.org/commons/el/) commons-el:commons-el:jar:1.0
    License: The Apache Software License, Version 2.0  (/LICENSE.txt)

Should this be filed as a bug?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Need-help-Spark-Accumulo-Error-java-lang-NoSuchMethodError-org-apache-commons-codec-binary-Base64-eng-tp7667p8102.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Need help. Spark + Accumulo => Error: java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeBase64String

Posted by Jianshi Huang <ji...@gmail.com>.
With the help from the Accumulo guys, I probably know why.

I'm using the binary distro of Spark and Base64 is from spark-assembly.jar
and it probably uses an older version of commons-codec.

I'll need to reinstall spark from source.

Jianshi


On Mon, Jun 16, 2014 at 9:18 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> Hi
>
> Check in your driver programs Environment, (eg:
> http://192.168.1.39:4040/environment/). If you don't see this
> commons-codec-1.7.jar jar then that's the issue.
>
> Thanks
> Best Regards
>
>
> On Mon, Jun 16, 2014 at 5:07 PM, Jianshi Huang <ji...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I'm trying to use Accumulo with Spark by writing to AccumuloOutputFormat.
>> It went all well on my laptop (Accumulo MockInstance + Spark Local mode).
>>
>> But when I try to submit it to the yarn cluster, the yarn logs shows the
>> following error message:
>>
>> 14/06/16 02:01:44 INFO cluster.YarnClientClusterScheduler:
>> YarnClientClusterScheduler.postStartHook done
>> Exception in thread "main" java.lang.NoSuchMethodError:
>> org.apache.commons.codec.binary.Base64.encodeBase64String([B)Ljava/lang/String;
>>         at
>> org.apache.accumulo.core.client.mapreduce.lib.impl.ConfiguratorBase.setConnectorInfo(ConfiguratorBase.java:127)
>>         at
>> org.apache.accumulo.core.client.mapreduce.AccumuloOutputFormat.setConnectorInfo(AccumuloOutputFormat.java:92)
>>         at
>> com.paypal.rtgraph.demo.MapReduceWriter$.main(MapReduceWriter.scala:44)
>>         at
>> com.paypal.rtgraph.demo.MapReduceWriter.main(MapReduceWriter.scala)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at
>> org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292)
>>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
>>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>>
>> Looks like Accumulo's dependency has got problems.
>>
>> Does Anyone know what's wrong with my code or settings? I've added all
>> needed jars to spark's classpath. I confirmed that commons-codec-1.7.jar
>> has been uploaded to hdfs.
>>
>> 14/06/16 04:36:02 INFO yarn.Client: Uploading
>> file:/x/home/jianshuang/tmp/lib/commons-codec-1.7.jar to
>> hdfs://manny-lvs/user/jianshuang/.sparkStaging/application_1401752249873_12662/commons-codec-1.7.jar
>>
>>
>>
>> And here's my spark-submit cmd (all JARs needed are concatenated after
>> --jars):
>>
>> ~/spark/spark-1.0.0-bin-hadoop2/bin/spark-submit --name 'rtgraph' --class
>> com.paypal.rtgraph.demo.Tables --master yarn --deploy-mode cluster --jars
>> `find lib -type f | tr '\n' ','` --driver-memory 4G --driver-cores 4
>> --executor-memory 20G --executor-cores 8 --num-executors 2 rtgraph.jar
>>
>> I've tried both cluster mode and client mode and neither worked.
>>
>>
>> BTW, I tried to use sbt-assembly to created a bundled jar, however I
>> always got the following error:
>>
>> [error] (*:assembly) deduplicate: different file contents found in the
>> following:
>> [error]
>> /Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.transaction/orbits/javax.transaction-1.1.1.v201105210645.jar:META-INF/ECLIPSEF.RSA
>> [error]
>> /Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.servlet/orbits/javax.servlet-3.0.0.v201112011016.jar:META-INF/ECLIPSEF.RSA
>> [error]
>> /Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.mail.glassfish/orbits/javax.mail.glassfish-1.4.1.v201005082020.jar:META-INF/ECLIPSEF.RSA
>> [error]
>> /Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.activation/orbits/javax.activation-1.1.0.v201105071233.jar:META-INF/ECLIPSEF.RSA
>>
>> I googled it and looks like I need to exclude some JARs. Anyone has done
>> that? Your help is really appreciated.
>>
>>
>>
>> Cheers,
>>
>> --
>> Jianshi Huang
>>
>> LinkedIn: jianshi
>> Twitter: @jshuang
>> Github & Blog: http://huangjs.github.com/
>>
>
>


-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Re: Need help. Spark + Accumulo => Error: java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeBase64String

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Hi

Check in your driver programs Environment, (eg:
http://192.168.1.39:4040/environment/). If you don't see this
commons-codec-1.7.jar jar then that's the issue.

Thanks
Best Regards


On Mon, Jun 16, 2014 at 5:07 PM, Jianshi Huang <ji...@gmail.com>
wrote:

> Hi,
>
> I'm trying to use Accumulo with Spark by writing to AccumuloOutputFormat.
> It went all well on my laptop (Accumulo MockInstance + Spark Local mode).
>
> But when I try to submit it to the yarn cluster, the yarn logs shows the
> following error message:
>
> 14/06/16 02:01:44 INFO cluster.YarnClientClusterScheduler:
> YarnClientClusterScheduler.postStartHook done
> Exception in thread "main" java.lang.NoSuchMethodError:
> org.apache.commons.codec.binary.Base64.encodeBase64String([B)Ljava/lang/String;
>         at
> org.apache.accumulo.core.client.mapreduce.lib.impl.ConfiguratorBase.setConnectorInfo(ConfiguratorBase.java:127)
>         at
> org.apache.accumulo.core.client.mapreduce.AccumuloOutputFormat.setConnectorInfo(AccumuloOutputFormat.java:92)
>         at
> com.paypal.rtgraph.demo.MapReduceWriter$.main(MapReduceWriter.scala:44)
>         at
> com.paypal.rtgraph.demo.MapReduceWriter.main(MapReduceWriter.scala)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292)
>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
> Looks like Accumulo's dependency has got problems.
>
> Does Anyone know what's wrong with my code or settings? I've added all
> needed jars to spark's classpath. I confirmed that commons-codec-1.7.jar
> has been uploaded to hdfs.
>
> 14/06/16 04:36:02 INFO yarn.Client: Uploading
> file:/x/home/jianshuang/tmp/lib/commons-codec-1.7.jar to
> hdfs://manny-lvs/user/jianshuang/.sparkStaging/application_1401752249873_12662/commons-codec-1.7.jar
>
>
>
> And here's my spark-submit cmd (all JARs needed are concatenated after
> --jars):
>
> ~/spark/spark-1.0.0-bin-hadoop2/bin/spark-submit --name 'rtgraph' --class
> com.paypal.rtgraph.demo.Tables --master yarn --deploy-mode cluster --jars
> `find lib -type f | tr '\n' ','` --driver-memory 4G --driver-cores 4
> --executor-memory 20G --executor-cores 8 --num-executors 2 rtgraph.jar
>
> I've tried both cluster mode and client mode and neither worked.
>
>
> BTW, I tried to use sbt-assembly to created a bundled jar, however I
> always got the following error:
>
> [error] (*:assembly) deduplicate: different file contents found in the
> following:
> [error]
> /Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.transaction/orbits/javax.transaction-1.1.1.v201105210645.jar:META-INF/ECLIPSEF.RSA
> [error]
> /Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.servlet/orbits/javax.servlet-3.0.0.v201112011016.jar:META-INF/ECLIPSEF.RSA
> [error]
> /Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.mail.glassfish/orbits/javax.mail.glassfish-1.4.1.v201005082020.jar:META-INF/ECLIPSEF.RSA
> [error]
> /Users/jianshuang/.ivy2/cache/org.eclipse.jetty.orbit/javax.activation/orbits/javax.activation-1.1.0.v201105071233.jar:META-INF/ECLIPSEF.RSA
>
> I googled it and looks like I need to exclude some JARs. Anyone has done
> that? Your help is really appreciated.
>
>
>
> Cheers,
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>