You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by "Eugene Sapozhnikov (JIRA)" <ji...@apache.org> on 2015/08/21 21:11:46 UTC
[jira] [Created] (ZEPPELIN-253) EMR Spark deployment: Class
com.hadoop.compression.lzo.LzoCodec not found
Eugene Sapozhnikov created ZEPPELIN-253:
-------------------------------------------
Summary: EMR Spark deployment: Class com.hadoop.compression.lzo.LzoCodec not found
Key: ZEPPELIN-253
URL: https://issues.apache.org/jira/browse/ZEPPELIN-253
Project: Zeppelin
Issue Type: Bug
Components: Core
Affects Versions: 0.6.0
Environment: It's Amazon EMR cluster:
AMI version:3.8.0
Hadoop distribution:Amazon 2.4.0
Applications:Hive 0.13.1, Pig 0.12.0, Spark 1.3.1
Zeppelin current clone from git master: 0.6.0-incubating-SNAPSHOT
Contents of zeppelin-env.sh:
export MASTER=yarn-client
export HADOOP_CONF_DIR=/home/hadoop/conf
export ZEPPELIN_SPARK_USEHIVECONTEXT=false
export ZEPPELIN_JAVA_OPTS="-Dspark.executor.instances=2 -Dspark.executor.cores=2 -Dspark.executor.memory=1547M -Dspark.default.parallelism=4"
Reporter: Eugene Sapozhnikov
Priority: Blocker
Hi,
I am trying to install EMR+Spark+Zeppelin with no luck.
I executed recommendations from https://gist.github.com/andershammar/224e1077021d0ea376dd, everything feels all set, I checked this .sh file line by line.
At the host 'spark-shell' is working okay, my test code executes just fine.
When I enter Zeppelin and try to actually do something in Scala in a notebook I get error below.
Could you tell me what's wrong in connecting Zeppelin to the existing Spark cluster, or head me to some instruction about it. So far the matter of proper configuration is foggy for me.
CODE AND OUTPUT:
val people = sc.textFile("s3://mybucket/storage-archive/run=2015-08-15*")
people.take(10)
people: org.apache.spark.rdd.RDD[String] = s3://mybucket/storage-archive/run=2015-08-15* MapPartitionsRDD[3] at textFile at <console>:23
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:186)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199)
...
Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801)
at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:128)
... 59 more
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)