You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by "Verma, Rishi (398M)" <Ri...@jpl.nasa.gov> on 2015/10/19 21:32:47 UTC

Zeppelin on Spark with Mesos, getting JsonMappingException

Hello,

I’m getting a Json mapping exception errors when trying to use any cluster map/reduce operations on Zeppelin with Apache Spark, running on Mesos. Could somebody provide guidance? I feel my configuration is correct according to documentation I’ve read, and I can’t seem to figure out why the reduce commands are failing. 

Here are three (Scala) commands I tried in a fresh Zeppelin notebook, the first two work fine and the third fails:

> sc.getConf.getAll
res0: Array[(String, String)] = Array((spark.submit.pyArchives,pyspark.zip:py4j-0.8.2.1-src.zip), (spark.home,/cluster/spark), (spark.executor.memory,512m), (spark.files,file:/cluster/spark/python/lib/pyspark.zip,file:/cluster/spark/python/lib/py4j-0.8.2.1-src.zip), (spark.repl.class.uri,http://<IP_HIDDEN>:<PORT_HIDDEN>), (args,""), (zeppelin.spark.concurrentSQL,false), (spark.fileserver.uri,http://<IP_HIDDEN>:<PORT_HIDDEN>), (zeppelin.pyspark.python,python), (spark.scheduler.mode,FAIR), (zeppelin.spark.maxResult,1000), (spark.executor.id,driver), (spark.driver.port,<PORT_HIDDEN>), (zeppelin.dep.localrepo,local-repo), (spark.app.id,20151007-143704-2255525248-5050-29909-0013), (spark.externalBlockStore.folderName,spark-e7edd394-1618-4c18-b76d-9...

> (1 to 10).reduce(_ + _)
res5: Int = 55

> sc.parallelize(1 to 10).reduce(_ + _)
com.fasterxml.jackson.databind.JsonMappingException: Could not find creator property with name 'id' (in class org.apache.spark.rdd.RDDOperationScope) at [Source: {"id":"1","name":"parallelize"}; line: 1, column: 1] at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148) at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:843) at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.addBeanProps(BeanDeserializerFactory.java:533) at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.buildBeanDeserializer(BeanDeserializerFactory.java:220) at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.createBeanDeserializer(BeanDeserializerFactory.java:143) at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:409) at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:358) at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:265) at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:245) at com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:143) at com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:439) at com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:3666) at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3558) at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2578) at org.apache.spark.rdd.RDDOperationScope$.fromJson(RDDOperationScope.scala:82) at org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1490) at org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1490) at scala.Option.map(Option.scala:145) at org.apache.spark.rdd.RDD.<init>(RDD.scala:1490) at org.apache.spark.rdd.ParallelCollectionRDD.<init>(ParallelCollectionRDD.scala:85) at org.apache.spark.SparkContext$$anonfun$parallelize$1.apply(SparkContext.scala:697) at org.apache.spark.SparkContext$$anonfun$parallelize$1.apply(SparkContext.scala:695) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108) at org.apache.spark.SparkContext.withScope(SparkContext.scala:681) at org.apache.spark.SparkContext.parallelize(SparkContext.scala:695) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:24) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:29) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:31) at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33) at $iwC$$iwC$$iwC$$iwC.<init>(<console>:35) at $iwC$$iwC$$iwC.<init>(<console>:37) at $iwC$$iwC.<init>(<console>:39) at $iwC.<init>(<console>:41) at <init>(<console>:43) at .<init>(<console>:47) at .<clinit>(<console>) at .<init>(<console>:7) at .<clinit>(<console>) at $print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:610) at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:586) at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:579) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

My configuration is as below:
--------------------------------
Spark 1.4.1
Mesos 0.21.0
Zepplin 0.5.0-incubating
Hadoop 2.4.0
Cluster: 4 nodes, CentOS 6.7

zepplin-env.sh:
-----------------
export MASTER=mesos://master:5050
export SPARK_MASTER=mesos://master:5050
export MESOS_NATIVE_JAVA_LIBRARY=/cluster/mesos/build/src/.libs/libmesos.so
export MESOS_NATIVE_LIBRARY=/cluster/mesos/build/src/.libs/libmesos.so
export SPARK_EXECUTOR_URI=hdfs:///spark/spark-1.4.1-bin-hadoop2.4.tgz
export ZEPPELIN_JAVA_OPTS="-Dspark.executor.uri=hdfs:///spark/spark-1.4.1-bin-hadoop2.4.tgz"
export SPARK_PID_DIR=/tmp
export SPARK_LOCAL_DIRS=/cluster/spark/spark_tmp
export HADOOP_CONF_DIR=/cluster/hadoop-2.4.0/etc/hadoop

Note: zepplin was built from source using:
mvn clean package -Pspark-1.4 -Dhadoop.version=2.4.0 -Phadoop-2.4 -DskipTests


Thanks very much,
---
Rishi Verma
NASA Jet Propulsion Laboratory
California Institute of Technology


Re: Zeppelin on Spark with Mesos, getting JsonMappingException

Posted by moon soo Lee <mo...@apache.org>.
Hi,

Thanks for sharing the problem.
The error may happen when there're multiple jackson version of jackson
libraries in the classpath.

If you're building Zeppelin from source (latest master branch), another way
of configuring Zeppelin with Spark is,

1. Configure Spark with Mesos
2. Make sure SPARK_HOME/bin/spark-shell works fine
3. export SPARK_HOME to point your configured spark installation in
conf/zeppelin-env.sh
4. Start Zeppelin and set 'master' property to 'mesos://...' in Interpreter
menu.

This way, Zeppelin will use configuration from your spark installation
(because of Zeppelin is using spark-submit command to run it's interpreter
process) and may simplify the configuration and possibly solve the problem.

Hope this helps.

Best,
moon


On Tue, Oct 20, 2015 at 4:33 AM Verma, Rishi (398M) <
Rishi.Verma@jpl.nasa.gov> wrote:

> Hello,
>
> I’m getting a Json mapping exception errors when trying to use any cluster
> map/reduce operations on Zeppelin with Apache Spark, running on Mesos.
> Could somebody provide guidance? I feel my configuration is correct
> according to documentation I’ve read, and I can’t seem to figure out why
> the reduce commands are failing.
>
> Here are three (Scala) commands I tried in a fresh Zeppelin notebook, the
> first two work fine and the third fails:
>
> > sc.getConf.getAll
> res0: Array[(String, String)] =
> Array((spark.submit.pyArchives,pyspark.zip:py4j-0.8.2.1-src.zip),
> (spark.home,/cluster/spark), (spark.executor.memory,512m),
> (spark.files,file:/cluster/spark/python/lib/pyspark.zip,file:/cluster/spark/python/lib/py4j-0.8.2.1-src.zip),
> (spark.repl.class.uri,http://<IP_HIDDEN>:<PORT_HIDDEN>), (args,""),
> (zeppelin.spark.concurrentSQL,false), (spark.fileserver.uri,http://<IP_HIDDEN>:<PORT_HIDDEN>),
> (zeppelin.pyspark.python,python), (spark.scheduler.mode,FAIR),
> (zeppelin.spark.maxResult,1000), (spark.executor.id,driver),
> (spark.driver.port,<PORT_HIDDEN>), (zeppelin.dep.localrepo,local-repo), (
> spark.app.id,20151007-143704-2255525248-5050-29909-0013),
> (spark.externalBlockStore.folderName,spark-e7edd394-1618-4c18-b76d-9...
>
> > (1 to 10).reduce(_ + _)
> res5: Int = 55
>
> > sc.parallelize(1 to 10).reduce(_ + _)
> com.fasterxml.jackson.databind.JsonMappingException: Could not find
> creator property with name 'id' (in class
> org.apache.spark.rdd.RDDOperationScope) at [Source:
> {"id":"1","name":"parallelize"}; line: 1, column: 1] at
> com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148)
> at
> com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:843)
> at
> com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.addBeanProps(BeanDeserializerFactory.java:533)
> at
> com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.buildBeanDeserializer(BeanDeserializerFactory.java:220)
> at
> com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.createBeanDeserializer(BeanDeserializerFactory.java:143)
> at
> com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:409)
> at
> com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:358)
> at
> com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:265)
> at
> com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:245)
> at
> com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:143)
> at
> com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:439)
> at
> com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:3666)
> at
> com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3558)
> at
> com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2578)
> at
> org.apache.spark.rdd.RDDOperationScope$.fromJson(RDDOperationScope.scala:82)
> at org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1490) at
> org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1490) at
> scala.Option.map(Option.scala:145) at
> org.apache.spark.rdd.RDD.<init>(RDD.scala:1490) at
> org.apache.spark.rdd.ParallelCollectionRDD.<init>(ParallelCollectionRDD.scala:85)
> at
> org.apache.spark.SparkContext$$anonfun$parallelize$1.apply(SparkContext.scala:697)
> at
> org.apache.spark.SparkContext$$anonfun$parallelize$1.apply(SparkContext.scala:695)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
> at org.apache.spark.SparkContext.withScope(SparkContext.scala:681) at
> org.apache.spark.SparkContext.parallelize(SparkContext.scala:695) at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:24) at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:29) at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:31) at
> $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33) at
> $iwC$$iwC$$iwC$$iwC.<init>(<console>:35) at
> $iwC$$iwC$$iwC.<init>(<console>:37) at $iwC$$iwC.<init>(<console>:39) at
> $iwC.<init>(<console>:41) at <init>(<console>:43) at .<init>(<console>:47)
> at .<clinit>(<console>) at .<init>(<console>:7) at .<clinit>(<console>) at
> $print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483) at
> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
> at
> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
> at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at
> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at
> org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:610)
> at
> org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:586)
> at
> org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:579)
> at
> org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57)
> at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
> My configuration is as below:
> --------------------------------
> Spark 1.4.1
> Mesos 0.21.0
> Zepplin 0.5.0-incubating
> Hadoop 2.4.0
> Cluster: 4 nodes, CentOS 6.7
>
> zepplin-env.sh:
> -----------------
> export MASTER=mesos://master:5050
> export SPARK_MASTER=mesos://master:5050
> export MESOS_NATIVE_JAVA_LIBRARY=/cluster/mesos/build/src/.libs/libmesos.so
> export MESOS_NATIVE_LIBRARY=/cluster/mesos/build/src/.libs/libmesos.so
> export SPARK_EXECUTOR_URI=hdfs:///spark/spark-1.4.1-bin-hadoop2.4.tgz
> export
> ZEPPELIN_JAVA_OPTS="-Dspark.executor.uri=hdfs:///spark/spark-1.4.1-bin-hadoop2.4.tgz"
> export SPARK_PID_DIR=/tmp
> export SPARK_LOCAL_DIRS=/cluster/spark/spark_tmp
> export HADOOP_CONF_DIR=/cluster/hadoop-2.4.0/etc/hadoop
>
> Note: zepplin was built from source using:
> mvn clean package -Pspark-1.4 -Dhadoop.version=2.4.0 -Phadoop-2.4
> -DskipTests
>
>
> Thanks very much,
> ---
> Rishi Verma
> NASA Jet Propulsion Laboratory
> California Institute of Technology
>
>