You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@zeppelin.apache.org by Eugene Morozov <fa...@list.ru> on 2015/04/14 11:14:54 UTC

Spark interpreter throws: NumberFormatException for ""

Hi!

I’m trying to run at least something using my spark / cassandra setup: prebuilt spark 1.2.1-hadoop1, cassandra 2.0.14
Spark by itself is working fine, I have several tests in my project - they’re working, I’m able to use spark shell to run my project - everything is fine.

So, the simplest thing I’m trying now is run zeppelin, then creating a paragraph:
val rdd = sc.textFile("/Users/emorozov/tools/apache-cassandra-2.0.14/conf/cassandra.yaml")
rdd.count

Here is what I see in the log files:

Zeppelin Interpreter log file:
 INFO [2015-04-14 01:29:21,097] ({pool-1-thread-5} Logging.scala[logInfo]:59) - Successfully started service 'SparkUI' on port 4045.
 INFO [2015-04-14 01:29:21,097] ({pool-1-thread-5} Logging.scala[logInfo]:59) - Started SparkUI at http://10.59.26.123:4045
 INFO [2015-04-14 01:29:21,255] ({pool-1-thread-5} Logging.scala[logInfo]:59) - Added JAR file:/Users/emorozov/dev/analytics/analytics-jobs/target/analytics-jobs-5.2.0-SNAPSHOT-all.jar at http://10.59.26.123:53658/jars/analytics-jobs-5.2.0-SNAPSHOT-all.jar with timestamp 1429000161255
ERROR [2015-04-14 01:29:21,256] ({pool-1-thread-5} ProcessFunction.java[process]:41) - Internal error processing getProgress
org.apache.zeppelin.interpreter.InterpreterException: java.lang.NumberFormatException: For input string: ""
	at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:75)
	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:109)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.getProgress(RemoteInterpreterServer.java:299)
	at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:938)
	at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:923)
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NumberFormatException: For input string: ""
	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
	at java.lang.Integer.parseInt(Integer.java:504)
	at java.lang.Integer.parseInt(Integer.java:527)
	at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229)
	at scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
	at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend$$anonfun$2.apply(SparkDeploySchedulerBackend.scala:42)
	at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend$$anonfun$2.apply(SparkDeploySchedulerBackend.scala:42)
	at scala.Option.map(Option.scala:145)
	at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend.<init>(SparkDeploySchedulerBackend.scala:42)
	at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:1883)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:330)
	at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:267)
	at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:145)
	at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:389)
	at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:73)
	... 11 more

Zeppelin log file
INFO [2015-04-14 01:29:17,551] ({pool-1-thread-5} Paragraph.java[jobRun]:194) - run paragraph 20150414-012109_673822021 using null org.apache.zeppelin.interpreter.LazyOpenInterpreter@7cfec020
 INFO [2015-04-14 01:29:17,551] ({pool-1-thread-5} Paragraph.java[jobRun]:211) - RUN : val rdd = sc.textFile("/Users/emorozov/tools/apache-cassandra-2.0.14/conf/cassandra.yaml")
rdd.count
 INFO [2015-04-14 01:29:18,262] ({Thread-32} NotebookServer.java[broadcast]:251) - SEND >> NOTE
ERROR [2015-04-14 01:29:19,273] ({pool-1-thread-5} Job.java[run]:183) - Job failed
org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.TApplicationException: Internal error processing interpret
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:222)
	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
	at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:212)
	at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
	at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:293)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.TApplicationException: Internal error processing interpret
	at org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
	at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:190)
	at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:175)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:205)
	... 11 more
 
zeppelin-env.sh is the following:
export MASTER="spark://emorozov.local:7077"
export ZEPPELIN_PORT=8089
export ZEPPELIN_JAVA_OPTS="-Dspark.jars=/Users/emorozov/dev/analytics/analytics-jobs/target/analytics-jobs-5.2.0-SNAPSHOT-all.jar"
export SPARK_HOME="/Users/emorozov/tools/spark-1.2.1-bin-hadoop1/"
export ZEPPELIN_HOME="/Users/emorozov/dev/zeppelin"
export ZEPPELIN_MEM="-Xmx4g"

I turned on debug in log4j.properties, hoped to see properties that are provided into SparkConf (SparkInterpreter.java, line 263: logger.debug), but there are no properties in the log file.


Although in the same notebook, I’m able to run smth like %sh echo blah. It gives blah as a result.

--
Eugene Morozov
fathersson@list.ru

Re: Spark interpreter throws: NumberFormatException for "". Researched.

Posted by Jongyoul Lee <jo...@gmail.com>.

Hi Eugene,

Nice catch! This looks like a side effect of ZEPPELIN-28. spark.cores.max
doesn't have a default value itself but it follows
spark.deploy.defaultCores whose initial value is Int.MaxValue. I think It
should be fixed. Can you make a JIRA ticket and PR for github of this issue?

Regards,
Jongyoul Lee

On Wed, Apr 15, 2015 at 8:39 AM, Eugene Morozov <fa...@list.ru> wrote:

> Hi!
>
> Yesterday, I had an issue with running Zeppelin on Spark 1.2.1, but I
> wrote to user list of this project. The issue is described in details in my
> previous email. Today, I’ve debugged it a little and I seems to find a
> weird place, where I believe the issue might be.
>
> That is a line 38 of SparkDeploySchedulerBackend: val maxCores =
> conf.getOption("spark.cores.max").map(_.toInt)
>
> I wrote a simple program to prove that NumberFormatException comes from
> exactly this line:
>         val conf: SparkConf = new SparkConf()
>         conf.set("spark.cores.max","")
>         conf.getOption("spark.cores.max").map(_.toInt)
>
> These three lines gives NumberFormatException.
> So, this spark.cores.max comes from two places:
> 1. static initialiser of SparkInterpreter adds spark.cores.max with empty
> string value. It was null, but has been changed.
> 2. from file conf/interpreter.json, where spark.cores.max is also a empty
> string value.
>
> Description of this field states that if it’s blank, then it should become
> number of cores in the system, but that doesn’t happen for me for some
> reason. When I’ve changed it in conf/interpreter.json Zeppelin started
> worked as it should.
>
> I believe it should be number of cores in static initialiser of
> SparkInterpreter instead of empty string value, but not sure.
> - Is there a better fix.
> - What’s the best approach to run it under debug? Right now I’m using
> remote debug, but interpreter starts in particular time and it’s hard to
> start debugging in the right, there should be a way simpler than that. May
> be there are some resources for developers how to configure local
> environment?
>
> On 14 Apr 2015, at 02:14, Eugene Morozov <fa...@list.ru> wrote:
>
> > Hi!
> >
> > I’m trying to run at least something using my spark / cassandra setup:
> prebuilt spark 1.2.1-hadoop1, cassandra 2.0.14
> > Spark by itself is working fine, I have several tests in my project -
> they’re working, I’m able to use spark shell to run my project - everything
> is fine.
> >
> > So, the simplest thing I’m trying now is run zeppelin, then creating a
> paragraph:
> > val rdd =
> sc.textFile("/Users/emorozov/tools/apache-cassandra-2.0.14/conf/cassandra.yaml")
> > rdd.count
> >
> > Here is what I see in the log files:
> >
> > Zeppelin Interpreter log file:
> >  INFO [2015-04-14 01:29:21,097] ({pool-1-thread-5}
> Logging.scala[logInfo]:59) - Successfully started service 'SparkUI' on port
> 4045.
> >  INFO [2015-04-14 01:29:21,097] ({pool-1-thread-5}
> Logging.scala[logInfo]:59) - Started SparkUI at http://10.59.26.123:4045
> >  INFO [2015-04-14 01:29:21,255] ({pool-1-thread-5}
> Logging.scala[logInfo]:59) - Added JAR
> file:/Users/emorozov/dev/analytics/analytics-jobs/target/analytics-jobs-5.2.0-SNAPSHOT-all.jar
> at http://10.59.26.123:53658/jars/analytics-jobs-5.2.0-SNAPSHOT-all.jar
> with timestamp 1429000161255
> > ERROR [2015-04-14 01:29:21,256] ({pool-1-thread-5}
> ProcessFunction.java[process]:41) - Internal error processing getProgress
> > org.apache.zeppelin.interpreter.InterpreterException:
> java.lang.NumberFormatException: For input string: ""
> >       at
> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:75)
> >       at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
> >       at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:109)
> >       at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.getProgress(RemoteInterpreterServer.java:299)
> >       at
> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:938)
> >       at
> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:923)
> >       at
> org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> >       at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> >       at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> >       at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >       at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >       at java.lang.Thread.run(Thread.java:745)
> > Caused by: java.lang.NumberFormatException: For input string: ""
> >       at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> >       at java.lang.Integer.parseInt(Integer.java:504)
> >       at java.lang.Integer.parseInt(Integer.java:527)
> >       at
> scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229)
> >       at scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
> >       at
> org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend$$anonfun$2.apply(SparkDeploySchedulerBackend.scala:42)
> >       at
> org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend$$anonfun$2.apply(SparkDeploySchedulerBackend.scala:42)
> >       at scala.Option.map(Option.scala:145)
> >       at
> org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend.<init>(SparkDeploySchedulerBackend.scala:42)
> >       at
> org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:1883)
> >       at org.apache.spark.SparkContext.<init>(SparkContext.scala:330)
> >       at
> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:267)
> >       at
> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:145)
> >       at
> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:389)
> >       at
> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:73)
> >       ... 11 more
> >
> > Zeppelin log file
> > INFO [2015-04-14 01:29:17,551] ({pool-1-thread-5}
> Paragraph.java[jobRun]:194) - run paragraph 20150414-012109_673822021 using
> null org.apache.zeppelin.interpreter.LazyOpenInterpreter@7cfec020
> >  INFO [2015-04-14 01:29:17,551] ({pool-1-thread-5}
> Paragraph.java[jobRun]:211) - RUN : val rdd =
> sc.textFile("/Users/emorozov/tools/apache-cassandra-2.0.14/conf/cassandra.yaml")
> > rdd.count
> >  INFO [2015-04-14 01:29:18,262] ({Thread-32}
> NotebookServer.java[broadcast]:251) - SEND >> NOTE
> > ERROR [2015-04-14 01:29:19,273] ({pool-1-thread-5} Job.java[run]:183) -
> Job failed
> > org.apache.zeppelin.interpreter.InterpreterException:
> org.apache.thrift.TApplicationException: Internal error processing interpret
> >       at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:222)
> >       at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
> >       at
> org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:212)
> >       at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
> >       at
> org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:293)
> >       at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> >       at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
> >       at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
> >       at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >       at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >       at java.lang.Thread.run(Thread.java:745)
> > Caused by: org.apache.thrift.TApplicationException: Internal error
> processing interpret
> >       at
> org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
> >       at
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
> >       at
> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:190)
> >       at
> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:175)
> >       at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:205)
> >       ... 11 more
> >
> > zeppelin-env.sh is the following:
> > export MASTER="spark://emorozov.local:7077"
> > export ZEPPELIN_PORT=8089
> > export
> ZEPPELIN_JAVA_OPTS="-Dspark.jars=/Users/emorozov/dev/analytics/analytics-jobs/target/analytics-jobs-5.2.0-SNAPSHOT-all.jar"
> > export SPARK_HOME="/Users/emorozov/tools/spark-1.2.1-bin-hadoop1/"
> > export ZEPPELIN_HOME="/Users/emorozov/dev/zeppelin"
> > export ZEPPELIN_MEM="-Xmx4g"
> >
> > I turned on debug in log4j.properties, hoped to see properties that are
> provided into SparkConf (SparkInterpreter.java, line 263: logger.debug),
> but there are no properties in the log file.
> >
> >
> > Although in the same notebook, I’m able to run smth like %sh echo blah.
> It gives blah as a result.
> >
> > --
> > Eugene Morozov
> > fathersson@list.ru
> >
> >
> >
> >
>
> Eugene Morozov
> fathersson@list.ru
>
>
>
>
>


-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net

Spark interpreter throws: NumberFormatException for "". Researched.

Posted by Eugene Morozov <fa...@list.ru>.

Hi!

Yesterday, I had an issue with running Zeppelin on Spark 1.2.1, but I wrote to user list of this project. The issue is described in details in my previous email. Today, I’ve debugged it a little and I seems to find a weird place, where I believe the issue might be.

That is a line 38 of SparkDeploySchedulerBackend: val maxCores = conf.getOption("spark.cores.max").map(_.toInt)

I wrote a simple program to prove that NumberFormatException comes from exactly this line:
	val conf: SparkConf = new SparkConf()
	conf.set("spark.cores.max","")
	conf.getOption("spark.cores.max").map(_.toInt)

These three lines gives NumberFormatException.
So, this spark.cores.max comes from two places:
1. static initialiser of SparkInterpreter adds spark.cores.max with empty string value. It was null, but has been changed.
2. from file conf/interpreter.json, where spark.cores.max is also a empty string value.

Description of this field states that if it’s blank, then it should become number of cores in the system, but that doesn’t happen for me for some reason. When I’ve changed it in conf/interpreter.json Zeppelin started worked as it should.

I believe it should be number of cores in static initialiser of SparkInterpreter instead of empty string value, but not sure.
- Is there a better fix.
- What’s the best approach to run it under debug? Right now I’m using remote debug, but interpreter starts in particular time and it’s hard to start debugging in the right, there should be a way simpler than that. May be there are some resources for developers how to configure local environment?

On 14 Apr 2015, at 02:14, Eugene Morozov <fa...@list.ru> wrote:

> Hi!
> 
> I’m trying to run at least something using my spark / cassandra setup: prebuilt spark 1.2.1-hadoop1, cassandra 2.0.14
> Spark by itself is working fine, I have several tests in my project - they’re working, I’m able to use spark shell to run my project - everything is fine.
> 
> So, the simplest thing I’m trying now is run zeppelin, then creating a paragraph:
> val rdd = sc.textFile("/Users/emorozov/tools/apache-cassandra-2.0.14/conf/cassandra.yaml")
> rdd.count
> 
> Here is what I see in the log files:
> 
> Zeppelin Interpreter log file:
>  INFO [2015-04-14 01:29:21,097] ({pool-1-thread-5} Logging.scala[logInfo]:59) - Successfully started service 'SparkUI' on port 4045.
>  INFO [2015-04-14 01:29:21,097] ({pool-1-thread-5} Logging.scala[logInfo]:59) - Started SparkUI at http://10.59.26.123:4045
>  INFO [2015-04-14 01:29:21,255] ({pool-1-thread-5} Logging.scala[logInfo]:59) - Added JAR file:/Users/emorozov/dev/analytics/analytics-jobs/target/analytics-jobs-5.2.0-SNAPSHOT-all.jar at http://10.59.26.123:53658/jars/analytics-jobs-5.2.0-SNAPSHOT-all.jar with timestamp 1429000161255
> ERROR [2015-04-14 01:29:21,256] ({pool-1-thread-5} ProcessFunction.java[process]:41) - Internal error processing getProgress
> org.apache.zeppelin.interpreter.InterpreterException: java.lang.NumberFormatException: For input string: ""
> 	at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:75)
> 	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
> 	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:109)
> 	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.getProgress(RemoteInterpreterServer.java:299)
> 	at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:938)
> 	at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:923)
> 	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> 	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> 	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NumberFormatException: For input string: ""
> 	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> 	at java.lang.Integer.parseInt(Integer.java:504)
> 	at java.lang.Integer.parseInt(Integer.java:527)
> 	at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229)
> 	at scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
> 	at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend$$anonfun$2.apply(SparkDeploySchedulerBackend.scala:42)
> 	at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend$$anonfun$2.apply(SparkDeploySchedulerBackend.scala:42)
> 	at scala.Option.map(Option.scala:145)
> 	at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend.<init>(SparkDeploySchedulerBackend.scala:42)
> 	at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:1883)
> 	at org.apache.spark.SparkContext.<init>(SparkContext.scala:330)
> 	at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:267)
> 	at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:145)
> 	at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:389)
> 	at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:73)
> 	... 11 more
> 
> Zeppelin log file
> INFO [2015-04-14 01:29:17,551] ({pool-1-thread-5} Paragraph.java[jobRun]:194) - run paragraph 20150414-012109_673822021 using null org.apache.zeppelin.interpreter.LazyOpenInterpreter@7cfec020
>  INFO [2015-04-14 01:29:17,551] ({pool-1-thread-5} Paragraph.java[jobRun]:211) - RUN : val rdd = sc.textFile("/Users/emorozov/tools/apache-cassandra-2.0.14/conf/cassandra.yaml")
> rdd.count
>  INFO [2015-04-14 01:29:18,262] ({Thread-32} NotebookServer.java[broadcast]:251) - SEND >> NOTE
> ERROR [2015-04-14 01:29:19,273] ({pool-1-thread-5} Job.java[run]:183) - Job failed
> org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.TApplicationException: Internal error processing interpret
> 	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:222)
> 	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
> 	at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:212)
> 	at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
> 	at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:293)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.TApplicationException: Internal error processing interpret
> 	at org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
> 	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
> 	at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:190)
> 	at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:175)
> 	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:205)
> 	... 11 more
>  
> zeppelin-env.sh is the following:
> export MASTER="spark://emorozov.local:7077"
> export ZEPPELIN_PORT=8089
> export ZEPPELIN_JAVA_OPTS="-Dspark.jars=/Users/emorozov/dev/analytics/analytics-jobs/target/analytics-jobs-5.2.0-SNAPSHOT-all.jar"
> export SPARK_HOME="/Users/emorozov/tools/spark-1.2.1-bin-hadoop1/"
> export ZEPPELIN_HOME="/Users/emorozov/dev/zeppelin"
> export ZEPPELIN_MEM="-Xmx4g"
> 
> I turned on debug in log4j.properties, hoped to see properties that are provided into SparkConf (SparkInterpreter.java, line 263: logger.debug), but there are no properties in the log file.
> 
> 
> Although in the same notebook, I’m able to run smth like %sh echo blah. It gives blah as a result.
> 
> --
> Eugene Morozov
> fathersson@list.ru
> 
> 
> 
> 

Eugene Morozov
fathersson@list.ru