You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by "Maruthi@aeverie.com" <ma...@aeverie.com> on 2015/02/26 18:58:15 UTC

java.lang.IllegalStateException: unread block data Error

Hi,
I am using zeppelin to integrate with DeepSparkContext. I am able to build zeppelin with independent spark cluster of 1.1.1 version. And gave Spark Master URL in conf/zeppelin-env.sh. Using the same procedure, I am trying to integrate zeppelin with deepsparkcontext. Where Startio DeepSparkContext which internally provides the SparkContext and creates the spark cluster. Now I have given spark's master url in zeppelin-env.sh. It was ablt to build it. and able to get the notebook and sc.version also working. I mean if i open a notebook and type sc.version, I am getting the result as 1.1.1. So scala is working,  but if run any RDD and spark operations like as follows, I am having troubles.

val bankText99 = sc.TextFile("/home/dev004/try/Zeppelin_dev/bank/bank-full.csv")
bankText99.count

Here is my logs..

bankText99: org.apache.spark.rdd.RDD[String] = /home/dev004/try/Zeppelin_dev/bank/bank-full.csv MappedRDD[3] at textFile at <console>:19 org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 5, averie001-edt-loc): java.lang.IllegalStateException: unread block data java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62) org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:160) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

I dont know what is happening. I tried to change the sparkContext with deepSparkContext in the code, but getting lot other errors. Please give me some help on this. I am struck on this from one month.

Looking forward for a quick support.??



Maruthi Donthi
Java Developer
[aeverie-logo-med-res- signature size]
250 Parkway Drive Suite 150
Lincolnshire, Illinois 60069
203-218-6949(M)
maruthi@aeverie.com<ma...@aeverie.com>
http://www.aeverie.com/
________________________________
From: frank.schilder@thomsonreuters.com <fr...@thomsonreuters.com>
Sent: Wednesday, February 25, 2015 2:20 PM
To: users@zeppelin.incubator.apache.org
Subject: Re: Zeppelin with Stratio DeepSpark

Hi,

I'm having problems building Zeppelin because of the web proxy. It fails while building the web Application:


 Failed to execute goal com.github.eirslett:frontend-maven-plugin:0.0.20:npm (npm install) on project zeppelin-web: Failed to run task: 'npm install --color=false --proxy=http://NA:NA@<webproxy>:80<http://NA:NA@webproxy.int.westgroup.com:80>'


It seems like I need to set user name and password for the web proxy, but those are actually not required.


I tried out setting the web proxy (without the user name and password) with npm, but no success.


Any help would be highly appreciated,


Thanks,

Frank



Re: java.lang.IllegalStateException: unread block data Error

Posted by "Maruthi@aeverie.com" <ma...@aeverie.com>.
Hi Moon,

Which cluster are you running. Did you setup spark cluster and deep spark cluster?


Thanks,​

Maruthi Donthi
Java Developer
[aeverie-logo-med-res- signature size]
250 Parkway Drive Suite 150
Lincolnshire, Illinois 60069
203-218-6949(M)
maruthi@aeverie.com<ma...@aeverie.com>
http://www.aeverie.com/
________________________________
From: moon soo Lee <mo...@apache.org>
Sent: Monday, March 2, 2015 2:15 AM
To: users@zeppelin.incubator.apache.org
Subject: Re: java.lang.IllegalStateException: unread block data Error

Hi Maruthi,

I tried DeepSparkContext a little, I used %dep to load it from maven central instead of building it.
And i did simple test.

[Inline image 1]

It works well on Zeppelin. Can you try in this way?
Your error really looks like coming from version incompatibility, Spark or JDK version, between Spark worker and the machine you run Zeppelin.

Thanks,
moon




On Fri, Feb 27, 2015 at 2:58 AM, Maruthi@aeverie.com<ma...@aeverie.com> <ma...@aeverie.com>> wrote:

Hi,
I am using zeppelin to integrate with DeepSparkContext. I am able to build zeppelin with independent spark cluster of 1.1.1 version. And gave Spark Master URL in conf/zeppelin-env.sh. Using the same procedure, I am trying to integrate zeppelin with deepsparkcontext. Where Startio DeepSparkContext which internally provides the SparkContext and creates the spark cluster. Now I have given spark's master url in zeppelin-env.sh. It was ablt to build it. and able to get the notebook and sc.version also working. I mean if i open a notebook and type sc.version, I am getting the result as 1.1.1. So scala is working,  but if run any RDD and spark operations like as follows, I am having troubles.

val bankText99 = sc.TextFile("/home/dev004/try/Zeppelin_dev/bank/bank-full.csv")
bankText99.count

Here is my logs..

bankText99: org.apache.spark.rdd.RDD[String] = /home/dev004/try/Zeppelin_dev/bank/bank-full.csv MappedRDD[3] at textFile at <console>:19 org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 5, averie001-edt-loc): java.lang.IllegalStateException: unread block data java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62) org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:160) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org<http://org.apache.spark.scheduler.DAGScheduler.org>$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

I dont know what is happening. I tried to change the sparkContext with deepSparkContext in the code, but getting lot other errors. Please give me some help on this. I am struck on this from one month.

Looking forward for a quick support.​​



Maruthi Donthi
Java Developer
[aeverie-logo-med-res- signature size]
250 Parkway Drive Suite 150
Lincolnshire, Illinois 60069
203-218-6949(M)
maruthi@aeverie.com<ma...@aeverie.com>
http://www.aeverie.com/
________________________________
From: frank.schilder@thomsonreuters.com<ma...@thomsonreuters.com> <fr...@thomsonreuters.com>>
Sent: Wednesday, February 25, 2015 2:20 PM
To: users@zeppelin.incubator.apache.org<ma...@zeppelin.incubator.apache.org>
Subject: Re: Zeppelin with Stratio DeepSpark

Hi,

I’m having problems building Zeppelin because of the web proxy. It fails while building the web Application:


 Failed to execute goal com.github.eirslett:frontend-maven-plugin:0.0.20:npm (npm install) on project zeppelin-web: Failed to run task: 'npm install --color=false --proxy=http://NA:NA@<webproxy>:80<http://NA:NA@webproxy.int.westgroup.com:80>'


It seems like I need to set user name and password for the web proxy, but those are actually not required.


I tried out setting the web proxy (without the user name and password) with npm, but no success.


Any help would be highly appreciated,


Thanks,

Frank




Re: java.lang.IllegalStateException: unread block data Error

Posted by moon soo Lee <mo...@apache.org>.
Hi Maruthi,

I tried DeepSparkContext a little, I used %dep to load it from maven
central instead of building it.
And i did simple test.

[image: Inline image 1]

It works well on Zeppelin. Can you try in this way?
Your error really looks like coming from version incompatibility, Spark or
JDK version, between Spark worker and the machine you run Zeppelin.

Thanks,
moon




On Fri, Feb 27, 2015 at 2:58 AM, Maruthi@aeverie.com <ma...@aeverie.com>
wrote:

>   Hi,
>  I am using zeppelin to integrate with DeepSparkContext. I am able to
> build zeppelin with independent spark cluster of 1.1.1 version. And gave
> Spark Master URL in conf/zeppelin-env.sh. Using the same procedure, I am
> trying to integrate zeppelin with deepsparkcontext. Where Startio
> DeepSparkContext which internally provides the SparkContext and creates the
> spark cluster. Now I have given spark's master url in zeppelin-env.sh. It
> was ablt to build it. and able to get the notebook and sc.version also
> working. I mean if i open a notebook and type sc.version, I am getting the
> result as 1.1.1. So scala is working,  but if run any RDD and spark
> operations like as follows, I am having troubles.
>
>  val bankText99 =
> sc.TextFile("/home/dev004/try/Zeppelin_dev/bank/bank-full.csv")
>  bankText99.count
>
>  Here is my logs..
>
>  bankText99: org.apache.spark.rdd.RDD[String] =
> /home/dev004/try/Zeppelin_dev/bank/bank-full.csv MappedRDD[3] at textFile
> at <console>:19 org.apache.spark.SparkException: Job aborted due to stage
> failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task
> 1.3 in stage 0.0 (TID 5, averie001-edt-loc):
> java.lang.IllegalStateException: unread block data
> java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421)
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382)
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:160)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> java.lang.Thread.run(Thread.java:745) Driver stacktrace: at
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
> at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
> at scala.Option.foreach(Option.scala:236) at
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at
> akka.actor.ActorCell.invoke(ActorCell.scala:456) at
> akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at
> akka.dispatch.Mailbox.run(Mailbox.scala:219) at
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>
>  I dont know what is happening. I tried to change the sparkContext with
> deepSparkContext in the code, but getting lot other errors. Please give me
> some help on this. I am struck on this from one month.
>
>  Looking forward for a quick support.​​
>
>
>
>   *Maruthi Donthi*
> *Java Developer*
>  [image: aeverie-logo-med-res- signature size]
>  *250 Parkway Drive Suite 150*
>  *Lincolnshire, Illinois 60069*
> *203-218-6949(M)*
>  *maruthi@aeverie.com <ra...@aeverie.com>*
>
> *http://www.aeverie.com/ <http://www.aeverie.com/> *
>   ------------------------------
> *From:* frank.schilder@thomsonreuters.com <
> frank.schilder@thomsonreuters.com>
> *Sent:* Wednesday, February 25, 2015 2:20 PM
> *To:* users@zeppelin.incubator.apache.org
> *Subject:* Re: Zeppelin with Stratio DeepSpark
>
>   Hi,
>
>  I’m having problems building Zeppelin because of the web proxy. It fails
> while building the web Application:
>
>   Failed to execute goal
> com.github.eirslett:frontend-maven-plugin:0.0.20:npm (npm install) on
> project zeppelin-web: Failed to run task: 'npm install --color=false
> --proxy=http://NA:NA@<webproxy>:80
> <http://NA:NA@webproxy.int.westgroup.com:80>'
>
>
>  It seems like I need to set user name and password for the web proxy,
> but those are actually not required.
>
>
>  I tried out setting the web proxy (without the user name and password)
> with npm, but no success.
>
>
>  Any help would be highly appreciated,
>
>
>  Thanks,
>
> Frank
>
>
>
>