You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Chawla,Sumit " <su...@gmail.com> on 2016/09/20 06:15:50 UTC
Task Deserialization Error
Hi All
I am trying to test a simple Spark APP using scala.
import org.apache.spark.SparkContext
object SparkDemo {
def main(args: Array[String]) {
val logFile = "README.md" // Should be some file on your system
// to run in local mode
val sc = new SparkContext("local", "Simple App",
""PATH_OF_DIRECTORY_WHERE_COMPILED_SPARK_PROJECT_FROM_GIT")
val logData = sc.textFile(logFile).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
}
}
When running this demo in IntelliJ, i am getting following error:
java.lang.IllegalStateException: unread block data
at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2449)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1385)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:253)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I guess its associated with task not being deserializable. Any help
will be appreciated.
Regards
Sumit Chawla
Re: Task Deserialization Error
Posted by "Chawla,Sumit " <su...@gmail.com>.
Thanks Guys. It was a classLoader issue. Rather than linking to the
SPARK_HOME/assembly/target/scala-2.11/jars/ i was linking the individual
jars. Linking to the folder instead solved the issue for me.
Regards
Sumit Chawla
On Wed, Sep 21, 2016 at 2:51 PM, Jakob Odersky <ja...@odersky.com> wrote:
> Your app is fine, I think the error has to do with the way inttelij
> launches applications. Is your app forked in a new jvm when you run it?
>
> On Wed, Sep 21, 2016 at 2:28 PM, Gokula Krishnan D <em...@gmail.com>
> wrote:
>
>> Hello Sumit -
>>
>> I could see that SparkConf() specification is not being mentioned in your
>> program. But rest looks good.
>>
>>
>>
>> Output:
>>
>>
>> By the way, I have used the README.md template https://gis
>> t.github.com/jxson/1784669
>>
>> Thanks & Regards,
>> Gokula Krishnan* (Gokul)*
>>
>> On Tue, Sep 20, 2016 at 2:15 AM, Chawla,Sumit <su...@gmail.com>
>> wrote:
>>
>>> Hi All
>>>
>>> I am trying to test a simple Spark APP using scala.
>>>
>>>
>>> import org.apache.spark.SparkContext
>>>
>>> object SparkDemo {
>>> def main(args: Array[String]) {
>>> val logFile = "README.md" // Should be some file on your system
>>>
>>> // to run in local mode
>>> val sc = new SparkContext("local", "Simple App", ""PATH_OF_DIRECTORY_WHERE_COMPILED_SPARK_PROJECT_FROM_GIT")
>>>
>>> val logData = sc.textFile(logFile).cache()
>>> val numAs = logData.filter(line => line.contains("a")).count()
>>> val numBs = logData.filter(line => line.contains("b")).count()
>>>
>>>
>>> println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
>>>
>>> }
>>> }
>>>
>>>
>>> When running this demo in IntelliJ, i am getting following error:
>>>
>>>
>>> java.lang.IllegalStateException: unread block data
>>> at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2449)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1385)
>>> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
>>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
>>> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
>>> at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
>>> at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
>>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:253)
>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>>
>>> I guess its associated with task not being deserializable. Any help will be appreciated.
>>>
>>>
>>>
>>> Regards
>>> Sumit Chawla
>>>
>>>
>>
>
Re: Task Deserialization Error
Posted by Jakob Odersky <ja...@odersky.com>.
Your app is fine, I think the error has to do with the way inttelij
launches applications. Is your app forked in a new jvm when you run it?
On Wed, Sep 21, 2016 at 2:28 PM, Gokula Krishnan D <em...@gmail.com>
wrote:
> Hello Sumit -
>
> I could see that SparkConf() specification is not being mentioned in your
> program. But rest looks good.
>
>
>
> Output:
>
>
> By the way, I have used the README.md template https://
> gist.github.com/jxson/1784669
>
> Thanks & Regards,
> Gokula Krishnan* (Gokul)*
>
> On Tue, Sep 20, 2016 at 2:15 AM, Chawla,Sumit <su...@gmail.com>
> wrote:
>
>> Hi All
>>
>> I am trying to test a simple Spark APP using scala.
>>
>>
>> import org.apache.spark.SparkContext
>>
>> object SparkDemo {
>> def main(args: Array[String]) {
>> val logFile = "README.md" // Should be some file on your system
>>
>> // to run in local mode
>> val sc = new SparkContext("local", "Simple App", ""PATH_OF_DIRECTORY_WHERE_COMPILED_SPARK_PROJECT_FROM_GIT")
>>
>> val logData = sc.textFile(logFile).cache()
>> val numAs = logData.filter(line => line.contains("a")).count()
>> val numBs = logData.filter(line => line.contains("b")).count()
>>
>>
>> println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
>>
>> }
>> }
>>
>>
>> When running this demo in IntelliJ, i am getting following error:
>>
>>
>> java.lang.IllegalStateException: unread block data
>> at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2449)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1385)
>> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
>> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
>> at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
>> at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:253)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:745)
>>
>>
>> I guess its associated with task not being deserializable. Any help will be appreciated.
>>
>>
>>
>> Regards
>> Sumit Chawla
>>
>>
>
Re: Task Deserialization Error
Posted by Gokula Krishnan D <em...@gmail.com>.
Hello Sumit -
I could see that SparkConf() specification is not being mentioned in your
program. But rest looks good.
Output:
By the way, I have used the README.md template
https://gist.github.com/jxson/1784669
Thanks & Regards,
Gokula Krishnan* (Gokul)*
On Tue, Sep 20, 2016 at 2:15 AM, Chawla,Sumit <su...@gmail.com>
wrote:
> Hi All
>
> I am trying to test a simple Spark APP using scala.
>
>
> import org.apache.spark.SparkContext
>
> object SparkDemo {
> def main(args: Array[String]) {
> val logFile = "README.md" // Should be some file on your system
>
> // to run in local mode
> val sc = new SparkContext("local", "Simple App", ""PATH_OF_DIRECTORY_WHERE_COMPILED_SPARK_PROJECT_FROM_GIT")
>
> val logData = sc.textFile(logFile).cache()
> val numAs = logData.filter(line => line.contains("a")).count()
> val numBs = logData.filter(line => line.contains("b")).count()
>
>
> println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
>
> }
> }
>
>
> When running this demo in IntelliJ, i am getting following error:
>
>
> java.lang.IllegalStateException: unread block data
> at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2449)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1385)
> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
> at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
> at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:253)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
>
> I guess its associated with task not being deserializable. Any help will be appreciated.
>
>
>
> Regards
> Sumit Chawla
>
>