You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ondrej Smola <on...@gmail.com> on 2015/03/26 18:59:10 UTC

Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel

Hi,

I am running spark streaming v 1.3.0 (running inside Docker) on Mesos
0.21.1. Spark streaming is started using Marathon -> docker container gets
deployed and starts streaming (from custom Actor). Spark binary is located
on shared GlusterFS volume. Data is streamed from Elasticsearch/Redis. When
new batch arrives Spark tries to replicate it but fails with following
error :

15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840 dropped
from memory (free 278017782)
15/03/26 14:50:00 INFO BlockManager: Removing block broadcast_0_piece0
15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of size 1658
dropped from memory (free 278019440)
15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block
broadcast_0_piece0
15/03/26 14:50:00 ERROR TransportRequestHandler: Error while invoking
RpcHandler#receive() on RPC id 7178767328921933569
java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:344)
at
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:65)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:88)
at
org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:65)
at
org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:124)
at
org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:97)
at
org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:91)
at
org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:44)
at
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
at
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
at java.lang.Thread.run(Thread.java:745)
15/03/26 14:50:01 ERROR TransportRequestHandler: Error while invoking
RpcHandler#receive() on RPC id 9001562482648380222

>From mesos UI i see unpacked spark binary and my assembly jar in place (on
running driver and on replication targets). I have other spark BATCH jobs
running from same base docker image OK. When there is no incoming data
exception is not thrown. Spark config :

spark.master
 mesos://zk://incomparable-brush.maas:2181,cumbersome-match.maas:2181,voluminous-toys.maas:2181/mesos
spark.serializer        org.apache.spark.serializer.KryoSerializer
spark.executor.uri      file:///master/spark/spark-1.3.0-bin-hadoop2.4.tgz
spark.local.dir         /opt/spark_tmp

spark.driver.port       41000
spark.executor.port     41016
spark.fileserver.port   41032
spark.broadcast.port    41048
spark.replClassServer.port 41064
spark.blockManager.port  41080
spark.ui.port   41096
spark.history.ui.port 41112

Thanks for any help

Re: Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel

Posted by Tathagata Das <td...@databricks.com>.
Seems like a bug, could you file a JIRA?

@Tim: Patrick said you take a look at Mesos related issues. Could you take
a look at this. Thanks!

TD

On Fri, Mar 27, 2015 at 1:25 PM, Ondrej Smola <on...@gmail.com>
wrote:

> Yes, only when using fine grained mode and replication (StorageLevel.MEMORY_ONLY_2
> etc).
>
> 2015-03-27 19:06 GMT+01:00 Tathagata Das <td...@databricks.com>:
>
>> Does it fail with just Spark jobs (using storage levels) on non-coarse
>> mode?
>>
>> TD
>>
>> On Fri, Mar 27, 2015 at 4:39 AM, Ondrej Smola <on...@gmail.com>
>> wrote:
>>
>>> More info
>>>
>>> when using *spark.mesos.coarse* everything works as expected. I think
>>> this must be a bug in spark-mesos integration.
>>>
>>>
>>> 2015-03-27 9:23 GMT+01:00 Ondrej Smola <on...@gmail.com>:
>>>
>>>> It happens only when StorageLevel is used with 1 replica (
>>>> StorageLevel.MEMORY_ONLY_2,StorageLevel.MEMORY_AND_DISK_2) ,
>>>> StorageLevel.MEMORY_ONLY ,StorageLevel.MEMORY_AND_DISK works - the
>>>> problems must be clearly somewhere between mesos-spark . From console I see
>>>> that spark is trying to replicate to nodes -> nodes show up in Mesos active
>>>> tasks ... but they always fail with ClassNotFoundE.
>>>>
>>>> 2015-03-27 0:52 GMT+01:00 Tathagata Das <td...@databricks.com>:
>>>>
>>>>> Could you try running a simpler spark streaming program with receiver
>>>>> (may be socketStream) and see if that works.
>>>>>
>>>>> TD
>>>>>
>>>>> On Thu, Mar 26, 2015 at 2:08 PM, Ondrej Smola <on...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi thanks for reply,
>>>>>>
>>>>>> yes I have custom receiver -> but it has simple logic .. pop ids from
>>>>>> redis queue -> load docs based on ids from elastic and store them in spark.
>>>>>> No classloader modifications. I am running multiple Spark batch jobs (with
>>>>>> user supplied partitioning) and they have no problems, debug in local mode
>>>>>> show no errors.
>>>>>>
>>>>>> 2015-03-26 21:47 GMT+01:00 Tathagata Das <td...@databricks.com>:
>>>>>>
>>>>>>> Here are few steps to debug.
>>>>>>>
>>>>>>> 1. Try using replication from a Spark job: sc.parallelize(1 to 100,
>>>>>>> 100).persist(StorageLevel.MEMORY_ONLY_2).count()
>>>>>>> 2. If one works, then we know that there is probably nothing wrong
>>>>>>> with the Spark installation, and probably in the threads related to the
>>>>>>> receivers receiving the data. Are you writing a custom receiver? Are you
>>>>>>> somehow playing around with the class loader in the custom receiver?
>>>>>>>
>>>>>>> TD
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Mar 26, 2015 at 10:59 AM, Ondrej Smola <
>>>>>>> ondrej.smola@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am running spark streaming v 1.3.0 (running inside Docker) on
>>>>>>>> Mesos 0.21.1. Spark streaming is started using Marathon -> docker container
>>>>>>>> gets deployed and starts streaming (from custom Actor). Spark binary is
>>>>>>>> located on shared GlusterFS volume. Data is streamed from
>>>>>>>> Elasticsearch/Redis. When new batch arrives Spark tries to replicate it but
>>>>>>>> fails with following error :
>>>>>>>>
>>>>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840
>>>>>>>> dropped from memory (free 278017782)
>>>>>>>> 15/03/26 14:50:00 INFO BlockManager: Removing block
>>>>>>>> broadcast_0_piece0
>>>>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of
>>>>>>>> size 1658 dropped from memory (free 278019440)
>>>>>>>> 15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block
>>>>>>>> broadcast_0_piece0
>>>>>>>> 15/03/26 14:50:00 ERROR TransportRequestHandler: Error while
>>>>>>>> invoking RpcHandler#receive() on RPC id 7178767328921933569
>>>>>>>> java.lang.ClassNotFoundException:
>>>>>>>> org/apache/spark/storage/StorageLevel
>>>>>>>> at java.lang.Class.forName0(Native Method)
>>>>>>>> at java.lang.Class.forName(Class.java:344)
>>>>>>>> at
>>>>>>>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:65)
>>>>>>>> at
>>>>>>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
>>>>>>>> at
>>>>>>>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>>>>>>>> at
>>>>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
>>>>>>>> at
>>>>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>>>>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>>>>>>> at
>>>>>>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
>>>>>>>> at
>>>>>>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:88)
>>>>>>>> at
>>>>>>>> org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:65)
>>>>>>>> at
>>>>>>>> org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:124)
>>>>>>>> at
>>>>>>>> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:97)
>>>>>>>> at
>>>>>>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:91)
>>>>>>>> at
>>>>>>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:44)
>>>>>>>> at
>>>>>>>> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>>>>>>>> at
>>>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>>>>>> at
>>>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>>>>>> at
>>>>>>>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>>>>>>>> at
>>>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>>>>>> at
>>>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>>>>>> at
>>>>>>>> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
>>>>>>>> at
>>>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>>>>>> at
>>>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>>>>>> at
>>>>>>>> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
>>>>>>>> at
>>>>>>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
>>>>>>>> at
>>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>>>>>>>> at
>>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>>>>>>>> at
>>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>>>>>>>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>>>>>>>> at
>>>>>>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>>>> 15/03/26 14:50:01 ERROR TransportRequestHandler: Error while
>>>>>>>> invoking RpcHandler#receive() on RPC id 9001562482648380222
>>>>>>>>
>>>>>>>> From mesos UI i see unpacked spark binary and my assembly jar in
>>>>>>>> place (on running driver and on replication targets). I have other spark
>>>>>>>> BATCH jobs running from same base docker image OK. When there is no
>>>>>>>> incoming data exception is not thrown. Spark config :
>>>>>>>>
>>>>>>>> spark.master
>>>>>>>>  mesos://zk://incomparable-brush.maas:2181,cumbersome-match.maas:2181,voluminous-toys.maas:2181/mesos
>>>>>>>> spark.serializer        org.apache.spark.serializer.KryoSerializer
>>>>>>>> spark.executor.uri
>>>>>>>>  file:///master/spark/spark-1.3.0-bin-hadoop2.4.tgz
>>>>>>>> spark.local.dir         /opt/spark_tmp
>>>>>>>>
>>>>>>>> spark.driver.port       41000
>>>>>>>> spark.executor.port     41016
>>>>>>>> spark.fileserver.port   41032
>>>>>>>> spark.broadcast.port    41048
>>>>>>>> spark.replClassServer.port 41064
>>>>>>>> spark.blockManager.port  41080
>>>>>>>> spark.ui.port   41096
>>>>>>>> spark.history.ui.port 41112
>>>>>>>>
>>>>>>>> Thanks for any help
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel

Posted by Ondrej Smola <on...@gmail.com>.
Yes, only when using fine grained mode and replication
(StorageLevel.MEMORY_ONLY_2
etc).

2015-03-27 19:06 GMT+01:00 Tathagata Das <td...@databricks.com>:

> Does it fail with just Spark jobs (using storage levels) on non-coarse
> mode?
>
> TD
>
> On Fri, Mar 27, 2015 at 4:39 AM, Ondrej Smola <on...@gmail.com>
> wrote:
>
>> More info
>>
>> when using *spark.mesos.coarse* everything works as expected. I think
>> this must be a bug in spark-mesos integration.
>>
>>
>> 2015-03-27 9:23 GMT+01:00 Ondrej Smola <on...@gmail.com>:
>>
>>> It happens only when StorageLevel is used with 1 replica ( StorageLevel.
>>> MEMORY_ONLY_2,StorageLevel.MEMORY_AND_DISK_2) , StorageLevel.MEMORY_ONLY
>>> ,StorageLevel.MEMORY_AND_DISK works - the problems must be clearly
>>> somewhere between mesos-spark . From console I see that spark is trying to
>>> replicate to nodes -> nodes show up in Mesos active tasks ... but they
>>> always fail with ClassNotFoundE.
>>>
>>> 2015-03-27 0:52 GMT+01:00 Tathagata Das <td...@databricks.com>:
>>>
>>>> Could you try running a simpler spark streaming program with receiver
>>>> (may be socketStream) and see if that works.
>>>>
>>>> TD
>>>>
>>>> On Thu, Mar 26, 2015 at 2:08 PM, Ondrej Smola <on...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi thanks for reply,
>>>>>
>>>>> yes I have custom receiver -> but it has simple logic .. pop ids from
>>>>> redis queue -> load docs based on ids from elastic and store them in spark.
>>>>> No classloader modifications. I am running multiple Spark batch jobs (with
>>>>> user supplied partitioning) and they have no problems, debug in local mode
>>>>> show no errors.
>>>>>
>>>>> 2015-03-26 21:47 GMT+01:00 Tathagata Das <td...@databricks.com>:
>>>>>
>>>>>> Here are few steps to debug.
>>>>>>
>>>>>> 1. Try using replication from a Spark job: sc.parallelize(1 to 100,
>>>>>> 100).persist(StorageLevel.MEMORY_ONLY_2).count()
>>>>>> 2. If one works, then we know that there is probably nothing wrong
>>>>>> with the Spark installation, and probably in the threads related to the
>>>>>> receivers receiving the data. Are you writing a custom receiver? Are you
>>>>>> somehow playing around with the class loader in the custom receiver?
>>>>>>
>>>>>> TD
>>>>>>
>>>>>>
>>>>>> On Thu, Mar 26, 2015 at 10:59 AM, Ondrej Smola <
>>>>>> ondrej.smola@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am running spark streaming v 1.3.0 (running inside Docker) on
>>>>>>> Mesos 0.21.1. Spark streaming is started using Marathon -> docker container
>>>>>>> gets deployed and starts streaming (from custom Actor). Spark binary is
>>>>>>> located on shared GlusterFS volume. Data is streamed from
>>>>>>> Elasticsearch/Redis. When new batch arrives Spark tries to replicate it but
>>>>>>> fails with following error :
>>>>>>>
>>>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840
>>>>>>> dropped from memory (free 278017782)
>>>>>>> 15/03/26 14:50:00 INFO BlockManager: Removing block
>>>>>>> broadcast_0_piece0
>>>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of size
>>>>>>> 1658 dropped from memory (free 278019440)
>>>>>>> 15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block
>>>>>>> broadcast_0_piece0
>>>>>>> 15/03/26 14:50:00 ERROR TransportRequestHandler: Error while
>>>>>>> invoking RpcHandler#receive() on RPC id 7178767328921933569
>>>>>>> java.lang.ClassNotFoundException:
>>>>>>> org/apache/spark/storage/StorageLevel
>>>>>>> at java.lang.Class.forName0(Native Method)
>>>>>>> at java.lang.Class.forName(Class.java:344)
>>>>>>> at
>>>>>>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:65)
>>>>>>> at
>>>>>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
>>>>>>> at
>>>>>>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>>>>>>> at
>>>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
>>>>>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>>>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>>>>>> at
>>>>>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
>>>>>>> at
>>>>>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:88)
>>>>>>> at
>>>>>>> org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:65)
>>>>>>> at
>>>>>>> org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:124)
>>>>>>> at
>>>>>>> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:97)
>>>>>>> at
>>>>>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:91)
>>>>>>> at
>>>>>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:44)
>>>>>>> at
>>>>>>> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>>>>>>> at
>>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>>>>> at
>>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>>>>> at
>>>>>>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>>>>>>> at
>>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>>>>> at
>>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>>>>> at
>>>>>>> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
>>>>>>> at
>>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>>>>> at
>>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>>>>> at
>>>>>>> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
>>>>>>> at
>>>>>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
>>>>>>> at
>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>>>>>>> at
>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>>>>>>> at
>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>>>>>>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>>>>>>> at
>>>>>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>>> 15/03/26 14:50:01 ERROR TransportRequestHandler: Error while
>>>>>>> invoking RpcHandler#receive() on RPC id 9001562482648380222
>>>>>>>
>>>>>>> From mesos UI i see unpacked spark binary and my assembly jar in
>>>>>>> place (on running driver and on replication targets). I have other spark
>>>>>>> BATCH jobs running from same base docker image OK. When there is no
>>>>>>> incoming data exception is not thrown. Spark config :
>>>>>>>
>>>>>>> spark.master
>>>>>>>  mesos://zk://incomparable-brush.maas:2181,cumbersome-match.maas:2181,voluminous-toys.maas:2181/mesos
>>>>>>> spark.serializer        org.apache.spark.serializer.KryoSerializer
>>>>>>> spark.executor.uri
>>>>>>>  file:///master/spark/spark-1.3.0-bin-hadoop2.4.tgz
>>>>>>> spark.local.dir         /opt/spark_tmp
>>>>>>>
>>>>>>> spark.driver.port       41000
>>>>>>> spark.executor.port     41016
>>>>>>> spark.fileserver.port   41032
>>>>>>> spark.broadcast.port    41048
>>>>>>> spark.replClassServer.port 41064
>>>>>>> spark.blockManager.port  41080
>>>>>>> spark.ui.port   41096
>>>>>>> spark.history.ui.port 41112
>>>>>>>
>>>>>>> Thanks for any help
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel

Posted by Tathagata Das <td...@databricks.com>.
Does it fail with just Spark jobs (using storage levels) on non-coarse mode?

TD

On Fri, Mar 27, 2015 at 4:39 AM, Ondrej Smola <on...@gmail.com>
wrote:

> More info
>
> when using *spark.mesos.coarse* everything works as expected. I think
> this must be a bug in spark-mesos integration.
>
>
> 2015-03-27 9:23 GMT+01:00 Ondrej Smola <on...@gmail.com>:
>
>> It happens only when StorageLevel is used with 1 replica ( StorageLevel.
>> MEMORY_ONLY_2,StorageLevel.MEMORY_AND_DISK_2) , StorageLevel.MEMORY_ONLY
>> ,StorageLevel.MEMORY_AND_DISK works - the problems must be clearly
>> somewhere between mesos-spark . From console I see that spark is trying to
>> replicate to nodes -> nodes show up in Mesos active tasks ... but they
>> always fail with ClassNotFoundE.
>>
>> 2015-03-27 0:52 GMT+01:00 Tathagata Das <td...@databricks.com>:
>>
>>> Could you try running a simpler spark streaming program with receiver
>>> (may be socketStream) and see if that works.
>>>
>>> TD
>>>
>>> On Thu, Mar 26, 2015 at 2:08 PM, Ondrej Smola <on...@gmail.com>
>>> wrote:
>>>
>>>> Hi thanks for reply,
>>>>
>>>> yes I have custom receiver -> but it has simple logic .. pop ids from
>>>> redis queue -> load docs based on ids from elastic and store them in spark.
>>>> No classloader modifications. I am running multiple Spark batch jobs (with
>>>> user supplied partitioning) and they have no problems, debug in local mode
>>>> show no errors.
>>>>
>>>> 2015-03-26 21:47 GMT+01:00 Tathagata Das <td...@databricks.com>:
>>>>
>>>>> Here are few steps to debug.
>>>>>
>>>>> 1. Try using replication from a Spark job: sc.parallelize(1 to 100,
>>>>> 100).persist(StorageLevel.MEMORY_ONLY_2).count()
>>>>> 2. If one works, then we know that there is probably nothing wrong
>>>>> with the Spark installation, and probably in the threads related to the
>>>>> receivers receiving the data. Are you writing a custom receiver? Are you
>>>>> somehow playing around with the class loader in the custom receiver?
>>>>>
>>>>> TD
>>>>>
>>>>>
>>>>> On Thu, Mar 26, 2015 at 10:59 AM, Ondrej Smola <ondrej.smola@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am running spark streaming v 1.3.0 (running inside Docker) on Mesos
>>>>>> 0.21.1. Spark streaming is started using Marathon -> docker container gets
>>>>>> deployed and starts streaming (from custom Actor). Spark binary is located
>>>>>> on shared GlusterFS volume. Data is streamed from Elasticsearch/Redis. When
>>>>>> new batch arrives Spark tries to replicate it but fails with following
>>>>>> error :
>>>>>>
>>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840
>>>>>> dropped from memory (free 278017782)
>>>>>> 15/03/26 14:50:00 INFO BlockManager: Removing block broadcast_0_piece0
>>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of size
>>>>>> 1658 dropped from memory (free 278019440)
>>>>>> 15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block
>>>>>> broadcast_0_piece0
>>>>>> 15/03/26 14:50:00 ERROR TransportRequestHandler: Error while invoking
>>>>>> RpcHandler#receive() on RPC id 7178767328921933569
>>>>>> java.lang.ClassNotFoundException:
>>>>>> org/apache/spark/storage/StorageLevel
>>>>>> at java.lang.Class.forName0(Native Method)
>>>>>> at java.lang.Class.forName(Class.java:344)
>>>>>> at
>>>>>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:65)
>>>>>> at
>>>>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
>>>>>> at
>>>>>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>>>>>> at
>>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
>>>>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>>>>> at
>>>>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
>>>>>> at
>>>>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:88)
>>>>>> at
>>>>>> org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:65)
>>>>>> at
>>>>>> org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:124)
>>>>>> at
>>>>>> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:97)
>>>>>> at
>>>>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:91)
>>>>>> at
>>>>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:44)
>>>>>> at
>>>>>> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>>>>>> at
>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>>>> at
>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>>>> at
>>>>>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>>>>>> at
>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>>>> at
>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>>>> at
>>>>>> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
>>>>>> at
>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>>>> at
>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>>>> at
>>>>>> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
>>>>>> at
>>>>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
>>>>>> at
>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>>>>>> at
>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>>>>>> at
>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>>>>>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>>>>>> at
>>>>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>> 15/03/26 14:50:01 ERROR TransportRequestHandler: Error while invoking
>>>>>> RpcHandler#receive() on RPC id 9001562482648380222
>>>>>>
>>>>>> From mesos UI i see unpacked spark binary and my assembly jar in
>>>>>> place (on running driver and on replication targets). I have other spark
>>>>>> BATCH jobs running from same base docker image OK. When there is no
>>>>>> incoming data exception is not thrown. Spark config :
>>>>>>
>>>>>> spark.master
>>>>>>  mesos://zk://incomparable-brush.maas:2181,cumbersome-match.maas:2181,voluminous-toys.maas:2181/mesos
>>>>>> spark.serializer        org.apache.spark.serializer.KryoSerializer
>>>>>> spark.executor.uri
>>>>>>  file:///master/spark/spark-1.3.0-bin-hadoop2.4.tgz
>>>>>> spark.local.dir         /opt/spark_tmp
>>>>>>
>>>>>> spark.driver.port       41000
>>>>>> spark.executor.port     41016
>>>>>> spark.fileserver.port   41032
>>>>>> spark.broadcast.port    41048
>>>>>> spark.replClassServer.port 41064
>>>>>> spark.blockManager.port  41080
>>>>>> spark.ui.port   41096
>>>>>> spark.history.ui.port 41112
>>>>>>
>>>>>> Thanks for any help
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel

Posted by Ondrej Smola <on...@gmail.com>.
More info

when using *spark.mesos.coarse* everything works as expected. I think this
must be a bug in spark-mesos integration.


2015-03-27 9:23 GMT+01:00 Ondrej Smola <on...@gmail.com>:

> It happens only when StorageLevel is used with 1 replica ( StorageLevel.
> MEMORY_ONLY_2,StorageLevel.MEMORY_AND_DISK_2) , StorageLevel.MEMORY_ONLY ,
> StorageLevel.MEMORY_AND_DISK works - the problems must be clearly
> somewhere between mesos-spark . From console I see that spark is trying to
> replicate to nodes -> nodes show up in Mesos active tasks ... but they
> always fail with ClassNotFoundE.
>
> 2015-03-27 0:52 GMT+01:00 Tathagata Das <td...@databricks.com>:
>
>> Could you try running a simpler spark streaming program with receiver
>> (may be socketStream) and see if that works.
>>
>> TD
>>
>> On Thu, Mar 26, 2015 at 2:08 PM, Ondrej Smola <on...@gmail.com>
>> wrote:
>>
>>> Hi thanks for reply,
>>>
>>> yes I have custom receiver -> but it has simple logic .. pop ids from
>>> redis queue -> load docs based on ids from elastic and store them in spark.
>>> No classloader modifications. I am running multiple Spark batch jobs (with
>>> user supplied partitioning) and they have no problems, debug in local mode
>>> show no errors.
>>>
>>> 2015-03-26 21:47 GMT+01:00 Tathagata Das <td...@databricks.com>:
>>>
>>>> Here are few steps to debug.
>>>>
>>>> 1. Try using replication from a Spark job: sc.parallelize(1 to 100,
>>>> 100).persist(StorageLevel.MEMORY_ONLY_2).count()
>>>> 2. If one works, then we know that there is probably nothing wrong with
>>>> the Spark installation, and probably in the threads related to the
>>>> receivers receiving the data. Are you writing a custom receiver? Are you
>>>> somehow playing around with the class loader in the custom receiver?
>>>>
>>>> TD
>>>>
>>>>
>>>> On Thu, Mar 26, 2015 at 10:59 AM, Ondrej Smola <on...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am running spark streaming v 1.3.0 (running inside Docker) on Mesos
>>>>> 0.21.1. Spark streaming is started using Marathon -> docker container gets
>>>>> deployed and starts streaming (from custom Actor). Spark binary is located
>>>>> on shared GlusterFS volume. Data is streamed from Elasticsearch/Redis. When
>>>>> new batch arrives Spark tries to replicate it but fails with following
>>>>> error :
>>>>>
>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840
>>>>> dropped from memory (free 278017782)
>>>>> 15/03/26 14:50:00 INFO BlockManager: Removing block broadcast_0_piece0
>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of size
>>>>> 1658 dropped from memory (free 278019440)
>>>>> 15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block
>>>>> broadcast_0_piece0
>>>>> 15/03/26 14:50:00 ERROR TransportRequestHandler: Error while invoking
>>>>> RpcHandler#receive() on RPC id 7178767328921933569
>>>>> java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel
>>>>> at java.lang.Class.forName0(Native Method)
>>>>> at java.lang.Class.forName(Class.java:344)
>>>>> at
>>>>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:65)
>>>>> at
>>>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
>>>>> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>>>>> at
>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
>>>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>>>> at
>>>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
>>>>> at
>>>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:88)
>>>>> at
>>>>> org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:65)
>>>>> at
>>>>> org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:124)
>>>>> at
>>>>> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:97)
>>>>> at
>>>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:91)
>>>>> at
>>>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:44)
>>>>> at
>>>>> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>>>>> at
>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>>> at
>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>>> at
>>>>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>>>>> at
>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>>> at
>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>>> at
>>>>> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
>>>>> at
>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>>> at
>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>>> at
>>>>> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
>>>>> at
>>>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
>>>>> at
>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>>>>> at
>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>>>>> at
>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>>>>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>>>>> at
>>>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>> 15/03/26 14:50:01 ERROR TransportRequestHandler: Error while invoking
>>>>> RpcHandler#receive() on RPC id 9001562482648380222
>>>>>
>>>>> From mesos UI i see unpacked spark binary and my assembly jar in place
>>>>> (on running driver and on replication targets). I have other spark BATCH
>>>>> jobs running from same base docker image OK. When there is no incoming data
>>>>> exception is not thrown. Spark config :
>>>>>
>>>>> spark.master
>>>>>  mesos://zk://incomparable-brush.maas:2181,cumbersome-match.maas:2181,voluminous-toys.maas:2181/mesos
>>>>> spark.serializer        org.apache.spark.serializer.KryoSerializer
>>>>> spark.executor.uri
>>>>>  file:///master/spark/spark-1.3.0-bin-hadoop2.4.tgz
>>>>> spark.local.dir         /opt/spark_tmp
>>>>>
>>>>> spark.driver.port       41000
>>>>> spark.executor.port     41016
>>>>> spark.fileserver.port   41032
>>>>> spark.broadcast.port    41048
>>>>> spark.replClassServer.port 41064
>>>>> spark.blockManager.port  41080
>>>>> spark.ui.port   41096
>>>>> spark.history.ui.port 41112
>>>>>
>>>>> Thanks for any help
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel

Posted by Ondrej Smola <on...@gmail.com>.
It happens only when StorageLevel is used with 1 replica ( StorageLevel.
MEMORY_ONLY_2,StorageLevel.MEMORY_AND_DISK_2) , StorageLevel.MEMORY_ONLY ,
StorageLevel.MEMORY_AND_DISK works - the problems must be clearly somewhere
between mesos-spark . From console I see that spark is trying to replicate
to nodes -> nodes show up in Mesos active tasks ... but they always fail
with ClassNotFoundE.

2015-03-27 0:52 GMT+01:00 Tathagata Das <td...@databricks.com>:

> Could you try running a simpler spark streaming program with receiver (may
> be socketStream) and see if that works.
>
> TD
>
> On Thu, Mar 26, 2015 at 2:08 PM, Ondrej Smola <on...@gmail.com>
> wrote:
>
>> Hi thanks for reply,
>>
>> yes I have custom receiver -> but it has simple logic .. pop ids from
>> redis queue -> load docs based on ids from elastic and store them in spark.
>> No classloader modifications. I am running multiple Spark batch jobs (with
>> user supplied partitioning) and they have no problems, debug in local mode
>> show no errors.
>>
>> 2015-03-26 21:47 GMT+01:00 Tathagata Das <td...@databricks.com>:
>>
>>> Here are few steps to debug.
>>>
>>> 1. Try using replication from a Spark job: sc.parallelize(1 to 100,
>>> 100).persist(StorageLevel.MEMORY_ONLY_2).count()
>>> 2. If one works, then we know that there is probably nothing wrong with
>>> the Spark installation, and probably in the threads related to the
>>> receivers receiving the data. Are you writing a custom receiver? Are you
>>> somehow playing around with the class loader in the custom receiver?
>>>
>>> TD
>>>
>>>
>>> On Thu, Mar 26, 2015 at 10:59 AM, Ondrej Smola <on...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am running spark streaming v 1.3.0 (running inside Docker) on Mesos
>>>> 0.21.1. Spark streaming is started using Marathon -> docker container gets
>>>> deployed and starts streaming (from custom Actor). Spark binary is located
>>>> on shared GlusterFS volume. Data is streamed from Elasticsearch/Redis. When
>>>> new batch arrives Spark tries to replicate it but fails with following
>>>> error :
>>>>
>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840
>>>> dropped from memory (free 278017782)
>>>> 15/03/26 14:50:00 INFO BlockManager: Removing block broadcast_0_piece0
>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of size
>>>> 1658 dropped from memory (free 278019440)
>>>> 15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block
>>>> broadcast_0_piece0
>>>> 15/03/26 14:50:00 ERROR TransportRequestHandler: Error while invoking
>>>> RpcHandler#receive() on RPC id 7178767328921933569
>>>> java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel
>>>> at java.lang.Class.forName0(Native Method)
>>>> at java.lang.Class.forName(Class.java:344)
>>>> at
>>>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:65)
>>>> at
>>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
>>>> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>>>> at
>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
>>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>>> at
>>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
>>>> at
>>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:88)
>>>> at
>>>> org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:65)
>>>> at
>>>> org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:124)
>>>> at
>>>> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:97)
>>>> at
>>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:91)
>>>> at
>>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:44)
>>>> at
>>>> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>>>> at
>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>> at
>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>> at
>>>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>>>> at
>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>> at
>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>> at
>>>> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
>>>> at
>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>>> at
>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>>> at
>>>> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
>>>> at
>>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
>>>> at
>>>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>>>> at
>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>>>> at
>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>>>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>>>> at
>>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>>> at java.lang.Thread.run(Thread.java:745)
>>>> 15/03/26 14:50:01 ERROR TransportRequestHandler: Error while invoking
>>>> RpcHandler#receive() on RPC id 9001562482648380222
>>>>
>>>> From mesos UI i see unpacked spark binary and my assembly jar in place
>>>> (on running driver and on replication targets). I have other spark BATCH
>>>> jobs running from same base docker image OK. When there is no incoming data
>>>> exception is not thrown. Spark config :
>>>>
>>>> spark.master
>>>>  mesos://zk://incomparable-brush.maas:2181,cumbersome-match.maas:2181,voluminous-toys.maas:2181/mesos
>>>> spark.serializer        org.apache.spark.serializer.KryoSerializer
>>>> spark.executor.uri
>>>>  file:///master/spark/spark-1.3.0-bin-hadoop2.4.tgz
>>>> spark.local.dir         /opt/spark_tmp
>>>>
>>>> spark.driver.port       41000
>>>> spark.executor.port     41016
>>>> spark.fileserver.port   41032
>>>> spark.broadcast.port    41048
>>>> spark.replClassServer.port 41064
>>>> spark.blockManager.port  41080
>>>> spark.ui.port   41096
>>>> spark.history.ui.port 41112
>>>>
>>>> Thanks for any help
>>>>
>>>
>>>
>>
>

Re: Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel

Posted by Tathagata Das <td...@databricks.com>.
Could you try running a simpler spark streaming program with receiver (may
be socketStream) and see if that works.

TD

On Thu, Mar 26, 2015 at 2:08 PM, Ondrej Smola <on...@gmail.com>
wrote:

> Hi thanks for reply,
>
> yes I have custom receiver -> but it has simple logic .. pop ids from
> redis queue -> load docs based on ids from elastic and store them in spark.
> No classloader modifications. I am running multiple Spark batch jobs (with
> user supplied partitioning) and they have no problems, debug in local mode
> show no errors.
>
> 2015-03-26 21:47 GMT+01:00 Tathagata Das <td...@databricks.com>:
>
>> Here are few steps to debug.
>>
>> 1. Try using replication from a Spark job: sc.parallelize(1 to 100,
>> 100).persist(StorageLevel.MEMORY_ONLY_2).count()
>> 2. If one works, then we know that there is probably nothing wrong with
>> the Spark installation, and probably in the threads related to the
>> receivers receiving the data. Are you writing a custom receiver? Are you
>> somehow playing around with the class loader in the custom receiver?
>>
>> TD
>>
>>
>> On Thu, Mar 26, 2015 at 10:59 AM, Ondrej Smola <on...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I am running spark streaming v 1.3.0 (running inside Docker) on Mesos
>>> 0.21.1. Spark streaming is started using Marathon -> docker container gets
>>> deployed and starts streaming (from custom Actor). Spark binary is located
>>> on shared GlusterFS volume. Data is streamed from Elasticsearch/Redis. When
>>> new batch arrives Spark tries to replicate it but fails with following
>>> error :
>>>
>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840
>>> dropped from memory (free 278017782)
>>> 15/03/26 14:50:00 INFO BlockManager: Removing block broadcast_0_piece0
>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of size
>>> 1658 dropped from memory (free 278019440)
>>> 15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block
>>> broadcast_0_piece0
>>> 15/03/26 14:50:00 ERROR TransportRequestHandler: Error while invoking
>>> RpcHandler#receive() on RPC id 7178767328921933569
>>> java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel
>>> at java.lang.Class.forName0(Native Method)
>>> at java.lang.Class.forName(Class.java:344)
>>> at
>>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:65)
>>> at
>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
>>> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>>> at
>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>> at
>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
>>> at
>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:88)
>>> at
>>> org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:65)
>>> at
>>> org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:124)
>>> at
>>> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:97)
>>> at
>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:91)
>>> at
>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:44)
>>> at
>>> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>>> at
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>> at
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>> at
>>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>>> at
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>> at
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>> at
>>> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
>>> at
>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>>> at
>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>>> at
>>> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
>>> at
>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
>>> at
>>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>>> at
>>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>>> at
>>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>>> at
>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>>> at java.lang.Thread.run(Thread.java:745)
>>> 15/03/26 14:50:01 ERROR TransportRequestHandler: Error while invoking
>>> RpcHandler#receive() on RPC id 9001562482648380222
>>>
>>> From mesos UI i see unpacked spark binary and my assembly jar in place
>>> (on running driver and on replication targets). I have other spark BATCH
>>> jobs running from same base docker image OK. When there is no incoming data
>>> exception is not thrown. Spark config :
>>>
>>> spark.master
>>>  mesos://zk://incomparable-brush.maas:2181,cumbersome-match.maas:2181,voluminous-toys.maas:2181/mesos
>>> spark.serializer        org.apache.spark.serializer.KryoSerializer
>>> spark.executor.uri
>>>  file:///master/spark/spark-1.3.0-bin-hadoop2.4.tgz
>>> spark.local.dir         /opt/spark_tmp
>>>
>>> spark.driver.port       41000
>>> spark.executor.port     41016
>>> spark.fileserver.port   41032
>>> spark.broadcast.port    41048
>>> spark.replClassServer.port 41064
>>> spark.blockManager.port  41080
>>> spark.ui.port   41096
>>> spark.history.ui.port 41112
>>>
>>> Thanks for any help
>>>
>>
>>
>

Re: Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel

Posted by Ondrej Smola <on...@gmail.com>.
Hi thanks for reply,

yes I have custom receiver -> but it has simple logic .. pop ids from redis
queue -> load docs based on ids from elastic and store them in spark. No
classloader modifications. I am running multiple Spark batch jobs (with
user supplied partitioning) and they have no problems, debug in local mode
show no errors.

2015-03-26 21:47 GMT+01:00 Tathagata Das <td...@databricks.com>:

> Here are few steps to debug.
>
> 1. Try using replication from a Spark job: sc.parallelize(1 to 100,
> 100).persist(StorageLevel.MEMORY_ONLY_2).count()
> 2. If one works, then we know that there is probably nothing wrong with
> the Spark installation, and probably in the threads related to the
> receivers receiving the data. Are you writing a custom receiver? Are you
> somehow playing around with the class loader in the custom receiver?
>
> TD
>
>
> On Thu, Mar 26, 2015 at 10:59 AM, Ondrej Smola <on...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I am running spark streaming v 1.3.0 (running inside Docker) on Mesos
>> 0.21.1. Spark streaming is started using Marathon -> docker container gets
>> deployed and starts streaming (from custom Actor). Spark binary is located
>> on shared GlusterFS volume. Data is streamed from Elasticsearch/Redis. When
>> new batch arrives Spark tries to replicate it but fails with following
>> error :
>>
>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840
>> dropped from memory (free 278017782)
>> 15/03/26 14:50:00 INFO BlockManager: Removing block broadcast_0_piece0
>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of size 1658
>> dropped from memory (free 278019440)
>> 15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block
>> broadcast_0_piece0
>> 15/03/26 14:50:00 ERROR TransportRequestHandler: Error while invoking
>> RpcHandler#receive() on RPC id 7178767328921933569
>> java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel
>> at java.lang.Class.forName0(Native Method)
>> at java.lang.Class.forName(Class.java:344)
>> at
>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:65)
>> at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
>> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>> at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>> at
>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
>> at
>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:88)
>> at
>> org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:65)
>> at
>> org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:124)
>> at
>> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:97)
>> at
>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:91)
>> at
>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:44)
>> at
>> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>> at
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>> at
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>> at
>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>> at
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>> at
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>> at
>> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
>> at
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>> at
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
>> at
>> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
>> at
>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
>> at
>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>> at
>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>> at
>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>> at
>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>> at java.lang.Thread.run(Thread.java:745)
>> 15/03/26 14:50:01 ERROR TransportRequestHandler: Error while invoking
>> RpcHandler#receive() on RPC id 9001562482648380222
>>
>> From mesos UI i see unpacked spark binary and my assembly jar in place
>> (on running driver and on replication targets). I have other spark BATCH
>> jobs running from same base docker image OK. When there is no incoming data
>> exception is not thrown. Spark config :
>>
>> spark.master
>>  mesos://zk://incomparable-brush.maas:2181,cumbersome-match.maas:2181,voluminous-toys.maas:2181/mesos
>> spark.serializer        org.apache.spark.serializer.KryoSerializer
>> spark.executor.uri      file:///master/spark/spark-1.3.0-bin-hadoop2.4.tgz
>> spark.local.dir         /opt/spark_tmp
>>
>> spark.driver.port       41000
>> spark.executor.port     41016
>> spark.fileserver.port   41032
>> spark.broadcast.port    41048
>> spark.replClassServer.port 41064
>> spark.blockManager.port  41080
>> spark.ui.port   41096
>> spark.history.ui.port 41112
>>
>> Thanks for any help
>>
>
>

Re: Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel

Posted by Tathagata Das <td...@databricks.com>.
Here are few steps to debug.

1. Try using replication from a Spark job: sc.parallelize(1 to 100,
100).persist(StorageLevel.MEMORY_ONLY_2).count()
2. If one works, then we know that there is probably nothing wrong with the
Spark installation, and probably in the threads related to the receivers
receiving the data. Are you writing a custom receiver? Are you somehow
playing around with the class loader in the custom receiver?

TD


On Thu, Mar 26, 2015 at 10:59 AM, Ondrej Smola <on...@gmail.com>
wrote:

> Hi,
>
> I am running spark streaming v 1.3.0 (running inside Docker) on Mesos
> 0.21.1. Spark streaming is started using Marathon -> docker container gets
> deployed and starts streaming (from custom Actor). Spark binary is located
> on shared GlusterFS volume. Data is streamed from Elasticsearch/Redis. When
> new batch arrives Spark tries to replicate it but fails with following
> error :
>
> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840 dropped
> from memory (free 278017782)
> 15/03/26 14:50:00 INFO BlockManager: Removing block broadcast_0_piece0
> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of size 1658
> dropped from memory (free 278019440)
> 15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block
> broadcast_0_piece0
> 15/03/26 14:50:00 ERROR TransportRequestHandler: Error while invoking
> RpcHandler#receive() on RPC id 7178767328921933569
> java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:344)
> at
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:65)
> at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
> at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
> at
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:88)
> at
> org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:65)
> at
> org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:124)
> at
> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:97)
> at
> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:91)
> at
> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:44)
> at
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
> at
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
> at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
> at java.lang.Thread.run(Thread.java:745)
> 15/03/26 14:50:01 ERROR TransportRequestHandler: Error while invoking
> RpcHandler#receive() on RPC id 9001562482648380222
>
> From mesos UI i see unpacked spark binary and my assembly jar in place (on
> running driver and on replication targets). I have other spark BATCH jobs
> running from same base docker image OK. When there is no incoming data
> exception is not thrown. Spark config :
>
> spark.master
>  mesos://zk://incomparable-brush.maas:2181,cumbersome-match.maas:2181,voluminous-toys.maas:2181/mesos
> spark.serializer        org.apache.spark.serializer.KryoSerializer
> spark.executor.uri      file:///master/spark/spark-1.3.0-bin-hadoop2.4.tgz
> spark.local.dir         /opt/spark_tmp
>
> spark.driver.port       41000
> spark.executor.port     41016
> spark.fileserver.port   41032
> spark.broadcast.port    41048
> spark.replClassServer.port 41064
> spark.blockManager.port  41080
> spark.ui.port   41096
> spark.history.ui.port 41112
>
> Thanks for any help
>