You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Manoj Samel <ma...@gmail.com> on 2016/10/07 17:07:06 UTC

Executor errors out connecting to external shuffle service when using dynamic allocation

Resending with more clear subject.

Any feedback ?

On Tue, Oct 4, 2016 at 4:43 PM, Manoj Samel <ma...@gmail.com>
wrote:

> Hi,
>
> On a secure hadoop cluster, spark shuffle is enabled (spark 1.6.0, shuffle
> jar is spark-1.6.0-yarn-shuffle.jar). A client connecting using
> spark-assembly_2.11-1.6.1.jar gets errors starting executors, with
> following trace.
>
> Could this be due to spark version mismatch ? Any thoughts ?
>
> Thanks in advance,
>
> 16/10/04 16:12:53 INFO storage.BlockManager: Registering executor with
> local external shuffle service.
> 16/10/04 16:12:53 ERROR client.TransportClientFactory: Exception while
> bootstrapping client after 13 ms
> java.lang.RuntimeException: java.lang.IllegalArgumentException: Unknown
> message type: -22
>         at org.apache.spark.network.shuffle.protocol.
> BlockTransferMessage$Decoder.fromByteBuffer(BlockTransferMessage.java:67)
>         at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.
> receive(ExternalShuffleBlockHandler.java:71)
>         at org.apache.spark.network.server.TransportRequestHandler.
> processRpcRequest(TransportRequestHandler.java:149)
>         at org.apache.spark.network.server.TransportRequestHandler.handle(
> TransportRequestHandler.java:102)
>         at org.apache.spark.network.server.TransportChannelHandler.
> channelRead0(TransportChannelHandler.java:104)
>         at org.apache.spark.network.server.TransportChannelHandler.
> channelRead0(TransportChannelHandler.java:51)
>         at io.netty.channel.SimpleChannelInboundHandler.channelRead(
> SimpleChannelInboundHandler.java:105)
>         at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(
> AbstractChannelHandlerContext.java:294)
>         at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
>
>         at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(
> AbstractChannelHandlerContext.java:294)
>         at io.netty.handler.codec.MessageToMessageDecoder.channelRead(
> MessageToMessageDecoder.java:103)
>         at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(
> AbstractChannelHandlerContext.java:294)
>         at org.apache.spark.network.util.TransportFrameDecoder.
> channelRead(TransportFrameDecoder.java:86)
>         at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(
> AbstractChannelHandlerContext.java:294)
>         at io.netty.channel.DefaultChannelPipeline.fireChannelRead(
> DefaultChannelPipeline.java:846)
>         at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(
> AbstractNioByteChannel.java:131)
>         at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>
>         at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>
>         at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>         at io.netty.util.concurrent.SingleThreadEventExecutor$2.
> run(SingleThreadEventExecutor.java:111)
>         at java.lang.Thread.run(Thread.java:745)
>
>         at org.apache.spark.network.client.TransportResponseHandler.
> handle(TransportResponseHandler.java:186)
>         at org.apache.spark.network.server.TransportChannelHandler.
> channelRead0(TransportChannelHandler.java:106)
>         at org.apache.spark.network.server.TransportChannelHandler.
> channelRead0(TransportChannelHandler.java:51)
>         at io.netty.channel.SimpleChannelInboundHandler.channelRead(
> SimpleChannelInboundHandler.java:105)
>         at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(
> AbstractChannelHandlerContext.java:294)
>         at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
>
>         at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(
> AbstractChannelHandlerContext.java:294)
>         at io.netty.handler.codec.MessageToMessageDecoder.channelRead(
> MessageToMessageDecoder.java:103)
>         at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(
> AbstractChannelHandlerContext.java:294)
>         at org.apache.spark.network.util.TransportFrameDecoder.
> channelRead(TransportFrameDecoder.java:86)
>         at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(
> AbstractChannelHandlerContext.java:294)
>         at org.apache.spark.network.util.TransportFrameDecoder.
> channelRead(TransportFrameDecoder.java:86)
>         at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(
> AbstractChannelHandlerContext.java:294)
>         at io.netty.channel.DefaultChannelPipeline.fireChannelRead(
> DefaultChannelPipeline.java:846)
>         at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(
> AbstractNioByteChannel.java:131)
>         at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>
>         at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>
>         at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>         at io.netty.util.concurrent.SingleThreadEventExecutor$2.
> run(SingleThreadEventExecutor.java:111)
>         at java.lang.Thread.run(Thread.java:745)
>
> 16/10/04 16:12:53 ERROR storage.BlockManager: Failed to connect to
> external shuffle server, will retry 2 more times after waiting 5 seconds...
> java.lang.RuntimeException: java.lang.IllegalArgumentException: Unknown
> message type: -22
>         at org.apache.spark.network.shuffle.protocol.
> BlockTransferMessage$Decoder.fromByteBuffer(BlockTransferMessage.java:67)
>         at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.
> receive(ExternalShuffleBlockHandler.java:71)
> ... repeats ..
>