You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "zzc (JIRA)" <ji...@apache.org> on 2014/11/04 09:37:36 UTC

[jira] [Comment Edited] (SPARK-2468) Netty-based block server / client module

    [ https://issues.apache.org/jira/browse/SPARK-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189825#comment-14189825 ] 

zzc edited comment on SPARK-2468 at 11/4/14 8:37 AM:
-----------------------------------------------------

Hi, Reynold Xin, [SPARK-3453] Netty-based BlockTransferService, extracted from Spark core  was commited yesterday, I compile latest code from github master, when I set spark.shuffle.blockTransferService=netty, there is error:

2014-11-04 15:30:27,013 - ERROR - org.apache.spark.util.SignalLoggerHandler.handle(SignalLogger.scala:57) - RECEIVED SIGNAL 15: SIGTERM
2014-11-04 15:30:28,484 - ERROR - org.apache.spark.network.client.TransportResponseHandler.channelUnregistered(TransportResponseHandler.java:95) - Still have 6 requests outstanding when connection from np04/203.130.48.183:42574 is closed
2014-11-04 15:30:28,522 - WARN - org.apache.spark.network.server.TransportChannelHandler.exceptionCaught(TransportChannelHandler.java:66) - Exception in connection from /203.130.48.183:39332
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:192)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
        at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:311)
        at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
        at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:225)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)

and when i set spark.shuffle.blockTransferService=nio, run successfully.

In addition, when shuffle performance improvement issue will be resolved?


was (Author: zzcclp):
Hi, Reynold Xin, [SPARK-3453] Netty-based BlockTransferService, extracted from Spark core  was commited yesterday, I compile latest code from github master, when I set spark.shuffle.blockTransferService=netty, there is error:

ERROR - org.apache.spark.Logging$class.logError(Logging.scala:75) - sparkDriver-akka.actor.default-dispatcher-14 -Lost executor 17 on np05: remote Akka client disassociated 

and when i set spark.shuffle.blockTransferService=nio, run successfully.

In addition, when shuffle performance improvement issue will be resolved?

> Netty-based block server / client module
> ----------------------------------------
>
>                 Key: SPARK-2468
>                 URL: https://issues.apache.org/jira/browse/SPARK-2468
>             Project: Spark
>          Issue Type: Improvement
>          Components: Shuffle, Spark Core
>            Reporter: Reynold Xin
>            Assignee: Reynold Xin
>            Priority: Critical
>
> Right now shuffle send goes through the block manager. This is inefficient because it requires loading a block from disk into a kernel buffer, then into a user space buffer, and then back to a kernel send buffer before it reaches the NIC. It does multiple copies of the data and context switching between kernel/user. It also creates unnecessary buffer in the JVM that increases GC
> Instead, we should use FileChannel.transferTo, which handles this in the kernel space with zero-copy. See http://www.ibm.com/developerworks/library/j-zerocopy/
> One potential solution is to use Netty.  Spark already has a Netty based network module implemented (org.apache.spark.network.netty). However, it lacks some functionality and is turned off by default. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org