You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Raghu Angadi (JIRA)" <ji...@apache.org> on 2008/04/04 06:27:25 UTC

[jira] Updated: (HADOOP-3164) Use FileChannel.transferTo() when data is read from DataNode.

     [ https://issues.apache.org/jira/browse/HADOOP-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-3164:
---------------------------------

    Attachment: HADOOP-3614.patch

'mostly good' patch attacheed.

With the patch,initial tests show DataNode takes about 1/5th to 1/4th the CPU compared to trunk while reading data.  That is about 10 times  faster than 0.16.

With the patch, for majority of the data the path changes from 
'file --> direct buffer --> Java buffer --> direct buffer ---> socket'
 to 'file ---> socket',
 which mostly explains slightly better than 4 times less cpu.

The main remaining issue is with non-blocking sockets, as mentioned in the previous comment. One option is to blocking sockets and have one thread that enforces write timeout.

> Use FileChannel.transferTo() when data is read from DataNode.
> -------------------------------------------------------------
>
>                 Key: HADOOP-3164
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3164
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Raghu Angadi
>            Assignee: Raghu Angadi
>         Attachments: HADOOP-3614.patch
>
>
> HADOOP-2312 talks about using FileChannel's [{{transferTo()}}|http://java.sun.com/javase/6/docs/api/java/nio/channels/FileChannel.html#transferTo(long,%20long,%20java.nio.channels.WritableByteChannel)] and [{{transferFrom()}}|http://java.sun.com/javase/6/docs/api/java/nio/channels/FileChannel.html#transferFrom(java.nio.channels.ReadableByteChannel,%20long,%20long)] in DataNode. 
> At the time DataNode neither used NIO sockets nor wrote large chunks of contiguous block data to socket. Hadoop 0.17 does both when data is seved to clients (and other datanodes). I am planning to try using transferTo() in the trunk. This might reduce DataNode's cpu by another 50% or more.
> Once HADOOP-1702 is committed, we can look into using transferFrom().

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.