You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Todd Lipcon (Created) (JIRA)" <ji...@apache.org> on 2012/02/14 05:12:59 UTC

[jira] [Created] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Enable TCP_NODELAY by default for IPC
-------------------------------------

                 Key: HADOOP-8069
                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
             Project: Hadoop Common
          Issue Type: Improvement
          Components: ipc
    Affects Versions: 0.23.0
            Reporter: Todd Lipcon
            Assignee: Todd Lipcon


I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
{quote}
In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
{quote}
Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Todd Lipcon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-8069:
--------------------------------

    Attachment: hadoop-8069.txt

Attached patch changes the defaults and removes these keys from the documentation. There isn't any good reason that the user should change them, since we do our own buffering at the IPC layer.
                
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8069.txt
>
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Eli Collins (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207565#comment-13207565 ] 

Eli Collins commented on HADOOP-8069:
-------------------------------------

+1  nice
                
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8069.txt
>
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209012#comment-13209012 ] 

Todd Lipcon commented on HADOOP-8069:
-------------------------------------

Hi Daryn. Your above descriptions sound right, except the nagle delay on Linux is 40ms rather than 200 (I think the dack delay is 200 though like you said).
I hacked up something like my #4 yesterday morning but didn't really like the way I did it so I threw it away. I'll try again soon :)
                
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8069.txt
>
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209134#comment-13209134 ] 

Suresh Srinivas commented on HADOOP-8069:
-----------------------------------------

Todd, how many RPC responses go beyond 8K in size? Roughly what would be your guess on what % of total RPC calls this is?
                
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8069.txt
>
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Aaron T. Myers (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207561#comment-13207561 ] 

Aaron T. Myers commented on HADOOP-8069:
----------------------------------------

+1, the patch looks good to me. Great analysis/benchmarking.
                
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8069.txt
>
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207862#comment-13207862 ] 

Todd Lipcon commented on HADOOP-8069:
-------------------------------------

Before committing this, I want to double check a couple things to make sure there are no cases where we end up making more packets than before.
                
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8069.txt
>
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Robert Joseph Evans (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Joseph Evans updated HADOOP-8069:
----------------------------------------

    Target Version/s: 2.0.0, 3.0.0  (was: 0.23.2)
    
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8069.txt
>
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207539#comment-13207539 ] 

Hadoop QA commented on HADOOP-8069:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12514452/hadoop-8069.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/594//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/594//console

This message is automatically generated.
                
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8069.txt
>
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Todd Lipcon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-8069:
--------------------------------

    Status: Open  (was: Patch Available)

One issue here with nagling off is the following:

In the Server implementation, we write with maximum 8KB write() calls, to avoid a heap malloc inside the JDK's SocketOutputStream implementation (with less than 8K it uses a stack buffer instead).
So, if we write a 10KB response, we end up doing a write(8KB) followed by write(2KB)

The problem here, when NODELAY is on, is that the TCP MSS doesn't divide neatly into the 8K buffer size. So we get the following behavior:
write(8K):
  sends 5 packets of MSS size (eg 1490 bytes)
  sends 1 packet of around half MSS (around 750 bytes)
write(2K):
  sends 1 packet of MSS
  sends 1 packet around 1/3 MSS

although we should have fit the result in 7 packets, instead we used 8

The following thread about postfix discusses a similar issue:
http://tech.groups.yahoo.com/group/postfix-users/message/224183

Possible solutions:
1) accept the inefficiency - it's bounded by one extra "small" packet for every 8KB in the response
2) try to set the write buffer size to an exact multiple of MSS. This is difficult because Java doesn't let you call getsockopt(TCP_MAXSEG)
3) use TCP_CORK and TCP_UNCORK to control the packet sending behavior. This is difficult because Java also doesn't expose those
4) in the Server.channelIO loop, turn off NODELAY while writing all but the last buffer worth, then turn on NODELAY for the last buffer. This should act as a flush of all the remaining buffered data

Canceling patch for now to work through this
                
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8069.txt
>
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207491#comment-13207491 ] 

Todd Lipcon commented on HADOOP-8069:
-------------------------------------

A good reason to do this is that, when I run an IPC benchmark on an "echo" function, as soon as the message size eclipses 8K the latency goes up to 40ms due to the interaction of nagling and delayed ACK.
                
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209150#comment-13209150 ] 

Todd Lipcon commented on HADOOP-8069:
-------------------------------------

My hunch is that it's pretty small. I think the only RPC to the NN which would be at all frequent and cross the 8K boundary would be getListing(). On one production hbase cluster I collected metrics from a while back, getListing represented 8.3% of the RPCs. On one of our QA clusters that's been running MR workloads, it represents 2.3%. Unfortunately we don't have enough metrics to get any info on the size distribution of those responses.

Would be interested to hear if some of your production clusters show a similar mix.
                
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8069.txt
>
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Suresh Srinivas (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209182#comment-13209182 ] 

Suresh Srinivas commented on HADOOP-8069:
-----------------------------------------

bq. ... cross the 8K boundary would be getListing()
This is what I was thinking. However we have iterative listing now. With that perhaps the probability of such RPCs > 8K is lower. However we should tweek DFS_LIST_LIMIT_DEFAULT, certainly based on your findings.

Additionally there are other RPCs such as Namenode#getBlocks(), ClientProtocol#listCorruptBlocks(). However these are not frequently called.


                
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8069.txt
>
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Todd Lipcon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-8069:
--------------------------------

    Status: Patch Available  (was: Open)
    
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8069.txt
>
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207535#comment-13207535 ] 

Todd Lipcon commented on HADOOP-8069:
-------------------------------------

Illustrating the improvement that NODELAY makes using the benchmark from HADOOP-8070:
{code}
todd@todd-w510:~/git/hadoop-common/hadoop-common-project/hadoop-common$ /usr/lib/jvm/java-6-sun/bin/java -cp /home/todd/git/hadoop-common/hadoop-dist/target/hadoop-0.24.0-SNAPSHOT/share/hadoop/common/lib/*:target/classes:target/test-classes org.apache.hadoop.ipc.RPCCallBenchmark -Dipc.server.tcpnodelay=false -Dipc.client.tcpnodelay=false -c 1 -s 1 -t 10 -m 8300  -e protobuf
Calls per second: 21.0
Calls per second: 24.0
Calls per second: 24.0
Calls per second: 24.0
Calls per second: 23.0
Calls per second: 25.0
Calls per second: 24.0
Calls per second: 24.0
Calls per second: 23.0
Calls per second: 24.0
====== Results ======
Options:
rpcEngine=class org.apache.hadoop.ipc.ProtobufRpcEngine
serverThreads=1
serverReaderThreads=1
clientThreads=1
host=0.0.0.0
port=12345
secondsToRun=10
msgSize=8300
Total calls per second: 24.0
CPU time per call on client: 691056 ns
CPU time per call on server: 894308 ns
todd@todd-w510:~/git/hadoop-common/hadoop-common-project/hadoop-common$ /usr/lib/jvm/java-6-sun/bin/java -cp /home/todd/git/hadoop-common/hadoop-dist/target/hadoop-0.24.0-SNAPSHOT/share/hadoop/common/lib/*:target/classes:target/test-classes org.apache.hadoop.ipc.RPCCallBenchmark -Dipc.server.tcpnodelay=true -Dipc.client.tcpnodelay=true -c 1 -s 1 -t 10 -m 8300  -e protobuf
Calls per second: 642.0
Calls per second: 859.0
Calls per second: 1593.0
Calls per second: 2378.0
Calls per second: 2069.0
Calls per second: 2716.0
Calls per second: 3400.0
Calls per second: 3973.0
Calls per second: 4117.0
Calls per second: 4075.0
====== Results ======
Options:
rpcEngine=class org.apache.hadoop.ipc.ProtobufRpcEngine
serverThreads=1
serverReaderThreads=1
clientThreads=1
host=0.0.0.0
port=12345
secondsToRun=10
msgSize=8300
Total calls per second: 2582.0
CPU time per call on client: 137426 ns
CPU time per call on server: 151749 ns
{code}
Note that the 24 calls/sec corresponds to 41ms/call, which is just above the 40ms delay you see with interaction of delayed ACK and nagling on Linux
                
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8069.txt
>
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8069) Enable TCP_NODELAY by default for IPC

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208994#comment-13208994 ] 

Daryn Sharp commented on HADOOP-8069:
-------------------------------------

The short packets should result 40 byte packet overhead.  Delayed ack should coalesce the ack to prevent any overhead there.  Overall, the overhead for short packets should be ~0.5%.

The key is avoiding 200ms bubbles in the default settings for a socket.  From memory: nagle holds data until a full packet is assembled, it receives an ack for the previous packet, or 200ms expires.  Receiver sends delayed acks for every other packet, or 200ms expires.  If the last partial packet is an odd packet, the sender and receiver are waiting for each other to send something.  The receiver's 200ms dack timer expires, it sends the ack for the next to last even packet, sender sends the last odd packet.

I _think_, but I'm rusty, that the main differences between nagle and cork are:
* nagle may send a partial packet if the receiver acks before another full packet is assembled
* cork ignores acks and just sends full packets, or 200ms expires
* uncorking flushes the socket buffer and sets the tcp push flag (causing immediate ack, not dack) on the partial packet 
* nodelay might be setting the push flag on all packets (generating 2X acks), but I think it's just the partial packets
* nodelay is portable, cork is not

All said, I think #4 is probably the best bet.  It should in effect be like cork unless the writes for a given chunk of data are written in a slow/sporadic fashion, thus causing acks to send out partial packets.  Most comparisons are straight nodelay or straight cork, so your findings will be interesting.
                
> Enable TCP_NODELAY by default for IPC
> -------------------------------------
>
>                 Key: HADOOP-8069
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8069
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8069.txt
>
>
> I think we should switch the default for the IPC client and server NODELAY options to true. As wikipedia says:
> {quote}
> In general, since Nagle's algorithm is only a defense against careless applications, it will not benefit a carefully written application that takes proper care of buffering; the algorithm has either no effect, or negative effect on the application.
> {quote}
> Since our IPC layer is well contained and does its own buffering, we shouldn't be careless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira