You are viewing a plain text version of this content. The canonical link for it is here.
Posted to proton@qpid.apache.org by Ted Ross <tr...@redhat.com> on 2013/04/17 19:42:04 UTC
Re: nagle algorithm

https://issues.apache.org/jira/browse/PROTON-294

On 03/15/2013 08:18 AM, Ted Ross wrote:
> +1
>
> It would be good to make this configurable, but in the absence of 
> configurability, we should turn Nagle off by default.
>
> -Ted
>
> On 03/14/2013 03:25 PM, Rafael Schloming wrote:
>> I think in most of the other network code at qpid we tend to disable 
>> nagle
>> as you suggest, so I expect it would make sense to do so for proton as
>> well. I'm not fussed about not having it be configurable at this 
>> point. We
>> can always add configuration later.
>>
>> --Rafael
>>
>> On Wed, Mar 13, 2013 at 6:20 PM, Bozo Dragojevic <bo...@digiverse.si> 
>> wrote:
>>
>>> Hi,
>>>
>>> migrated to proton 0.4 now, so I can ask relevant questions again :)
>>>
>>> I'm doing basic latency analysis in a really trivial setup:
>>>       (publisher -> broker -> subscriber)
>>>
>>> publisher is emitting a message every 10 milliseconds, message 
>>> contains an
>>> array of timestamps.
>>> broker is forwarding the messages
>>> subscriber is looking at propagation delay
>>>
>>> all participants are adding timestamps to the message as it 
>>> traverses the
>>> chain.
>>>
>>> general operating parameters:
>>>
>>> I have incoming and outgoing window set to 10
>>>
>>> after each pn_messenger_get() I do a pn_messenger_accept() for the last
>>> message
>>> after each pn_messenger_put() I do a pn_messenger_settle() for the last
>>> message
>>>
>>> the only synchronous call used is pn_messenger_recv(100), nominally
>>> for each get() and/or put()
>>>
>>> running all three processes on the same VM, no real network, 
>>> loopback only.
>>>
>>> During the run, the number of unsent messages (pn_messenger_outgoing())
>>> fluctuates
>>> between 1, 2, 3, and 0 typically in a seesaw fashion, but not really
>>> predictable.
>>> Because of that the percieved propagation delay oscillated between 2 
>>> and
>>> 40+ milliseconds.
>>>
>>> [408] P00:00:07.611882 <message.cpp:67 trace dump_timeline>
>>> +                   e2e:   41.858 TIMELINE CREATE: P00:00:07.569998
>>> +               PUBLISH:    0.000
>>> +             SERIALIZE:    0.197
>>> +               RECEIVE:    1.113
>>> +           DESERIALIZE:    0.079
>>> +             REPUBLISH:    0.773
>>> +             SERIALIZE:    1.164
>>> +               RECEIVE:   38.010
>>> +           DESERIALIZE:    0.070
>>> +           QUEUE_EVENT:    0.126
>>> +            NEXT_EVENT:    0.047
>>>
>>> After adding TCP_NODELAY to pn_configure_sock() this is no longer the
>>> case, and
>>> all messages are sent immediately.
>>>
>>> With this change the end-to-end propagation delay over loopback
>>> hovers around 2 milliseconds for most of the time, there are occasional
>>> hiccups,
>>> but nowhere near values observed previously, 7ms is now considered long
>>> delay :).
>>>
>>> A sample trace for one event:
>>> [408] P00:00:17.497270 <message.cpp:67 trace dump_timeline>
>>> +                   e2e:    2.015 TIMELINE CREATE: P00:00:17.495230
>>> +               PUBLISH:    0.000
>>> +             SERIALIZE:    0.185
>>> +               RECEIVE:    0.359
>>> +           DESERIALIZE:    0.052
>>> +             REPUBLISH:    0.209
>>> +             SERIALIZE:    0.165
>>> +               RECEIVE:    0.333
>>> +           DESERIALIZE:    0.047
>>> +           QUEUE_EVENT:    0.158
>>> +            NEXT_EVENT:    0.011
>>>
>>> time spent from sender not long before calling pn_messenger_put() to
>>> to not long after receiver gets the message from pn_messenger_get()
>>> is attributed to 'RECEIVE'. There are two, one for each hop.
>>>
>>> With the change, our processing became a significant contributing 
>>> factor
>>> to the delay, a good spot to be in :)
>>>
>>>
>>> One of futher ideas on the topic of analysing performance is adding the
>>> following method to the data api:
>>>
>>> typedef pn_timestamp_t (*pn_timestamp_cb_t)();
>>> int pn_data_put_lazy_timestamp(pn_**data_t *, pn_timestamp_cb_t cb)
>>>
>>> This would allow one to embed a pn_timestamp_t anywhere into the
>>> pn_message_t *
>>> and proton would evaluate it as close to the wire as at all 
>>> possible, for
>>> example,
>>> remember the byte offset into the binary buffer, and then do a sneaky
>>> layering
>>> violation :).
>>>
>>> A similar mechanism would come handy on the receive side, although I
>>> suspect the latency there is not really an issue.
>>>
>>>
>>> Here's the simple version of the patch, that I'm running with.
>>> I can also extend it so it's optionally settable, if you guys feel that
>>> disabling nagle is not always a good idea.
>>>
>>> -- 
>>> Bozzo
>>>
>>> -- snip --
>>>
>>> Author: Bozo Dragojevic <bo...@gmail.com>
>>> Date:   Wed Mar 13 16:53:44 2013 -0400
>>>
>>>      Disable nagle algorithm
>>>
>>> diff --git a/proton-c/src/posix/driver.c b/proton-c/src/posix/driver.c
>>> index 680ce6c..805a6ac 100644
>>> --- a/proton-c/src/posix/driver.c
>>> +++ b/proton-c/src/posix/driver.c
>>> @@ -26,6 +26,7 @@
>>>   #include <sys/types.h>
>>>   #include <sys/socket.h>
>>>   #include <netinet/in.h>
>>> +#include <netinet/tcp.h>
>>>   #include <netdb.h>
>>>   #include <unistd.h>
>>>   #include <fcntl.h>
>>> @@ -252,6 +253,14 @@ static void pn_configure_sock(int sock) {
>>>     };
>>>     */
>>>
>>> +  // disable nagle
>>> +  int flag = 1;
>>> +  int result = setsockopt(sock, IPPROTO_TCP, TCP_NODELAY,
>>> +                          (char *) &flag, sizeof(int));
>>> +  if (result < 0) {
>>> +    perror("setsockopt");
>>> +  }
>>> +
>>>       int flags = fcntl(sock, F_GETFL);
>>>       flags |= O_NONBLOCK;
>>>
>>>
>