You are viewing a plain text version of this content. The canonical link for it is here.
Posted to proton@qpid.apache.org by Ted Ross <tr...@redhat.com> on 2013/04/17 19:42:04 UTC
Re: nagle algorithm
https://issues.apache.org/jira/browse/PROTON-294
On 03/15/2013 08:18 AM, Ted Ross wrote:
> +1
>
> It would be good to make this configurable, but in the absence of
> configurability, we should turn Nagle off by default.
>
> -Ted
>
> On 03/14/2013 03:25 PM, Rafael Schloming wrote:
>> I think in most of the other network code at qpid we tend to disable
>> nagle
>> as you suggest, so I expect it would make sense to do so for proton as
>> well. I'm not fussed about not having it be configurable at this
>> point. We
>> can always add configuration later.
>>
>> --Rafael
>>
>> On Wed, Mar 13, 2013 at 6:20 PM, Bozo Dragojevic <bo...@digiverse.si>
>> wrote:
>>
>>> Hi,
>>>
>>> migrated to proton 0.4 now, so I can ask relevant questions again :)
>>>
>>> I'm doing basic latency analysis in a really trivial setup:
>>> (publisher -> broker -> subscriber)
>>>
>>> publisher is emitting a message every 10 milliseconds, message
>>> contains an
>>> array of timestamps.
>>> broker is forwarding the messages
>>> subscriber is looking at propagation delay
>>>
>>> all participants are adding timestamps to the message as it
>>> traverses the
>>> chain.
>>>
>>> general operating parameters:
>>>
>>> I have incoming and outgoing window set to 10
>>>
>>> after each pn_messenger_get() I do a pn_messenger_accept() for the last
>>> message
>>> after each pn_messenger_put() I do a pn_messenger_settle() for the last
>>> message
>>>
>>> the only synchronous call used is pn_messenger_recv(100), nominally
>>> for each get() and/or put()
>>>
>>> running all three processes on the same VM, no real network,
>>> loopback only.
>>>
>>> During the run, the number of unsent messages (pn_messenger_outgoing())
>>> fluctuates
>>> between 1, 2, 3, and 0 typically in a seesaw fashion, but not really
>>> predictable.
>>> Because of that the percieved propagation delay oscillated between 2
>>> and
>>> 40+ milliseconds.
>>>
>>> [408] P00:00:07.611882 <message.cpp:67 trace dump_timeline>
>>> + e2e: 41.858 TIMELINE CREATE: P00:00:07.569998
>>> + PUBLISH: 0.000
>>> + SERIALIZE: 0.197
>>> + RECEIVE: 1.113
>>> + DESERIALIZE: 0.079
>>> + REPUBLISH: 0.773
>>> + SERIALIZE: 1.164
>>> + RECEIVE: 38.010
>>> + DESERIALIZE: 0.070
>>> + QUEUE_EVENT: 0.126
>>> + NEXT_EVENT: 0.047
>>>
>>> After adding TCP_NODELAY to pn_configure_sock() this is no longer the
>>> case, and
>>> all messages are sent immediately.
>>>
>>> With this change the end-to-end propagation delay over loopback
>>> hovers around 2 milliseconds for most of the time, there are occasional
>>> hiccups,
>>> but nowhere near values observed previously, 7ms is now considered long
>>> delay :).
>>>
>>> A sample trace for one event:
>>> [408] P00:00:17.497270 <message.cpp:67 trace dump_timeline>
>>> + e2e: 2.015 TIMELINE CREATE: P00:00:17.495230
>>> + PUBLISH: 0.000
>>> + SERIALIZE: 0.185
>>> + RECEIVE: 0.359
>>> + DESERIALIZE: 0.052
>>> + REPUBLISH: 0.209
>>> + SERIALIZE: 0.165
>>> + RECEIVE: 0.333
>>> + DESERIALIZE: 0.047
>>> + QUEUE_EVENT: 0.158
>>> + NEXT_EVENT: 0.011
>>>
>>> time spent from sender not long before calling pn_messenger_put() to
>>> to not long after receiver gets the message from pn_messenger_get()
>>> is attributed to 'RECEIVE'. There are two, one for each hop.
>>>
>>> With the change, our processing became a significant contributing
>>> factor
>>> to the delay, a good spot to be in :)
>>>
>>>
>>> One of futher ideas on the topic of analysing performance is adding the
>>> following method to the data api:
>>>
>>> typedef pn_timestamp_t (*pn_timestamp_cb_t)();
>>> int pn_data_put_lazy_timestamp(pn_**data_t *, pn_timestamp_cb_t cb)
>>>
>>> This would allow one to embed a pn_timestamp_t anywhere into the
>>> pn_message_t *
>>> and proton would evaluate it as close to the wire as at all
>>> possible, for
>>> example,
>>> remember the byte offset into the binary buffer, and then do a sneaky
>>> layering
>>> violation :).
>>>
>>> A similar mechanism would come handy on the receive side, although I
>>> suspect the latency there is not really an issue.
>>>
>>>
>>> Here's the simple version of the patch, that I'm running with.
>>> I can also extend it so it's optionally settable, if you guys feel that
>>> disabling nagle is not always a good idea.
>>>
>>> --
>>> Bozzo
>>>
>>> -- snip --
>>>
>>> Author: Bozo Dragojevic <bo...@gmail.com>
>>> Date: Wed Mar 13 16:53:44 2013 -0400
>>>
>>> Disable nagle algorithm
>>>
>>> diff --git a/proton-c/src/posix/driver.c b/proton-c/src/posix/driver.c
>>> index 680ce6c..805a6ac 100644
>>> --- a/proton-c/src/posix/driver.c
>>> +++ b/proton-c/src/posix/driver.c
>>> @@ -26,6 +26,7 @@
>>> #include <sys/types.h>
>>> #include <sys/socket.h>
>>> #include <netinet/in.h>
>>> +#include <netinet/tcp.h>
>>> #include <netdb.h>
>>> #include <unistd.h>
>>> #include <fcntl.h>
>>> @@ -252,6 +253,14 @@ static void pn_configure_sock(int sock) {
>>> };
>>> */
>>>
>>> + // disable nagle
>>> + int flag = 1;
>>> + int result = setsockopt(sock, IPPROTO_TCP, TCP_NODELAY,
>>> + (char *) &flag, sizeof(int));
>>> + if (result < 0) {
>>> + perror("setsockopt");
>>> + }
>>> +
>>> int flags = fcntl(sock, F_GETFL);
>>> flags |= O_NONBLOCK;
>>>
>>>
>