You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Adam Kocoloski <ko...@apache.org> on 2009/11/04 22:42:41 UTC

mochiweb socket options discriminating against slow connections

Hi, Zachary Zolton, Joe Williams and I stumbled on an awkward part of  
CouchDB's (actually MochiWeb's) configuration that makes working over  
a slow connection problematic.  When receiving a large unchunked  
upload, MochiWeb calls

gen_tcp:recv(Socket, Length = 1048576, Timeout = 10000)

which says "pull 1MB off the socket in 10 seconds, or timeout".  This  
is a bit crazy, as we'd definitely like to support e.g. replication  
with CouchDB servers over slower links than 100KB/s.

I tried to do a little digging to see what recourse we have.  We can  
of course play the tuning game adjusting the length and timeout until  
we're happy, but I think it'd be nicer to never timeout a connection  
that is still sending data.  From the gen_tcp man page

> The Length argument is only meaningful when the socket is in raw
> mode and denotes the number of bytes to read. If Length = 0, all
> available bytes are returned. If  Length  >  0,  exactly  Length
> bytes  are  returned, or an error; possibly discarding less than
> Length bytes of data when the socket gets closed from the  other
> side.
>
> The  optional Timeout parameter specifies a timeout in millisec-
> onds. The default value is infinity.

I don't know what the behavior is when there are 0 bytes available on  
the socket -- does gen_tcp:recv just keep returning 0 and leave it to  
us to implement the timeout, or can we do something like

gen_tcp:recv(Socket, 0, 10000)

Best, Adam

Re: mochiweb socket options discriminating against slow connections

Posted by Adam Kocoloski <ko...@apache.org>.
Hi Oliver, it's entirely possible, especially if couchapp is using  
_bulk_docs for the upload (I don't know if that's the case).  Best,

Adam

On Nov 6, 2009, at 5:35 AM, Oliver Oli wrote:

> sometimes i'm getting connection errors from couchapp over a slow DSL
> connection. is this related?
>
> On Wed, Nov 4, 2009 at 10:42 PM, Adam Kocoloski  
> <ko...@apache.org> wrote:
>> Hi, Zachary Zolton, Joe Williams and I stumbled on an awkward part of
>> CouchDB's (actually MochiWeb's) configuration that makes working  
>> over a slow
>> connection problematic.  When receiving a large unchunked upload,  
>> MochiWeb
>> calls
>>
>> gen_tcp:recv(Socket, Length = 1048576, Timeout = 10000)
>>
>> which says "pull 1MB off the socket in 10 seconds, or timeout".   
>> This is a
>> bit crazy, as we'd definitely like to support e.g. replication with  
>> CouchDB
>> servers over slower links than 100KB/s.
>>
>> I tried to do a little digging to see what recourse we have.  We  
>> can of
>> course play the tuning game adjusting the length and timeout until  
>> we're
>> happy, but I think it'd be nicer to never timeout a connection that  
>> is still
>> sending data.  From the gen_tcp man page
>>
>>> The Length argument is only meaningful when the socket is in raw
>>> mode and denotes the number of bytes to read. If Length = 0, all
>>> available bytes are returned. If  Length  >  0,  exactly  Length
>>> bytes  are  returned, or an error; possibly discarding less than
>>> Length bytes of data when the socket gets closed from the  other
>>> side.
>>>
>>> The  optional Timeout parameter specifies a timeout in millisec-
>>> onds. The default value is infinity.
>>
>> I don't know what the behavior is when there are 0 bytes available  
>> on the
>> socket -- does gen_tcp:recv just keep returning 0 and leave it to  
>> us to
>> implement the timeout, or can we do something like
>>
>> gen_tcp:recv(Socket, 0, 10000)
>>
>> Best, Adam
>>


Re: mochiweb socket options discriminating against slow connections

Posted by Oliver Oli <ol...@gmail.com>.
sometimes i'm getting connection errors from couchapp over a slow DSL
connection. is this related?

On Wed, Nov 4, 2009 at 10:42 PM, Adam Kocoloski <ko...@apache.org> wrote:
> Hi, Zachary Zolton, Joe Williams and I stumbled on an awkward part of
> CouchDB's (actually MochiWeb's) configuration that makes working over a slow
> connection problematic.  When receiving a large unchunked upload, MochiWeb
> calls
>
> gen_tcp:recv(Socket, Length = 1048576, Timeout = 10000)
>
> which says "pull 1MB off the socket in 10 seconds, or timeout".  This is a
> bit crazy, as we'd definitely like to support e.g. replication with CouchDB
> servers over slower links than 100KB/s.
>
> I tried to do a little digging to see what recourse we have.  We can of
> course play the tuning game adjusting the length and timeout until we're
> happy, but I think it'd be nicer to never timeout a connection that is still
> sending data.  From the gen_tcp man page
>
>> The Length argument is only meaningful when the socket is in raw
>> mode and denotes the number of bytes to read. If Length = 0, all
>> available bytes are returned. If  Length  >  0,  exactly  Length
>> bytes  are  returned, or an error; possibly discarding less than
>> Length bytes of data when the socket gets closed from the  other
>> side.
>>
>> The  optional Timeout parameter specifies a timeout in millisec-
>> onds. The default value is infinity.
>
> I don't know what the behavior is when there are 0 bytes available on the
> socket -- does gen_tcp:recv just keep returning 0 and leave it to us to
> implement the timeout, or can we do something like
>
> gen_tcp:recv(Socket, 0, 10000)
>
> Best, Adam
>

Re: mochiweb socket options discriminating against slow connections

Posted by Brian Candler <B....@pobox.com>.
On Fri, Nov 06, 2009 at 11:01:18AM +0000, Brian Candler wrote:
> On Wed, Nov 04, 2009 at 05:37:55PM -0500, Adam Kocoloski wrote:
> > Ok, confirmed -- this call blocks for 10 seconds when no data is  
> > available on the socket.  I think that's the behavior we're looking for: 
> > detect stale connections, but let slow connections keep working.
> 
> 10 seconds still seems like an extremely low timeout to me. Drop a few
> packets on a busy link and TCP could easily backoff for that long.
> 
> Why not, say, 2 minutes? The only resource issue would be the RAM used by
> the idle erlang process.
> 
> As a reference, perhaps it would be worth looking at how Apache httpd deals
> with this. I doubt it kicks off clients which have stalled for only 10
> seconds, but I haven't got the unpacked source to hand.

It appears that (a) this is configurable, and (b) it defaults to 300
seconds.

[docs/conf/extra/httpd-default.conf.in]
# Timeout: The number of seconds before receives and sends time out.
#
Timeout 300


Re: mochiweb socket options discriminating against slow connections

Posted by Adam Kocoloski <ko...@apache.org>.
On Nov 7, 2009, at 7:02 PM, Adam Kocoloski wrote:

> On Nov 6, 2009, at 6:01 AM, Brian Candler wrote:
>
>> On Wed, Nov 04, 2009 at 05:37:55PM -0500, Adam Kocoloski wrote:
>>> Ok, confirmed -- this call blocks for 10 seconds when no data is
>>> available on the socket.  I think that's the behavior we're  
>>> looking for:
>>> detect stale connections, but let slow connections keep working.
>>
>> 10 seconds still seems like an extremely low timeout to me. Drop a  
>> few
>> packets on a busy link and TCP could easily backoff for that long.
>>
>> Why not, say, 2 minutes? The only resource issue would be the RAM  
>> used by
>> the idle erlang process.
>>
>> As a reference, perhaps it would be worth looking at how Apache  
>> httpd deals
>> with this. I doubt it kicks off clients which have stalled for only  
>> 10
>> seconds, but I haven't got the unpacked source to hand.
>
> Hi Brian, I agree that What Would Apache Do? is a good mantra here.   
> Looks like the answer is "wait 5 minutes":
>
> http://httpd.apache.org/docs/2.2/mod/core.html#timeout
>
> I'm comfortable making that switch if no one objects.  Thanks, Adam

For the archives, we switched to this 5 minute timeout in r834212.

Adam

Re: mochiweb socket options discriminating against slow connections

Posted by Adam Kocoloski <ko...@apache.org>.
On Nov 6, 2009, at 6:01 AM, Brian Candler wrote:

> On Wed, Nov 04, 2009 at 05:37:55PM -0500, Adam Kocoloski wrote:
>> Ok, confirmed -- this call blocks for 10 seconds when no data is
>> available on the socket.  I think that's the behavior we're looking  
>> for:
>> detect stale connections, but let slow connections keep working.
>
> 10 seconds still seems like an extremely low timeout to me. Drop a few
> packets on a busy link and TCP could easily backoff for that long.
>
> Why not, say, 2 minutes? The only resource issue would be the RAM  
> used by
> the idle erlang process.
>
> As a reference, perhaps it would be worth looking at how Apache  
> httpd deals
> with this. I doubt it kicks off clients which have stalled for only 10
> seconds, but I haven't got the unpacked source to hand.

Hi Brian, I agree that What Would Apache Do? is a good mantra here.   
Looks like the answer is "wait 5 minutes":

http://httpd.apache.org/docs/2.2/mod/core.html#timeout

I'm comfortable making that switch if no one objects.  Thanks, Adam


Re: mochiweb socket options discriminating against slow connections

Posted by Brian Candler <B....@pobox.com>.
On Wed, Nov 04, 2009 at 05:37:55PM -0500, Adam Kocoloski wrote:
> Ok, confirmed -- this call blocks for 10 seconds when no data is  
> available on the socket.  I think that's the behavior we're looking for: 
> detect stale connections, but let slow connections keep working.

10 seconds still seems like an extremely low timeout to me. Drop a few
packets on a busy link and TCP could easily backoff for that long.

Why not, say, 2 minutes? The only resource issue would be the RAM used by
the idle erlang process.

As a reference, perhaps it would be worth looking at how Apache httpd deals
with this. I doubt it kicks off clients which have stalled for only 10
seconds, but I haven't got the unpacked source to hand.

Re: mochiweb socket options discriminating against slow connections

Posted by Adam Kocoloski <ko...@apache.org>.
On Nov 4, 2009, at 5:37 PM, Adam Kocoloski wrote:

> On Nov 4, 2009, at 4:42 PM, Adam Kocoloski wrote:
>
>> I don't know what the behavior is when there are 0 bytes available  
>> on the socket -- does gen_tcp:recv just keep returning 0 and leave  
>> it to us to implement the timeout, or can we do something like
>>
>> gen_tcp:recv(Socket, 0, 10000)
>>
>> Best, Adam
>
> Ok, confirmed -- this call blocks for 10 seconds when no data is  
> available on the socket.  I think that's the behavior we're looking  
> for: detect stale connections, but let slow connections keep  
> working.  Not sure what the greater performance implications of  
> switching this configuration might be, though (the chunks of data  
> we'd be dealing with would be much smaller).  Best,
>
> Adam

So, it looks like we already "do the right thing" when dealing with  
attachment uploads -- we only timeout if no data is received for 10  
seconds.  It looks like the reason we take a different code path with  
document uploads is so that we can enforce the mox_document_size  
config setting (which is insanely high, BTW -- who really has a 4GB  
JSON document lying around!)

The following patch to MochiWeb gives it the desired timeout property  
while still limiting the total number of bytes in the received body.   
Test suite looks the same, and I didn't see a significant difference  
in performance in either baracus or the built-in JS bench tests.  Best,

Adam

Index: src/mochiweb/mochiweb_request.erl
===================================================================
--- src/mochiweb/mochiweb_request.erl	(revision 832988)
+++ src/mochiweb/mochiweb_request.erl	(working copy)
@@ -451,14 +451,11 @@

  stream_unchunked_body(0, _MaxChunkSize, Fun, FunState) ->
      Fun({0, <<>>}, FunState);
-stream_unchunked_body(Length, MaxChunkSize, Fun, FunState) when  
Length > MaxChunkSize ->
-    Bin = recv(MaxChunkSize),
-    NewState = Fun({MaxChunkSize, Bin}, FunState),
-    stream_unchunked_body(Length - MaxChunkSize, MaxChunkSize, Fun,  
NewState);
-stream_unchunked_body(Length, MaxChunkSize, Fun, FunState) ->
-    Bin = recv(Length),
-    NewState = Fun({Length, Bin}, FunState),
-    stream_unchunked_body(0, MaxChunkSize, Fun, NewState).
+stream_unchunked_body(Length, _, Fun, FunState) when Length > 0 ->
+    Bin = recv(0),
+    BinSize = size(Bin),
+    NewState = Fun({BinSize, Bin}, FunState),
+    stream_unchunked_body(Length - BinSize, 0, Fun, NewState).


  %% @spec read_chunk_length() -> integer()


Re: mochiweb socket options discriminating against slow connections

Posted by Adam Kocoloski <ko...@apache.org>.
On Nov 4, 2009, at 4:42 PM, Adam Kocoloski wrote:

> I don't know what the behavior is when there are 0 bytes available  
> on the socket -- does gen_tcp:recv just keep returning 0 and leave  
> it to us to implement the timeout, or can we do something like
>
> gen_tcp:recv(Socket, 0, 10000)
>
> Best, Adam

Ok, confirmed -- this call blocks for 10 seconds when no data is  
available on the socket.  I think that's the behavior we're looking  
for: detect stale connections, but let slow connections keep working.   
Not sure what the greater performance implications of switching this  
configuration might be, though (the chunks of data we'd be dealing  
with would be much smaller).  Best,

Adam