You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Lars Eilebrecht <La...@unix-ag.org> on 1998/01/03 15:41:46 UTC

[BUG?] Send body lost connection...

Hi,

please take a look at the following PRs:

- PR#1119
- PR#1555
- PR#1596

All PRs are about the following two error log entries (1.2.3-1.3b3):

  send body lost connection to: foo.bar: Broken pipe
  send body lost connection to client foobar
 

I know that these messages are - up to a certain limit - normal, but
the submitters of PR#1555/1596 are talking about 'excessive' of such
messages and 'PR#1119' says that these errors are in most cases
(if not all) accompanied by clients seeing broken images (that was
the original reason why he submitted the PR).

- all PRs are from Solaris users using 2.4-2.6 (maybe a Solaris problem?)
- the standard Solaris /dev/tcp tunings haven't solved the problem
- a larger Timeout value seems to have no impact on the problem
  (AFAIR I had a privat conversation with the submitter of PR#1119 about
   this.)
- The submitter of PR#1555 notes that the errors are NOT caused by
  'slow' clients (e.g. clients accessing the server via a slow connection).
- Turning off keep-alive seems to reduce the frequency of the errors
  (PR#1555).


What do you think? Is this a bug (Solaris, broken clients or Apache?)
or just normal error log entries (but what about PR#1119 and the broken
images?)?
  
P.S.: One of my servers is a Solaris box and I'm seeing a lot of those
      'lost connection' errors too, but I always thought that they are
      caused by 'slow' clients and I never worried about them.
      What amount of the errors can be considered normal and what
      'excessive'?


ciao...
-- 
Lars Eilebrecht                         - COBOL programmers never die;
sfx@unix-ag.org                             - they're already dead.
http://www.si.unix-ag.org/~sfx/

Re: [BUG?] Send body lost connection...

Posted by Dean Gaudet <dg...@arctic.org>.

Interesting... it happens with 1.3b3 as well so it's not the
lack-of-a-bflush() problem I thought it might be. 

It would be so great if we could get a tcpdump of a session.  To do so
would require perseverance though.  The error message would have to be
changed to list the client IP address (unresolved) and port number.  And
the server would have to "tcpdump -p -s 1514 -w dump.out tcp port 80" (no
need for promiscuous mode).

Then once a few hours have passed and your disk is almost full and you're
certain that some of these events have happened stop the tcpdump.  Then
use the error logs to extract the relevant portions of the dump.  (You can
use "tcpdump -r dump.out -w dump.smaller host client.ip.addr" to create a
smaller dump.) 

Given that info we could figure out if Solaris or Apache is to blame. 

Other random thoughts include dinking with the socket options that are set
... like maybe the send buffer size.  I doubt nagle is the cause, but you
never know. 

Dean

On Sat, 3 Jan 1998, Lars Eilebrecht wrote:

> Hi,
> 
> please take a look at the following PRs:
> 
> - PR#1119
> - PR#1555
> - PR#1596
> 
> All PRs are about the following two error log entries (1.2.3-1.3b3):
> 
>   send body lost connection to: foo.bar: Broken pipe
>   send body lost connection to client foobar
>  
> 
> I know that these messages are - up to a certain limit - normal, but
> the submitters of PR#1555/1596 are talking about 'excessive' of such
> messages and 'PR#1119' says that these errors are in most cases
> (if not all) accompanied by clients seeing broken images (that was
> the original reason why he submitted the PR).
> 
> - all PRs are from Solaris users using 2.4-2.6 (maybe a Solaris problem?)
> - the standard Solaris /dev/tcp tunings haven't solved the problem
> - a larger Timeout value seems to have no impact on the problem
>   (AFAIR I had a privat conversation with the submitter of PR#1119 about
>    this.)
> - The submitter of PR#1555 notes that the errors are NOT caused by
>   'slow' clients (e.g. clients accessing the server via a slow connection).
> - Turning off keep-alive seems to reduce the frequency of the errors
>   (PR#1555).
> 
> 
> What do you think? Is this a bug (Solaris, broken clients or Apache?)
> or just normal error log entries (but what about PR#1119 and the broken
> images?)?
>   
> P.S.: One of my servers is a Solaris box and I'm seeing a lot of those
>       'lost connection' errors too, but I always thought that they are
>       caused by 'slow' clients and I never worried about them.
>       What amount of the errors can be considered normal and what
>       'excessive'?
> 
> 
> ciao...
> -- 
> Lars Eilebrecht                         - COBOL programmers never die;
> sfx@unix-ag.org                             - they're already dead.
> http://www.si.unix-ag.org/~sfx/
> 
>