You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Amol Dev <de...@yahoo.com> on 2007/03/18 02:53:20 UTC

mod_cgid and accept() loop

After running the Apache-2.0.58 server on mod_cgid on HPUX B.11.23 PA for 3-4 days all of sudden I see the following errors in error_log. 

"[Fri Mar 16 07:23:53 2007] [error] (231)Software caused connection abort: Error accepting on cgid socket" 

There were 18 millons such entries in 30 minutes which mean the cgid daemon was under infinite loop.  Error '231'  is ECONNABORTED, which is not handled by mod_cgid and puts the accept() into infinite loop. Not sure why would this socket be shutdown() by anything. But if it does get ECONNABORTED how should mod_cgid handle it?  Should we handle this error by setting daemon_should_exit++? Does that respawn new daemon without interruption?

Thank you,
Amol Dev


 
____________________________________________________________________________________
No need to miss a message. Get email on-the-go 
with Yahoo! Mail for Mobile. Get started.
http://mobile.yahoo.com/mail 

Re: mod_cgid and accept() loop

Posted by Jeff Trawick <tr...@gmail.com>.
On 3/17/07, Amol Dev <de...@yahoo.com> wrote:
> After running the Apache-2.0.58 server on mod_cgid on HPUX B.11.23 PA for 3-4 days all of sudden I see the following errors in error_log.
>
> "[Fri Mar 16 07:23:53 2007] [error] (231)Software caused connection abort: Error accepting on cgid socket"
>
> There were 18 millons such entries in 30 minutes which mean the cgid daemon was under infinite loop.

        len = sizeof(unix_addr);
        sd2 = accept(sd, (struct sockaddr *)&unix_addr, &len);
        if (sd2 < 0) {
            if (errno != EINTR) {
                ap_log_error(APLOG_MARK, APLOG_ERR, errno,
                             (server_rec *)data,
                             "Error accepting on cgid socket");
            }
            continue;
        }

>  Error '231'  is ECONNABORTED, which is not handled by mod_cgid and puts the
>accept() into infinite loop.

no, ECONNABORTED will generate a log message and go back into accept
and wait for a new connection; it takes an infinite number of such
connections (or kernel acting like there is) to create an infinite
loop there

perhaps the kernel is confused?  some unknown glitch caused a
connection to be aborted once, and kernel has left it on an internal
queue even after accept() is called?

> Not sure why would this socket be shutdown() by anything. But if it does get
>ECONNABORTED how should mod_cgid handle it?

It handles it correctly today IMHO.

Without information on root cause of the kernel acting like there is
an endless number of aborted connections to the mod_cgid socket, I
wouldn't suggest any change to Apache.

>  Should we handle this error by setting daemon_should_exit++? Does that respawn
>new daemon without interruption?

You may wish to make a local modification to have the cgid process
exit if, for example, 10 consecutive calls to accept() return
-1/ECONNABORTED.

You may first want to try to catch it happening again and use tusc to
see if child process(es) handling request are repeatedly trying to
connect to mod_cgid's socket.  If they're not doing anything wrong,
see about applicable kernel patches.

If by chance you're using HP's Apache-based server and have support
for it, give them a call.  If anybody has heard of this before they
would likely be in the know.