You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Ryan Bloom <rb...@raleigh.ibm.com> on 1999/11/08 13:21:58 UTC

Re: Various 2.0 bugs I'm looking at

I should have responded to this on Friday, but I was taking it easy and
just trying to get through 1500 messages.

Why is the Buff code ever trying to close sockets without using
ap_close_socket?  The native types (e.g. ap_os_file_t) are only in Apache
for a short time.  They AREN'T and SHOULDN't be staying in the Apache
code.  In fact, I wish somebody would try to remove the rest of them ASAP.
It is on my list of things I would like to do when I have time.  They
were meant to be used so that Apache could talk to NON-APACHE modules.
That is what they were designed for.  NOT so that parts of Apache could
not have to use APR.  If that is the way they are going to be used, we
really should reconsider APR at all, because we will forever be finding
these kinds of bugs.

Ryan

> OK, the problem was that APR and buff were both trying to close the
> same socket. Under heavy load, buff would close the socket, the socket
> would get reassigned to something else (a file, leading to some of the
> errors, or a socket, leading to the rest), then APR would close the
> socket that didn't belong to it. That last step is the new piece that
> got trigged by the patch I referenced above. I've put in a hack for
> now (preventing IOL from actually closing the socket) on Unix.
> 
> The root problem is the switching back and forth between APR and OS
> sockets. APR thinks it owns the socket and is responsible for closing
> it, and so does buff. The best solution for MPMs that don't get
> APRized is probably not to use the ap_accept call, but to use the OS
> version instead. But, the bug also naturally goes away on platforms
> that have an APR-based iol_socket.
> 
> -- 
> Manoj Kasichainula - manojk at io dot com - http://www.io.com/~manojk/
> 

_______________________________________________________________________
Ryan Bloom		rbb@raleigh.ibm.com
4205 S Miami Blvd	
RTP, NC 27709		It's a beautiful sight to see good dancers 
			doing simple steps.  It's a painful sight to
			see beginners doing complicated patterns.	




Re: Various 2.0 bugs I'm looking at

Posted by Manoj Kasichainula <ma...@io.com>.
On Mon, Nov 08, 1999 at 07:21:58AM -0500, Ryan Bloom wrote:
> Why is the Buff code ever trying to close sockets without using
> ap_close_socket?  The native types (e.g. ap_os_file_t) are only in Apache
> for a short time.  They AREN'T and SHOULDN't be staying in the Apache
> code.  In fact, I wish somebody would try to remove the rest of them ASAP.
> It is on my list of things I would like to do when I have time.

I've been more concerned with getting the server stable than APRizing
it. It's much easier to find bugs resulting from a change when there
aren't hordes of other bugs lurking around. Now, at least on Unix, the
server seems to be running quite well, so I think we can start
converting more code to the APR World Order.

> They
> were meant to be used so that Apache could talk to NON-APACHE modules.

High-performance MPMs will have to use native code. I don't see a way
for APR to portably export an interface that can optimally use NT's
asynchronous I/O APIs. The Linux SIGINFO stuff is probably easier, but
still difficult. And whatever problems there are in Apache right now
from the combination of APR and native code, I imagine they will show
up in external modules as well.

So, I think it's important that APR play well with native code.

> If that is the way they are going to be used, we really should
> reconsider APR at all, because we will forever be finding these
> kinds of bugs.

I think these sorts of bugs can be minimized if the portability code
exports as thin a layer as possible, meaning that it keeps as little
state as is possible on the various platforms, and if it gives the
programmer ultimate control over reasonable behaviors. For example, a
function to tell APR to relinquish ownership of a socket once the app
has called ap_get_os_sock() would have solved the double-close
problem just as well as the eventual APRization of the core code will.

-- 
Manoj Kasichainula - manojk at io dot com - http://www.io.com/~manojk/