You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Ben Laurie <be...@algroup.co.uk> on 1999/10/10 15:24:46 UTC

Memory Hog

I've been looking at the implementation of ap_poll() with a view to
moving mpmt_pthread over to using it. Oh dear. Every time poll is
called, a whole pollset is created (and not destroyed). The whole thing
is terribly inefficient, too.

Something needs to be done. I don't really understand why the
ap_pollset_t isn't just a struct pollfd (or a thin wrapper around one).
And why translate between POLLIN and APR_POLLIN et al. instead of just
making them the same? Also, having to scan the whole pollset to find a
particular socket is dreadful (especially since this is going to be used
in a scan through all sockets in the main loop, making it an O(n^2)
process).

Funnily enough, if poll() is emulated with select() the implementation
is far more efficient.

I'd attempt a fix, except the whole reason I'm looking at this is
because I don't have poll().

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Gandhi

Re: Memory Hog

Posted by David Reid <ab...@dial.pipex.com>.
The BeOS select now also knows about write sockets! WooHoo!

d.
----- Original Message -----
From: Ryan Bloom <rb...@raleigh.ibm.com>
To: Apache List <ne...@apache.org>
Sent: 10 October 1999 20:37
Subject: Re: Memory Hog


>
> select is not more efficient that poll.  The current ap_poll when
> implemented using select is more efficient than the one implemented using
> poll.  That is a side-effect of what I was doing when I was coding it.  We
> NEED both implementations, because there are platforms that don't support
> select, or don't support it fully.  A good example of this, is BeOS, which
> doesn't pay attention to any of the select sets other than the read one.
>
> The performance of the poll based ap_poll is not tragic.  It is also not
> unfixable.  There are ways to fix the performance af ap_pool when it uses
> poll.  Nobody has done them yet.
>
> Ryan
>
> On Sun, 10 Oct 1999, Manoj Kasichainula wrote:
>
> > On Sun, Oct 10, 1999 at 02:24:46PM +0100, Ben Laurie wrote:
> > > Funnily enough, if poll() is emulated with select() the implementation
> > > is far more efficient.
> >
> > What was the purpose of poll() if select() is more efficient?
> >
> > --
> > Manoj Kasichainula - manojk at io dot com - http://www.io.com/~manojk/
> > "Life is like an analogy." - Aaron Allston
> >
>
> _______________________________________________________________________
> Ryan Bloom rbb@raleigh.ibm.com
> 4205 S Miami Blvd
> RTP, NC 27709 It's a beautiful sight to see good dancers
> doing simple steps.  It's a painful sight to
> see beginners doing complicated patterns.
>


Re: Memory Hog

Posted by Ryan Bloom <rb...@raleigh.ibm.com>.
First of all, in the near future I will have nothing to do with whether or
not a particular system uses poll or select for this APR function.  This
is because it will be selectable at APR configuration time.  This has been
a goal for a while, but nobody has had the time to invest in learning
autoconf well enough to do it.  When this is true, I am assuming people
will use whichever method is the better performer on their system.

The general consensus I have heard, is that select is better for densly
populated lists of file descriptors, and poll is better for sparesly
populated lists.  Because Apache tends to have sparesly populated lists,
poll would seem to make sense.  Of course, that requires that the
poll implementation of ap_poll is tuned much better.  :)

Ryan

On Sun, 10 Oct 1999 saeedt@ix.netcom.com wrote:

> Ryan, just a short comment about poll/select. The project that 
> I was working on had a simple rule of portability. Use
> poll() for TLI implementation and use select() for the 
> BSD Socket implementations. This guaranteed maximum portability
> among different Unixes.
> 
> Also there was a general agreement that BSD socket was more 
> efficient than TLI because more performance work was done on
> the BSD by the UC Berkely fellows than the TLI. So this makes
> select() more efficient than poll(). We never measured the performance
> of the two by profiling. It was just an agreement among the engineers.
> 
>    My two cents,
>    ST.

_______________________________________________________________________
Ryan Bloom		rbb@raleigh.ibm.com
4205 S Miami Blvd	
RTP, NC 27709		It's a beautiful sight to see good dancers 
			doing simple steps.  It's a painful sight to
			see beginners doing complicated patterns.	


Re: Memory Hog

Posted by sa...@ix.netcom.com.
Ryan, just a short comment about poll/select. The project that 
I was working on had a simple rule of portability. Use
poll() for TLI implementation and use select() for the 
BSD Socket implementations. This guaranteed maximum portability
among different Unixes.

Also there was a general agreement that BSD socket was more 
efficient than TLI because more performance work was done on
the BSD by the UC Berkely fellows than the TLI. So this makes
select() more efficient than poll(). We never measured the performance
of the two by profiling. It was just an agreement among the engineers.

   My two cents,
   ST.
  

Ryan Bloom wrote:
> 
> select is not more efficient that poll.  The current ap_poll when
> implemented using select is more efficient than the one implemented using
> poll.  That is a side-effect of what I was doing when I was coding it.  We
> NEED both implementations, because there are platforms that don't support
> select, or don't support it fully.  A good example of this, is BeOS, which
> doesn't pay attention to any of the select sets other than the read one.
> 
> The performance of the poll based ap_poll is not tragic.  It is also not
> unfixable.  There are ways to fix the performance af ap_pool when it uses
> poll.  Nobody has done them yet.
> 
> Ryan
> 
> On Sun, 10 Oct 1999, Manoj Kasichainula wrote:
> 
> > On Sun, Oct 10, 1999 at 02:24:46PM +0100, Ben Laurie wrote:
> > > Funnily enough, if poll() is emulated with select() the implementation
> > > is far more efficient.
> >
> > What was the purpose of poll() if select() is more efficient?
> >
> > --
> > Manoj Kasichainula - manojk at io dot com - http://www.io.com/~manojk/
> > "Life is like an analogy." - Aaron Allston
> >
> 
> _______________________________________________________________________
> Ryan Bloom              rbb@raleigh.ibm.com
> 4205 S Miami Blvd
> RTP, NC 27709           It's a beautiful sight to see good dancers
>                         doing simple steps.  It's a painful sight to
>                         see beginners doing complicated patterns.

Re: Memory Hog

Posted by Ryan Bloom <rb...@raleigh.ibm.com>.
select is not more efficient that poll.  The current ap_poll when
implemented using select is more efficient than the one implemented using
poll.  That is a side-effect of what I was doing when I was coding it.  We
NEED both implementations, because there are platforms that don't support
select, or don't support it fully.  A good example of this, is BeOS, which
doesn't pay attention to any of the select sets other than the read one.

The performance of the poll based ap_poll is not tragic.  It is also not
unfixable.  There are ways to fix the performance af ap_pool when it uses
poll.  Nobody has done them yet.

Ryan

On Sun, 10 Oct 1999, Manoj Kasichainula wrote:

> On Sun, Oct 10, 1999 at 02:24:46PM +0100, Ben Laurie wrote:
> > Funnily enough, if poll() is emulated with select() the implementation
> > is far more efficient.
> 
> What was the purpose of poll() if select() is more efficient?
> 
> -- 
> Manoj Kasichainula - manojk at io dot com - http://www.io.com/~manojk/
> "Life is like an analogy." - Aaron Allston
> 

_______________________________________________________________________
Ryan Bloom		rbb@raleigh.ibm.com
4205 S Miami Blvd	
RTP, NC 27709		It's a beautiful sight to see good dancers 
			doing simple steps.  It's a painful sight to
			see beginners doing complicated patterns.	


Re: Memory Hog

Posted by Manoj Kasichainula <ma...@io.com>.
On Sun, Oct 10, 1999 at 02:24:46PM +0100, Ben Laurie wrote:
> Funnily enough, if poll() is emulated with select() the implementation
> is far more efficient.

What was the purpose of poll() if select() is more efficient?

-- 
Manoj Kasichainula - manojk at io dot com - http://www.io.com/~manojk/
"Life is like an analogy." - Aaron Allston

Re: Memory Hog

Posted by Ryan Bloom <rb...@raleigh.ibm.com>.
Because performance wasn't occupying even 1% of my brain when writing APR.
As I have expressed multiple times since starting APR, I write code by
getting it working, and when I have time I go back and improve
performance.  I am expecting performance to be an issue for me in a few
days/weeks.  If somebody else wants to go through APR looking for
performance improvements, be my guest.

Ryan

On Sun, 10 Oct 1999, Ben Laurie wrote:

> 
> 
> Ryan Bloom wrote:
> > 
> > > Something needs to be done. I don't really understand why the
> > > ap_pollset_t isn't just a struct pollfd (or a thin wrapper around one).
> > > And why translate between POLLIN and APR_POLLIN et al. instead of just
> > > making them the same? Also, having to scan the whole pollset to find a
> > > particular socket is dreadful (especially since this is going to be used
> > > in a scan through all sockets in the main loop, making it an O(n^2)
> > > process).
> > 
> > Because we can't just use POLLIN.  On systems that don't have poll, POLLIN
> > isn't defined.
> 
> I was aware of that. What I meant was why not make APR_POLLIN == POLLIN
> when POLLIN is defined?
> 
> Cheers,
> 
> Ben.
> 
> --
> http://www.apache-ssl.org/ben.html
> 
> "My grandfather once told me that there are two kinds of people: those
> who work and those who take the credit. He told me to try to be in the
> first group; there was less competition there."
>      - Indira Gandhi
> 

_______________________________________________________________________
Ryan Bloom		rbb@raleigh.ibm.com
4205 S Miami Blvd	
RTP, NC 27709		It's a beautiful sight to see good dancers 
			doing simple steps.  It's a painful sight to
			see beginners doing complicated patterns.	


Re: Memory Hog

Posted by Ben Laurie <be...@algroup.co.uk>.

Ryan Bloom wrote:
> 
> > Something needs to be done. I don't really understand why the
> > ap_pollset_t isn't just a struct pollfd (or a thin wrapper around one).
> > And why translate between POLLIN and APR_POLLIN et al. instead of just
> > making them the same? Also, having to scan the whole pollset to find a
> > particular socket is dreadful (especially since this is going to be used
> > in a scan through all sockets in the main loop, making it an O(n^2)
> > process).
> 
> Because we can't just use POLLIN.  On systems that don't have poll, POLLIN
> isn't defined.

I was aware of that. What I meant was why not make APR_POLLIN == POLLIN
when POLLIN is defined?

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Gandhi

Re: Memory Hog

Posted by Ryan Bloom <rb...@ntrnet.net>.
> Something needs to be done. I don't really understand why the
> ap_pollset_t isn't just a struct pollfd (or a thin wrapper around one).
> And why translate between POLLIN and APR_POLLIN et al. instead of just
> making them the same? Also, having to scan the whole pollset to find a
> particular socket is dreadful (especially since this is going to be used
> in a scan through all sockets in the main loop, making it an O(n^2)
> process).

Because we can't just use POLLIN.  On systems that don't have poll, POLLIN
isn't defined.  I could ifdef the code, to make it faster, but just like
the rest of APR, my goal has not been performance yet.  My goal for the
entire thing has been get something that works.  We can fix performance
when we have code that requires it perform.  There are ways to improve our
performance on poll systems, but nobody has done them yet.

One of the things on my list of things to do, is look through all of APR
with an eye towards performance, but I have found that it is far better to
get working code before you try to get optimal working code.  I would say
99% of APR is working fully, and when that number is 100%, we can look to
really focusing on performance in APR.  I believe we will have a working
APR long before we have finished all of the TODO's in Apache, so while
people were finishing the TODO's, I had planned to look at performance.
Of course, if somebody wants to get to it before I do, by all means!

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@ntrnet.net
6209 H Shanda Dr.
Raleigh, NC 27609		Ryan Bloom -- thinker, adventurer, artist,
				     writer, but mostly, friend.
-------------------------------------------------------------------------------