You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Justin Erenkrantz <je...@apache.org> on 2002/07/06 11:39:18 UTC

New apr_poll() implementation was Re: [PATCH] speed up network timeout processing

On Fri, Jul 05, 2002 at 08:47:32PM -0400, rbb@apache.org wrote:
> This also creates a support library for APR, this is basically just a
> series of functions that APR can use internally to get the job
> done.  Since wait_for_io_or_timeout was identical between files and
> sockets, I have moved that function to the support lib, it also now uses
> apr_poll().

What's the rationale behind a 'support library' for APR?  All of the
content in newpoll.tar.gz seems like it should just stay in APR.  You
have me thoroughly confused here.  Perhaps in a new directory, but
definitely not a separate library.

I'm also not clear what the additional num parameter to apr_poll is
buying us.  Why not just use the *nsds value on input as your num
parameter?  The number of descriptors read is what *nsds is on the
output.  No signature change required.

Aha, it seems that httpd uses a broken apr_poll() semantic.  If you
read the documentation for apr_poll, it says:

 * @param nsds The number of sockets we are polling. 
...
 * The number of sockets signalled is returned in the second argument (nsds)

But, httpd-2.0 doesn't pass in the number of sockets we are polling.
flood does because I read the docs not the code.  I think we should
switch the code to match the docs.

I know I took it to mean that nsds on input is the number of fds we
are polling and on completion, it is the number of signaled fds.  I
believe that the docs have the right API.

So, why not do something like this (very rough):

(Some OS does the low magic number off the stack, but I can't
remember which for the life of me.)

apr_status_t apr_poll(apr_pollfd_t *aprset, apr_int32_t *nsds,
                      apr_interval_time_t timeout)
{
    int i, rv;
    struct pollfd *pollset;
    struct pollfd stack_pollset[SOME_LOW_MAGIC_NUMBER_LIKE_SAY_6];

    if (*nsds < SOME_LOW_MAGIC_NUMBER_LIKE_SAY_6) {
        pollset = &stack_pollset;
    }
    else {
        pollset = apr_palloc(aprset->p,
                             sizeof(struct pollfd) * *nsds);
    }

    for (i = 0; i < *nsds; i++) {
        if (aprset[i].desc_type == APR_POLL_SOCKET) {
            pollset[i].fd = aprset[i].desc.s->socketdes;
        }
        else if (aprset[i].desc_type == APR_POLL_FILE) {
            pollset[i].fd = aprset[i].desc.f->filedes;
        }
        pollset[i].events = get_event(aprset[i].events);
    }

    if (timeout > 0) {
	/* FIXME: need apr conversion macro from apr_interval_time_t
  	 * to milliseconds.
	 */
        timeout /= 1000; /* convert microseconds to milliseconds */
    }

    rv = poll(pollset, *nsds, timeout);

    if (rv < 0) {
        return errno;
    }

    for (i = 0; i < *nsds; i++) {
        aprset[i].revents = get_revent(pollset[i].revents);
    }

    *nsds = rv;
    return APR_SUCCESS;
}

It seems that you might have originally meant to go down this route
because you have:

struct pollfd *pollset = apr_palloc(aprset->p,
                                    sizeof(struct pollfd) * *nsds);

Which means that the pollset is allocated to contain *nsds
pollfds.  And, I think that means you need to have *nsds to be
equivalent to num, or you will segfault.  (Perhaps it hasn't in
your tests by sheer luck.)  -- justin

RE: New apr_poll() implementation

Posted by Ryan Bloom <rb...@covalent.net>.
> From: Sander Striker [mailto:striker@apache.org]
> 
> > From: Ryan Bloom [mailto:rbb@covalent.net]
> > Sent: 09 July 2002 15:46
> 
> >> From: Ryan Bloom [mailto:rbb@covalent.net]
> >>> From: Justin Erenkrantz [mailto:jerenkrantz@apache.org]
> >>> On Sat, Jul 06, 2002 at 12:11:59PM -0700, Ryan Bloom wrote:
> >>>> parameters.  I would like to fix that mistake for apr_poll now,
as
> >>>> long as we are changing the implementation.
> >>>
> >>> Getting back to this conversation for a brief second, I think the
> >>> additional parameter with the fd count is unneeded (but for a
> >>> different reason than IN/OUT).  The count should be stored in
> >>> apr_pollfd_t - it does not need to be passed into apr_poll().
> >>
> >> The count should absolutely NOT be stored in apr_pollfd_t.  That is
> >> user owned structure, not an APR owned structure.
> >
> > I should clarify this.  The whole point of this change, is that the
user
> > can do the following:
> 
> [...]
> > That code will work.  Notice that no where before the apr_poll, did
we
> > tell the poll implementation the size of the pollset.  Nor can we.
This
> > is where the real performance benefits of this patch come in.
Because
> > the user has complete control over the pollset, we can remove all
but
> > one pool allocation, and that one can be removed with an
optimization
> > for small pollsets.
> >
> > Now, we can make macros to add sockets and files, but we should not
EVER
> > be telling APR how many descriptors we are polling until we call
> > apr_poll().  The functions that currently setup the pollset and add
> > sockets and set events and revents are all deprecated, and were only
> > left in the patch to allow for as much source backwards compat as
> > possible while the patch is being discussed.  Once the patch has
been
> > committed, I will go through the source and start to remove those
> > functions completely.  The only real example of apr_poll() that is a
> > part of the patch is wait_for_io_or_timeout.  I probably didn't make
> > that clear enough.
> 
> Ah, no.  Without the accessor functions it doesn't make much sense
> to store the pollset size in apr_pollfd_t.  I surely didn't see
removing
> the accessor functions comming.  Not sure if I like that (+0, -0).  I
can
> see the perf benefit, but if it makes for a better API, I don't know.

You can keep the accessor functions if you want to, but you don't need
them, and they shouldn't be functions regardless.  The whole point of
the new poll implementation is that it opens up the structure for
anybody to manage the memory.  If we keep the accessors, then we should
use macros not functions.

Either way, the setup function goes away, because the user can allocate
the memory, and there is no place to give the size of the structure to
the apr_pollfd_t before the call to apr_poll().

Ryan



RE: New apr_poll() implementation

Posted by Sander Striker <st...@apache.org>.
> From: Ryan Bloom [mailto:rbb@covalent.net]
> Sent: 09 July 2002 15:46

>> From: Ryan Bloom [mailto:rbb@covalent.net]
>>> From: Justin Erenkrantz [mailto:jerenkrantz@apache.org]
>>> On Sat, Jul 06, 2002 at 12:11:59PM -0700, Ryan Bloom wrote:
>>>> parameters.  I would like to fix that mistake for apr_poll now, as
>>>> long as we are changing the implementation.
>>>
>>> Getting back to this conversation for a brief second, I think the
>>> additional parameter with the fd count is unneeded (but for a
>>> different reason than IN/OUT).  The count should be stored in
>>> apr_pollfd_t - it does not need to be passed into apr_poll().
>> 
>> The count should absolutely NOT be stored in apr_pollfd_t.  That is
>> user owned structure, not an APR owned structure.
> 
> I should clarify this.  The whole point of this change, is that the user
> can do the following:

[...]
> That code will work.  Notice that no where before the apr_poll, did we
> tell the poll implementation the size of the pollset.  Nor can we.  This
> is where the real performance benefits of this patch come in.  Because
> the user has complete control over the pollset, we can remove all but
> one pool allocation, and that one can be removed with an optimization
> for small pollsets.
> 
> Now, we can make macros to add sockets and files, but we should not EVER
> be telling APR how many descriptors we are polling until we call
> apr_poll().  The functions that currently setup the pollset and add
> sockets and set events and revents are all deprecated, and were only
> left in the patch to allow for as much source backwards compat as
> possible while the patch is being discussed.  Once the patch has been
> committed, I will go through the source and start to remove those
> functions completely.  The only real example of apr_poll() that is a
> part of the patch is wait_for_io_or_timeout.  I probably didn't make
> that clear enough.

Ah, no.  Without the accessor functions it doesn't make much sense
to store the pollset size in apr_pollfd_t.  I surely didn't see removing
the accessor functions comming.  Not sure if I like that (+0, -0).  I can
see the perf benefit, but if it makes for a better API, I don't know.

Oh well, my $0.02,

Sander

RE: New apr_poll() implementation

Posted by Ryan Bloom <rb...@covalent.net>.
> From: Ryan Bloom [mailto:rbb@covalent.net]
> 
> > From: Justin Erenkrantz [mailto:jerenkrantz@apache.org]
> >
> > On Sat, Jul 06, 2002 at 12:11:59PM -0700, Ryan Bloom wrote:
> > > parameters.  I would like to fix that mistake for apr_poll now, as
> long
> > > as we are changing the implementation.
> >
> > Getting back to this conversation for a brief second, I think the
> > additional parameter with the fd count is unneeded (but for a
> > different reason than IN/OUT).  The count should be stored in
> > apr_pollfd_t - it does not need to be passed into apr_poll().
> 
> The count should absolutely NOT be stored in apr_pollfd_t.  That is
user
> owned structure, not an APR owned structure.

I should clarify this.  The whole point of this change, is that the user
can do the following:

Apr_pollfd_t pollset[2];

Pollset[0].desc_type = APR_POLL_FILE;
Pollset[0].desc.file = f1;
Pollset[0].events = APR_POLLIN;
Pollset[1].desc_type = APR_POLL_SOCKET;
Pollset[1].desc.sock = s1;
Pollset[0].events = APR_POLLIN;

Apr_poll(pollset, 2, &n, pool);

That code will work.  Notice that no where before the apr_poll, did we
tell the poll implementation the size of the pollset.  Nor can we.  This
is where the real performance benefits of this patch come in.  Because
the user has complete control over the pollset, we can remove all but
one pool allocation, and that one can be removed with an optimization
for small pollsets.

Now, we can make macros to add sockets and files, but we should not EVER
be telling APR how many descriptors we are polling until we call
apr_poll().  The functions that currently setup the pollset and add
sockets and set events and revents are all deprecated, and were only
left in the patch to allow for as much source backwards compat as
possible while the patch is being discussed.  Once the patch has been
committed, I will go through the source and start to remove those
functions completely.  The only real example of apr_poll() that is a
part of the patch is wait_for_io_or_timeout.  I probably didn't make
that clear enough.

Ryan


RE: New apr_poll() implementation

Posted by Ryan Bloom <rb...@covalent.net>.
> From: Justin Erenkrantz [mailto:jerenkrantz@apache.org]
> 
> On Sat, Jul 06, 2002 at 12:11:59PM -0700, Ryan Bloom wrote:
> > parameters.  I would like to fix that mistake for apr_poll now, as
long
> > as we are changing the implementation.
> 
> Getting back to this conversation for a brief second, I think the
> additional parameter with the fd count is unneeded (but for a
> different reason than IN/OUT).  The count should be stored in
> apr_pollfd_t - it does not need to be passed into apr_poll().

The count should absolutely NOT be stored in apr_pollfd_t.  That is user
owned structure, not an APR owned structure.

> Since you have to call apr_poll_socket_{add|remove}, you can't
> populate apr_pollfd_t on your own (which is something you can do
> with pollfd, hence why poll() needs the count).  apr_poll_setup()
> should take the maximal size and init the counter to 0.  As fds
> are added or removed, the counter is adjusted.

The apr_poll_socket_(add|remove) are deprecated, and actually aren't and
shouldn't be used.  The whole point of this patch, and the reason it
performs better is that the apr_pollfd_t structure is transparent, and
it is meant to be allocated and owned by the user.

> BTW, we also need apr_poll_file_{add|remove} (seems to be missing
> from rbb's patch).  -- Justin

We don't need it, although we may create it, as a macro.  As proof that
we don't need either however, look at wait_for_io_or_timeout.  That
function is a consumer of apr_poll now, and it acts just like any
consumer outside of APR should act.  It does not add sockets or files
with functions.

The only reason the other functions are still in the code, is for
backwards source compat.  They have been marked as deprecated in my
tree, although I did it after I posted the patch, because I forgot about
it.

Ryan



RE: New apr_poll() implementation

Posted by Sander Striker <st...@apache.org>.
> From: Justin Erenkrantz [mailto:jerenkrantz@apache.org]
> Sent: 09 July 2002 09:44

> On Sat, Jul 06, 2002 at 12:11:59PM -0700, Ryan Bloom wrote:
> > parameters.  I would like to fix that mistake for apr_poll now, as long
> > as we are changing the implementation.
> 
> Getting back to this conversation for a brief second, I think the
> additional parameter with the fd count is unneeded (but for a
> different reason than IN/OUT).  The count should be stored in
> apr_pollfd_t - it does not need to be passed into apr_poll().

+1.  It doesn't seem to make sense to keep track of that in the
application.  Mismatch of actual fd count and passed in fd count
is best prevented.
 
> Since you have to call apr_poll_socket_{add|remove}, you can't
> populate apr_pollfd_t on your own (which is something you can do
> with pollfd, hence why poll() needs the count).  apr_poll_setup()
> should take the maximal size and init the counter to 0.  As fds
> are added or removed, the counter is adjusted.

Makes sense, +1.
 
> BTW, we also need apr_poll_file_{add|remove} (seems to be missing
> from rbb's patch).  -- justin

Sander

Re: New apr_poll() implementation

Posted by Justin Erenkrantz <je...@apache.org>.
On Sat, Jul 06, 2002 at 12:11:59PM -0700, Ryan Bloom wrote:
> parameters.  I would like to fix that mistake for apr_poll now, as long
> as we are changing the implementation.

Getting back to this conversation for a brief second, I think the
additional parameter with the fd count is unneeded (but for a
different reason than IN/OUT).  The count should be stored in
apr_pollfd_t - it does not need to be passed into apr_poll().

Since you have to call apr_poll_socket_{add|remove}, you can't
populate apr_pollfd_t on your own (which is something you can do
with pollfd, hence why poll() needs the count).  apr_poll_setup()
should take the maximal size and init the counter to 0.  As fds
are added or removed, the counter is adjusted.

BTW, we also need apr_poll_file_{add|remove} (seems to be missing
from rbb's patch).  -- justin

Re: New apr_poll() implementation was Re: [PATCH] speed up network timeout processing

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 11:47 AM 7/7/2002, Aaron Bannert wrote:
>On Sat, Jul 06, 2002 at 12:11:59PM -0700, Ryan Bloom wrote:
> > I vote to fix the API so that these kinds of mistakes can't happen in
> > the future.
>
>+1, let's fix the API now. (I also strongly dislike input/output parameters.)

Microsoft's Win32 API was built on IN,OUT arguments.  They are confusing,
you lose the original value ... when you needed to hang on to it you needed
to copy to another variable before you invoked the call.  It is much simpler
to deal with the other case;

int somefunc(int inarg, int *outarg)

and pass

somefunc (foo, &foo)

when you don't care about the original value.

Since they have become far more OO/IDL/COM/.NET centric, very, very few
new functions are introduced as [IN,OUT], most are IN foo, OUT bar instead.

They learned from their mistake, so should we :-)

Bill



Re: New apr_poll() implementation was Re: [PATCH] speed up network timeout processing

Posted by Aaron Bannert <aa...@clove.org>.
On Sat, Jul 06, 2002 at 12:11:59PM -0700, Ryan Bloom wrote:
> I vote to fix the API so that these kinds of mistakes can't happen in
> the future.

+1, let's fix the API now. (I also strongly dislike input/output parameters.)

-aaron

RE: New apr_poll() implementation was Re: [PATCH] speed up network timeout processing

Posted by Ryan Bloom <rb...@covalent.net>.
> From: Justin Erenkrantz [mailto:jerenkrantz@apache.org]
> 
> On Sat, Jul 06, 2002 at 12:11:59PM -0700, Ryan Bloom wrote:
> > I vote to fix the API so that these kinds of mistakes can't happen
in
> > the future.  I made a lot of mistakes when I designed APR (even
though
> > Manoj tried to convince me I was wrong).  One of those mistakes is
> > having functions use a single variable for both input and result
> > parameters.  I would like to fix that mistake for apr_poll now, as
long
> > as we are changing the implementation.
> 
> I don't think that was a mistake.  You are adding extra parameters
> where it just confuses the API.  I see no problem with using the
> same parameter on input/output - provided that it has the same
> meaning on input and output (same as apr_file_read()).

It is a mistake, and it is confusing, as should be obvious, because
people screwed up.

> Regardless, why don't we just fix apr_poll() and then discuss
> whether we should change/fix the API?  -- Justin

Because I want more opinions before I commit, for one.  And for another,
I want to fix it right.   Multiple commits doesn't make sense.  Have the
discussion, and then we can commit.  Waiting until Monday or Tuesday
isn't going to hurt anything, and I would like Jeff and Bill to have an
opportunity to comment.  Not to mention that I still need time to fix
the other platforms versions of apr_poll.  Wheil the API doesn't have to
change, the implementation is changing, and committing on just one
platform won't work in this case.

Ryan



Re: New apr_poll() implementation was Re: [PATCH] speed up network timeout processing

Posted by Justin Erenkrantz <je...@apache.org>.
On Sat, Jul 06, 2002 at 12:11:59PM -0700, Ryan Bloom wrote:
> I vote to fix the API so that these kinds of mistakes can't happen in
> the future.  I made a lot of mistakes when I designed APR (even though
> Manoj tried to convince me I was wrong).  One of those mistakes is
> having functions use a single variable for both input and result
> parameters.  I would like to fix that mistake for apr_poll now, as long
> as we are changing the implementation.

I don't think that was a mistake.  You are adding extra parameters
where it just confuses the API.  I see no problem with using the
same parameter on input/output - provided that it has the same
meaning on input and output (same as apr_file_read()).

And, it does here, as on input, *nsds is the number of fds to
poll and, on output, *nsds is the number of fds polled.  The
typical Unix convention is to return the *nsds as the output of
the function, but APR has explicit error codes, so we can't do
that.  I don't see a problem with the API and I'm not convinced
that we should change it.

Regardless, why don't we just fix apr_poll() and then discuss
whether we should change/fix the API?  -- justin

RE: New apr_poll() implementation was Re: [PATCH] speed up network timeout processing

Posted by Ryan Bloom <rb...@covalent.net>.
> From: Justin Erenkrantz [mailto:jerenkrantz@apache.org]
> 
> On Sat, Jul 06, 2002 at 08:32:18AM -0700, Ryan Bloom wrote:
> > Because if Apache can't get it right, then I am assuming that nobody
> > else can either.  I had originally coded it to use *nsds just as you
> > describe below, and it didn't pass any tests, because throughout the
> > code people were passing 0 as *nsds.  I decided to fix an API
problem
> > that I created years ago by using the same variable for both input
and
> > output parameters in this patch.  I am not tied to adding num, but I
do
> > believe that it is the correct approach.
> 
> I vote to fix httpd rather than add an extra parameter.  Perhaps we
> need to make it clearer that *nsds is the number of fd's on input.
> But, I know that I was under that impression by reading the docs.
> However, I can see where people were confused if they read the docs.

I vote to fix the API so that these kinds of mistakes can't happen in
the future.  I made a lot of mistakes when I designed APR (even though
Manoj tried to convince me I was wrong).  One of those mistakes is
having functions use a single variable for both input and result
parameters.  I would like to fix that mistake for apr_poll now, as long
as we are changing the implementation.

BTW, I should also mention that the APR_POLL_LASTDESC enum is only there
for backwards compat.  Most code that uses apr_poll would never use that
enum.

Ryan


Re: New apr_poll() implementation was Re: [PATCH] speed up network timeout processing

Posted by Justin Erenkrantz <je...@apache.org>.
On Sat, Jul 06, 2002 at 08:32:18AM -0700, Ryan Bloom wrote:
> Because if Apache can't get it right, then I am assuming that nobody
> else can either.  I had originally coded it to use *nsds just as you
> describe below, and it didn't pass any tests, because throughout the
> code people were passing 0 as *nsds.  I decided to fix an API problem
> that I created years ago by using the same variable for both input and
> output parameters in this patch.  I am not tied to adding num, but I do
> believe that it is the correct approach.

I vote to fix httpd rather than add an extra parameter.  Perhaps we
need to make it clearer that *nsds is the number of fd's on input.
But, I know that I was under that impression by reading the docs.
However, I can see where people were confused if they read the docs.

> The small array on the stack is an optimization that I was discussing
> with bpane last night.  It would require some testing to find the right
> number, and I wasn't ready to do that.  I wanted to get the
> implementation out there, and I figured we could continue to optimize it
> once it was done.

Cool.  I'd bet any number >2 will work for our purposes in httpd
as the common case seems to be 2.  But, I do recall 6 being used
somewhere and I can't fathom where I saw that.   -- justin

RE: New apr_poll() implementation was Re: [PATCH] speed up network timeout processing

Posted by Ryan Bloom <rb...@covalent.net>.
> From: Justin Erenkrantz [mailto:jerenkrantz@apache.org]
> 
> On Fri, Jul 05, 2002 at 08:47:32PM -0400, rbb@apache.org wrote:
> > This also creates a support library for APR, this is basically just
a
> > series of functions that APR can use internally to get the job
> > done.  Since wait_for_io_or_timeout was identical between files and
> > sockets, I have moved that function to the support lib, it also now
uses
> > apr_poll().
> 
> What's the rationale behind a 'support library' for APR?  All of the
> content in newpoll.tar.gz seems like it should just stay in APR.  You
> have me thoroughly confused here.  Perhaps in a new directory, but
> definitely not a separate library.

Support library may have been the wrong words.  It is just a set of
functions for use inside of APR that don't belong to any other category
of APR functions.  It is implemented as a directory inside of APR, but
the functions aren't meant to be used outside of APR.

> I'm also not clear what the additional num parameter to apr_poll is
> buying us.  Why not just use the *nsds value on input as your num
> parameter?  The number of descriptors read is what *nsds is on the
> output.  No signature change required.

Because if Apache can't get it right, then I am assuming that nobody
else can either.  I had originally coded it to use *nsds just as you
describe below, and it didn't pass any tests, because throughout the
code people were passing 0 as *nsds.  I decided to fix an API problem
that I created years ago by using the same variable for both input and
output parameters in this patch.  I am not tied to adding num, but I do
believe that it is the correct approach.

> So, why not do something like this (very rough):
> 
> (Some OS does the low magic number off the stack, but I can't
> remember which for the life of me.)
> 
> apr_status_t apr_poll(apr_pollfd_t *aprset, apr_int32_t *nsds,
>                       apr_interval_time_t timeout)
> {
>     int i, rv;
>     struct pollfd *pollset;
>     struct pollfd stack_pollset[SOME_LOW_MAGIC_NUMBER_LIKE_SAY_6];

The small array on the stack is an optimization that I was discussing
with bpane last night.  It would require some testing to find the right
number, and I wasn't ready to do that.  I wanted to get the
implementation out there, and I figured we could continue to optimize it
once it was done.

Ryan