You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Aaron Bannert <aa...@clove.org> on 2002/03/07 00:02:08 UTC

Why do we have NONBLOCK bucket reads? (or, "I want poll for buckets")

I see how these will be useful with an Async I/O model, but at the moment
this read mode seems either incomplete or not useful. It seems to me like
all current uses of NONBLOCK are probably unnecessary, since there is no
other way to "wait" for data to appear other than spinning endlessly.

The problem seems to be that we don't have a way to determine when that
bucket read would no longer block once we've already determined that
it would. I'm looking for a select()/poll() mechanism for a group of
buckets. Does this even fix into our model? If not, how were we planning
on making bucket brigades able to work with Async I/O?

-aaron

RE: Why do we have NONBLOCK bucket reads? (or, "I want poll for buckets")

Posted by Ryan Bloom <rb...@covalent.net>.

> On Wed, Mar 06, 2002 at 11:42:11PM -0800, Greg Stein wrote:
> > apr_bucket_read(NONBLOCK)
> > if (got_nothing)
> >   do_some_work
> >   flush_some_buffers
> >   apr_bucket_read(BLOCK)
> > process_buckets()
> >
> >
> > In other words, you could see if you have some work to do. If not,
then
> you
> > go off and flush out other stuff that was pending. (that is: take
> advantage
> > of idle time)  Once you're done with the work, then you go ahead and
> block
> > to get more work.
> 
> You are correct, and this pattern is used twice only in mod_include.
> 
> It seems to me that this pattern would tend to double the amount of
> system calls to read, while only being successful with fast clients.
> I would have to do some testing, but I suspect that we're not actually
> improving data processing/response time but are instead increasing
> system time. Just a hunch though...

It depends on what you are doing in between the first read and the
second.  The original conversation around filters only discussed
non-blocking in terms of being able to pass the brigade faster.  So, the
conversation was always:

apr_bucket_read(NON_BLOCK)
If (got_nothing)
	Do some stuff
	Ap_pass_brigade()
Apr_bucket_read(BLOCK)

If you are passing a bunch of data to be written to the network, then
the streaming effect outweighs the system time cost, because the user
feels like they are getting the data faster.

Of course, that logic doesn't work if you aren't putting a pass_briade
inside the if clause.  :-)

Ryan

Re: Why do we have NONBLOCK bucket reads? (or, "I want poll for buckets")

Posted by Aaron Bannert <aa...@clove.org>.

On Wed, Mar 06, 2002 at 11:42:11PM -0800, Greg Stein wrote:
> apr_bucket_read(NONBLOCK)
> if (got_nothing)
>   do_some_work
>   flush_some_buffers
>   apr_bucket_read(BLOCK)
> process_buckets()
> 
> 
> In other words, you could see if you have some work to do. If not, then you
> go off and flush out other stuff that was pending. (that is: take advantage
> of idle time)  Once you're done with the work, then you go ahead and block
> to get more work.

You are correct, and this pattern is used twice only in mod_include.

It seems to me that this pattern would tend to double the amount of
system calls to read, while only being successful with fast clients.
I would have to do some testing, but I suspect that we're not actually
improving data processing/response time but are instead increasing
system time. Just a hunch though...

-aaron

Re: Why do we have NONBLOCK bucket reads? (or, "I want poll for buckets")

Posted by Greg Stein <gs...@lyra.org>.

On Wed, Mar 06, 2002 at 03:02:08PM -0800, Aaron Bannert wrote:
> I see how these will be useful with an Async I/O model, but at the moment
> this read mode seems either incomplete or not useful. It seems to me like
> all current uses of NONBLOCK are probably unnecessary, since there is no
> other way to "wait" for data to appear other than spinning endlessly.
> 
> The problem seems to be that we don't have a way to determine when that
> bucket read would no longer block once we've already determined that
> it would. I'm looking for a select()/poll() mechanism for a group of
> buckets. Does this even fix into our model? If not, how were we planning
> on making bucket brigades able to work with Async I/O?

apr_bucket_read(NONBLOCK)
if (got_nothing)
  do_some_work
  flush_some_buffers
  apr_bucket_read(BLOCK)
process_buckets()


In other words, you could see if you have some work to do. If not, then you
go off and flush out other stuff that was pending. (that is: take advantage
of idle time)  Once you're done with the work, then you go ahead and block
to get more work.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: Why do we have NONBLOCK bucket reads? (or, "I want poll for buckets")

Posted by Aaron Bannert <aa...@clove.org>.

[I should have posted this to dev@apr in the first place, so I'll follow
up with the discussion there]

Re: Why do we have NONBLOCK bucket reads? (or, "I want poll for buckets")

Posted by Aaron Bannert <aa...@clove.org>.

On Wed, Mar 06, 2002 at 07:32:57PM -0500, Bill Stoddard wrote:
> Yep, you've nailed one of the big problems with the bucket/brigade API as it is currently
> implemented. Remember a few weeks back when I talked about how interesting it would be if
> our CGI interface could be made full-duplex? To do this, we need to be able to do
> non-blocking i/o plus be able to detect i/o event notifications (via poll/select). We
> probably do NOT want to do true async i/o. Async i/o is way complicated as compared to
> event driven i/o using /dev/poll, kqenqueue/kqdequeue, etc., and event driven i/o achieves
> most all of the interesting function anyway.
> 
> I am interested in working on adding a set of event driven APIs to APR, but I can't even
> think about it until Apache 2.0 is released.... (I even have a prototype event driven
> Apache 2.0 implemented on Windows that I shelved work on maybe 8 months ago.) If you come
> up with a design, I'll certainly find time to review it :-)

Hmmm...I just had a partial brain dump of what I've been thinking
through today, but I ended up discarding that idea. Here's what
I've been mulling over today:

Basicly, I think we would have to come up with a bunch of
apr_bucket_poll*() calls that mostly mirror the apr_poll*()
functions. Under the covers we could actually call apr_poll()
when the foremost bucket on a brigade was a FILE, PIPE, or SOCKET;
and if any other foremost-buckets on a brigade in the set were of
another type we would treat it as an event.

My first impression of this model is that we're going to have a huge
amount of overhead on each apr_bucket_poll() call, since internally
we'd have to run through every brigade and build a pollset from that,
returning on any non-blockable bucket. The next question is to determine
if this is feasible, or if there is a better way.

-aaron

Re: Why do we have NONBLOCK bucket reads? (or, "I want poll for buckets")

Posted by Bill Stoddard <bi...@wstoddard.com>.

Yep, you've nailed one of the big problems with the bucket/brigade API as it is currently
implemented. Remember a few weeks back when I talked about how interesting it would be if
our CGI interface could be made full-duplex? To do this, we need to be able to do
non-blocking i/o plus be able to detect i/o event notifications (via poll/select). We
probably do NOT want to do true async i/o. Async i/o is way complicated as compared to
event driven i/o using /dev/poll, kqenqueue/kqdequeue, etc., and event driven i/o achieves
most all of the interesting function anyway.

I am interested in working on adding a set of event driven APIs to APR, but I can't even
think about it until Apache 2.0 is released.... (I even have a prototype event driven
Apache 2.0 implemented on Windows that I shelved work on maybe 8 months ago.) If you come
up with a design, I'll certainly find time to review it :-)

Bill

> I see how these will be useful with an Async I/O model, but at the moment
> this read mode seems either incomplete or not useful. It seems to me like
> all current uses of NONBLOCK are probably unnecessary, since there is no
> other way to "wait" for data to appear other than spinning endlessly.
>
> The problem seems to be that we don't have a way to determine when that
> bucket read would no longer block once we've already determined that
> it would. I'm looking for a select()/poll() mechanism for a group of
> buckets. Does this even fix into our model? If not, how were we planning
> on making bucket brigades able to work with Async I/O?
>
> -aaron
>