You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Brian Pane <br...@apache.org> on 2005/09/25 07:59:20 UTC
RFC: nonblocking rewrite of ap_core_output_filter
I just committed a new version of ap_core_output_filter() to the
async-dev branch.
The semantics of the filter have changed in a fundamental way: the
original version
does blocking writes and usually tries to write out all of the
available data before
writing (except in cases where it buffers data to avoid small
writes), whereas this
new version does nonblocking writes in most cases.
The goal of the rewrite was to set the stage for a clean
implementation of
asynchronous write completion in the Event and/or Leader/Followers MPMs.
The nonblocking behavior, however, appears to be useful in all MPMs.
For
example, nonblocking writes can enable mod_include to parse the next
bunch of output while awaiting an ack from the client. To avoid
infinite
memory usage, the new filter does blocking writes if it has >= 64KB
of data
buffered up.
There are some significant differences from the earlier nonblocking
output
filter that I posted a few weeks ago as part of the Event MPM async
write
completion patch:
- If a nonblocking write attempt results in EAGAIN, the new filter
returns
APR_SUCCESS instead of APR_EAGAIN. The old patch broke too much
existing code that wasn't prepared for EAGAIN. The new filter can
return
APR_SUCCESS without having actually written the entire brigade, but
that's also true of the original ap_core_output_filter(), with its
various
setaside cases.
- The new filter doesn't try to ignore flush buffers like the earlier
patch did.
Instead, when it encounters a flush bucket, it does a blocking
write of
everything up to that point. The earlier patch tried to detect
certain
patterns of buckets involving flush, EOS, and EOC that could be
interpreted as "hand this data off to the write completion thread
instead
of actually writing it out immediately." But that logic was too
brittle, as
it depended on knowledge of the bucket patterns that the httpd core
just happens to produce. In the new design, the core output filter
interprets a flush bucket as "write this data out before
returning." To
implement async write completion on top of this, we'll likely have to
remove some of the points in the core that generate flush buffers--
but that's a project for another day.
There are a few things missing in the new version:
- It doesn't concatenate sequences of really small buckets together
the way
the original does. If you send it a brigade containing 16 single-
byte buckets,
it will do a writev of 16 bytes. My inclination is to leave this
"broken" in order
to keep the code simple, unless someone has a real-world use case
that
produces such output.
- I haven't yet put in mod_logio support. This shouldn't be
difficult to add, but
I want to do some experiments with a new design: sending an End-Of-
Request
bucket that calls the request logger when all the buckets in front
of it have
been written to the network. If this works, the "EOR" bucket's
destroy function
might end up being the cleanest place to call the logio hooks.
- It doesn't yet do nonblocking reads on socket buckets. Can anyone
recommend
a good test case that make use of socket buckets?
Thanks,
Brian