You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Brian Pane <br...@apache.org> on 2005/09/25 07:59:20 UTC
RFC: nonblocking rewrite of ap_core_output_filter

I just committed a new version of ap_core_output_filter() to the  
async-dev branch.
The semantics of the filter have changed in a fundamental way: the  
original version
does blocking writes and usually tries to write out all of the  
available data before
writing (except in cases where it buffers data to avoid small  
writes), whereas this
new version does nonblocking writes in most cases.

The goal of the rewrite was to set the stage for a clean  
implementation of
asynchronous write completion in the Event and/or Leader/Followers MPMs.
The nonblocking behavior, however, appears to be useful in all MPMs.   
For
example, nonblocking writes can enable mod_include to parse the next
bunch of output while awaiting an ack from the client.  To avoid  
infinite
memory usage, the new filter does blocking writes if it has >= 64KB  
of data
buffered up.

There are some significant differences from the earlier nonblocking  
output
filter that I posted a few weeks ago as part of the Event MPM async  
write
completion patch:

- If a nonblocking write attempt results in EAGAIN, the new filter  
returns
   APR_SUCCESS instead of APR_EAGAIN.  The old patch broke too much
   existing code that wasn't prepared for EAGAIN.  The new filter can  
return
   APR_SUCCESS without having actually written the entire brigade, but
   that's also true of the original ap_core_output_filter(), with its  
various
   setaside cases.

- The new filter doesn't try to ignore flush buffers like the earlier  
patch did.
   Instead, when it encounters a flush bucket, it does a blocking  
write of
   everything up to that point.  The earlier patch tried to detect  
certain
   patterns of buckets involving flush, EOS, and EOC that could be
   interpreted as "hand this data off to the write completion thread  
instead
   of actually writing it out immediately."  But that logic was too  
brittle, as
   it depended on knowledge of the bucket patterns that the httpd core
   just happens to produce.  In the new design, the core output filter
   interprets a flush bucket as "write this data out before  
returning."  To
   implement async write completion on top of this, we'll likely have to
   remove some of the points in the core that generate flush buffers--
   but that's a project for another day.

There are a few things missing in the new version:

- It doesn't concatenate sequences of really small buckets together  
the way
   the original does.  If you send it a brigade containing 16 single- 
byte buckets,
   it will do a writev of 16 bytes.  My inclination is to leave this  
"broken" in order
   to keep the code simple, unless someone has a real-world use case  
that
   produces such output.

- I haven't yet put in mod_logio support.  This shouldn't be  
difficult to add, but
   I want to do some experiments with a new design: sending an End-Of- 
Request
   bucket that calls the request logger when all the buckets in front  
of it have
   been written to the network.  If this works, the "EOR" bucket's  
destroy function
   might end up being the cleanest place to call the logio hooks.

- It doesn't yet do nonblocking reads on socket buckets.  Can anyone  
recommend
   a good test case that make use of socket buckets?

Thanks,
Brian