You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Paul Querna <pa...@querna.org> on 2011/11/10 20:42:59 UTC

rethinking connection input filter API

The input filter API function signature is the following:

    apr_status_t func(
         ap_filter_t *f,
         apr_bucket_brigade *b,
         ap_input_mode_t mode,
         apr_read_type_e block,
         apr_off_t readbytes);

Problems:

1) This gives the caller of the API control over weither they can
block via the  apr_read_type_e block.

2) The caller also suggests how much data 'should' be read by passing
of readbytes, however many modules just pull in whatever they can, or
they have to put in extra logic to keep a backlog buffer.

3) Currently we have these modes:

    AP_MODE_READBYTES,
    AP_MODE_GETLINE,
    AP_MODE_EATCRLF,
    AP_MODE_SPECULATIVE,
    AP_MODE_EXHAUSTIVE,
    AP_MODE_INIT

This means every input filter, must implement all of these different
modes.  This significantly complicates modules like mod_ssl and
reqtimeout.  Every module to support things like speculative must
support buffering internally.

I would like to change the function signature for connection level
input filters to:

apr_status_t  func(
         ap_filter_t *f,
         apr_bucket_brigade *b);

The function always be allowed to return EAGAIN.  It would not be
responsible for polling or blocking.

This would massively simplify how most input filters work, and would
enable more creative methods of network IO inside the MPMs.

Most of the work would be in ap_rgetline_core, and figuring out how to
make it work without pushing complexity into the protocol parsers.

It would however reduce the total lines of code I believe, because we
would be centralizing all input buffering and different mode needs
into one place.

I believe we would need to add a new method to the MPMs,
ap_mpm_poll_connection() as a transitionary API.  This could be called
by ap_rgetline_core or ap_get_brigade, and in the the traditional IO
MPMs like Worker, and Prefork, select/poll on the socket until it is
available. I'm unsure about how to best fix ap_get_brigade, it is also
used within mod_proxy for client IO.

Thoughts?

Thanks,

Paul

Re: rethinking connection input filter API

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Nov 11, 2011, at 4:25 AM, Graham Leggett wrote:

> 
> My moon-on-a-stick would be for this to be ap_mpm_poll_connections() instead, and allow the option to add extra connections to the poll (for something like mod_proxy and friends, with the option to have each of these extra connections removed by registering a pool cleanup appropriately). In theory, ap_mpm_poll_connections() would return the connection that triggered the event somehow.
> 

+1

> Some thoughts about async SSL handling, specifically SSL_read or SSL_write returning WANTSREAD or WANTSWRITE, in theory EAGAIN just means "call me again", but if there was a way to express "call me again when ready to read" (EREADAGAIN), or "call me again when ready to write" (EWRITEAGAIN) we can do proper async SSL. Not sure how that would fit in with the stuff above?


Re: rethinking connection input filter API

Posted by Graham Leggett <mi...@sharp.fm>.
On 10 Nov 2011, at 9:42 PM, Paul Querna wrote:

> The input filter API function signature is the following:
>
>    apr_status_t func(
>         ap_filter_t *f,
>         apr_bucket_brigade *b,
>         ap_input_mode_t mode,
>         apr_read_type_e block,
>         apr_off_t readbytes);
>
> Problems:
>
> 1) This gives the caller of the API control over weither they can
> block via the  apr_read_type_e block.
>
> 2) The caller also suggests how much data 'should' be read by passing
> of readbytes, however many modules just pull in whatever they can, or
> they have to put in extra logic to keep a backlog buffer.
>
> 3) Currently we have these modes:
>
>    AP_MODE_READBYTES,
>    AP_MODE_GETLINE,
>    AP_MODE_EATCRLF,
>    AP_MODE_SPECULATIVE,
>    AP_MODE_EXHAUSTIVE,
>    AP_MODE_INIT
>
> This means every input filter, must implement all of these different
> modes.  This significantly complicates modules like mod_ssl and
> reqtimeout.  Every module to support things like speculative must
> support buffering internally.
>
> I would like to change the function signature for connection level
> input filters to:
>
> apr_status_t  func(
>         ap_filter_t *f,
>         apr_bucket_brigade *b);
>
> The function always be allowed to return EAGAIN.  It would not be
> responsible for polling or blocking.
>
> This would massively simplify how most input filters work, and would
> enable more creative methods of network IO inside the MPMs.
>
> Most of the work would be in ap_rgetline_core, and figuring out how to
> make it work without pushing complexity into the protocol parsers.
>
> It would however reduce the total lines of code I believe, because we
> would be centralizing all input buffering and different mode needs
> into one place.

Huge +1.

> I believe we would need to add a new method to the MPMs,
> ap_mpm_poll_connection() as a transitionary API.  This could be called
> by ap_rgetline_core or ap_get_brigade, and in the the traditional IO
> MPMs like Worker, and Prefork, select/poll on the socket until it is
> available. I'm unsure about how to best fix ap_get_brigade, it is also
> used within mod_proxy for client IO.

My moon-on-a-stick would be for this to be ap_mpm_poll_connections()  
instead, and allow the option to add extra connections to the poll  
(for something like mod_proxy and friends, with the option to have  
each of these extra connections removed by registering a pool cleanup  
appropriately). In theory, ap_mpm_poll_connections() would return the  
connection that triggered the event somehow.

Some thoughts about async SSL handling, specifically SSL_read or  
SSL_write returning WANTSREAD or WANTSWRITE, in theory EAGAIN just  
means "call me again", but if there was a way to express "call me  
again when ready to read" (EREADAGAIN), or "call me again when ready  
to write" (EWRITEAGAIN) we can do proper async SSL. Not sure how that  
would fit in with the stuff above?

Regards,
Graham
--


Re: rethinking connection input filter API

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Nov 10, 2011, at 11:42 AM, Paul Querna wrote:
> 
> This means every input filter, must implement all of these different
> modes.  This significantly complicates modules like mod_ssl and
> reqtimeout.  Every module to support things like speculative must
> support buffering internally.
> 
> I would like to change the function signature for connection level
> input filters to:
> 
> apr_status_t  func(
>         ap_filter_t *f,
>         apr_bucket_brigade *b);
> 
> The function always be allowed to return EAGAIN.  It would not be
> responsible for polling or blocking.
> 
> This would massively simplify how most input filters work, and would
> enable more creative methods of network IO inside the MPMs.
> 

Assuming that we want to keep input filters (and filters in
general), certainly the current design needs this kind of
adjustment.

++1 on centralization and simplification.