You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@perl.apache.org by barries <ba...@slaysys.com> on 2001/05/25 20:00:53 UTC

Bucket Brigade Filters as objects

Background: A couple of yuears ago I wrote a complete Perl input/output
filtering system much like the current Apache filter system and a bunch
of filters for it. I'm hoping to port it to mod_perl-2.0 and let
mod_perl do the heavy lifting (filters are just a pain to manage).

Having done a bunch of filtering stuff, a large number of "real" filters
will need to be cognizant of BOS/middle/EOS (ie external state) issues,
and will occasionally need to manually send EOSs.  Some will also need
to maintain internal state (even if just the tail end of whetever input
they've not processed).  Since the same filter can be added multiple
times and each get different state (as mod_perl-2.0 does today), OO
Perl offer a nice API for filters, though a little magic is required
since Apache filters have only one callback.

NOTE: None of this is about the proposal for higher level
Perl{In,Out}putFilterHandler padding in my previous mail. This is all
only about bucket brigade filters.

To support a one-handler API like today's, Apache::Filter needs is_BOS()
and is_EOS() accessors added.  That way a simple filter can be:

   my $count ;

   sub handler: MP_INPUT_FILTER {
       my $filter = shift ;

       if ( $filter->is_BOS() ) {
           $count = 0 ;
	   $filter->print( "header\n" ) ;
       }

       while ( $filter->read( my $buffer ) ) {
           $count+= length $buffer ;
           $filter->print( ... ) ;
       }

       if ( $filter->is_EOS() ) {
           $filter->print( "$count bytes filtered\n" ) ;
       }
   }

NOTE: There's no underlying BOS cookie that corresponds to the EOS
cookie, the bucket brigade API assumes that you'll cook up the context
before calling add_xxx_filter(), rather than providing a callback to do
intialization.  It's a bit annoyingly non-symettric.  'course I don't
like magic cookies like EOS, they tend to get dropped by bugs all too
frequently and they introduce special cases in all of the code, so I'll
just shut up about the lack of a BOS :-).

That filter breaks if it's added twice, caveat Pusher.  Perhaps we
should proffer an apache filteration mod that adds an "on_add" handler
to the ap_register_xx_filter() calls, since that's a useful place to do
double-add prevention and state initialization.  It would make it easier
to encapsulate filters and expose them properly, allowing
SetOutputFilter directives to cause a filter init event to occur.

Adding a Apache::Filter::send_EOS() would also allow a filter to purge
Apache's filter chain to a network buffer and hang around doing cleanup
or consuming more input.

Now for the OO insanity...

A natural way to accomodate both the BOS/EOS and internal state would be
to allow Apache::Filter *instances* to be sublassed.  Here's an example:

    package Algae::Filter ;

    use base qw(Apache::Filter) ;

    sub new {
        ## Called manually (see below) before manually calling
	## ap_add_{in,out}put_filter() or just before handle_BOS()
	## is called if a filter named "Algae::Filter" is added
	## using, say SetOutputFilter.
	my $proto = shift ;
	my ref $class = $proto || $proto ;

	my $self = $class->SUPER::new() ;

	... init internal state...

	return $self ;
    }

    sub handle_BOS {
        my $self = shift ;

	## called when first bucket brigade arrives, just before the 
	## first call to handler().

	return APR_SUCCESS ;  ## Or not...
    }

    sub handler: AP_FTYPE_FOO, MP_INPUT_FILTER {
       ## Defaults to AP_FTYPE_CONTENT and MP_OUTPUT_FILTER
       ## Might we call this handle_content()???
        my $self = shift ;
	my ( $bb ) = @_ ;  ## Optional

	... process input, possibly send EOS ...
	return APR_SUCCESS ;
    }

    sub handle_EOS {
        my $self = shift ;
	## Called after the last call to handler().
	return APR_SUCCESS ;
    }

Two common ways of adding this type of filter would be by using the
SetOutputFilter directive or by manually calling ap_add_output_filter().  

In the SetOutputFilter case, nothing would happen until the first bucket
brigade arrived at modperl_output_filter_handler()'s gate.  At that time
mod_perl would notice that no object had been built and would try to
call Algae::Filter->new(), setting the ctx->perl_ctx field to it's
result.

If no new() existed, perl_ctx would not be set.  In this case the
Apache::Filter would be passed in to the other subs as-is.  This is like
today's behavior, and would be what happens for simple cases.

After calling new(), mod_perl would try to call handle_BOS(), passing in
either perl_ctx (ie $self) or the plain ol' Apache::Filter depending on
whether new() returned anything.

Then, handler() would get called with the first arg ($self or the
Apache::Filter) and the first bucket brigade. It would
also be called for each additional bucket brigade that contained data
(other than an EOS).

When the EOS-tailed bucket brigade arrived, handler() would only be
called if it had no additional data before the EOS, and then
handle_EOS() would be called with the same first arg (and no bucket
brigade).  If the filter didn't send_EOS() at some point manually,
mod_perl would now send the EOS.

The default Apache::Filter::handler would be a pass-through.

The reason for separate BOS and EOS handlers is to support
specialization through inheritence.  If we don't want to burden all
bucket brigade filters this way, then perhaps we can use an adapter
class that provides a handler and calls (handle_BOS(), handle_content(),
and handle_EOS()) based on the new Apache::Filter::is_BOS() and
...::is_EOS() accessors.

NOTE: neither handle_BOS() nor handle_EOS() would be passed the bucket
brigade.  All three event handlers would get $self as the first param
(or the default Apache::Filter object if new() was not called).

The second case is where user code wants to call ap_add_xxx_filter() to
add a filter.  Here's how that might look:

   my $f = Algae::Filter->new( ...parms... ) ;
   $r->add_output_filter( $f ) ;  ## or $c->...

This would use the blessed object of the filter as the name of the
filter (which could be quickly registered if not found) and the
reference in $f as the $perl_ctx.

When modperl_output_filter_handler() then gets the first bucket brigade,
it'll see the perl_ctx is present and not call new.  It'll pass in the
perl_ctx as above to the other three functions in the same way.

Areas for improvement:
   - Perhaps, when no new() is found and isa( "Algea::Filter",
     "Apache::Filter" ) is true, the Apache::Filter object passed to the
     handlers should be blessed in to "Algae::Filter" to allow easier
     method calls.  There's a large number of filters that would need no
     new()ing and this little trick would keep users from having to
     write new() just to call SUPER::new().
   - It would be nice to be able to set/get a context without needing to
     go OO.  Not sure if an Apache::Filter accessor would do the trick
     or if it would be better to pass in an optional third arg to the
     handlers.
   - We don't expose enough flexibility for a filter writer to say
     AP_FTYPE_CONTENT+1 or AP_FTYPE_CONTENT-1 to bias a filter towards
     either end of a content chain.

Wow, that's a lot of blather.  Sorry.

Comments?

- Barrie

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by Doug MacEachern <do...@covalent.net>.
On Fri, 25 May 2001, barries wrote:
 
> That way the $r->add_xxx_filter( $name, $ctx ) DWIMs what to do with
> $ctx.

sounds good.
 
> What do you think of enabling Apache::Filter instance subclassing, given
> that filters have to inherit from it anyway to get the CODE attrs
> working?
> 
> Seems like a $r->add_xxx_filter( new Foo::Filter ) is a very simple
> interface.  It's a way of hiding the $ctx and $name in the same blessed
> reference and getting all the context to "just work" using well
> understood Perl OO semantics.  It'd probably mean giving $f some magic
> to make it drive like a HASH ref for the subclasses.

that would be cool!


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by barries <ba...@slaysys.com>.
On Fri, May 25, 2001 at 06:30:03PM -0700, Doug MacEachern wrote:
> On Fri, 25 May 2001, barries wrote:
>  
> >    Apache::ap_add_output_filter( "foo", AP_FTYPE_CONTENT, $ctx )
> 
> you mean ${r,c}->add_output_filter("foo", $ctx), right?

Yeah. Getting late...

> > where Foo is not related to mod_perl, you better be passing an integer
> > in $ctx which is a pointer to some C struct you cooked up in XS that
> > filter foo is expecting.  This would allow mod_perl interfaces to C
> > level filters.
> 
> ah, ok.  $ctx would default to NULL, most of the filters check if their
> context == NULL, and create a new one if so.  hmm, maybe that would also
> be a good way to when the first brigade arrives.

Yeah, I think it comes free if we add a

   ((mod_filter_ctx *)f->ctx)->read_count
   
which (since it's calloced) defaults to 0, and which $filter->tell returns.

What I was after was twofold: being able to pass a Perl structure in
$ctx, which would be put in f->ctx->data, or bineg able to pass a simple
integer value, which would be put in f->ctx, preventing the creation
of the modperl_filter_ctx_t that normally goes in f->ctx, so that
manipulation of external, non-mod_perl filters is possible.

I guess one thought is to see if the named filter is a mod_perl filter,
and create the modperl_filter_ctx_t only if it is, then stuff $ctx in the
modperl_filter_ctx_t's data slot.  $ctx could be any kind of SV in this
case.

If it's not a modperl filter, then $ctx can just be treated as an int
and put in f->ctx.  If $ctx is not an SV that cleanly turns in to an
int, it could croak().

That way the $r->add_xxx_filter( $name, $ctx ) DWIMs what to do with
$ctx.

What do you think of enabling Apache::Filter instance subclassing, given
that filters have to inherit from it anyway to get the CODE attrs
working?

Seems like a $r->add_xxx_filter( new Foo::Filter ) is a very simple
interface.  It's a way of hiding the $ctx and $name in the same blessed
reference and getting all the context to "just work" using well
understood Perl OO semantics.  It'd probably mean giving $f some magic
to make it drive like a HASH ref for the subclasses.

- Barrie

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by Doug MacEachern <do...@covalent.net>.
On Fri, 25 May 2001, barries wrote:
 
>    Apache::ap_add_output_filter( "foo", AP_FTYPE_CONTENT, $ctx )

you mean ${r,c}->add_output_filter("foo", $ctx), right?

> where Foo is not related to mod_perl, you better be passing an integer
> in $ctx which is a pointer to some C struct you cooked up in XS that
> filter foo is expecting.  This would allow mod_perl interfaces to C
> level filters.

ah, ok.  $ctx would default to NULL, most of the filters check if their
context == NULL, and create a new one if so.  hmm, maybe that would also
be a good way to when the first brigade arrives.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by barries <ba...@slaysys.com>.
On Fri, May 25, 2001 at 03:52:31PM -0700, Doug MacEachern wrote:
> On Fri, 25 May 2001, barries wrote:
>  
> > We're on the same page, but right now $ctx in the $r->add_xxx_filter(
> > "foo", $ctx) is written to the f->ctx field that the modperl_filter_t
> > comes to reside in.  So I'm proposing a perl_ctx in the modperl_filter_t
> > to carry Perl context and mapping the $ctx in that call to
> > (modperl_filter_t *f->ctx)->perl_ctx
> 
> right, there is a slot already reserved for that:
> typedef struct {
> --->SV *data;

Yeah, figured that's what data was for.

> > s/is_eos()/eof()/ then.  And s/is_BOS()/!tell()/ for that matter.  Can't
> > really have seek(), since you might be on the 5th of 10 brigades, with
> > the first four already send downstream.
> 
> right, unless we have the module in the middle that collects all the
> brigades into one.

Yup.

> the Perl-level context (set with ${r,c}->add_xxx_filter) should live in
> modperl_filter_ctx_t.data, would be find to rename data perl_ctx or
> whatever.
> not sure what you mean by Apache::ap_add_filter ... C pointer part ?

Well, if you call

   Apache::ap_add_output_filter( "foo", AP_FTYPE_CONTENT, $ctx )

where Foo is not related to mod_perl, you better be passing an integer
in $ctx which is a pointer to some C struct you cooked up in XS that
filter foo is expecting.  This would allow mod_perl interfaces to C
level filters.

- Barrie

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by Doug MacEachern <do...@covalent.net>.
On Fri, 25 May 2001, barries wrote:
 
> BOS=Beginning Of Stream, like EOS=End Of Stream (I assume)

right, stream, not string.
 
> I didn't see any flags in the filter, filter chain, bucket brigade or
> bucket structures/APIs that indicate that this is the first brigade.
> Normally I think stateful filters just init the context before passing
> it in to ap_add_xxx_filter().

there is a macro to test that a bucket is first in the brigade, but i
don't know of anything to check that a brigade is the first brigade in the
chain.
 
> I might throw in a filter_init callback though and see if they like it.
> that solves a lot of these problems nicely, I think and extends the
> SetOutputFilter directive to enable stateful filter init.

sounds good.
 
> We're on the same page, but right now $ctx in the $r->add_xxx_filter(
> "foo", $ctx) is written to the f->ctx field that the modperl_filter_t
> comes to reside in.  So I'm proposing a perl_ctx in the modperl_filter_t
> to carry Perl context and mapping the $ctx in that call to
> (modperl_filter_t *f->ctx)->perl_ctx

right, there is a slot already reserved for that:
typedef struct {
--->SV *data;
    modperl_handler_t *handler;
    PerlInterpreter *perl;
} modperl_filter_ctx_t;

not hooked up yet though.

> Hey, consistency is good, I just want the functionality.

i hear that.
 
> s/is_eos()/eof()/ then.  And s/is_BOS()/!tell()/ for that matter.  Can't
> really have seek(), since you might be on the 5th of 10 brigades, with
> the first four already send downstream.

right, unless we have the module in the middle that collects all the
brigades into one.
 
> > i have been planning todo the implementation, but if you want to beat me
> > to it, that's fine :)
> 
> I'll take a swing at it next week.  I can easily add the $f->eof and
> $f->tell().  Let me know what you want to do about passing a Perl
> context in to $r->add_xxx_filter() (ie do you want to retain the ability
> to set the f->ctx field to a C level pointer, and/or do you want to add
> a perl_ctx).  Perhaps the ${r,c}->add_xxx_filter()s should take a perl
> context and the Apache::ap_add_filter( .... ) should take a C pointer.

the Perl-level context (set with ${r,c}->add_xxx_filter) should live in
modperl_filter_ctx_t.data, would be find to rename data perl_ctx or
whatever.
not sure what you mean by Apache::ap_add_filter ... C pointer part ?


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by barries <ba...@slaysys.com>.
On Fri, May 25, 2001 at 02:31:38PM -0700, Doug MacEachern wrote:
> On Fri, 25 May 2001, barries wrote:
>  
> > I think that two things are necessary to enable a pure Perl
> > implementation:
> >    - adding is_BOS() and is_EOS() to Apache::Filter (or to
> >      Apache::{Brigade,Bucket}, and
> 
> if you really need to know bos (beginning of string?) and eos, you can use

BOS=Beginning Of Stream, like EOS=End Of Stream (I assume)

> the brigade/bucket api, rather than the mod_perl read/print filter
> methods.  i realize it might be tricky in the current state, since filter
> handlers can be called more than once per request.

I didn't see any flags in the filter, filter chain, bucket brigade or
bucket structures/APIs that indicate that this is the first brigade.
Normally I think stateful filters just init the context before passing
it in to ap_add_xxx_filter().

I might throw in a filter_init callback though and see if they like it.
that solves a lot of these problems nicely, I think and extends the
SetOutputFilter directive to enable stateful filter init.

> >    - allowing some kind of Perl scalar (including refs) to be passed in
> >      to ap_add_xxx_filter() and then in to the handler sub.
> 
> shouldn't the filter context provide this?

We're on the same page, but right now $ctx in the $r->add_xxx_filter(
"foo", $ctx) is written to the f->ctx field that the modperl_filter_t
comes to reside in.  So I'm proposing a perl_ctx in the modperl_filter_t
to carry Perl context and mapping the $ctx in that call to
(modperl_filter_t *f->ctx)->perl_ctx

> > Would exposing filter->eos be sufficient for Apache::Fitler->is_EOS()?
> 
> would be cool to have seek(),
> truncate() and similar stdio methods map to the brigade/bucket interface
> underneath.  not to say its out of the question, but mixing methods like
> is_eos() with the stdio-like/stream-like interface doesn't feel right.

Hey, consistency is good, I just want the functionality.

s/is_eos()/eof()/ then.  And s/is_BOS()/!tell()/ for that matter.  Can't
really have seek(), since you might be on the 5th of 10 brigades, with
the first four already send downstream.

> i have been planning todo the implementation, but if you want to beat me
> to it, that's fine :)

I'll take a swing at it next week.  I can easily add the $f->eof and
$f->tell().  Let me know what you want to do about passing a Perl
context in to $r->add_xxx_filter() (ie do you want to retain the ability
to set the f->ctx field to a C level pointer, and/or do you want to add
a perl_ctx).  Perhaps the ${r,c}->add_xxx_filter()s should take a perl
context and the Apache::ap_add_filter( .... ) should take a C pointer.

- Barrie

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by Doug MacEachern <do...@covalent.net>.
On Fri, 25 May 2001, barries wrote:
 
> I think that two things are necessary to enable a pure Perl
> implementation:
>    - adding is_BOS() and is_EOS() to Apache::Filter (or to
>      Apache::{Brigade,Bucket}, and

if you really need to know bos (beginning of string?) and eos, you can use
the brigade/bucket api, rather than the mod_perl read/print filter
methods.  i realize it might be tricky in the current state, since filter
handlers can be called more than once per request.

>    - allowing some kind of Perl scalar (including refs) to be passed in
>      to ap_add_xxx_filter() and then in to the handler sub.

shouldn't the filter context provide this?

> Would exposing filter->eos be sufficient for Apache::Fitler->is_EOS()?

i'd rather beef up the stream-like interface (currently
$filter->{read,print}), so filters which use that interface don't need to
know things like bos and eos.  would be cool to have seek(),
truncate() and similar stdio methods map to the brigade/bucket interface
underneath.  not to say its out of the question, but mixing methods like
is_eos() with the stdio-like/stream-like interface doesn't feel right.

> The is_BOS() would be nice for symmetry (lack of which is one of my
> beefs about both bucket brigades and filter chains as they now exist in
> apache).

if this is something that will fit in the current api, i'm sure the group
would consider adding this functionality.
 
> Providing some Perl context structure passing mechanism is a must-have,
> too, given that the same filter may be isntalled several times.

agreed.

> Providing this in the right way could also obviate the need for an XS
> level is_BOS().
> 
> Allowing Apache::Filter instance subclassing would be really nice, but
> just the above should be fine.
> 
> Do you want me to take a swipe at the above (don't want to duplicate
> work you're planning on doing)?

i have been planning todo the implementation, but if you want to beat me
to it, that's fine :)



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by barries <ba...@slaysys.com>.
On Fri, May 25, 2001 at 11:37:01AM -0700, Doug MacEachern wrote:
>  
> i think we should focus on getting the rest of the direct C api
> mapping done, which we want in any case.  with that in place, you
> should be able to prototype the approach you outlined here as a pure
> Perl module.

Yeah, I mentioned doing it as a subclass of Apache::Filter in that email
somewhere.

I think that two things are necessary to enable a pure Perl
implementation:
   - adding is_BOS() and is_EOS() to Apache::Filter (or to
     Apache::{Brigade,Bucket}, and
   - allowing some kind of Perl scalar (including refs) to be passed in
     to ap_add_xxx_filter() and then in to the handler sub.

Would exposing filter->eos be sufficient for Apache::Fitler->is_EOS()?

The is_BOS() would be nice for symmetry (lack of which is one of my
beefs about both bucket brigades and filter chains as they now exist in
apache).

Providing some Perl context structure passing mechanism is a must-have,
too, given that the same filter may be isntalled several times.
Providing this in the right way could also obviate the need for an XS
level is_BOS().

Allowing Apache::Filter instance subclassing would be really nice, but
just the above should be fine.

Do you want me to take a swipe at the above (don't want to duplicate
work you're planning on doing)?

- Barrie

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by Doug MacEachern <do...@covalent.net>.
On Fri, 25 May 2001, barries wrote:
 
> Adding a Apache::Filter::send_EOS() would also allow a filter to purge

APR::{Brigade,Bucket} apis already provide this.  i don't want to add
methods to Apache::Filter that exist elsewhere.

> Wow, that's a lot of blather.  Sorry.

that is a alot to digest, i've only skimmed it.  i do not want to make
this the default behavior for mod_perl filters, at least not just yet.  i
think we should focus on getting the rest of the direct C api mapping
done, which we want in any case.  with that in place, you should be able
to prototype the approach you outlined here as a pure Perl module.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org