You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@perl.apache.org by barries <ba...@slaysys.com> on 2001/05/24 19:53:13 UTC

[PATCH] A hack at "first class filters"

Here's a quick hack that exposes a PerlFilter per-server command that
ap_register_output_filter()s whatever name(s) are provided them with a
cobbled-together modperl_perl_filter_handler().

I can now enable a Perl filter using SetOutputFilter or in a
PerlFixupHandler (in lieue of a PerlFilterRegisterHandler) using
$r->add_output_filter() and things work "noicly":

   PerlSwitches -MApache2::TestOutputFitler
   PerlFilter Apache2::TestOutputFilter

   <LocationMatch /ftest/.*>
       SetOutputFilter Apache2::TestOutputFilter
   </LocationMatch>

or

   sub fixup_handler {
       my $r = shift ;
       # Note: the C<0> is needed since we're (for now) passing a C-level
       # NULL pointer to be put in the filter's ctx slot (f->ctx).
       $r->add_output_filter( "Apache2::TestOutputFilter", 0 ) ;
   }

Some thoughts:
  - All this is also possible for input filters, but less fun so I
    skipped it, and some naming tweaks would need to take place.
  - The filter names are queued in scgf->PerlFilter so that (someday)
    PerlModule or PerlRequire can glance at the list to see what to
    load.
  - I'm wondering if PerlFilter should be a special case of
    PerlOutputFilter when it's seen at RSRC_CONF scope?
  - An alternative is to just ap_register_xxx_filter() when code is
    loaded (due to PerlSwitch -M or PerlModule/PerlRequire commands),
    based on CODE attrs:

       use base qw(Apache::Fitler);

       sub my_output_filter: AP_FTYPE_CONTENT {
       }

       sub my_input_Filter: mp_input_filter, AP_FTYPE_CONTENT {
       }

    They'd get registered when the CODE attrs get handled by
    Apache::Filter.
  - This could replace current Perl{Input,Output}Filter functionality.
    As far as I can see, the overhead is the creation of a
    modperl_filter_ctx_t and a modperl_handler_t for every filter call.
  - The $r->add_output_filter() functionality is cool. I really, really
    want to use this in a piece of code I'm dieing to work on.
  - It seems like it would be easy to add the ->is_bos and ->is_eos to
    the modperl_filter_ctx_t in this scheme and pass it to handlers as
    a hash.
  - Seems like the current filter code leaks a modperl_filter_t every
    filter call (which gets cleaned up in pool cleanup, but still...) 

Barrie

P.S. This applies on top of the more recent EOS patch, and should apply
without the EOS patch with some fuzz, I think.

--- ../modperl-2.0/./src/modules/perl/modperl_filter.c	Thu May 24 10:55:51 2001
+++ ./src/modules/perl/modperl_filter.c	Thu May 24 13:40:11 2001
@@ -372,6 +372,44 @@
     }
 }
 
+apr_status_t modperl_perl_filter_handler(ap_filter_t *f,
+                                         apr_bucket_brigade *bb)
+{
+    request_rec          *r = f->r;
+    modperl_handler_t    *handler ;
+    modperl_filter_ctx_t *ctx ;
+    modperl_filter_t     *filter;
+    int status;
+
+    if (APR_BRIGADE_IS_EOS(bb)) {
+        /* XXX: see about preventing this in the first place */
+        MP_TRACE_f(MP_FUNC, "first bucket is EOS, skipping callback\n");
+        return ap_pass_brigade(f->next, bb);
+    }
+    else {
+        if (!f->ctx) {
+            ctx = (modperl_filter_ctx_t *)apr_pcalloc(r->pool, sizeof(*ctx));
+            f->ctx = ctx ;
+
+            handler = modperl_handler_new(r->pool,f->frec->name);
+            ctx->handler = handler ;
+        }
+
+        filter = modperl_filter_new(f, bb, MP_OUTPUT_FILTER_MODE);
+        status = modperl_run_filter(filter, 0, 0);
+        modperl_output_filter_send_EOS(filter);
+    }
+
+    switch (status) {
+      case OK:
+        return APR_SUCCESS;
+      case DECLINED:
+        return ap_pass_brigade(f->next, bb);
+      default:
+        return status; /*XXX*/
+    }
+}
+
 apr_status_t modperl_input_filter_handler(ap_filter_t *f,
                                           apr_bucket_brigade *bb,
                                           ap_input_mode_t mode,
--- ../modperl-2.0/./src/modules/perl/modperl_filter.h	Tue May 22 11:07:28 2001
+++ ./src/modules/perl/modperl_filter.h	Thu May 24 12:18:43 2001
@@ -28,6 +28,10 @@
 int modperl_run_filter(modperl_filter_t *filter, ap_input_mode_t mode,
                        apr_size_t *readbytes);
 
+/* "perl" filters: added by httpd, so no context when handler called */
+apr_status_t modperl_perl_filter_handler(ap_filter_t *f,
+                                           apr_bucket_brigade *bb);
+
 /* output filters */
 apr_status_t modperl_output_filter_handler(ap_filter_t *f,
                                            apr_bucket_brigade *bb);
--- ../modperl-2.0/./src/modules/perl/mod_perl.c	Thu May 10 14:48:01 2001
+++ ./src/modules/perl/mod_perl.c	Thu May 24 11:16:48 2001
@@ -323,6 +323,7 @@
 static const command_rec modperl_cmds[] = {  
     MP_CMD_SRV_ITERATE("PerlSwitches", switches, "Perl Switches"),
     MP_CMD_DIR_ITERATE("PerlOptions", options, "Perl Options"),
+    MP_CMD_SRV_ITERATE("PerlFilter", filters, "Perl Filters (visible to Apache)"),
 #ifdef MP_TRACE
     MP_CMD_SRV_TAKE1("PerlTrace", trace, "Trace level"),
 #endif
--- ../modperl-2.0/./src/modules/perl/modperl_cmd.c	Thu Apr  5 22:18:15 2001
+++ ./src/modules/perl/modperl_cmd.c	Thu May 24 12:19:06 2001
@@ -67,6 +67,19 @@
     return NULL;
 }
 
+MP_CMD_SRV_DECLARE(filters)
+{
+    MP_dSCFG(parms->server);
+    *(const char **)apr_array_push(scfg->PerlFilter) = arg ;
+    /* XXX: Need to allow a cmg arg or CODE attr to specify AP_FTYPE_... */
+    ap_register_output_filter(
+        arg,
+        modperl_perl_filter_handler,
+        AP_FTYPE_CONTENT
+    );
+    return NULL;
+}
+
 #ifdef USE_ITHREADS
 
 #define MP_INTERP_SCOPE_USAGE "PerlInterpScope must be one of "
--- ../modperl-2.0/./src/modules/perl/modperl_cmd.h	Thu Apr  5 22:18:15 2001
+++ ./src/modules/perl/modperl_cmd.h	Thu May 24 11:21:01 2001
@@ -11,6 +11,7 @@
 MP_CMD_SRV_DECLARE(trace);
 MP_CMD_SRV_DECLARE(switches);
 MP_CMD_SRV_DECLARE(options);
+MP_CMD_SRV_DECLARE(filters);
 
 #ifdef USE_ITHREADS
 MP_CMD_SRV_DECLARE(interp_start);
--- ../modperl-2.0/./src/modules/perl/modperl_types.h	Thu May 10 14:48:17 2001
+++ ./src/modules/perl/modperl_types.h	Thu May 24 11:33:33 2001
@@ -112,7 +112,7 @@
 typedef struct {
     MpHV *SetVars;
     MpAV *PassEnv;
-    MpAV *PerlRequire, *PerlModule;
+    MpAV *PerlRequire, *PerlModule, *PerlFilter;
     MpAV *handlers_per_srv[MP_HANDLER_NUM_PER_SRV];
     MpAV *handlers_files[MP_HANDLER_NUM_FILES];
     MpAV *handlers_process[MP_HANDLER_NUM_PROCESS];
--- ../modperl-2.0/./src/modules/perl/modperl_config.c	Thu May 10 14:48:07 2001
+++ ./src/modules/perl/modperl_config.c	Thu May 24 11:51:06 2001
@@ -71,6 +71,7 @@
     MpSrvENABLED_On(scfg); /* mod_perl enabled by default */
     MpSrvHOOKS_ALL_On(scfg); /* all hooks enabled by default */
 
+    scfg->PerlFilter = apr_array_make(p, 2, sizeof(char *));
     scfg->argv = apr_array_make(p, 2, sizeof(char *));
 
     modperl_config_srv_argv_push((char *)ap_server_argv0);

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by Doug MacEachern <do...@covalent.net>.
On Fri, 25 May 2001, barries wrote:
 
> That way the $r->add_xxx_filter( $name, $ctx ) DWIMs what to do with
> $ctx.

sounds good.
 
> What do you think of enabling Apache::Filter instance subclassing, given
> that filters have to inherit from it anyway to get the CODE attrs
> working?
> 
> Seems like a $r->add_xxx_filter( new Foo::Filter ) is a very simple
> interface.  It's a way of hiding the $ctx and $name in the same blessed
> reference and getting all the context to "just work" using well
> understood Perl OO semantics.  It'd probably mean giving $f some magic
> to make it drive like a HASH ref for the subclasses.

that would be cool!


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by barries <ba...@slaysys.com>.
On Fri, May 25, 2001 at 06:30:03PM -0700, Doug MacEachern wrote:
> On Fri, 25 May 2001, barries wrote:
>  
> >    Apache::ap_add_output_filter( "foo", AP_FTYPE_CONTENT, $ctx )
> 
> you mean ${r,c}->add_output_filter("foo", $ctx), right?

Yeah. Getting late...

> > where Foo is not related to mod_perl, you better be passing an integer
> > in $ctx which is a pointer to some C struct you cooked up in XS that
> > filter foo is expecting.  This would allow mod_perl interfaces to C
> > level filters.
> 
> ah, ok.  $ctx would default to NULL, most of the filters check if their
> context == NULL, and create a new one if so.  hmm, maybe that would also
> be a good way to when the first brigade arrives.

Yeah, I think it comes free if we add a

   ((mod_filter_ctx *)f->ctx)->read_count
   
which (since it's calloced) defaults to 0, and which $filter->tell returns.

What I was after was twofold: being able to pass a Perl structure in
$ctx, which would be put in f->ctx->data, or bineg able to pass a simple
integer value, which would be put in f->ctx, preventing the creation
of the modperl_filter_ctx_t that normally goes in f->ctx, so that
manipulation of external, non-mod_perl filters is possible.

I guess one thought is to see if the named filter is a mod_perl filter,
and create the modperl_filter_ctx_t only if it is, then stuff $ctx in the
modperl_filter_ctx_t's data slot.  $ctx could be any kind of SV in this
case.

If it's not a modperl filter, then $ctx can just be treated as an int
and put in f->ctx.  If $ctx is not an SV that cleanly turns in to an
int, it could croak().

That way the $r->add_xxx_filter( $name, $ctx ) DWIMs what to do with
$ctx.

What do you think of enabling Apache::Filter instance subclassing, given
that filters have to inherit from it anyway to get the CODE attrs
working?

Seems like a $r->add_xxx_filter( new Foo::Filter ) is a very simple
interface.  It's a way of hiding the $ctx and $name in the same blessed
reference and getting all the context to "just work" using well
understood Perl OO semantics.  It'd probably mean giving $f some magic
to make it drive like a HASH ref for the subclasses.

- Barrie

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by Doug MacEachern <do...@covalent.net>.
On Fri, 25 May 2001, barries wrote:
 
>    Apache::ap_add_output_filter( "foo", AP_FTYPE_CONTENT, $ctx )

you mean ${r,c}->add_output_filter("foo", $ctx), right?

> where Foo is not related to mod_perl, you better be passing an integer
> in $ctx which is a pointer to some C struct you cooked up in XS that
> filter foo is expecting.  This would allow mod_perl interfaces to C
> level filters.

ah, ok.  $ctx would default to NULL, most of the filters check if their
context == NULL, and create a new one if so.  hmm, maybe that would also
be a good way to when the first brigade arrives.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by barries <ba...@slaysys.com>.
On Fri, May 25, 2001 at 03:52:31PM -0700, Doug MacEachern wrote:
> On Fri, 25 May 2001, barries wrote:
>  
> > We're on the same page, but right now $ctx in the $r->add_xxx_filter(
> > "foo", $ctx) is written to the f->ctx field that the modperl_filter_t
> > comes to reside in.  So I'm proposing a perl_ctx in the modperl_filter_t
> > to carry Perl context and mapping the $ctx in that call to
> > (modperl_filter_t *f->ctx)->perl_ctx
> 
> right, there is a slot already reserved for that:
> typedef struct {
> --->SV *data;

Yeah, figured that's what data was for.

> > s/is_eos()/eof()/ then.  And s/is_BOS()/!tell()/ for that matter.  Can't
> > really have seek(), since you might be on the 5th of 10 brigades, with
> > the first four already send downstream.
> 
> right, unless we have the module in the middle that collects all the
> brigades into one.

Yup.

> the Perl-level context (set with ${r,c}->add_xxx_filter) should live in
> modperl_filter_ctx_t.data, would be find to rename data perl_ctx or
> whatever.
> not sure what you mean by Apache::ap_add_filter ... C pointer part ?

Well, if you call

   Apache::ap_add_output_filter( "foo", AP_FTYPE_CONTENT, $ctx )

where Foo is not related to mod_perl, you better be passing an integer
in $ctx which is a pointer to some C struct you cooked up in XS that
filter foo is expecting.  This would allow mod_perl interfaces to C
level filters.

- Barrie

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by Doug MacEachern <do...@covalent.net>.
On Fri, 25 May 2001, barries wrote:
 
> BOS=Beginning Of Stream, like EOS=End Of Stream (I assume)

right, stream, not string.
 
> I didn't see any flags in the filter, filter chain, bucket brigade or
> bucket structures/APIs that indicate that this is the first brigade.
> Normally I think stateful filters just init the context before passing
> it in to ap_add_xxx_filter().

there is a macro to test that a bucket is first in the brigade, but i
don't know of anything to check that a brigade is the first brigade in the
chain.
 
> I might throw in a filter_init callback though and see if they like it.
> that solves a lot of these problems nicely, I think and extends the
> SetOutputFilter directive to enable stateful filter init.

sounds good.
 
> We're on the same page, but right now $ctx in the $r->add_xxx_filter(
> "foo", $ctx) is written to the f->ctx field that the modperl_filter_t
> comes to reside in.  So I'm proposing a perl_ctx in the modperl_filter_t
> to carry Perl context and mapping the $ctx in that call to
> (modperl_filter_t *f->ctx)->perl_ctx

right, there is a slot already reserved for that:
typedef struct {
--->SV *data;
    modperl_handler_t *handler;
    PerlInterpreter *perl;
} modperl_filter_ctx_t;

not hooked up yet though.

> Hey, consistency is good, I just want the functionality.

i hear that.
 
> s/is_eos()/eof()/ then.  And s/is_BOS()/!tell()/ for that matter.  Can't
> really have seek(), since you might be on the 5th of 10 brigades, with
> the first four already send downstream.

right, unless we have the module in the middle that collects all the
brigades into one.
 
> > i have been planning todo the implementation, but if you want to beat me
> > to it, that's fine :)
> 
> I'll take a swing at it next week.  I can easily add the $f->eof and
> $f->tell().  Let me know what you want to do about passing a Perl
> context in to $r->add_xxx_filter() (ie do you want to retain the ability
> to set the f->ctx field to a C level pointer, and/or do you want to add
> a perl_ctx).  Perhaps the ${r,c}->add_xxx_filter()s should take a perl
> context and the Apache::ap_add_filter( .... ) should take a C pointer.

the Perl-level context (set with ${r,c}->add_xxx_filter) should live in
modperl_filter_ctx_t.data, would be find to rename data perl_ctx or
whatever.
not sure what you mean by Apache::ap_add_filter ... C pointer part ?


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by barries <ba...@slaysys.com>.
On Fri, May 25, 2001 at 02:31:38PM -0700, Doug MacEachern wrote:
> On Fri, 25 May 2001, barries wrote:
>  
> > I think that two things are necessary to enable a pure Perl
> > implementation:
> >    - adding is_BOS() and is_EOS() to Apache::Filter (or to
> >      Apache::{Brigade,Bucket}, and
> 
> if you really need to know bos (beginning of string?) and eos, you can use

BOS=Beginning Of Stream, like EOS=End Of Stream (I assume)

> the brigade/bucket api, rather than the mod_perl read/print filter
> methods.  i realize it might be tricky in the current state, since filter
> handlers can be called more than once per request.

I didn't see any flags in the filter, filter chain, bucket brigade or
bucket structures/APIs that indicate that this is the first brigade.
Normally I think stateful filters just init the context before passing
it in to ap_add_xxx_filter().

I might throw in a filter_init callback though and see if they like it.
that solves a lot of these problems nicely, I think and extends the
SetOutputFilter directive to enable stateful filter init.

> >    - allowing some kind of Perl scalar (including refs) to be passed in
> >      to ap_add_xxx_filter() and then in to the handler sub.
> 
> shouldn't the filter context provide this?

We're on the same page, but right now $ctx in the $r->add_xxx_filter(
"foo", $ctx) is written to the f->ctx field that the modperl_filter_t
comes to reside in.  So I'm proposing a perl_ctx in the modperl_filter_t
to carry Perl context and mapping the $ctx in that call to
(modperl_filter_t *f->ctx)->perl_ctx

> > Would exposing filter->eos be sufficient for Apache::Fitler->is_EOS()?
> 
> would be cool to have seek(),
> truncate() and similar stdio methods map to the brigade/bucket interface
> underneath.  not to say its out of the question, but mixing methods like
> is_eos() with the stdio-like/stream-like interface doesn't feel right.

Hey, consistency is good, I just want the functionality.

s/is_eos()/eof()/ then.  And s/is_BOS()/!tell()/ for that matter.  Can't
really have seek(), since you might be on the 5th of 10 brigades, with
the first four already send downstream.

> i have been planning todo the implementation, but if you want to beat me
> to it, that's fine :)

I'll take a swing at it next week.  I can easily add the $f->eof and
$f->tell().  Let me know what you want to do about passing a Perl
context in to $r->add_xxx_filter() (ie do you want to retain the ability
to set the f->ctx field to a C level pointer, and/or do you want to add
a perl_ctx).  Perhaps the ${r,c}->add_xxx_filter()s should take a perl
context and the Apache::ap_add_filter( .... ) should take a C pointer.

- Barrie

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by Doug MacEachern <do...@covalent.net>.
On Fri, 25 May 2001, barries wrote:
 
> I think that two things are necessary to enable a pure Perl
> implementation:
>    - adding is_BOS() and is_EOS() to Apache::Filter (or to
>      Apache::{Brigade,Bucket}, and

if you really need to know bos (beginning of string?) and eos, you can use
the brigade/bucket api, rather than the mod_perl read/print filter
methods.  i realize it might be tricky in the current state, since filter
handlers can be called more than once per request.

>    - allowing some kind of Perl scalar (including refs) to be passed in
>      to ap_add_xxx_filter() and then in to the handler sub.

shouldn't the filter context provide this?

> Would exposing filter->eos be sufficient for Apache::Fitler->is_EOS()?

i'd rather beef up the stream-like interface (currently
$filter->{read,print}), so filters which use that interface don't need to
know things like bos and eos.  would be cool to have seek(),
truncate() and similar stdio methods map to the brigade/bucket interface
underneath.  not to say its out of the question, but mixing methods like
is_eos() with the stdio-like/stream-like interface doesn't feel right.

> The is_BOS() would be nice for symmetry (lack of which is one of my
> beefs about both bucket brigades and filter chains as they now exist in
> apache).

if this is something that will fit in the current api, i'm sure the group
would consider adding this functionality.
 
> Providing some Perl context structure passing mechanism is a must-have,
> too, given that the same filter may be isntalled several times.

agreed.

> Providing this in the right way could also obviate the need for an XS
> level is_BOS().
> 
> Allowing Apache::Filter instance subclassing would be really nice, but
> just the above should be fine.
> 
> Do you want me to take a swipe at the above (don't want to duplicate
> work you're planning on doing)?

i have been planning todo the implementation, but if you want to beat me
to it, that's fine :)



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by barries <ba...@slaysys.com>.
On Fri, May 25, 2001 at 11:37:01AM -0700, Doug MacEachern wrote:
>  
> i think we should focus on getting the rest of the direct C api
> mapping done, which we want in any case.  with that in place, you
> should be able to prototype the approach you outlined here as a pure
> Perl module.

Yeah, I mentioned doing it as a subclass of Apache::Filter in that email
somewhere.

I think that two things are necessary to enable a pure Perl
implementation:
   - adding is_BOS() and is_EOS() to Apache::Filter (or to
     Apache::{Brigade,Bucket}, and
   - allowing some kind of Perl scalar (including refs) to be passed in
     to ap_add_xxx_filter() and then in to the handler sub.

Would exposing filter->eos be sufficient for Apache::Fitler->is_EOS()?

The is_BOS() would be nice for symmetry (lack of which is one of my
beefs about both bucket brigades and filter chains as they now exist in
apache).

Providing some Perl context structure passing mechanism is a must-have,
too, given that the same filter may be isntalled several times.
Providing this in the right way could also obviate the need for an XS
level is_BOS().

Allowing Apache::Filter instance subclassing would be really nice, but
just the above should be fine.

Do you want me to take a swipe at the above (don't want to duplicate
work you're planning on doing)?

- Barrie

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Bucket Brigade Filters as objects

Posted by Doug MacEachern <do...@covalent.net>.
On Fri, 25 May 2001, barries wrote:
 
> Adding a Apache::Filter::send_EOS() would also allow a filter to purge

APR::{Brigade,Bucket} apis already provide this.  i don't want to add
methods to Apache::Filter that exist elsewhere.

> Wow, that's a lot of blather.  Sorry.

that is a alot to digest, i've only skimmed it.  i do not want to make
this the default behavior for mod_perl filters, at least not just yet.  i
think we should focus on getting the rest of the direct C api mapping
done, which we want in any case.  with that in place, you should be able
to prototype the approach you outlined here as a pure Perl module.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Bucket Brigade Filters as objects

Posted by barries <ba...@slaysys.com>.
Background: A couple of yuears ago I wrote a complete Perl input/output
filtering system much like the current Apache filter system and a bunch
of filters for it. I'm hoping to port it to mod_perl-2.0 and let
mod_perl do the heavy lifting (filters are just a pain to manage).

Having done a bunch of filtering stuff, a large number of "real" filters
will need to be cognizant of BOS/middle/EOS (ie external state) issues,
and will occasionally need to manually send EOSs.  Some will also need
to maintain internal state (even if just the tail end of whetever input
they've not processed).  Since the same filter can be added multiple
times and each get different state (as mod_perl-2.0 does today), OO
Perl offer a nice API for filters, though a little magic is required
since Apache filters have only one callback.

NOTE: None of this is about the proposal for higher level
Perl{In,Out}putFilterHandler padding in my previous mail. This is all
only about bucket brigade filters.

To support a one-handler API like today's, Apache::Filter needs is_BOS()
and is_EOS() accessors added.  That way a simple filter can be:

   my $count ;

   sub handler: MP_INPUT_FILTER {
       my $filter = shift ;

       if ( $filter->is_BOS() ) {
           $count = 0 ;
	   $filter->print( "header\n" ) ;
       }

       while ( $filter->read( my $buffer ) ) {
           $count+= length $buffer ;
           $filter->print( ... ) ;
       }

       if ( $filter->is_EOS() ) {
           $filter->print( "$count bytes filtered\n" ) ;
       }
   }

NOTE: There's no underlying BOS cookie that corresponds to the EOS
cookie, the bucket brigade API assumes that you'll cook up the context
before calling add_xxx_filter(), rather than providing a callback to do
intialization.  It's a bit annoyingly non-symettric.  'course I don't
like magic cookies like EOS, they tend to get dropped by bugs all too
frequently and they introduce special cases in all of the code, so I'll
just shut up about the lack of a BOS :-).

That filter breaks if it's added twice, caveat Pusher.  Perhaps we
should proffer an apache filteration mod that adds an "on_add" handler
to the ap_register_xx_filter() calls, since that's a useful place to do
double-add prevention and state initialization.  It would make it easier
to encapsulate filters and expose them properly, allowing
SetOutputFilter directives to cause a filter init event to occur.

Adding a Apache::Filter::send_EOS() would also allow a filter to purge
Apache's filter chain to a network buffer and hang around doing cleanup
or consuming more input.

Now for the OO insanity...

A natural way to accomodate both the BOS/EOS and internal state would be
to allow Apache::Filter *instances* to be sublassed.  Here's an example:

    package Algae::Filter ;

    use base qw(Apache::Filter) ;

    sub new {
        ## Called manually (see below) before manually calling
	## ap_add_{in,out}put_filter() or just before handle_BOS()
	## is called if a filter named "Algae::Filter" is added
	## using, say SetOutputFilter.
	my $proto = shift ;
	my ref $class = $proto || $proto ;

	my $self = $class->SUPER::new() ;

	... init internal state...

	return $self ;
    }

    sub handle_BOS {
        my $self = shift ;

	## called when first bucket brigade arrives, just before the 
	## first call to handler().

	return APR_SUCCESS ;  ## Or not...
    }

    sub handler: AP_FTYPE_FOO, MP_INPUT_FILTER {
       ## Defaults to AP_FTYPE_CONTENT and MP_OUTPUT_FILTER
       ## Might we call this handle_content()???
        my $self = shift ;
	my ( $bb ) = @_ ;  ## Optional

	... process input, possibly send EOS ...
	return APR_SUCCESS ;
    }

    sub handle_EOS {
        my $self = shift ;
	## Called after the last call to handler().
	return APR_SUCCESS ;
    }

Two common ways of adding this type of filter would be by using the
SetOutputFilter directive or by manually calling ap_add_output_filter().  

In the SetOutputFilter case, nothing would happen until the first bucket
brigade arrived at modperl_output_filter_handler()'s gate.  At that time
mod_perl would notice that no object had been built and would try to
call Algae::Filter->new(), setting the ctx->perl_ctx field to it's
result.

If no new() existed, perl_ctx would not be set.  In this case the
Apache::Filter would be passed in to the other subs as-is.  This is like
today's behavior, and would be what happens for simple cases.

After calling new(), mod_perl would try to call handle_BOS(), passing in
either perl_ctx (ie $self) or the plain ol' Apache::Filter depending on
whether new() returned anything.

Then, handler() would get called with the first arg ($self or the
Apache::Filter) and the first bucket brigade. It would
also be called for each additional bucket brigade that contained data
(other than an EOS).

When the EOS-tailed bucket brigade arrived, handler() would only be
called if it had no additional data before the EOS, and then
handle_EOS() would be called with the same first arg (and no bucket
brigade).  If the filter didn't send_EOS() at some point manually,
mod_perl would now send the EOS.

The default Apache::Filter::handler would be a pass-through.

The reason for separate BOS and EOS handlers is to support
specialization through inheritence.  If we don't want to burden all
bucket brigade filters this way, then perhaps we can use an adapter
class that provides a handler and calls (handle_BOS(), handle_content(),
and handle_EOS()) based on the new Apache::Filter::is_BOS() and
...::is_EOS() accessors.

NOTE: neither handle_BOS() nor handle_EOS() would be passed the bucket
brigade.  All three event handlers would get $self as the first param
(or the default Apache::Filter object if new() was not called).

The second case is where user code wants to call ap_add_xxx_filter() to
add a filter.  Here's how that might look:

   my $f = Algae::Filter->new( ...parms... ) ;
   $r->add_output_filter( $f ) ;  ## or $c->...

This would use the blessed object of the filter as the name of the
filter (which could be quickly registered if not found) and the
reference in $f as the $perl_ctx.

When modperl_output_filter_handler() then gets the first bucket brigade,
it'll see the perl_ctx is present and not call new.  It'll pass in the
perl_ctx as above to the other three functions in the same way.

Areas for improvement:
   - Perhaps, when no new() is found and isa( "Algea::Filter",
     "Apache::Filter" ) is true, the Apache::Filter object passed to the
     handlers should be blessed in to "Algae::Filter" to allow easier
     method calls.  There's a large number of filters that would need no
     new()ing and this little trick would keep users from having to
     write new() just to call SUPER::new().
   - It would be nice to be able to set/get a context without needing to
     go OO.  Not sure if an Apache::Filter accessor would do the trick
     or if it would be better to pass in an optional third arg to the
     handlers.
   - We don't expose enough flexibility for a filter writer to say
     AP_FTYPE_CONTENT+1 or AP_FTYPE_CONTENT-1 to bias a filter towards
     either end of a content chain.

Wow, that's a lot of blather.  Sorry.

Comments?

- Barrie

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Filters != handlers [Was: [PATCH] A hack at "first class filters"]

Posted by Doug MacEachern <do...@covalent.net>.
On Fri, 25 May 2001, barries wrote:
 
> Ok, I think I was underestimating how much padding you wanted on the
> direct C API mapping.  I thought that that would be 0 padding (like how
> the ctx param to $r->add_output_filter() is treated).

i would like 0 padding for the 'direct' mapping.  the padding aside from
Perl*FilterHandler's taking care of registration/adding is to provide the
stdio-like interface, $filter->{read,print,etc}.  and maybe a bit more to
have an option where brigades are merged so a filter handler is only
called once per-request.
 
> So I was seeing three layers: Perl*FilterHandler-installed filters with
> lots of padding, The "Bucket Brigade" filters, with enough padding to
> make bucket brigade writing easy yet efficient (relative to
> Perl*FilterHandlers, anyway), and C level API, which is very low level
> and probably needs a little XS or Inline::C code to make much use of it.
> That's what I get for reading not-yet-polished code (the C level API)
> and assuming...

much of the brigade/bucket api has been polished, though there might be 
some missing methods.  its the registration/adding of filters thats not
really been touched yet.

> I think the main point of what I'm talking about here (the "Bucket
> Brigade filters") is to get enough padding and automation built in that
> it doesn't take any more Perl code to write them than it does to write
> Perl*Handlers, even though it's tricker code.  I couldn't see how to add
> that level of padding to the C API without hiding the low level peices.

right, $filter->{read,print,etc} is attempting to hide the low level.
 
> Ok, now I see two alternatives: I'm saying use CODE attrs on subs you
> want auto-registered and call the C level ap_register_xxx_filter()
> wrapper if you want to register something manually, passing in the
> optional AP_FTYPE_FOO constant.
> 
> You're saying to always use the ap_register_xxx_filter() manually all
> the time unless you use the Perl*FilterHandler directive.  In this case
> the ap_register_xxx_filter() would need to arbitrate between any CODE
> attrs and whatever AP_FTYPE_FOO the user passed in (if they passed one
> in).

i suppose if no AP_FTYPE_ is passed to ap_register_xxx_filter(), mod_perl
could first look for sub attributes, if there are none, then default to
AP_FTYPE_CONTENT.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Filters != handlers [Was: [PATCH] A hack at "first class filters"]

Posted by barries <ba...@slaysys.com>.
On Fri, May 25, 2001 at 11:06:47AM -0700, Doug MacEachern wrote:
>
> i think we're on the same page, this is pretty much the same as what 
> i outlined under 'direct C api mapping.

Ok, I think I was underestimating how much padding you wanted on the
direct C API mapping.  I thought that that would be 0 padding (like how
the ctx param to $r->add_output_filter() is treated).

So I was seeing three layers: Perl*FilterHandler-installed filters with
lots of padding, The "Bucket Brigade" filters, with enough padding to
make bucket brigade writing easy yet efficient (relative to
Perl*FilterHandlers, anyway), and C level API, which is very low level
and probably needs a little XS or Inline::C code to make much use of it.
That's what I get for reading not-yet-polished code (the C level API)
and assuming...

I think the main point of what I'm talking about here (the "Bucket
Brigade filters") is to get enough padding and automation built in that
it doesn't take any more Perl code to write them than it does to write
Perl*Handlers, even though it's tricker code.  I couldn't see how to add
that level of padding to the C API without hiding the low level peices.

> > mod_perl could call ap_register_xxx_filter() on all subs compiled with
> > either an AP_FTYPE_FOO or a new MP_INPUT_FILTER or MP_OUTPUT_FILTER sub
> > attr.
> 
> except for this part.  for the direct mapping, mod_perl should not do
> anything special.  it will be up to the Perl filter module to call
> Apache::register_xxx_filter with AP_FTYPE_FOO.  i'm ok with defaulting
> arguments, e.g. ftype could default to AP_FTYPE_CONTENT.

Ok, now I see two alternatives: I'm saying use CODE attrs on subs you
want auto-registered and call the C level ap_register_xxx_filter()
wrapper if you want to register something manually, passing in the
optional AP_FTYPE_FOO constant.

You're saying to always use the ap_register_xxx_filter() manually all
the time unless you use the Perl*FilterHandler directive.  In this case
the ap_register_xxx_filter() would need to arbitrate between any CODE
attrs and whatever AP_FTYPE_FOO the user passed in (if they passed one
in).

Either way seems pretty reasonable. It just seems like using the CODE
attrs to trigger something automatic is fun & a nice level of helpful
padding.  When wouldn't you want it (seeing as you can always leave off
the CODE attrs and pass the AP_FTYPE_FOO in to ap_register_xx_filter())?

Thanks,

Barrie

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Filters != handlers [Was: [PATCH] A hack at "first class filters"]

Posted by Doug MacEachern <do...@covalent.net>.
On Fri, 25 May 2001, barries wrote:

> On Thu, May 24, 2001 at 05:52:39PM -0700, Doug MacEachern wrote:
> > let's consider everything before adding new code.
> 
> Ok :-).  I have a reply in queue that works through your ideation w/
> questions & suggestions.  But first, let's look at really bifurcating
> the API into a Perl*Filter API and a more Apache-esque API.
> 
> I'm starting to think that packaging low level Apache filters as though
> they were Perl*Handlers is a misleading API: they're different from
> plain handlers and treating them so is likely to mislead developers and
> provides different semantics than the underlying Apache API provides,
> leading to impedance mismatch in mod_perl's implementation and in
> devloper's mental models.
> 
> Other handlers get called once per request or subrequest.  Filters will
> often get called 2 or many, many times (see any handler that skips
> sending an EOS bucket, as well as mod_isapi and mod_proxy). Writing a
> filter sub requires a much more event-driven mindest than writing a
> handler sub: even the example reverse filter has a surprise in it waiting
> for somebody that slaps it in front of mod_proxy or mod_isapi or any
> other handler that sends multiple bbs of content.
> 
> Handler subs are simple use-once event handlers no intra-request state
> needs to be kept.  Filter subs will often need to be aware of BOS/EOS
> state provided by mod_perl, and will also need to keep some
> intra-request state.

to me, a Perl*Handler is just a configuration directive which specifies
the name of a Perl subroutine.  and that subroutine is called during a
given apache plugin callback.  all Perl*Handlers, including filters use
the same mechanism for resolving the name, loading the module if
needed, calling the subroutine, etc.
 
> So, how about two APIs?

isn't this what i've been saying all along? :)
 
> API 1: Perl{In,Out}putFilterHandlers
> 
> Perl{In,Out}putFilter directives act much like todays, but use the
> filter name when registering filters and AP_FTYPE_...CODE attrs to
> modify when ap_add_xxx_filter()s get called.  The only reason for using
> the filter names is to enable debug dumps of the filter stack to be
> read, and to obey the principle of "least surpise", meaning that there's
> no need for the special-ness of MODPERL_{IN,OUT}PUT_FILTER_NAME to be
> grokked by hapless mod_perl-internals-geek-wannabes like me.

sure, i'm all for using the Perl name to register the filter.
 
> However, to make these Perl{In,Out}putFilterHandlers consistent with
> other Perl*Handlers, mod_perl would need
>    - to buffer all filter input in an ever-growing bucket brigade until
>      EOS is seen,
>    - then call the Perl*FilterHandler sub (once!),

i would consider this as an option.  it could actually be a standalone
filter module that collects all brigades into one before sending it
further down the chain.  if its proven to be efficient, we can make it the
default.

>    - passing it $r blessed in to an Apache::RequestFilter (or some such)
>      class which would have alternative I/O subs that called the brigade
>      APIs.  Would also provide a filter() accessor to get at the
>      "real" Apache::Filter object for the rare case when that might be
>      needed in sucha high level handler.

$filter->r gives you the request_rec.  Perl*Handlers in 2.0 are passed
the same arguments as C handlers are passed.

>    - and pass on the EOS when the sub exited.
> 
> The coding style would be very consistent with the existing mechanisms
> and all the hairball stuff you can trip over with filters is balled up
> into a tangle of wiring hidden behind nice, padded walls. No BOS/EOS
> worries, not statefullness worries, just memory and (possibly) latency
> worries.

i hope we can get to that.  the C bucket brigade api is klunky.  the
$filter->{read,write} methods written as part of the padding, still need
to be implemented for input filters.
 
> The worries in that scheme, as well as the lack of flexibility in
> calling ap_add_xxx_filter() need a low-level API to handle.
> 
> API 2: Bucket Brigade filters

i think we're on the same page, this is pretty much the same as what 
i outlined under 'direct C api mapping.

> mod_perl could call ap_register_xxx_filter() on all subs compiled with
> either an AP_FTYPE_FOO or a new MP_INPUT_FILTER or MP_OUTPUT_FILTER sub
> attr.

except for this part.  for the direct mapping, mod_perl should not do
anything special.  it will be up to the Perl filter module to call
Apache::register_xxx_filter with AP_FTYPE_FOO.  i'm ok with defaulting
arguments, e.g. ftype could default to AP_FTYPE_CONTENT.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Filters != handlers [Was: [PATCH] A hack at "first class filters"]

Posted by barries <ba...@slaysys.com>.
On Thu, May 24, 2001 at 05:52:39PM -0700, Doug MacEachern wrote:
> let's consider everything before adding new code.

Ok :-).  I have a reply in queue that works through your ideation w/
questions & suggestions.  But first, let's look at really bifurcating
the API into a Perl*Filter API and a more Apache-esque API.

I'm starting to think that packaging low level Apache filters as though
they were Perl*Handlers is a misleading API: they're different from
plain handlers and treating them so is likely to mislead developers and
provides different semantics than the underlying Apache API provides,
leading to impedance mismatch in mod_perl's implementation and in
devloper's mental models.

Other handlers get called once per request or subrequest.  Filters will
often get called 2 or many, many times (see any handler that skips
sending an EOS bucket, as well as mod_isapi and mod_proxy). Writing a
filter sub requires a much more event-driven mindest than writing a
handler sub: even the example reverse filter has a surprise in it waiting
for somebody that slaps it in front of mod_proxy or mod_isapi or any
other handler that sends multiple bbs of content.

Handler subs are simple use-once event handlers no intra-request state
needs to be kept.  Filter subs will often need to be aware of BOS/EOS
state provided by mod_perl, and will also need to keep some
intra-request state.

So, how about two APIs?

API 1: Perl{In,Out}putFilterHandlers

Perl{In,Out}putFilter directives act much like todays, but use the
filter name when registering filters and AP_FTYPE_...CODE attrs to
modify when ap_add_xxx_filter()s get called.  The only reason for using
the filter names is to enable debug dumps of the filter stack to be
read, and to obey the principle of "least surpise", meaning that there's
no need for the special-ness of MODPERL_{IN,OUT}PUT_FILTER_NAME to be
grokked by hapless mod_perl-internals-geek-wannabes like me.

However, to make these Perl{In,Out}putFilterHandlers consistent with
other Perl*Handlers, mod_perl would need
   - to buffer all filter input in an ever-growing bucket brigade until
     EOS is seen,
   - then call the Perl*FilterHandler sub (once!),
   - passing it $r blessed in to an Apache::RequestFilter (or some such)
     class which would have alternative I/O subs that called the brigade
     APIs.  Would also provide a filter() accessor to get at the
     "real" Apache::Filter object for the rare case when that might be
     needed in sucha high level handler.
   - and pass on the EOS when the sub exited.

The coding style would be very consistent with the existing mechanisms
and all the hairball stuff you can trip over with filters is balled up
into a tangle of wiring hidden behind nice, padded walls. No BOS/EOS
worries, not statefullness worries, just memory and (possibly) latency
worries.

The worries in that scheme, as well as the lack of flexibility in
calling ap_add_xxx_filter() need a low-level API to handle.

API 2: Bucket Brigade filters

No new directives.

The apache filters design assumes that admins will do a static link or a
LoadModule and the modules will then to ap_register_xxx_filter()
filters.  What if mod_perl just went with the flow?

mod_perl could call ap_register_xxx_filter() on all subs compiled with
either an AP_FTYPE_FOO or a new MP_INPUT_FILTER or MP_OUTPUT_FILTER sub
attr.  So you do a PerlSwitches -M or PerlModule or use/require them to
pull the thing in to memory and get them registered with apache.  A
little finesse would be needed to not register the same sub twice.

Then you use any existing httpd config or API techniques to
ap_add_xxx_filter() them.

These low-level perl filters are different and will need some additional
explanation.  Giving them a different configuration API makes it more
likely that said alternative emplanation will be noticed before buggy
code is written and the inevitable mailing list traffic ensues.

Pros:
   - no new directives. 
   - no need for Perl coders to be aware of the registration mechanism,
     just Set{In,Out}putHandler
   - Works with / consistent with existing Apache infrastructure.
   - API Clearly differentiated from Perl*Handlers, paralleling the way
     that Perl filters are different from Perl*Handlers.
   - much more powerful and fine-grained control possible.

Cons:
   - Different from Perl*Handler API.

- Barrie

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: [PATCH] A hack at "first class filters"

Posted by Doug MacEachern <do...@covalent.net>.
good work, let's consider everything before adding new code.  below
lays out what i have in mind, combined with your approach...

direct C api mapping
--------------------

Apache::register_output_filter($filtername, $callback, $filter_type)

Apache::register_input_filter($filtername, $callback, $filter_type)

    filter_type can be one of:
      Apache::FTYPE_CONTENT
      Apache::FTYPE_HTTP_HEADER
      Apache::FTYPE_TRANSCODE
      Apache::FTYPE_CONNECTION
      Apache::FTYPE_NETWORK

$r->add_output_filter($name, $ctx)
$c->add_output_filter($name, $ctx)

$r->add_input_filter($name, $ctx)
$c->add_input_filter($name, $ctx)

note: $ctx will default to NULL

directives
----------

PerlInputFilterHandler

PerlOutputFilterHandler

each will be the equivalent of:

ap_register_{input,output}_filter($handler_name, $handler, $filter_type)

where:
 $handler_name is the Perl name, at the moment is "MODPERL_OUTPUT" and
 "MODPERL_INPUT", would be easy to switch that to the handler name

 $handler is the modperl C callback

 $filter_type defaults to AP_FTYPE_CONTENT, subroutine attributes can
 be used to specify the filter_types list above

 based on attributes, add_{input,output}_filter may need to happen at
 different times, e.g. input filters who want to filter headers +
 content vs. input filters who want to filter only content

alternative to those directives would be:

PerlInputFilter

PerlOutputFilter

combined with:

SetInputFilter

SetOutputFilter

pros: can use Perl{Input,Output}Filter to register the filter in
      httpd.conf, rather than using the API.  can then call
      $r->add_{input,output}_filter($filter_name) at request time

cons: in the common case, requires two directives that use the same
      values (the $handler_name)

 - and/or -

PerlSetInputFilter

PerlSetOutputFilter

as aliases for SetInputFilter, SetOutputFilter which also take care of
filter registration (which PerlInputFilter, PerlOutputFilter would
have done)

pros: similar to Set{Input,Output}Filter
      only need to use one directive

cons: the filter module needs to register the filter in order to add
      the filter at request time without using a directive
      however: PerlModule Apache::FooFilter
      where Apache::FooFilter registers the filter, can provide this
      functionality without requiring new Perl{Input,Output}Filter
      directives

 - in any case -

with the C api mapping it should be possible for a PerlModule to
register the filter(s), then use the standard Set{Input,Output}Filter
directives and the add_{input,output}_filter api at request time.

note: no need to maintain a list of PerlFilters (like PerlModule,
PerlRequire) since the directives trigger modperl_handler_new(), just
like all other Perl*Handlers

{get,set,push}_handlers
-----------------------
would be nice for Perl{Input,Output}FilterHandler to work with the
modperl {get,set,push}_handlers api just like other Perl*Handlers


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org