You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by "Roy T. Fielding" <fi...@kiwi.ICS.UCI.EDU> on 2000/07/07 04:36:35 UTC

Re: BIG filtering patch.

>This patch uses hooks, but it is completely recursive, and I expect to add
>filter names at some point.  

What is the rationale for using hooks?  They just seem to be
obfuscating what should be a simple stack of filters.  We don't
want filters to be self-sorting in any case, since that would
interfere with layered encodings and subrequests.  It also doesn't
serve our purposes to be calling a core register function to insert
filters when all we ever want to do is push them onto a stack and
pop them off a stack.

I think if we can agree on an answer to that one question then it will
be relatively easy to merge all of our stuff into a single sandbox and
jump start a coding frenzy.

The usage of ap_bucket_brigade_to_iol seems a bit odd as well, since
an iol itself should just be a filter, but that can wait.

....Roy

Re: BIG filtering patch.

Posted by Ben Laurie <be...@algroup.co.uk>.

Greg Stein wrote:
> I'm with Ryan on not understanding how a "stack" would work in the Apache
> execution environment. I much prefer the types of ordering mechanisms that
> are in mine and Ryan's patches. (I disagree with his using ap_hooks to do
> it, but will take it over a stack any day :-)

I'm intrigued to know why? It'd be goods if hooks did everything that
should be expected of them, and from what I'm hearing, I think that
they're a pretty good fit for the problem - perhaps not an exact one,
but that may just mean that they need a little work...

> I'll also point out that my ap_add_filter() adds filters "at the end"
> (meaning "after") of the particular group. In other words, a content filter
> can add a filter and be guaranteed it will be inserted after "self".

Seems to me that this is why hooks work better - a content-filter, for
example, may know that it has to come before or after certain other
filters, if they are being used. Hooks let you do that.

BTW, an unaddressed problem with hooks is that we don't have a name
registry, and they need unique names. We should fix this.

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

Coming to ApacheCon Europe 2000? http://apachecon.com/

Re: BIG filtering patch.

Posted by rb...@covalent.net.

> If we agree on the linked list that I use in my patches, filter registration
> (ap_register_filter), and inserting filters into the linked list
> (ap_add_filter), then I would recommend that we simply check that into
> Apache today. Certainly, the definition of the callback and bucket
> structure(s) will change in one direction or another, but we can get some of
> the framework into place (thus reducing the patch set size).

I disagree here.  My big patch is almost working.  I expect another half
hour to hour of debugging.  Then, I have to stop for the night to do some
writing.  Tomorrow or Sunday, I plan to rip the filter registration stuff
out of my patch, and replace it with the stuff from your patch.  In doing
this, I will make sure that the two merge seamlessly.  By Sunday night, I
will have posted two patches to this list.  The first will be the patch I
am finishing now.  The second will be the same patch with your filter
registration scheme.  I don't think two days is too long to ask to make
sure everything works seamlessly.

I expect to post the first patch tonight, so if we decide to go with the
other filter registration scheme, then anybody can add it if the get to it
before me.  I don't want to add this thing into the repository piece-meal,
at least not this part.  As soon as we add the ability to register a
filter function, people will need to be able to actually create filters.

> Ryan: if you're okay with the ap_filter_type, ap_filter_t (with a void* for
> the bucket_cb field), ap_register_filter, and ap_add_filter concepts and
> structures, then I'll extract a sub-patch for just those items and post that
> for review.
> 
> It would be great to have that in there, much like we have insert_filter
> in there already.

See above.  Let's just wait, and do it right.  I'll post my first patch
ASAP.  Then, we can create a second patch, and let them both be
reviewed.  Then, we commit one or the other.

Plus, I haven't read Roy's response yet, but I know I saw one.  I want his
input on how to do this.  This is the other reason I want two versions of
my patch.  If it is obviously easy to replace the filter registration in
the patch, then we can fix it later if needed.  If it isn't easy, then the
patch needs to change.  Does that make sense?

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: BIG filtering patch.

Posted by Greg Stein <gs...@lyra.org>.

On Fri, Jul 07, 2000 at 08:29:54AM -0700, rbb@covalent.net wrote:
>...
> > We don't
> > want filters to be self-sorting in any case, since that would
> > interfere with layered encodings and subrequests.  
> 
> I guess I don't see this, but I am willing to be educated.  In my mind, it

For the record, I'm with Ryan on this one. I *do* believe that some amount
of "self-sorting" is required. I don't think that Apache is deterministic
enough to ensure that the control flow will push filters onto a stack in the
correct order.

>...
> 2)  We add a content filter while we are filtering the data.  In my head,
> the function that is inserting the filter has to know enough to put the
> new filter into the list after the current filter.  The sorting algorithm
> shouldn't touch it, because when we inserted it into the list, we gave it
> the right number.
> 
> 
> Having said all of that.  I kept that section of my latest patch as clean
> as possible for a reason.  If people really want to remove the hooks
> stuff, it should be relatively easy to do.  We need a way for filters to
> easily insert their filters, and those filters need to be
> order-able.  Greg's patch relied on the insert-filter hook to do the
> ordering.  My patch asks the filters to order themselves.  I haven't given
> any thought to how a stack would work, but my gut feeling is that it
> wouldn't be great, simply because we take the chance of asking mod_include
> to run after we have gzip'ed the data.

Clarification here: my patch allows filters to specify their ordering in two
ways: the insert-filter hook, and by filter "type". I have defined only two
types at the moment because I wasn't sure we needed more granularity.
However, it is simply an enum, so we can have types that match up to each of
those Ryan listed later in this email.
[ this "type" is effectively a self-sorting thing for filters ]

I'm with Ryan on not understanding how a "stack" would work in the Apache
execution environment. I much prefer the types of ordering mechanisms that
are in mine and Ryan's patches. (I disagree with his using ap_hooks to do
it, but will take it over a stack any day :-)

I'll also point out that my ap_add_filter() adds filters "at the end"
(meaning "after") of the particular group. In other words, a content filter
can add a filter and be guaranteed it will be inserted after "self".

>...
> > I think if we can agree on an answer to that one question then it will
> > be relatively easy to merge all of our stuff into a single sandbox and
> > jump start a coding frenzy.
> 
> I assume you are talking about the filter stack from Greg's patch, and the
> bucket's from my latest patch.  I can do that merge next week if nobody
> else beats me to it.

If we agree on the linked list that I use in my patches, filter registration
(ap_register_filter), and inserting filters into the linked list
(ap_add_filter), then I would recommend that we simply check that into
Apache today. Certainly, the definition of the callback and bucket
structure(s) will change in one direction or another, but we can get some of
the framework into place (thus reducing the patch set size).

Ryan: if you're okay with the ap_filter_type, ap_filter_t (with a void* for
the bucket_cb field), ap_register_filter, and ap_add_filter concepts and
structures, then I'll extract a sub-patch for just those items and post that
for review.

It would be great to have that in there, much like we have insert_filter
in there already.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: BIG filtering patch.

Posted by rb...@covalent.net.

On Thu, 6 Jul 2000, Roy T. Fielding wrote:

> >This patch uses hooks, but it is completely recursive, and I expect to add
> >filter names at some point.  
> 
> What is the rationale for using hooks?  They just seem to be

There are a couple of rationals for using hooks, I will try to explain
here.  If it still doesn't make sense, please let me know.  :-)

> obfuscating what should be a simple stack of filters.  

A long time ago (at the beginning of this whole discussion) we decided
this shouldn't be a simple stack of filters.  The argument was put forth
that there are basically five types of filters, and there is an obvious
ordering for how those five types should be called.

The five types are:

content-generators     handled by the handler hook
content-filters        anything that munges data
content-encoders       anything that changes the representation of the
		       data without changing the meaning
content-processors     anything that changes the meta-data about the data
content-transporters   The stuff that actually transports the data.

The argument was made, and I still agree with it, that all content-filters
should be called before the encoders, which should be called before the
processors, which should be called before the transporters.

Using the hooks allows us to easily order the filters in these categories.

There are two other features of the hooks that I like:

The code is there, and it works.  And, module writers will already know
how the hooks work.

Adding a stack implementation into Apache just adds more code.  Our tables
can implement the same thing, with code that we know works.  Not that a
stack is hard to implement.  It's just more code that we don't really need
IMHO.

The easier we make it for module writers to add a filter, the more filters
there will be.  I was really hoping to not make module writers learn two
API's for inserting hooks into the server.  One for filters, and one for
everything else.

I can take the hooks stuff out, I'm not married to it, but I think it has
a lot of features we don't want to remove, like the automatic ordering of
filters.  

> We don't
> want filters to be self-sorting in any case, since that would
> interfere with layered encodings and subrequests.  

I guess I don't see this, but I am willing to be educated.  In my mind, it
all comes back to when filters are inserted onto the stack.  I guess I
look at it and think it isn't valid to insert a filter that is going to
munge the data once we are encoding the data.  This means that by the time
we are done munging data, all of the mungers have been added to the stack
and we can forget about them.  The same thing goes for all of the rest of
the types of filters.  I see the self-sorting as an issue in two cases:

1)  We try to add a content-filter while we are encoding the data.  This
is invalid in my mind (although I could be wrong) and not something to
worry about.

2)  We add a content filter while we are filtering the data.  In my head,
the function that is inserting the filter has to know enough to put the
new filter into the list after the current filter.  The sorting algorithm
shouldn't touch it, because when we inserted it into the list, we gave it
the right number.

Having said all of that.  I kept that section of my latest patch as clean
as possible for a reason.  If people really want to remove the hooks
stuff, it should be relatively easy to do.  We need a way for filters to
easily insert their filters, and those filters need to be
order-able.  Greg's patch relied on the insert-filter hook to do the
ordering.  My patch asks the filters to order themselves.  I haven't given
any thought to how a stack would work, but my gut feeling is that it
wouldn't be great, simply because we take the chance of asking mod_include
to run after we have gzip'ed the data.

Does that make sense?

> It also doesn't
> serve our purposes to be calling a core register function to insert
> filters when all we ever want to do is push them onto a stack and
> pop them off a stack.

I assume you are talking about the core_insert_filters.  This is just for
the core to insert it's filters.  All of the patches currently have a
function that allows modules to insert their filters, whether they are
pushed onto a stack, added through the hooks, or added to a linked list,
the modules need to have a place to insert the filters.  The very first
patch I made didn't have this feature, and modules just inserted filters
whenever they saw there filters were necessary.  People didn't like that
idea.

I expect that over time we will find that we really need three or four
functions to insert filters on the stack.  I know I want a function in
mod_cgi[d] so that we can allow mod_php, mod_perl, mod_include, etc to see
the output of a cgi script.  I don't think we want mod_cgi[d] to be
responsible for actually inserting the filter onto the stack.  We just
want mod_cgi to tell the other modules that they can insert their own
filters.

> I think if we can agree on an answer to that one question then it will
> be relatively easy to merge all of our stuff into a single sandbox and
> jump start a coding frenzy.

I assume you are talking about the filter stack from Greg's patch, and the
bucket's from my latest patch.  I can do that merge next week if nobody
else beats me to it.

> The usage of ap_bucket_brigade_to_iol seems a bit odd as well, since
> an iol itself should just be a filter, but that can wait.

That function is there to let us flush the data to the disk or to the
network.  Originally, I had called it ap_bucket_brigade_to_disk, which
opened a file, and flushed the whole brigade to the disk.  Then I realized
that basically the same code would be needed to write out to the network,
so I changed the name.  :-)  We can easily make iol's a filter later, but
we will still need a function to flush the data out of memory and to the
disk.  That's what this function is meant to do.

HTH,

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------