You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "William A. Rowe, Jr." <wr...@rowe-clan.net> on 2001/08/29 20:48:44 UTC

Broken SetOutputFilter/SetInputFilter semantics

Folks,

  I'm working on the AddOutputFilterByType/AddInputFilterByType patch to core.c.
I'll drop it into core.c, since we only have three other ByType directives (AddIcon
and AddDesc from mod_autoindex, and ExpiresByType in mod_expires.)

  I discovered something's pecular about SetOutputFilter/SetInputFilter.  Unlike every
similar SetFoo directive, we keep stacking them on, and have no remove function.
Warning; I'm discussing _only_ content filters here, connection and parent request 
filters aren't even considered!!!

----

  Essentially, a user trying a SetFoo function expects it to override any SetFoo
in a parent resource (.htaccess over <Location>, <Location> over <Files>, <Files>
over <Dir> and <Dir> over <VHost>.)

  Right now we add 'just one more'.  But we can't take them out.  This is borked.

  In the mod_mime patch I tossed in yesterday, each recognized extension can add filters 
(in the order the extensions occur in the filename, L2R order.)  But I added the 
corresponding RemoveInputFilter/RemoveOutputFilter the filename association in a nested 
container.  And each AddInputFilter directive _replaces_ the previous AddInputFilter 
directive for that same extention.

----

  Note that SetHandler shares part of this problem, it can't be overriden in a nested 
container.  This is wrong, so consider that UnSetHandler (no arg) will undo that problem.
Therefore SetFooFilter filters gets an UnSetFooFilters (no arg), and either can be undone
in a child container.  That's part of the problem, so onwards...

----

  That means that RemoveFooFilter BogusFilter .bak will take any filters away from the
.bak extension that were defined before for the .bak extension.  That does NOT mean 
that .bak will cause other extensions to be removed, so foo.shtml.bak would KEEP it's 
Includes filter.  That is how Add|Remove mod_mime things work.

  For mod_mime's semantics, the syntax is AddFooFilter filters ext [ext...].  We could not 
use the semantic AddFooFilters filter [filter...] ext since it's ambigious. Instead I used 
semicolons to allow multiple filters for the same extension, so an shtml-gz *could* be
defined to run through GUnZip;Includes (not that I see many doing this.  It will be an
obscure, rarely used but useful feature.)

  So now we have a ton of extensions adding filters, will soon have AddFooFilterByType
adding filters, and finally SetFooFilter adding more at the end.  Handlers were never
this difficult to configure :(

  I'm suggesting, today, that we support the following syntax for all these directives;

{Add|Remove}{Input|Output}Filter [+|-]foo[;[+|-]bar...] ext [ext...]

{Add|Remove}{Input|Output}FilterByType [+|-]foo[;[+|-]bar...] [mime-type]

Set{Input|Output}Filter [+|-]foo[;[+|-]bar...]   OR  UnSet{Input|Output}Filter

  Again, UnSetFooFilter takes no args, it undoes the SetFilter, that's all.  It has no 
effect on the Add/RemoveFooFilter stuff.  Apache will get an UnSetHandler that does the
very same thing.

---

  These filters would be run in the order above.  An absolute syntax in any of the above
will CLEAR the content filter list.  Any of these directives will OVERRIDE the prior
definition of the same directive.  So in this example;

AddOutputFilter +charset .utf-8

AddOutputFilter +includes .shtml

<Directory />
    SetOutputFilter none
</Directory>

<Directory /webroot>
    SetOutputFilter +decoration
<Directory>

<Directory /webroot/someweb>
    UnSetOutputFilter
<Directory>

  If we ask for the file /something/test.utf-8.shtml, mod_mime hits utf-8, which adds the 
charset filter, we hit .shtml, which adds includes, and then the core filter clears them
all.

  If we ask for the file /webroot/test.utf-8.shtml, when the core filter merged the
<Dir> blocks it started with none, then changed it's mind to +decoration.  So mod_mime hits 
utf-8, which adds the charset filter, then .shtml, which adds includes and the core filter 
adds the decoration filter.

  If we ask for the file /webroot/someweb/test.shtml, mod_mime adds includes, and then
the core filter (after merging the <Directory > blocks) doesn't change a thing.

----

  I'm suggesting ALL of these filters run in the mime negotation and fixups phase.  That
way, any module that uses the insert_filters hook will ALWAYS override these user configured
filters, so that modules will rarely be broken by user configuration errors.  Here are the
phases I suggest;

mime_types hook  :::  Add[foo]OutputFilter is run in mod_mime's hook

fixups hook      :::  Add[foo]OutputFilterByType is run in the core's hook

insert_hooks     :::  SetHandler (absolute) is run in the core's hook

----

  If we don't get this right, httpd-2.0 will be resoundingly booed by the Administrator
community a year from now.  Please - help me out with your comments ;)

Bill




Re: Broken SetOutputFilter/SetInputFilter semantics

Posted by Joshua Slive <jo...@slive.ca>.
[warning: annoying rambling to follow]

On Wed, 29 Aug 2001, William A. Rowe, Jr. wrote:

>   I'm suggesting, today, that we support the following syntax for all these directives;
>
> {Add|Remove}{Input|Output}Filter [+|-]foo[;[+|-]bar...] ext [ext...]
>
> {Add|Remove}{Input|Output}FilterByType [+|-]foo[;[+|-]bar...] [mime-type]
>
> Set{Input|Output}Filter [+|-]foo[;[+|-]bar...]   OR  UnSet{Input|Output}Filter
>
>   Again, UnSetFooFilter takes no args, it undoes the SetFilter, that's all.  It has no
> effect on the Add/RemoveFooFilter stuff.  Apache will get an UnSetHandler that does the
> very same thing.

It's great you are addressing this problem, since it will be an obvious
source of user annoyance.  But I don't know about your solution.

One problem with it is that the +|- syntax is confusing.  We have this for
Options, and I almost never see it used correctly.  It is especially bad
because it makes the final configuration very dependent on the
configuration processing order, which almost nobody understands
completely.

Another issue is the ordering that the filters are applied. If I say
AddOutputFilter decompress .gz
AddOutputFilter includes .shtml
how do we decide which filter comes first?  The config file order of the
directives? The order of the extensions?  (Ouch!)
The same question applies to your syntax above.  The ordering of
the filters is rather non-obvious after several
SetOutputFilter +foo;-bar
SetOutputFilter +bar;-foo;+foobar
rounds.

A third problem is that I don't believe Apache 2.0 supports argument-less
directives.

Unfortunately, I can't come up with a great alternative.
A couple ideas:

1. Add an {input|output}FilterOrder directive which specifies the order
of filter processing.

2. Have Set{Input|Output}Filter take the special filter name "none" which
clears the filter stack.  (I believe we already have a similar, though
undocumented, handler that just forces processing back to the core.)

Then you could do things like

<directory /foo>
SetOutputFilter includes
</directory>

<directory /foo/bar>
SetOutputFilter decompress
OutputFilterOrder decompress includes
</directory>

<directory /foo/bar/baz>
SetOutputFilter none
SetOutputFilter includes
</directory>

I admit that last one is slightly non-obvious, but I don't think it is any
worse than the alternatives.

A possible alternative would be a ClearOutputFilter directive which
can remove a specific list of filters from the stack after they
have been set by SetOutputFilter.  (The better syntax would be
RemoveOutputFilter, but that is already taken.)



Joshua.