You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Graham Leggett <mi...@sharp.fm> on 2004/08/24 01:54:32 UTC

AddOutputFilterByType oddness

Hi all,

I have just set up the most recent httpd v2.0.51-dev tree, and have 
configured a filter that strips leading whitespace from HTML:

AddOutputFilterByType STRIP text/html

The content is served by mod_proxy.

This seems to work fine for HTML requests, but I have noticed that this 
filter is also being applied to images as well (thus corrupting them). 
Why would the above directive apply to all content, instead of text/html 
only as is configured?

Looking at the following docs:

http://httpd.apache.org/docs-2.0/mod/core.html#addoutputfilterbytype

it says that filters are not applied by proxied requests (It does not 
give a reason why not). From the test above however this statement is 
false, filters are applied to proxy requests - all proxied requests.

Am I doing something wrong, or is AddOutputFilterByType broken?

Regards,
Graham
--

Re: AddOutputFilterByType oddness

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Thursday, September 23, 2004 12:43 PM +0100 Nick Kew <ni...@webthing.com> 
wrote:

> Basically it does the lookup/dispatch once per filter in the filterchain
> per request.  It checks that filter's providers until it finds a match.
> So for anything you could do with an [Add|Set]OutputFilter[ByType]
> that's one lookup per request.

Okay, so if I have three rules and ten filters, we'll be doing thirty checks, 
right?  And, this will happen even if mod_filter isn't configured - as 
mod_filter still needs to check ten times that it doesn't have anything to do, 
right?  Hmm.  How expensive is this again?

> mod_filter takes the content-type as it is at that point in the chain.
>
> Isn't the real nightmare where a filter calls ap_set_content_type and
> some AddOutputFilterByTypes are in effect?  I guess what *really* bothers
> me is the idea of adding filters *as a side-effect*.

How wouldn't it be a side-effect?  It's intentional from the admin 
perspective, but a side-effect from the developer's perspective.

>>	  And, then
>> mod_deflate needs to be conditionally added (sub-case #1: it needs to be
>> added for 'text/plain'; sub-case #2: it needs to be added for 'text/html').
>> How and where is it added?  Are you inserting dummy filters?
>
> I'm not sure I follow.  It will dispatch to deflate based on the
> content-type (or other dispatch criterion) as it is at that point
> in the chain.

The question is at which point in the chain does deflate get added?

> So if the handler sets application/xml but that goes through an XSLT
> filter which sets it to text/html, then mod_filter sees application/xml
> if it's before the XSLT filter in the chain, or text/html after it.
>
> How can AddOutputFilterByType expect to cope with that?

I thought you suggested that mod_filter could easily handle this case.  I'm 
still not seeing how.

> But FWIW I have that working locally with
>
> FilterDeclare   filter1 Content-Type    CONTENT_SET
> FilterDeclare   filter2 Content-Length  CONTENT_SET
>
> FilterProvider  filter1 filter2 $text
> FilterProvider  filter2 DEFLATE >4000
>
> FilterChain     filter1
>
> to deflate all "text/*" documents of 4k or greater.

Can I comment that I think a clearer configuration syntax is going to be 
needed if we are going to axe all of the current filter directives?

AddOutputFilterByType, for all of its internal oddness, is a simple directive 
for an administrator to understand.  So, perhaps keep 'AddOutputFilterByType' 
and having it internally converted to a mod_filter directive.  But, I'm just 
not overly excited about moving all filter configuration directives to 
something akin to mod_rewrite.  Ouch.  -- justin

Re: AddOutputFilterByType oddness

Posted by Nick Kew <ni...@webthing.com>.
On Wed, 22 Sep 2004, Justin Erenkrantz wrote:

> --On Wednesday, September 22, 2004 6:17 PM +0100 Nick Kew
> <ni...@webthing.com> wrote:
>
> > It seems to me heavily counterintuitive that mixing ByType directives
> > with anything else means that the ByType filters *always* come last.
> > And that Remove won't affect them, but will affect others.
>
> I think we could get Remove*Filter to also delete the content-type filters.
>
> > Indeed.  mod_filter addresses this by configuring at the last moment,
> > so any earlier set_content_type()s are irrelevant.  I don't suppose it's
> > a panacaea for everything, but I do think it's a significant improvement
> > on what we have.
>
> I'm concerned about the overhead of mod_filter having to check all of its
> rules each time a filter is invoked.  This is why I started to look through
> the code last night to see how it worked and how invasive it is.

It's improving with time (except when I introduce bugs...).  Merging in
the structs with util_filter saves on having to do superfluous lookups.

Basically it does the lookup/dispatch once per filter in the filterchain
per request.  It checks that filter's providers until it finds a match.
So for anything you could do with an [Add|Set]OutputFilter[ByType]
that's one lookup per request.

> How would you handle the situation when filter #1 sets C-T to be
> "text/plain" and then filter #2 sets C-T to be "text/html"?

mod_filter takes the content-type as it is at that point in the chain.

Isn't the real nightmare where a filter calls ap_set_content_type and
some AddOutputFilterByTypes are in effect?  I guess what *really* bothers
me is the idea of adding filters *as a side-effect*.

>	  And, then
> mod_deflate needs to be conditionally added (sub-case #1: it needs to be
> added for 'text/plain'; sub-case #2: it needs to be added for 'text/html').
> How and where is it added?  Are you inserting dummy filters?

I'm not sure I follow.  It will dispatch to deflate based on the
content-type (or other dispatch criterion) as it is at that point
in the chain.

So if the handler sets application/xml but that goes through an XSLT
filter which sets it to text/html, then mod_filter sees application/xml
if it's before the XSLT filter in the chain, or text/html after it.

How can AddOutputFilterByType expect to cope with that?

>
> > From the user's perspective, it's simply more powerful and flexible.
> > Works with any request or response headers (not just content-type) or
> > environment variables.  Gets rid of constraints on ordering, like
> > AddOutputFilterbyType filter always coming after other filters
> > regardless of ordering in httpd.conf.
> >
> > Example: I have a user who wants to insert mod_deflate in a reverse
> > proxy, but only for selected content-types AND not if the content
> > length is below a threshold.  How would he do that with the old filter
> > framework?
>
> I guess I'm not clear what the syntax is (I guess I should go read the
> docs).

That particular scenario is complex, and requires mod_filter to be
used as its own provider.  The point is, we *can* now support complex
setups (or will be - that chaining is still broken in CVS).

But FWIW I have that working locally with

FilterDeclare   filter1 Content-Type    CONTENT_SET
FilterDeclare   filter2 Content-Length  CONTENT_SET

FilterProvider  filter1 filter2 $text
FilterProvider  filter2 DEFLATE >4000

FilterChain     filter1

to deflate all "text/*" documents of 4k or greater.


>	  I definitely don't want to see the filters be configured like
> mod_rewrite.  It needs to be fairly straightforward, but still fairly
> simplistic.  I don't want to have users have to read a complicated manual
> or docs to set up filters.  KISS.

Indeed.  Do you think the examples in the manual page are too complex?

Bear in mind that the third example is no more complex than the first two,
yet suddenly enables a frequently-requested capability that simply isn't
possible with the old filtering.

> Well, the point by you committing it into our tree is that the rest of us
> are now responsible for it.  That's why I brought up the code style issue:

OK, OKOK!   I promise to look harder at the code style guidelines!
And I _did_ ask on the list a couple of weeks before introducing to CVS.

> I looked yesterday afternoon (and haven't seen any commits since then).  I

That'll be the latest version.  Which FWIW was introduced prematurely
because it introduced a new feature demanded by a user.  Only that turned
out to be broken, which is why I'm re-hacking that now.

-- 
Nick Kew

Re: AddOutputFilterByType oddness

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Wednesday, September 22, 2004 6:17 PM +0100 Nick Kew 
<ni...@webthing.com> wrote:

> It seems to me heavily counterintuitive that mixing ByType directives
> with anything else means that the ByType filters *always* come last.
> And that Remove won't affect them, but will affect others.

I think we could get Remove*Filter to also delete the content-type filters.

> Indeed.  mod_filter addresses this by configuring at the last moment,
> so any earlier set_content_type()s are irrelevant.  I don't suppose it's
> a panacaea for everything, but I do think it's a significant improvement
> on what we have.

I'm concerned about the overhead of mod_filter having to check all of its 
rules each time a filter is invoked.  This is why I started to look through 
the code last night to see how it worked and how invasive it is.

How would you handle the situation when filter #1 sets C-T to be 
"text/plain" and then filter #2 sets C-T to be "text/html"?  And, then 
mod_deflate needs to be conditionally added (sub-case #1: it needs to be 
added for 'text/plain'; sub-case #2: it needs to be added for 'text/html'). 
How and where is it added?  Are you inserting dummy filters?

> From the user's perspective, it's simply more powerful and flexible.
> Works with any request or response headers (not just content-type) or
> environment variables.  Gets rid of constraints on ordering, like
> AddOutputFilterbyType filter always coming after other filters
> regardless of ordering in httpd.conf.
>
> Example: I have a user who wants to insert mod_deflate in a reverse
> proxy, but only for selected content-types AND not if the content
> length is below a threshold.  How would he do that with the old filter
> framework?

I guess I'm not clear what the syntax is (I guess I should go read the 
docs).  I definitely don't want to see the filters be configured like 
mod_rewrite.  It needs to be fairly straightforward, but still fairly 
simplistic.  I don't want to have users have to read a complicated manual 
or docs to set up filters.  KISS.

> From a developers perspective, I wrote it for myself, and have at least
> two other developers using it operationally in their product.  Time will
> tell what others may use it for.

Well, the point by you committing it into our tree is that the rest of us 
are now responsible for it.  That's why I brought up the code style issue: 
we already have a number of modules that were never fully integrated or 
reviewed.  And, then the person who dropped the code ran away and left the 
code in a goofy state.  (See mod_rewrite, mod_ssl, mod_cache, etc.)

> When was that?  I made quite a lot of updates to the style towards
> conforming (like eliminating tabs and realigning some braces) before
> committing to CVS, but I'm willing to believe I need to look more
> carefully.

I looked yesterday afternoon (and haven't seen any commits since then).  I 
will say the most distracting parts are the odd spacing (i.e. parenthesis 
and semi-colons) as well as line spacing.  Unfortunately, I get distracted 
by shiny things such as improper code style such that I can't focus on the 
code itself.  =)  -- justin

Re: AddOutputFilterByType oddness

Posted by Nick Kew <ni...@webthing.com>.
On Wed, 22 Sep 2004, Justin Erenkrantz wrote:

> --On Wednesday, September 22, 2004 5:01 PM +0100 Nick Kew <ni...@webthing.com>
> wrote:
>
> > I've said it before and I'll say it again: AddOutputFilterByType is
> > fundamentally unsatisfactory.  This confusion is an effect, not cause.
>
> Suffice to say, I disagree.
>
> > * Configuration is inconsistent with other filter directives.  The
> >   relationship with [Set|Add|Remove]OutputFilter is utterly unintuitive
> >   and, from a user POV, broken.
>
> I think it's really clear from the user's perspective.  I think the problem
> comes in on the developer's side.

It seems to me heavily counterintuitive that mixing ByType directives
with anything else means that the ByType filters *always* come last.
And that Remove won't affect them, but will affect others.

> > * Tying it to ap_set_content_type is, to say the least, hairy.
> >   IMO we shouldn't *require* modules to call this, and it's utterly
> >   unreasonable to expect that it will never be called more than once
> >   for a request, given the number of modules that might take an interest.
> >   Especially when subrequests and internal redirects may be involved.
>
> We have *always* mandated that ap_set_content_type() should be called rather
> than setting r->content_type.  (I wish we could remove content_type from
> request_rec instead.)

Indeed.  But that doesn't prevent it being called multiple times, perhaps
from different modules.  So using it to insert filters leaves lots of
potantial for trouble.

> > * It's a complexity just waiting for modules to break on it.
>
> Anything that depends upon content-type like this is going to be hairy because
> there may be several 'right' answers during the course of the request.

Indeed.  mod_filter addresses this by configuring at the last moment,
so any earlier set_content_type()s are irrelevant.  I don't suppose it's
a panacaea for everything, but I do think it's a significant improvement
on what we have.

> > I've made some more updates to mod_filter since I last posted on the
> > subject, and I'm getting some very positive feedback from real users.
> > For 2.2 I'd like to remove AddOutputFilterByType entirely, replacing
> > it with mod_filter.
>
> I've yet to see a clear and concise statement as to how mod_filter will solve
> this problem in a better and more efficient way.  (Especially from a user's
> perspective, but also from a developer's perspective.)

>From the user's perspective, it's simply more powerful and flexible.
Works with any request or response headers (not just content-type) or
environment variables.  Gets rid of constraints on ordering, like
AddOutputFilterbyType filter always coming after other filters
regardless of ordering in httpd.conf.

Example: I have a user who wants to insert mod_deflate in a reverse
proxy, but only for selected content-types AND not if the content
length is below a threshold.  How would he do that with the old filter
framework?

>From a developers perspective, I wrote it for myself, and have at least
two other developers using it operationally in their product.  Time will
tell what others may use it for.

> I will also comment that I looked in the mod_filter code the other day and was
> disappointed that it doesn't follow our coding style at all or even have
> comments that help people understand what it is trying to do inside the .c
> file.

When was that?  I made quite a lot of updates to the style towards
conforming (like eliminating tabs and realigning some braces) before
committing to CVS, but I'm willing to believe I need to look more
carefully.

-- 
Nick Kew

Re: Style changes (was: AddOutputFilterByType oddness)

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Thursday, September 23, 2004 5:38 PM +0200 Graham Leggett 
<mi...@sharp.fm> wrote:

> At the same time I (and others) have been criticised in the past for trying
> to propose patches that do style changes or that correct whitespace.
>
> What is the official policy on this?
>
> I am quite happy to do as much style cleanups that are necessary, on
> condition nobody objects to me doing so.

Some guidelines: Separate style changes from functional changes.  You should 
also do large style commits before doing large functional commits (there are a 
couple of cases where it makes sense to reverse that though).  Style changes 
only get applied to the HEAD - they can't be backported to 2.0.

I think if you follow that, you won't get much flack.  But, there are a number 
of modules that are getting hard to maintain because they don't comply.  So, I 
think it behooves us to try to clean them up before we start 2.2.  -- justin

Style changes (was: AddOutputFilterByType oddness)

Posted by Graham Leggett <mi...@sharp.fm>.
Justin Erenkrantz wrote:

> I'm getting annoyed by people doing massive code drops (i.e. mod_filter, 
> mod_proxy, mod_auth_ldap, etc.) that don't conform to our code style and 
> have no comments.  It makes it much harder to go and fix bugs in 'em.  

At the same time I (and others) have been criticised in the past for 
trying to propose patches that do style changes or that correct whitespace.

What is the official policy on this?

I am quite happy to do as much style cleanups that are necessary, on 
condition nobody objects to me doing so.

Regards,
Graham
--


Re: AddOutputFilterByType oddness

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Wednesday, September 22, 2004 5:01 PM +0100 Nick Kew <ni...@webthing.com> 
wrote:

> I've said it before and I'll say it again: AddOutputFilterByType is
> fundamentally unsatisfactory.  This confusion is an effect, not cause.

Suffice to say, I disagree.

> * Configuration is inconsistent with other filter directives.  The
>   relationship with [Set|Add|Remove]OutputFilter is utterly unintuitive
>   and, from a user POV, broken.

I think it's really clear from the user's perspective.  I think the problem 
comes in on the developer's side.

> * Tying it to ap_set_content_type is, to say the least, hairy.
>   IMO we shouldn't *require* modules to call this, and it's utterly
>   unreasonable to expect that it will never be called more than once
>   for a request, given the number of modules that might take an interest.
>   Especially when subrequests and internal redirects may be involved.

We have *always* mandated that ap_set_content_type() should be called rather 
than setting r->content_type.  (I wish we could remove content_type from 
request_rec instead.)

> * It's a complexity just waiting for modules to break on it.

Anything that depends upon content-type like this is going to be hairy because 
there may be several 'right' answers during the course of the request.

> I've made some more updates to mod_filter since I last posted on the
> subject, and I'm getting some very positive feedback from real users.
> For 2.2 I'd like to remove AddOutputFilterByType entirely, replacing
> it with mod_filter.

I've yet to see a clear and concise statement as to how mod_filter will solve 
this problem in a better and more efficient way.  (Especially from a user's 
perspective, but also from a developer's perspective.)

> mod_filter can also obsolete [Set|Add|Remove]OutputFilter, though I'm
> in no hurry to do that.  What I can also do is re-implement all the
> outputfilter directives within mod_filter and its updated framework.

I will also comment that I looked in the mod_filter code the other day and was 
disappointed that it doesn't follow our coding style at all or even have 
comments that help people understand what it is trying to do inside the .c 
file.  This all makes it very difficult to understand the code.  I'd greatly 
appreciate it if mod_filter (and the code that you inserted elsewhere - i.e. 
in util_filter.c) would conform to our style guidelines and had some comments 
inside of it that say what it does (or trying to do).

For example, some of the things it does just makes no sense at all. 
filter_bucket_type() is completely bogus and needs to be tossed.  The 
type->name field in the bucket should be used instead.

I'm getting annoyed by people doing massive code drops (i.e. mod_filter, 
mod_proxy, mod_auth_ldap, etc.) that don't conform to our code style and have 
no comments.  It makes it much harder to go and fix bugs in 'em.  -- justin

Re: AddOutputFilterByType oddness

Posted by Nick Kew <ni...@webthing.com>.
On Sat, 18 Sep 2004, Justin Erenkrantz wrote:

> > But ap_add_output_filters_by_type() explicitly does nothing for a
> > proxied request.  Anyone know why?  "AddOutputFilterByType DEFLATE
> > text/plain text/html" seems to work as expected here for a forward proxy
> > with this applied: maybe I'm missing something fundamental...
>
> My recollection is initially it didn't have the proxy check, then FirstBill
> had a reason why proxied requests shouldn't work with AddOutputFilterByType.

I've said it before and I'll say it again: AddOutputFilterByType is
fundamentally unsatisfactory.  This confusion is an effect, not cause.

* Configuration is inconsistent with other filter directives.  The
  relationship with [Set|Add|Remove]OutputFilter is utterly unintuitive
  and, from a user POV, broken.
* Tying it to ap_set_content_type is, to say the least, hairy.
  IMO we shouldn't *require* modules to call this, and it's utterly
  unreasonable to expect that it will never be called more than once
  for a request, given the number of modules that might take an interest.
  Especially when subrequests and internal redirects may be involved.
* It's a complexity just waiting for modules to break on it.

I've made some more updates to mod_filter since I last posted on the
subject, and I'm getting some very positive feedback from real users.
For 2.2 I'd like to remove AddOutputFilterByType entirely, replacing
it with mod_filter.

mod_filter can also obsolete [Set|Add|Remove]OutputFilter, though I'm
in no hurry to do that.  What I can also do is re-implement all the
outputfilter directives within mod_filter and its updated framework.

-- 
Nick Kew

Re: AddOutputFilterByType oddness

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Thursday, September 16, 2004 5:11 PM +0100 Joe Orton <jo...@redhat.com> 
wrote:

> But ap_add_output_filters_by_type() explicitly does nothing for a
> proxied request.  Anyone know why?  "AddOutputFilterByType DEFLATE
> text/plain text/html" seems to work as expected here for a forward proxy
> with this applied: maybe I'm missing something fundamental...

My recollection is initially it didn't have the proxy check, then FirstBill 
had a reason why proxied requests shouldn't work with AddOutputFilterByType.

Would have to search the archives to remember why.  *sigh*  -- justin

Re: AddOutputFilterByType oddness

Posted by Joe Orton <jo...@redhat.com>.
On Wed, Aug 25, 2004 at 02:40:39PM +0200, Graham Leggett wrote:
> Justin Erenkrantz wrote:
> >Ultimately, all that is needed is a call to ap_set_content_type() before 
> >any bytes are written to the client to get AddOutputFilterByType to 
> >work. Perhaps with the recent momentum behind mod_proxy work, someone 
> >could investigate that and get mod_proxy fixed.
> 
> ap_set_content_type() is called on line 769 of proxy_http.c:

But ap_add_output_filters_by_type() explicitly does nothing for a
proxied request.  Anyone know why?  "AddOutputFilterByType DEFLATE
text/plain text/html" seems to work as expected here for a forward proxy
with this applied: maybe I'm missing something fundamental...

--- server/core.c~	2004-08-31 09:16:56.000000000 +0100
+++ server/core.c	2004-09-16 16:48:09.000000000 +0100
@@ -2875,11 +2875,10 @@
     conf = (core_dir_config *)ap_get_module_config(r->per_dir_config,
                                                    &core_module);
 
-    /* We can't do anything with proxy requests, no content-types or if
-     * we don't have a filter configured.
+    /* We can't do anything with no content-type or if we don't have a
+     * filter configured.
      */
-    if (r->proxyreq != PROXYREQ_NONE || !r->content_type ||
-        !conf->ct_output_filters) {
+    if (!r->content_type || !conf->ct_output_filters) {
         return;
     }
 


Re: AddOutputFilterByType oddness

Posted by Graham Leggett <mi...@sharp.fm>.
Justin Erenkrantz wrote:

>> Putting on an end user hat I see no reason why AddOutputFilterByType
>> shouldn't do exactly what it says it does.

> I believe it has more to do with mod_proxy than the filter design.  No 
> one, at the time we added AddOutputFilterByType, wanted to rewrite 
> mod_proxy to be knowledgeable about filters.

I wrote mod_proxy to be knowledgeable about filters shortly after v2.0 
came about, it was one of the first major modules to support filters.

> Ultimately, all that is needed is a call to ap_set_content_type() before 
> any bytes are written to the client to get AddOutputFilterByType to 
> work. Perhaps with the recent momentum behind mod_proxy work, someone 
> could investigate that and get mod_proxy fixed.

ap_set_content_type() is called on line 769 of proxy_http.c:

if ((buf = apr_table_get(r->headers_out, "Content-Type"))) {
     ap_set_content_type(r, apr_pstrdup(p, buf));
}

Is there anything else that needs to be done to make 
AddOutputFilterByType to work?

Is apr_table_get() case sensitive?

Regards,
Graham
--

Re: AddOutputFilterByType oddness

Posted by Jess Holle <je...@ptc.com>.
If I understand this correctly this is a necessity for 
mod_proxy/mod_proxy_ajp to replace mod_jk else this would be a 
significant regression from mod_jk (wherein this issue was fixed last 
year as I recall).

--
Jess Holle

Justin Erenkrantz wrote:

> --On Tuesday, August 24, 2004 12:20 PM +0200 Graham Leggett 
> <mi...@sharp.fm> wrote:
>
>> Putting on an end user hat I see no reason why AddOutputFilterByType
>> shouldn't do exactly what it says it does.
>
> I believe it has more to do with mod_proxy than the filter design.  No 
> one, at the time we added AddOutputFilterByType, wanted to rewrite 
> mod_proxy to be knowledgeable about filters.
>
> Ultimately, all that is needed is a call to ap_set_content_type() 
> before any bytes are written to the client to get 
> AddOutputFilterByType to work. Perhaps with the recent momentum behind 
> mod_proxy work, someone could investigate that and get mod_proxy 
> fixed.  -- justin


Re: AddOutputFilterByType oddness

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Tuesday, August 24, 2004 12:20 PM +0200 Graham Leggett 
<mi...@sharp.fm> wrote:

> Putting on an end user hat I see no reason why AddOutputFilterByType
> shouldn't do exactly what it says it does.

I believe it has more to do with mod_proxy than the filter design.  No one, 
at the time we added AddOutputFilterByType, wanted to rewrite mod_proxy to 
be knowledgeable about filters.

Ultimately, all that is needed is a call to ap_set_content_type() before 
any bytes are written to the client to get AddOutputFilterByType to work. 
Perhaps with the recent momentum behind mod_proxy work, someone could 
investigate that and get mod_proxy fixed.  -- justin

Re: AddOutputFilterByType oddness

Posted by Geoffrey Young <ge...@modperlcookbook.org>.

Graham Leggett wrote:
> Nick Kew wrote:
> 
>>> I have just set up the most recent httpd v2.0.51-dev tree, and have
>>> configured a filter that strips leading whitespace from HTML:
>>>
>>> AddOutputFilterByType STRIP text/html
>>>
>>> The content is served by mod_proxy.
> 
> 
>> As it stands, that can't work.
> 
> 
> Then as it stands filter's are broken.
> 
> Putting on an end user hat I see no reason why AddOutputFilterByType
> shouldn't do exactly what it says it does.

for the record, I've know of other cases where AddOutputFilterByType just
doesn't cut it, specifically wrt filter_init.  see

  http://marc.theaimsgroup.com/?l=apache-httpd-dev&m=107090791508163&w=2

for more details.  while I'm trying to address a separate issue there, the
example tarball shows that AddOutputFiltersByType is broken for even some
core module setups.

HTH

--Geoff

Re: AddOutputFilterByType oddness

Posted by Graham Leggett <mi...@sharp.fm>.
Nick Kew wrote:

>>I have just set up the most recent httpd v2.0.51-dev tree, and have
>>configured a filter that strips leading whitespace from HTML:
>>
>>AddOutputFilterByType STRIP text/html
>>
>>The content is served by mod_proxy.

> As it stands, that can't work.

Then as it stands filter's are broken.

Putting on an end user hat I see no reason why AddOutputFilterByType 
shouldn't do exactly what it says it does.

> It's a manifestation of the problem I'm addressing by reviewing
> the filter architecture: see http://www.apachetutor.org/dev/smart-filter
> and the "Ideas for smart filtering" thread here.

Reading the above, it seems that people are alergic to having filters 
look at headers to decide whether they should be valid or not.

Having a totally generic non HTTP filter sounds like a nice idea, but in 
practice it's a real pain in the ass. The filters need the knowledge 
contained in the headers regardless otherwise they simply won't work. 
They can either access the headers directly, or they can access some 
generic interface that warps the headers into something generic for the 
filters to access. Right now it seems filters do neither.

This is really annoying for an end user. Having developed the filter we 
need for our application, we deploy it and now find we cannot use it. 
For us it's back to the drawing board. :(

Regards,
Graham
--

Re: AddOutputFilterByType oddness

Posted by Nick Kew <ni...@webthing.com>.
On Tue, 24 Aug 2004, Nick Kew wrote:

> I actually have an implementation based on the discussion document and
> addressing the concerns people raised in the thread.  I hope to find
> time to finish the accompanying documentation and post it here round
> about this coming weekend.

OK, since you seem to have a real-life use for it, here goes.  As I
said before, I wasn't planning to post without a little more testing
and accompanying documents and discussion, but what the ****?
I'm sure I'll regret this premature posting ....

Mini-Synopsis:


# 1. Declare a smart filter that dispatches on Content-Type
FilterDeclare	myfilter	Content-Type


# 2. Declare your filter as a Provider, to run whenever Content-Type
#    includes the string "text/html"
FilterProvider	myfilter	STRIP	$text/html


# 3. Set the smart filter chain to this filter where you want to apply it
<Location scope-of-your-proxy>
	FilterChain	=myfilter
</Location>

-- 
Nick Kew

Re: AddOutputFilterByType oddness

Posted by Nick Kew <ni...@webthing.com>.
On Tue, 24 Aug 2004, Graham Leggett wrote:

> I have just set up the most recent httpd v2.0.51-dev tree, and have
> configured a filter that strips leading whitespace from HTML:
>
> AddOutputFilterByType STRIP text/html
>
> The content is served by mod_proxy.

As it stands, that can't work.

It's a manifestation of the problem I'm addressing by reviewing
the filter architecture: see http://www.apachetutor.org/dev/smart-filter
and the "Ideas for smart filtering" thread here.

I actually have an implementation based on the discussion document and
addressing the concerns people raised in the thread.  I hope to find
time to finish the accompanying documentation and post it here round
about this coming weekend.

> http://httpd.apache.org/docs-2.0/mod/core.html#addoutputfilterbytype
>
> it says that filters are not applied by proxied requests (It does not
> give a reason why not).

The URL above makes it clear what's happening there.

-- 
Nick Kew

Re: AddOutputFilterByType oddness

Posted by Graham Leggett <mi...@sharp.fm>.
William A. Rowe, Jr. wrote:

>>>Is your DefaultType set to text/html?
>>
>>It's set like so:
>>
>>DefaultType text/plain

> You are proxying content?  What does the HEAD /image.gif HTTP/1.0
> report for content type from the backend server?

It says this:

[root@gatekeeper root]# telnet gatekeeper.fma.co.za 80
Trying 196.30.143.210...
Connected to gatekeeper.fma.co.za.
Escape character is '^]'.
HEAD /patricia/policy/images/tabaccounting1.gif HTTP/1.1
Host: gatekeeper.fma.co.za

HTTP/1.1 200 OK
Date: Tue, 24 Aug 2004 10:05:54 GMT
Server: Apache-Coyote/1.1
ETag: W/"1636-1092965561000"
Last-Modified: Fri, 20 Aug 2004 01:32:41 GMT
Content-Type: image/gif
Connection: close

Connection closed by foreign host.

Regards,
Graham
--

Re: AddOutputFilterByType oddness

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 07:23 PM 8/23/2004, Graham Leggett wrote:
>Paul Querna wrote:
>
>>Is your DefaultType set to text/html?
>
>It's set like so:
>
>DefaultType text/plain

You are proxying content?  What does the HEAD /image.gif HTTP/1.0
report for content type from the backend server?

Bill



Re: AddOutputFilterByType oddness

Posted by Graham Leggett <mi...@sharp.fm>.
Paul Querna wrote:

> Is your DefaultType set to text/html?

It's set like so:

DefaultType text/plain

> On Tue, 2004-08-24 at 01:54 +0200, Graham Leggett wrote:
> 
>>Hi all,
>>
>>I have just set up the most recent httpd v2.0.51-dev tree, and have 
>>configured a filter that strips leading whitespace from HTML:
>>
>>AddOutputFilterByType STRIP text/html
>>
>>The content is served by mod_proxy.
>>
>>This seems to work fine for HTML requests, but I have noticed that this 
>>filter is also being applied to images as well (thus corrupting them). 
>>Why would the above directive apply to all content, instead of text/html 
>>only as is configured?
>>
>>Looking at the following docs:
>>
>>http://httpd.apache.org/docs-2.0/mod/core.html#addoutputfilterbytype
>>
>>it says that filters are not applied by proxied requests (It does not 
>>give a reason why not). From the test above however this statement is 
>>false, filters are applied to proxy requests - all proxied requests.
>>
>>Am I doing something wrong, or is AddOutputFilterByType broken?

Regards,
Graham
--

Re: AddOutputFilterByType oddness

Posted by Paul Querna <ch...@force-elite.com>.
Is your DefaultType set to text/html?

On Tue, 2004-08-24 at 01:54 +0200, Graham Leggett wrote:
> Hi all,
> 
> I have just set up the most recent httpd v2.0.51-dev tree, and have 
> configured a filter that strips leading whitespace from HTML:
> 
> AddOutputFilterByType STRIP text/html
> 
> The content is served by mod_proxy.
> 
> This seems to work fine for HTML requests, but I have noticed that this 
> filter is also being applied to images as well (thus corrupting them). 
> Why would the above directive apply to all content, instead of text/html 
> only as is configured?
> 
> Looking at the following docs:
> 
> http://httpd.apache.org/docs-2.0/mod/core.html#addoutputfilterbytype
> 
> it says that filters are not applied by proxied requests (It does not 
> give a reason why not). From the test above however this statement is 
> false, filters are applied to proxy requests - all proxied requests.
> 
> Am I doing something wrong, or is AddOutputFilterByType broken?
> 
> Regards,
> Graham
> --