You are viewing a plain text version of this content. The canonical link for it is here.
Posted to bugs@httpd.apache.org by bu...@apache.org on 2011/12/15 17:15:47 UTC

DO NOT REPLY [Bug 52342] New: ap_internal_redirect dropping filters means inconsistent behaviour for includes

https://issues.apache.org/bugzilla/show_bug.cgi?id=52342

             Bug #: 52342
           Summary: ap_internal_redirect dropping filters means
                    inconsistent behaviour for includes
           Product: Apache httpd-2
           Version: 2.2.21
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Core
        AssignedTo: bugs@httpd.apache.org
        ReportedBy: matthew.byng-maddick@bbc.co.uk
    Classification: Unclassified


Created attachment 28076
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=28076
mod_rewrite and mod_http patch to allow internal_redirect to retain filters and
notes

In our setup, we have people using .htaccess files (still). We also have lots
of proxying between various different service groups of the website.

In order to make the proxying consistent, we have a global main config
ProxyPass rule, which points to the correct service group for that environment
(eg. live, stage, test etc).

So, in main config:
ProxyPass /_proxy_/pal/ http://pal.<environment>.<site>/

And then in .htaccess:
RewriteRule mything /_proxy_/pal/myotherthing

Which, because it's in .htaccess gets internal-redirected and properly handled
by the mod_proxy stage. All good.

We *also* use mod_include extensively on this platform, and we have an
extension "sssi" to which we make sure we add an INCLUDES Output filter.

What we end up with, however, is that if you have, in .htaccess:
RewriteRule ^mything\.sssi$ http://pal.<environment>.<site>/mysite/mything.sssi
[P]
then the include gets parsed, as expected. If, on the other hand, you have:
RewriteRule ^mything\.sssi$ /_proxy_/pal/mysite/mything.sssi
then the includes filter gets stripped (described below) and the inclusion just
gets injected into the page instead of being parsed as sssi.

This is basically due to the default behaviour of internal_internal_redirect()
in modules/http/http_request.c, which strips output filters (except for the
sub_req filter in the case that that's being used) and replaces a completely
new r->notes table.

This is sensible behaviour when an internal redirect is being used to, say,
generate an error page, or similar, but is less useful when you then get
different behaviour in mod_rewrite depending on whether you put the RewriteRule
in main config or .htaccess and where in main config it goes.

The attached patch adds a new ap_internal_redirect_filter() function in the
httpd public API, which keeps the input and output filter chain and notes
(while still replacing the environment) by changing the calling convention
slightly for internal_internal_redirect() which isn't part of the public API,
and adds a new RewriteOptions option to mod_rewrite which allows it to do the
ap_internal_redirect_filter() when it redirects, rather than always doing the
ap_internal_redirect() which will strip the filter.

(the reason for the 1<<3 as the value rather than 1<<2 is that we're also
currently applying 1<<2 as part of the bug in 48304 - which is now merged with
httpd-2.4.x)

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


DO NOT REPLY [Bug 52342] ap_internal_redirect dropping filters means inconsistent behaviour for includes

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=52342

--- Comment #1 from Graham Leggett <mi...@sharp.fm> 2011-12-17 13:36:10 UTC ---
I'm smelling a bug - if you're using AddOuputFilter, that's all processed
within mod_mime.c in the type_checker hook, which in turn runs before fixups,
and this in turn is triggered from ap_process_request_internal(), which is in
turn triggered from ap_internal_redirect(), which is called by mod_rewrite in
your case, so the output filters should be re-added when the request comes
round again, the trick is finding out why this isn't happening.

Further digging finds two possible causes, and we might be suffering both:

- Something inside mod_mime.c:find_ct is aborting early and isn't reaching the
point where output_filters are added. Would it be possible to put a breakpoint
here and see if we ever reach this code? We'll reach it at least once for the
initial request, key though is that we should reach this again for the internal
direct, if we don't, we'd need to see why it's bailing out early. Most
specifically, do we reach this code:

           if (exinfo->output_filters && r->proxyreq == PROXYREQ_NONE) {
               const char *filter, *filters = exinfo->output_filters;
               while (*filters
                   && (filter = ap_getword(r->pool, &filters, ';'))) {
                   ap_add_output_filter(filter, NULL, r, r->connection);
               }
               if (conf->multimatch & MULTIMATCH_FILTERS) {
                   found = 1;
               }
           }

I suspect we might be reaching this code, only to find we've detected it as a
proxy request, and therefore aren't adding filters (In other words "r->proxyreq
== PROXYREQ_NONE" is false).

I can kinda-sorta see why we might want to not apply extensions to proxy
requests, but then at the same time I also see cases like this one where we
would want to - the URL ends with .sssi, whether it comes from a proxy or not,
who cares, we want the INCLUDES filter.

I suspect the fix might be to add a config directive that allows the end user
to ignore the "r->proxyreq == PROXYREQ_NONE" bit, or to just take that away
entirely (in 2.4).

In fact, this might be the cause of the inconsistency between using a ProxyPass
and the mod_rewrite [P] flag - in the ProxyPass case, r->proxyreq ==
PROXYREQ_NONE is false and this code is bypassed. In the [P] flag case,
r->proxyreq == PROXYREQ_NONE is true because the [P] flag hasn't been evaluated
yet (I *think* that happens 1 phase later, in fixups), so we get our filters in
this case.

Try this patch in other words, does this make it work?

Index: modules/http/mod_mime.c
===================================================================
--- modules/http/mod_mime.c    (revision 1214429)
+++ modules/http/mod_mime.c    (working copy)
@@ -895,7 +895,7 @@
             * setting redundant filters.    2, we insert these in the types
             * config hook, which may be too early (dunno.)
             */
-            if (exinfo->input_filters && r->proxyreq == PROXYREQ_NONE) {
+            if (exinfo->input_filters) {
                const char *filter, *filters = exinfo->input_filters;
                while (*filters
                    && (filter = ap_getword(r->pool, &filters, ';'))) {
@@ -905,7 +905,7 @@
                    found = 1;
                }
            }
-            if (exinfo->output_filters && r->proxyreq == PROXYREQ_NONE) {
+            if (exinfo->output_filters) {
                const char *filter, *filters = exinfo->output_filters;
                while (*filters
                    && (filter = ap_getword(r->pool, &filters, ';'))) {

- AddOutputFilter doesn't actually mean "add" (and this is broken). Tracing the
code, when we merge configs from one location/directory to another, we end up
replacing the previous filter definition for that extension, instead of adding
it. Or to put this another way, if you have this:

AddOutputFilter FOO html
<Location /baz>
 AddOutputFilter BAR html
</Location>

The BAR filter ends up replacing the FOO filter completely for the html type,
which is broken - add should mean just that, add, not replace.

To test this, put a breakpoint inside mod_mime:overlay_extension_mappings, and
see if you reach this code:

   if (overlay_info->output_filters) {
       new_info->output_filters = overlay_info->output_filters;
   }

In theory, you should reach this code twice. If this bug is present, you'll
reach a point where new_info->output_filters evaluates to "INCLUDES" (or
something containing INCLUDES), and overlay_info->output_filters evaluates to
something else (like "BUFFER", or "DEFLATE", or...).

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


DO NOT REPLY [Bug 52342] ap_internal_redirect dropping filters means inconsistent behaviour for includes

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=52342

Matthew Byng-Maddick <ma...@bbc.co.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|Core                        |mod_include

--- Comment #6 from Matthew Byng-Maddick <ma...@bbc.co.uk> 2012-01-23 19:26:45 UTC ---
So, having dug a lot deeper, and with a lot more ap_log_rerror(APLOG_MARK,
APLOG_DEBUG...) lines, I've finally found why this doesn't work, it's not
actually to do with keeping the filters or not, though obviously keeping the
existing filter, and the existing context in the internal-redirect case does
help (but is wrong).

It's actually what happens in mod_include, basically, in handle_include()
(which handles the <!--#include tag, we make the subrequest (as a file or uri),
which creates a completely empty r->request_config, fine. Before we call
ap_run_sub_req() on the new subrequest request_rec (which is actually called
"rr" in that function) we do:
        /* See the Kludge in includes_filter for why.
         * Basically, it puts a bread crumb in here, then looks
         * for the crumb later to see if its been here.
         */
        if (rr) {
            ap_set_module_config(rr->request_config, &include_module, r);
        }

In the actual filtering call (includes_filter()), we find:
    if ((parent = ap_get_module_config(r->request_config, &include_module))) {
        /* Kludge --- for nested includes, we want to keep the subprocess
         * environment of the base document (for compatibility); that means
         * torquing our own last_modified date as well so that the
         * LAST_MODIFIED variable gets reset to the proper value if the
         * nested document resets <!--#config timefmt -->.
         */
        r->subprocess_env = r->main->subprocess_env;
        apr_pool_join(r->main->pool, r->pool);
        r->finfo.mtime = r->main->finfo.mtime;
    }
    else {
        /* we're not a nested include, so we create an initial
         * environment */
        ap_add_common_vars(r);
        ap_add_cgi_vars(r);
        add_include_vars(r);
    }

This is obviously absolutely horrid, and explains why this doesn't work. I'm
guessing that the actual handling should be something like:
    initr=r;
    while(initr->prev) {
        initr = initr->prev;
    }
    if ((parent = ap_get_module_config(initr->request_config,
&include_module))) {
        r->subprocess_env = parent->subprocess_env;
...
using parent (as the token stuck in) rather than r->main.

With that in place, it can still be perfectly possible to set filters based on
existing stuff, without necessarily needing Graham's patches to mod_mime which
I definitely have concerns about.

This seems like the right solution here, though because you don't have a proper
r->request_config set up, I'm wondering if it can be done with a switch on
mod_include, so as not to change existing functionality generically.

Cheers

MBM

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


DO NOT REPLY [Bug 52342] ap_internal_redirect dropping filters means inconsistent behaviour for includes

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=52342

Matthew Byng-Maddick <ma...@bbc.co.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #28076|0                           |1
        is obsolete|                            |

--- Comment #7 from Matthew Byng-Maddick <ma...@bbc.co.uk> 2012-01-24 18:33:07 UTC ---
Created attachment 28201
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=28201
Allows SSI to consider following internal-redirects on its subrequests as valid

The above has the kind of properties we want - that we follow the
internal-redirect chain back to its source, which means that we can see the
original sub-request's tagging (or not), and hence act just like we would
within that sub-request. This is, of course, a kludge on a kludge, and in many
ways I don't like it.

There are other ways of making sure that the mod_include filter actually gets
activated, of course, which mean that applying stuff by extension where you're
using mod_proxy possibly isn't the right thing after all.

That the location that the hack acts on is the source location rather than the
destination is a bit of an annoyance, but if you're using mod_rewrite and
.htaccess (and probably SSI) the source is likely to be the one you have more
control over...

Cheers

MBM

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


DO NOT REPLY [Bug 52342] ap_internal_redirect dropping filters means inconsistent behaviour for includes

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=52342

--- Comment #3 from Joe Orton <jo...@redhat.com> 2012-01-18 16:46:48 UTC ---
"The BAR filter ends up replacing the FOO filter completely for the html type,
which is broken - add should mean just that, add, not replace."

The AddBlah directives *all* work this way with mod_mime; I agree it is not
particularly intuitive, but that it how they have always worked (and have been
documented!).  If you are trying to set up the filter chain for a reverse proxy
config surely SetOutputFilter is more appropriate?

Is it possible to create a minimal self-contained test case which demonstrates
the issue you're seeing here, using built-in modules?  I'm struggling to
understand what exactly the issue is here.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


DO NOT REPLY [Bug 52342] ap_internal_redirect dropping filters means inconsistent behaviour for includes

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=52342

--- Comment #5 from Joe Orton <jo...@redhat.com> 2012-01-19 12:05:09 UTC ---
(In reply to comment #4)
> Hi Joe,
> 
> The proxy thing is perhaps confusing the issue. Maybe this is more easily done
> by doing something like:
> 
> <Proxy balancer://.../**/*.sssi>
>   SetOutputFilter INCLUDES
> </Proxy>

LocationMatch would be the obvious choice to me.

> I'm actually more concerned, as I say above, by the difference in behaviour of
> the filters going:
> main->sub_request, and main->(sub_request->internal_redirect), which I think is
> very broken.

I don't understand your explanation of what it is broken, I'm afraid.  You can
try again, but a test case would be a way to communicate unambiguously.

In the case where, say:

1. main is uri /X
2. main requires a subreq for, say, SSI exec, so main->subreq is /X/Y
3. filter FOO is configured to apply to ONLY /X/Y, and is added to
r->output_filters
4. main->subreq does an internal redirect to /X/Z
5. main->subreq's handler is run

at step (5), is your contention that FOO should still be in the filter chain
for main->subreq's r->output_filters?  Or am I missing the point completely?

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


DO NOT REPLY [Bug 52342] ap_internal_redirect dropping filters means inconsistent behaviour for includes

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=52342

--- Comment #2 from Matthew Byng-Maddick <ma...@bbc.co.uk> 2012-01-16 12:26:31 UTC ---
I've spent a lot more time thinking about this, and Graham (minfrin) and I have
had some discussions about this in private email...

There's a definite issue in that a sub-request delegation, while it, obviously,
creates a new request_rec doesn't actually scan through the filters and update
f->r for each filter in the chain, if an internal_redirect happens, it creates
the new request_rec, and does scan the filters and reset f->r to the redirected
request_rec.

While it's obvious that if you're not in a sub-request context this is the
right behaviour for the internal_redirect, the inconsistency between a plain
sub-request and a sub-request that you end up internal_redirecting seems wrong.
My patch is definitely wrong in (a) keeping the filter chain - in fact, my
initial claims were wrong - though I don't like them, because of the order of
processing, and (b) adjusting the notes.

Now, I suspect that we can't just change this unilaterally - because it is such
a major change in behaviour - so the question is what should happen? Do we keep
the separate "don't update the f->r s" function, and allow mod_rewrite to call
that with a RewriteOption - this seems really ugly, as there are presumably
other situations where the internal_redirect could happen in a sub-request and
those won't necessarily call it? Do we have some flag set on any request_rec,
but then you need to traverse the request_rec calling chain to find out what
you're supposed to be doing? Other suggestions?

I may patch our local copy of mod_rewrite to solve my immediate problems with
something like the above, and apply minfrin's patch within this ticket to drop
the r->proxyreq check off mod_mime.c, but it would be nice if there were a
sensible way of solving this in a consistent way...

Cheers

MBM

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


DO NOT REPLY [Bug 52342] ap_internal_redirect dropping filters means inconsistent behaviour for includes

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=52342

--- Comment #4 from Matthew Byng-Maddick <ma...@bbc.co.uk> 2012-01-18 16:54:25 UTC ---
Hi Joe,

The proxy thing is perhaps confusing the issue. Maybe this is more easily done
by doing something like:

<Proxy balancer://.../**/*.sssi>
  SetOutputFilter INCLUDES
</Proxy>

Or there are possibly other ways of working around this.

I'm actually more concerned, as I say above, by the difference in behaviour of
the filters going:
main->sub_request, and main->(sub_request->internal_redirect), which I think is
very broken.

Cheers

MBM

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org