You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modules-dev@httpd.apache.org by rm...@tuxteam.de on 2011/05/04 11:34:46 UTC
using mod_proxy for subrequests
Hello list,
as the subject line says, I'm trying to run a subrequest through
mod_proxy and need to post-process the subrequests response data.
Looking at older posts on this list it seems as if the only way to
accomplish this is:
(1) create a subrequest with ap_sub_req_lookup_uri(...)
(2) modify parts of the created subrequest (filename, handler, proxyreq
etc.)
(3) Install a filter that captures the response data
(4) run that subrequest
Now, (1) seems unelegant since it does need a valid URI which has
nothing to do with the final proxy request. Hence the value of the
subrequest's status has no meaning -- but isn't this exactly the purpose
of subrequests? To quote Nick Kew '....to run a fast partial request, to
gather information: what would happen if we ran thos request?'
Is there really no way to create a subrequest directly aiming at
mod_proxy.
It would be utterly nice to be able to access a (proxied) subrequests
metadata (content-type, etag etc.) before running the filter.
Any ideas? Mabe a nice API extension for Apache or mod_proxy?
TIA Ralf Mattes
Re: using mod_proxy for subrequests
Posted by rm...@tuxteam.de.
On Wed, May 04, 2011 at 02:00:33PM +0200, Sorin Manolache wrote:
>
> I didn't mean that I'm really clueless. I trawled through the apache
> sources quite extensively and I decided to do it. And there's a
> commercial/financial stake in my case too.
>
> If you look at mod_proxy's sources, there're 4 places in which r->main
> is checked, two in ap_proxy_http_request, one in
> ap_proxy_backend_broke and one in mod_proxy_ajp.c
>
> In the first place, If-Match, If-None-Match, If-Range,
> If-Modified-Since, If-Unmodified-Since are not passed through in the
> subrequest.
>
> In the second place, for subrequests:
>
> *) the connection is marked to be closed after the request
> *) Content-Length and Transfer-Encoding are removed
> *) the main request body, if any, is not forwarded to the subrequest's backend.
>
> So if you set subreq->main to NULL you won't have the effects listed above.
>
> In ap_proxy_backend_broke, if r is a subrequest and the backend broke,
> the main request response is marked as non-cacheable.
>
> I didn't look into mod_proxy_ajp.c.
Yes, but what makes me feel quite uneasy is the fact that both your
solution as well as mine rely on "internal" knowledge and assumptions
build on that. From a programmers point of view this is o.k. in an open
source implementation but this creates administrative nightmares ...
What happens iff the programmers of mod_proxy decide to change their
internal processing? After all, line 426 ff. in mod_proxy.c aren't part
of a published API. So, maybe years after installing your fine module,
an inocent software update breaks it ... 8-/
I guess an exported mod_proxy function to fetch metadata would be a nice
thing to have.
Cheers, RalfD
> Sorin
Re: using mod_proxy for subrequests
Posted by Sorin Manolache <so...@gmail.com>.
On Wed, May 4, 2011 at 12:39, <rm...@tuxteam.de> wrote:
> On Wed, May 04, 2011 at 11:36:35AM +0200, Sorin Manolache wrote:
>> On Wed, May 4, 2011 at 11:34, <rm...@tuxteam.de> wrote:
>> > Hello list,
>> >
>> > as the subject line says, I'm trying to run a subrequest through
>> > mod_proxy and need to post-process the subrequests response data.
>> > Looking at older posts on this list it seems as if the only way to
>> > accomplish this is:
>> >
>> > (1) create a subrequest with ap_sub_req_lookup_uri(...)
>> >
>> > (2) modify parts of the created subrequest (filename, handler, proxyreq
>> > etc.)
>> >
>> > (3) Install a filter that captures the response data
>> >
>> > (4) run that subrequest
>>
>> Play it in conjunction to RewriteRules:
>>
>> RewriteCond %{IS_SUBREQ} true
>> RewriteRule ^/some_name$
>> http://backend.host.net/path?query_string [P]
>
> Hmm, I don't seem to get what's you do different compared with my
> approach:
>
>
>> request_rec *subr = ap_sub_req_method_uri("GET", "/some_name", r, NULL);
>
> Same as my (1)
> Here, "/some_name" is still an arbitrary URI and _not_ the proxy URI I
> want to query. BTW, this does clutter the URL namespace, a big no-no in
> my usecase ...
>
>> ap_add_output_filter(post_processing_filter_name, filter_context,
>> subr, subr->connection);
>
> Same as my (3)
>
>> int status = ap_run_subreq(subr);
>> int http_status = subr->status;
>> // optional: subr->main = r;
>> if (ap_is_HTTP_ERROR(status) || ap_is_HTTP_ERROR(http_status))
>> // some error handling
>> }
>
> And you still need to _run_ the subrequest to get at the restponse
> status etc.
>
>>
>> There are some subtleties here:
>>
>> 1. The rewrite rules are ran in the translate_name hook. If you want
>> to use %{ENV:request_note_name} in your rewrite rule, you have to copy
>> them somehow (for example in another translate_name callback that is
>> run before the mod_rewrite callbacks) from the main request notes to
>> the subrequest notes.
>>
>> 2. Subrequests are not kept alive. In order to keep them alive, you
>> could try to hook APR_OPTIONAL_HOOK(proxy, fixups, &proxy_fixups,
>> NULL, NULL, APR_HOOK_MIDDLE). In the proxy_fixups callback, you can
>> set subr->main = NULL; Then, after ap_run_subreq, you can re-set
>> subr->main = r (the "optional" line in the code example above). i
>
> But that means loosing all request context in the subrequest! One of
> tthe main reasons to use mod_proxy instead of
> some-arbitrary-webclient-lib is the fact that mod_proxy passes all
> incomming header to the backend server. A must in my case.
The request_rec structure of the subrequest is already correctly set
up when I cut its link to the main request.
>> I'm
>> using this trick but I do not know all its consequences.
>
> Hmmm - bold. The costs of server downtime might easily exeed my
> monthly income in this case :-)
I didn't mean that I'm really clueless. I trawled through the apache
sources quite extensively and I decided to do it. And there's a
commercial/financial stake in my case too.
If you look at mod_proxy's sources, there're 4 places in which r->main
is checked, two in ap_proxy_http_request, one in
ap_proxy_backend_broke and one in mod_proxy_ajp.c
In the first place, If-Match, If-None-Match, If-Range,
If-Modified-Since, If-Unmodified-Since are not passed through in the
subrequest.
In the second place, for subrequests:
*) the connection is marked to be closed after the request
*) Content-Length and Transfer-Encoding are removed
*) the main request body, if any, is not forwarded to the subrequest's backend.
So if you set subreq->main to NULL you won't have the effects listed above.
In ap_proxy_backend_broke, if r is a subrequest and the backend broke,
the main request response is marked as non-cacheable.
I didn't look into mod_proxy_ajp.c.
Sorin
Re: using mod_proxy for subrequests
Posted by rm...@tuxteam.de.
On Wed, May 04, 2011 at 11:36:35AM +0200, Sorin Manolache wrote:
> On Wed, May 4, 2011 at 11:34, <rm...@tuxteam.de> wrote:
> > Hello list,
> >
> > as the subject line says, I'm trying to run a subrequest through
> > mod_proxy and need to post-process the subrequests response data.
> > Looking at older posts on this list it seems as if the only way to
> > accomplish this is:
> >
> > (1) create a subrequest with ap_sub_req_lookup_uri(...)
> >
> > (2) modify parts of the created subrequest (filename, handler, proxyreq
> > etc.)
> >
> > (3) Install a filter that captures the response data
> >
> > (4) run that subrequest
>
> Play it in conjunction to RewriteRules:
>
> RewriteCond %{IS_SUBREQ} true
> RewriteRule ^/some_name$
> http://backend.host.net/path?query_string [P]
Hmm, I don't seem to get what's you do different compared with my
approach:
> request_rec *subr = ap_sub_req_method_uri("GET", "/some_name", r, NULL);
Same as my (1)
Here, "/some_name" is still an arbitrary URI and _not_ the proxy URI I
want to query. BTW, this does clutter the URL namespace, a big no-no in
my usecase ...
> ap_add_output_filter(post_processing_filter_name, filter_context,
> subr, subr->connection);
Same as my (3)
> int status = ap_run_subreq(subr);
> int http_status = subr->status;
> // optional: subr->main = r;
> if (ap_is_HTTP_ERROR(status) || ap_is_HTTP_ERROR(http_status))
> // some error handling
> }
And you still need to _run_ the subrequest to get at the restponse
status etc.
>
> There are some subtleties here:
>
> 1. The rewrite rules are ran in the translate_name hook. If you want
> to use %{ENV:request_note_name} in your rewrite rule, you have to copy
> them somehow (for example in another translate_name callback that is
> run before the mod_rewrite callbacks) from the main request notes to
> the subrequest notes.
>
> 2. Subrequests are not kept alive. In order to keep them alive, you
> could try to hook APR_OPTIONAL_HOOK(proxy, fixups, &proxy_fixups,
> NULL, NULL, APR_HOOK_MIDDLE). In the proxy_fixups callback, you can
> set subr->main = NULL; Then, after ap_run_subreq, you can re-set
> subr->main = r (the "optional" line in the code example above). i
But that means loosing all request context in the subrequest! One of
tthe main reasons to use mod_proxy instead of
some-arbitrary-webclient-lib is the fact that mod_proxy passes all
incomming header to the backend server. A must in my case.
> I'm
> using this trick but I do not know all its consequences.
Hmmm - bold. The costs of server downtime might easily exeed my
monthly income in this case :-)
cheers, RalfD
> Sorin
>
>
> >
> > Now, (1) seems unelegant since it does need a valid URI which has
> > nothing to do with the final proxy request. Hence the value of the
> > subrequest's status has no meaning -- but isn't this exactly the purpose
> > of subrequests? To quote Nick Kew '....to run a fast partial request, to
> > gather information: what would happen if we ran thos request?'
> > Is there really no way to create a subrequest directly aiming at
> > mod_proxy.
> > It would be utterly nice to be able to access a (proxied) subrequests
> > metadata (content-type, etag etc.) before running the filter.
> >
> > Any ideas? Mabe a nice API extension for Apache or mod_proxy?
> >
> > TIA Ralf Mattes
> >
> >
Re: using mod_proxy for subrequests
Posted by Sorin Manolache <so...@gmail.com>.
On Wed, May 4, 2011 at 11:34, <rm...@tuxteam.de> wrote:
> Hello list,
>
> as the subject line says, I'm trying to run a subrequest through
> mod_proxy and need to post-process the subrequests response data.
> Looking at older posts on this list it seems as if the only way to
> accomplish this is:
>
> (1) create a subrequest with ap_sub_req_lookup_uri(...)
>
> (2) modify parts of the created subrequest (filename, handler, proxyreq
> etc.)
>
> (3) Install a filter that captures the response data
>
> (4) run that subrequest
Play it in conjunction to RewriteRules:
RewriteCond %{IS_SUBREQ} true
RewriteRule ^/some_name$
http://backend.host.net/path?query_string [P]
request_rec *subr = ap_sub_req_method_uri("GET", "/some_name", r, NULL);
ap_add_output_filter(post_processing_filter_name, filter_context,
subr, subr->connection);
int status = ap_run_subreq(subr);
int http_status = subr->status;
// optional: subr->main = r;
if (ap_is_HTTP_ERROR(status) || ap_is_HTTP_ERROR(http_status))
// some error handling
}
There are some subtleties here:
1. The rewrite rules are ran in the translate_name hook. If you want
to use %{ENV:request_note_name} in your rewrite rule, you have to copy
them somehow (for example in another translate_name callback that is
run before the mod_rewrite callbacks) from the main request notes to
the subrequest notes.
2. Subrequests are not kept alive. In order to keep them alive, you
could try to hook APR_OPTIONAL_HOOK(proxy, fixups, &proxy_fixups,
NULL, NULL, APR_HOOK_MIDDLE). In the proxy_fixups callback, you can
set subr->main = NULL; Then, after ap_run_subreq, you can re-set
subr->main = r (the "optional" line in the code example above). I'm
using this trick but I do not know all its consequences.
Sorin
>
> Now, (1) seems unelegant since it does need a valid URI which has
> nothing to do with the final proxy request. Hence the value of the
> subrequest's status has no meaning -- but isn't this exactly the purpose
> of subrequests? To quote Nick Kew '....to run a fast partial request, to
> gather information: what would happen if we ran thos request?'
> Is there really no way to create a subrequest directly aiming at
> mod_proxy.
> It would be utterly nice to be able to access a (proxied) subrequests
> metadata (content-type, etag etc.) before running the filter.
>
> Any ideas? Mabe a nice API extension for Apache or mod_proxy?
>
> TIA Ralf Mattes
>
>