You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@httpd.apache.org by Hannes Schmidt <ha...@ucsc.edu> on 2013/08/17 11:11:34 UTC

[users@httpd] mod_proxy ignores incompleteness of chunked-coding response from backend

Hi,

We use Apache's mod_proxy to reverse-proxy a web application that sends
back large XML responses whose lengths are unknown in advance. IOW, the
responses sent by the application (aka the backend) use the chunked
transfer encoding. We configured mod_proxy to use chunked coding on the
frontend as well, via "SetEnv proxy-sendchunked".

If the application runs into an error condition while it is in the middle
of writing the response body, the HTTP status of 200 has already been sent.
The only way to communicate that error downstream is to close the
connection without sending the terminating zero-length chunk. This works
well if we point the user agent, e.g. curl, directly at the web
application. Curl properly detects the incomplete response and exits with a
non-zero status code, despite the HTTP status code of 200. If we point the
curl at Apache, this stops working and curl exits with 0, falsely
indicating success. We believe that the premature closing of the connection
by the backend goes unnoticed somewhere between ap_http_filter() and
ap_proxy_http_process_response(). Consequently, ap_http_chunk_filter()
terminates the frontend response with a zero-length chunk, even though the
backend response wasn't terminated by one.

IOW, an incomplete response body on the backend is turned into a complete
response body on the frontend. We are reasonably confident that this is in
violation of RFC 2616, sections 3.4 and 3.6.1. I captured the backend and
frontend HTTP exchanges in [4] and [5]. Note the terminating 0 chunk in the
frontend exchange [5] that is missing from the backend exchange [4].
Interestingly, ap_http_chunk_filter() is already able to handle dropped
backend connections but that particular code path isn't activated because
no error bucket with HTTP_BAD_GATEWAY is ever add to the brigade.

I am unsure, at this point, how to fix this properly. Mostly because I
don't know my way around Apache's inner workings. I do have a patch [1]
that fixes this behavior in our particular case but I am not sure whether
1) it catches all such conditions or 2) it doesn't severely break other
cases. I wrote a small Python HTTP server [2] that emulates our web
application's behavior. It needs Tornado "pip install tornado" and can be
run with "python server.py 8881". The relevant section of our httpd.conf is
at [3].

This occurs in 2.2.15 and 2.2.25. The patch is against the latter. I didn't
try 2.4.x. For the sake of completeness the curl invocation reads

curl -s 'http://localhost:8080/cghub/metadata/analysisObject?fake_error=1'
&& echo success || echo failure

or

curl -s 'http://localhost:8881/cghub/metadata/analysisObject?fake_error=1'
&& echo success || echo failure

to hit the backend directly.

[1] (patch) https://gist.github.com/hannes-ucsc/3f60c5fc5dd8c6bf23cc
[2] (server.py) https://gist.github.com/hannes-ucsc/a8ce89e3ce7967ffa833
[3] (httpd.conf) https://gist.github.com/hannes-ucsc/32df3a1adf6085bdb2cd
[4] (backend.txt) https://gist.github.com/hannes-ucsc/f38dfcc5b57caf318d34
[5] (frontend.txt) https://gist.github.com/hannes-ucsc/779966430e407c703543

--
Hannes Schmidt
Software Application Developer
Data Migration Engineer
Cancer Genomics Hub
University of California, Santa Cruz

(206) 696-2316 (cell)
hannes@ucsc.edu

[users@httpd] Re: mod_proxy ignores incompleteness of chunked-coding response from backend

Posted by Hannes Schmidt <ha...@ucsc.edu>.

Just FYI, we filed a bug about this:

https://issues.apache.org/bugzilla/show_bug.cgi?id=55475


On Sat, Aug 17, 2013 at 2:11 AM, Hannes Schmidt <ha...@ucsc.edu> wrote:

> Hi,
>
> We use Apache's mod_proxy to reverse-proxy a web application that sends
> back large XML responses whose lengths are unknown in advance. IOW, the
> responses sent by the application (aka the backend) use the chunked
> transfer encoding. We configured mod_proxy to use chunked coding on the
> frontend as well, via "SetEnv proxy-sendchunked".
>
> If the application runs into an error condition while it is in the middle
> of writing the response body, the HTTP status of 200 has already been sent.
> The only way to communicate that error downstream is to close the
> connection without sending the terminating zero-length chunk. This works
> well if we point the user agent, e.g. curl, directly at the web
> application. Curl properly detects the incomplete response and exits with a
> non-zero status code, despite the HTTP status code of 200. If we point the
> curl at Apache, this stops working and curl exits with 0, falsely
> indicating success. We believe that the premature closing of the connection
> by the backend goes unnoticed somewhere between ap_http_filter() and
> ap_proxy_http_process_response(). Consequently, ap_http_chunk_filter()
> terminates the frontend response with a zero-length chunk, even though the
> backend response wasn't terminated by one.
>
> IOW, an incomplete response body on the backend is turned into a complete
> response body on the frontend. We are reasonably confident that this is in
> violation of RFC 2616, sections 3.4 and 3.6.1. I captured the backend and
> frontend HTTP exchanges in [4] and [5]. Note the terminating 0 chunk in the
> frontend exchange [5] that is missing from the backend exchange [4].
> Interestingly, ap_http_chunk_filter() is already able to handle dropped
> backend connections but that particular code path isn't activated because
> no error bucket with HTTP_BAD_GATEWAY is ever add to the brigade.
>
> I am unsure, at this point, how to fix this properly. Mostly because I
> don't know my way around Apache's inner workings. I do have a patch [1]
> that fixes this behavior in our particular case but I am not sure whether
> 1) it catches all such conditions or 2) it doesn't severely break other
> cases. I wrote a small Python HTTP server [2] that emulates our web
> application's behavior. It needs Tornado "pip install tornado" and can be
> run with "python server.py 8881". The relevant section of our httpd.conf is
> at [3].
>
> This occurs in 2.2.15 and 2.2.25. The patch is against the latter. I
> didn't try 2.4.x. For the sake of completeness the curl invocation reads
>
> curl -s 'http://localhost:8080/cghub/metadata/analysisObject?fake_error=1'
> && echo success || echo failure
>
> or
>
> curl -s 'http://localhost:8881/cghub/metadata/analysisObject?fake_error=1'
> && echo success || echo failure
>
> to hit the backend directly.
>
> [1] (patch) https://gist.github.com/hannes-ucsc/3f60c5fc5dd8c6bf23cc
> [2] (server.py) https://gist.github.com/hannes-ucsc/a8ce89e3ce7967ffa833
> [3] (httpd.conf) https://gist.github.com/hannes-ucsc/32df3a1adf6085bdb2cd
> [4] (backend.txt) https://gist.github.com/hannes-ucsc/f38dfcc5b57caf318d34
> [5] (frontend.txt)
> https://gist.github.com/hannes-ucsc/779966430e407c703543
>
> --
> Hannes Schmidt
> Software Application Developer
> Data Migration Engineer
> Cancer Genomics Hub
> University of California, Santa Cruz
>
> (206) 696-2316 (cell)
> hannes@ucsc.edu
>



-- 
Hannes Schmidt
Software Application Developer
Data Migration Engineer
Cancer Genomics Hub
University of California, Santa Cruz

(206) 696-2316 (cell)
hannes@ucsc.edu