You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Ruediger Pluem <rp...@apache.org> on 2008/06/02 21:39:51 UTC

Re: mod_proxy race condition bug #37770


On 05/17/2008 09:24 PM, Ruediger Pluem wrote:

> 
> In conjunction with the last patch attached to 37770 the following patch
> should finally prevent this problem at the cost of a performance penalty:
> 
> Index: modules/proxy/proxy_util.c
> ===================================================================
> --- modules/proxy/proxy_util.c  (Revision 657321)
> +++ modules/proxy/proxy_util.c  (Arbeitskopie)
> @@ -2168,6 +2168,11 @@
>      else {
>          conn->addr = worker->cp->addr;
>      }
> +    /* Close a possible existing socket if we are told to do so */
> +    if (conn->close) {
> +        socket_cleanup(conn);
> +        conn->close = 0;
> +    }
> 
>      if (err != APR_SUCCESS) {
>          return ap_proxyerror(r, HTTP_BAD_GATEWAY,
> Index: modules/proxy/mod_proxy_http.c
> ===================================================================
> --- modules/proxy/mod_proxy_http.c      (Revision 657321)
> +++ modules/proxy/mod_proxy_http.c      (Arbeitskopie)
> @@ -1876,6 +1876,20 @@
>          ap_proxy_ssl_connection_cleanup(backend, r);
>      }
> 
> +    /*
> +     * In the case that we are handling a reverse proxy connection and 
> this
> +     * is not a request that is coming over an already kept alive 
> connection
> +     * with the client, do NOT reuse the connection to the backend, 
> because
> +     * we cannot forward a failure to the client in this case as the 
> client
> +     * does NOT expects this in this situation.
> +     * Yes, this creates a performance penalty.
> +     * XXX: Make this behaviour configurable: Either via an environment
> +     * variable or via a worker configuration directive.
> +     */
> +    if ((r->proxyreq == PROXYREQ_REVERSE) && (!c->keepalives)) {
> +        backend->close = 1;
> +    }
> +
>      /* Step One: Determine Who To Connect To */
>      if ((status = ap_proxy_determine_connection(p, r, conf, worker, 
> backend,
>                                                  uri, &url, proxyname,
> 
> Therefore it should be configurable. Any preferences whether to do this 
> via an
> environment variable or via a worker specific property?

I would like to fold the approach of the patch into trunk to reduce PR37770
to become (hopefully) an RTFM bug. The approach of the patch above in conjunction
with the reslist TTL fix in upcoming APR-UTIL 1.3.x should get us there.
Any opinions about how to make this configurable (environment variable or
via a worker specific property)?

Regards

Rüdiger

Re: mod_proxy race condition bug #37770

Posted by Adam Woodworth <mi...@gmail.com>.

Sounds good, glad I was able to help.  I'll keep an eye on this in a
future httpd release.


On Sat, Sep 20, 2008 at 6:42 AM, Ruediger Pluem <rp...@apache.org> wrote:
>
>
> On 09/20/2008 12:21 PM, Nick Kew wrote:
>>
>> On 19 Sep 2008, at 23:08, Ruediger Pluem wrote:
>>
>>> On 09/19/2008 11:24 PM, Adam Woodworth wrote:
>>>>
>>>> The problem with this is that it will also do this for connections to
>>>> the backend that timed out.  For our application, we want actual
>>>> timeouts to be caught as normal and kicked back through to the client
>>>> as a 50x error.
>>>
>>> Hm. Good point. I have to think about it, because a timeout error can
>>> justify  a repeated request (like when a firewall between frontend and
>>> backend cut the connection and now the packets are simply dropped by
>>> the firewall and a fresh one would fix this).
>>
>> I think Adam is right.  The justification for the RFC-breaking connection
>> drop is IIRC that it's merely propagating the same from the backend
>> to the client.  That doesn't apply in the case of a timeout.
>>
>
> Yes, I came to the same conclusion in the meantime. The firewall drop can be
> prevented by setting keepalive to on. Furthermore the typical keepalive
> timeouts
> for http connections are way below the typical firewall timeouts. So if we
> get back APR_TIMEUP we can safely assume that the backend did not respond in
> a timely manner. Looking at his patch I think it looks basically fine.
> Hope to find some time for a closer look and a commit later on.
>
> Regards
>
> Rüdiger
>

Re: mod_proxy race condition bug #37770

Posted by Ruediger Pluem <rp...@apache.org>.

On 09/20/2008 12:21 PM, Nick Kew wrote:
> 
> On 19 Sep 2008, at 23:08, Ruediger Pluem wrote:
> 
>> On 09/19/2008 11:24 PM, Adam Woodworth wrote:
>>>
>>> The problem with this is that it will also do this for connections to
>>> the backend that timed out.  For our application, we want actual
>>> timeouts to be caught as normal and kicked back through to the client
>>> as a 50x error.
>>
>> Hm. Good point. I have to think about it, because a timeout error can
>> justify  a repeated request (like when a firewall between frontend and
>> backend cut the connection and now the packets are simply dropped by
>> the firewall and a fresh one would fix this).
> 
> I think Adam is right.  The justification for the RFC-breaking connection
> drop is IIRC that it's merely propagating the same from the backend
> to the client.  That doesn't apply in the case of a timeout.
> 

Yes, I came to the same conclusion in the meantime. The firewall drop can be
prevented by setting keepalive to on. Furthermore the typical keepalive timeouts
for http connections are way below the typical firewall timeouts. So if we
get back APR_TIMEUP we can safely assume that the backend did not respond in
a timely manner. Looking at his patch I think it looks basically fine.
Hope to find some time for a closer look and a commit later on.

Regards

Rüdiger

Re: mod_proxy race condition bug #37770

Posted by Jim Jagielski <ji...@apache.org>.

On Sep 20, 2008, at 6:21 AM, Nick Kew wrote:

>
> On 19 Sep 2008, at 23:08, Ruediger Pluem wrote:
>
>> On 09/19/2008 11:24 PM, Adam Woodworth wrote:
>>>
>>> The problem with this is that it will also do this for connections  
>>> to
>>> the backend that timed out.  For our application, we want actual
>>> timeouts to be caught as normal and kicked back through to the  
>>> client
>>> as a 50x error.
>>
>> Hm. Good point. I have to think about it, because a timeout error can
>> justify  a repeated request (like when a firewall between frontend  
>> and
>> backend cut the connection and now the packets are simply dropped by
>> the firewall and a fresh one would fix this).
>
> I think Adam is right.  The justification for the RFC-breaking  
> connection
> drop is IIRC that it's merely propagating the same from the backend
> to the client.  That doesn't apply in the case of a timeout.
>

+1

Re: mod_proxy race condition bug #37770

Posted by Nick Kew <ni...@webthing.com>.

On 19 Sep 2008, at 23:08, Ruediger Pluem wrote:

> On 09/19/2008 11:24 PM, Adam Woodworth wrote:
>>
>> The problem with this is that it will also do this for connections to
>> the backend that timed out.  For our application, we want actual
>> timeouts to be caught as normal and kicked back through to the client
>> as a 50x error.
>
> Hm. Good point. I have to think about it, because a timeout error can
> justify  a repeated request (like when a firewall between frontend and
> backend cut the connection and now the packets are simply dropped by
> the firewall and a fresh one would fix this).

I think Adam is right.  The justification for the RFC-breaking  
connection
drop is IIRC that it's merely propagating the same from the backend
to the client.  That doesn't apply in the case of a timeout.

-- 
Nick Kew

Re: mod_proxy race condition bug #37770

Posted by Ruediger Pluem <rp...@apache.org>.

On 09/19/2008 11:24 PM, Adam Woodworth wrote:
> Now that we're using 2.2.9 in the field, I've noticed a problem with
> r657443, or at least it's a problem for our application.
> 
> r657443 added code to shutdown the connection w/o any response with an
> EOC in the case that a second (or later) request on a keepalive
> connection failed to get a response from the backend.
> 
> The problem with this is that it will also do this for connections to
> the backend that timed out.  For our application, we want actual
> timeouts to be caught as normal and kicked back through to the client
> as a 50x error.

Hm. Good point. I have to think about it, because a timeout error can
justify  a repeated request (like when a firewall between frontend and
backend cut the connection and now the packets are simply dropped by
the firewall and a fresh one would fix this).
Bug me here if I forget about it.

Regards

Rüdiger

Re: mod_proxy race condition bug #37770

Posted by Jim Jagielski <ji...@jaguNET.com>.

On Sep 19, 2008, at 5:24 PM, Adam Woodworth wrote:

> Now that we're using 2.2.9 in the field, I've noticed a problem with
> r657443, or at least it's a problem for our application.
>
> r657443 added code to shutdown the connection w/o any response with an
> EOC in the case that a second (or later) request on a keepalive
> connection failed to get a response from the backend.
>
> The problem with this is that it will also do this for connections to
> the backend that timed out.  For our application, we want actual
> timeouts to be caught as normal and kicked back through to the client
> as a 50x error.
>
> I made a simple one line patch to our copy of httpd 2.2.9 and it seems
> to work.  I just check the rc code to see if it's a timeout error and
> if so skip the new code from r657443.
>
> I thought I'd raise this to the httpd developers -- do you think it's
> something that should be part of the httpd code?  Having a timeout to
> the backend causing the client to re-request seems not so good.
>

Agreed. Good catch.

>
> *** mod_proxy_http.c.bak	Thu Sep 18 20:42:05 2008
> --- mod_proxy_http.c	Fri Sep 19 15:39:32 2008
> ***************
> *** 1376,1382 ****
>               * seamonkey only display an empty page in this case  
> and do
>               * not do a retry.
>               */
> !             if (r->proxyreq == PROXYREQ_REVERSE && c->keepalives) {
>                  apr_bucket *eos;
>
>                  ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r,
> --- 1376,1385 ----
>               * seamonkey only display an empty page in this case  
> and do
>               * not do a retry.
>               */
> !             if (r->proxyreq == PROXYREQ_REVERSE && c->keepalives &&
> rc != APR_TIMEUP) {
>                  apr_bucket *eos;
>
>                  ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r,
>
>
>
> On Mon, Jun 2, 2008 at 4:21 PM, Ruediger Pluem <rp...@apache.org>  
> wrote:
>>
>>
>> On 06/02/2008 10:11 PM, Adam Woodworth wrote:
>>>>
>>>> From a usage standpoint I have no real preference other than I  
>>>> need to
>>>
>>> be able to configure it from httpd.conf on a per-host basis. :)
>>>
>>> What is a worker specific property?
>>
>> All the properties you put behind ProxyPass / balancermember or via  
>> proxyset
>> in a block
>> like
>>
>> <Proxy http://www.somebackend.com/blah>
>> </Proxy>
>>
>> Regards
>>
>> Rüdiger
>>
>>
>

Re: mod_proxy race condition bug #37770

Posted by Adam Woodworth <mi...@gmail.com>.

Now that we're using 2.2.9 in the field, I've noticed a problem with
r657443, or at least it's a problem for our application.

r657443 added code to shutdown the connection w/o any response with an
EOC in the case that a second (or later) request on a keepalive
connection failed to get a response from the backend.

The problem with this is that it will also do this for connections to
the backend that timed out.  For our application, we want actual
timeouts to be caught as normal and kicked back through to the client
as a 50x error.

I made a simple one line patch to our copy of httpd 2.2.9 and it seems
to work.  I just check the rc code to see if it's a timeout error and
if so skip the new code from r657443.

I thought I'd raise this to the httpd developers -- do you think it's
something that should be part of the httpd code?  Having a timeout to
the backend causing the client to re-request seems not so good.

*** mod_proxy_http.c.bak	Thu Sep 18 20:42:05 2008
--- mod_proxy_http.c	Fri Sep 19 15:39:32 2008
***************
*** 1376,1382 ****
               * seamonkey only display an empty page in this case and do
               * not do a retry.
               */
!             if (r->proxyreq == PROXYREQ_REVERSE && c->keepalives) {
                  apr_bucket *eos;

                  ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r,
--- 1376,1385 ----
               * seamonkey only display an empty page in this case and do
               * not do a retry.
               */
!             if (r->proxyreq == PROXYREQ_REVERSE && c->keepalives &&
rc != APR_TIMEUP) {
                  apr_bucket *eos;

                  ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r,

On Mon, Jun 2, 2008 at 4:21 PM, Ruediger Pluem <rp...@apache.org> wrote:
>
>
> On 06/02/2008 10:11 PM, Adam Woodworth wrote:
>>>
>>> From a usage standpoint I have no real preference other than I need to
>>
>> be able to configure it from httpd.conf on a per-host basis. :)
>>
>> What is a worker specific property?
>
> All the properties you put behind ProxyPass / balancermember or via proxyset
> in a block
> like
>
> <Proxy http://www.somebackend.com/blah>
> </Proxy>
>
> Regards
>
> Rüdiger
>
>

Re: mod_proxy race condition bug #37770

Posted by Ruediger Pluem <rp...@apache.org>.


On 06/02/2008 10:11 PM, Adam Woodworth wrote:
>>>From a usage standpoint I have no real preference other than I need to
> be able to configure it from httpd.conf on a per-host basis. :)
> 
> What is a worker specific property?

All the properties you put behind ProxyPass / balancermember or via proxyset in a block
like

<Proxy http://www.somebackend.com/blah>
</Proxy>

Regards

Rüdiger

Re: mod_proxy race condition bug #37770

Posted by Adam Woodworth <mi...@gmail.com>.

>From a usage standpoint I have no real preference other than I need to
be able to configure it from httpd.conf on a per-host basis. :)

What is a worker specific property?


> I would like to fold the approach of the patch into trunk to reduce PR37770
> to become (hopefully) an RTFM bug. The approach of the patch above in
> conjunction
> with the reslist TTL fix in upcoming APR-UTIL 1.3.x should get us there.
> Any opinions about how to make this configurable (environment variable or
> via a worker specific property)?
>
> Regards
>
> Rüdiger
>