You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Ruediger Pluem <rp...@apache.org> on 2020/04/17 19:43:30 UTC

Re: svn commit: r1876664 - in /httpd/httpd/trunk: modules/http2/h2_request.c server/protocol.c


On 4/17/20 3:07 PM, ylavic@apache.org wrote:
> Author: ylavic
> Date: Fri Apr 17 13:07:46 2020
> New Revision: 1876664
> 
> URL: http://svn.apache.org/viewvc?rev=1876664&view=rev
> Log:
> core, h2: send EOR for early HTTP request failure.
> 
> The core output filters depend on EOR being sent at some point for correct
> accounting of setaside limits and lifetime.
> 
> Rework ap_read_request() early failure (including in post_read_request() hooks)
> so that it always sends the EOR after ap_die().
> 
> Apply the same scheme in h2_request_create_rec() which is the HTTP/2 to HTTP/1
> counterpart.
> 
> Modified:
>     httpd/httpd/trunk/modules/http2/h2_request.c
>     httpd/httpd/trunk/server/protocol.c
> 

> Modified: httpd/httpd/trunk/server/protocol.c
> URL: http://svn.apache.org/viewvc/httpd/httpd/trunk/server/protocol.c?rev=1876664&r1=1876663&r2=1876664&view=diff
> ==============================================================================
> --- httpd/httpd/trunk/server/protocol.c (original)
> +++ httpd/httpd/trunk/server/protocol.c Fri Apr 17 13:07:46 2020

> @@ -1453,14 +1444,12 @@ request_rec *ap_read_request(conn_rec *c
>          }
>      }
>  
> -    apr_brigade_destroy(tmp_bb);

Why don't we destroy it any longer?

> -
>      /* update what we think the virtual host is based on the headers we've
>       * now read. may update status.
>       */
> -    
> -    access_status = ap_update_vhost_from_headers_ex(r, conf->strict_host_check == AP_CORE_CONFIG_ON);
> -    if (conf->strict_host_check == AP_CORE_CONFIG_ON && access_status != HTTP_OK) { 
> +    strict_host_check = (conf->strict_host_check == AP_CORE_CONFIG_ON);
> +    access_status = ap_update_vhost_from_headers_ex(r, strict_host_check);
> +    if (strict_host_check && access_status != HTTP_OK) { 
>           if (r->server == ap_server_conf) { 
>               ap_log_rerror(APLOG_MARK, APLOG_WARNING, 0, r, APLOGNO(10156)
>                             "Requested hostname '%s' did not match any ServerName/ServerAlias "
> @@ -1472,22 +1461,13 @@ request_rec *ap_read_request(conn_rec *c
>                             "current connection is %s:%u)", 
>                             r->hostname, r->server->defn_name, r->server->defn_line_number);
>           }
> -         r->status = access_status;
> +         goto die_early;
>      }
> -
> -    access_status = r->status;
> -
> -    /* Toggle to the Host:-based vhost's timeout mode to fetch the
> -     * request body and send the response body, if needed.
> -     */
> -    if (cur_timeout != r->server->timeout) {
> -        apr_socket_timeout_set(csd, r->server->timeout);
> -        cur_timeout = r->server->timeout;
> +    if (r->status != HTTP_OK) { 
> +        access_status = r->status;
> +        goto die_early;

Why do we die_early here and not just die later?

>      }
>  
> -    /* we may have switched to another server */
> -    r->per_dir_config = r->server->lookup_defaults;
> -
>      if ((!r->hostname && (r->proto_num >= HTTP_VERSION(1, 1)))
>          || ((r->proto_num == HTTP_VERSION(1, 1))
>              && !apr_table_get(r->headers_in, "Host"))) {
> @@ -1498,10 +1478,23 @@ request_rec *ap_read_request(conn_rec *c
>           * HTTP/1.1 mentions twice (S9, S14.23) that a request MUST contain
>           * a Host: header, and the server MUST respond with 400 if it doesn't.
>           */
> -        access_status = HTTP_BAD_REQUEST;
>          ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(00569)
>                        "client sent HTTP/1.1 request without hostname "
>                        "(see RFC2616 section 14.23): %s", r->uri);
> +        access_status = HTTP_BAD_REQUEST;
> +        goto die_early;

Why do we die_early here and not just die later?


> +    }
> +
> +    /* we may have switched to another server */
> +    r->per_dir_config = r->server->lookup_defaults;
> +    conf = ap_get_core_module_config(r->server->module_config);
> +
> +    /* Toggle to the Host:-based vhost's timeout mode to fetch the
> +     * request body and send the response body, if needed.
> +     */
> +    if (cur_timeout != r->server->timeout) {
> +        apr_socket_timeout_set(csd, r->server->timeout);
> +        cur_timeout = r->server->timeout;
>      }
>  
>      /*

Regards

Rüdiger

Re: svn commit: r1876664 - in /httpd/httpd/trunk: modules/http2/h2_request.c server/protocol.c

Posted by Yann Ylavic <yl...@gmail.com>.

On Mon, Apr 20, 2020 at 8:51 AM Ruediger Pluem:
>
> On 4/19/20 5:26 PM, Yann Ylavic wrote:
> > +
> > +    /* Validate Host/Expect headers and select vhost. */
> >      if (!ap_check_request_header(r)) {
> >          access_status = r->status;
> > -        goto die_early;
> > +        goto die_before_hooks;
> >      }
>
> Shouldn't we ensure that
>
> r->per_dir_config = r->server->lookup_defaults;
>
> is done before we jump to die_before_hooks ?

Yes, correct, ap_check_request_header() can fail for (e.g.) invalid
Expect header, in which case r->server may still have changed.

>
> Otherwise I think looks good. As said my biggest worry was that we do not use vhost specific ErrorDocuments in cases where we used
> them.

Thanks for the review, hopefully everything addressed in r1876784.


Regards,
Yann.

Re: svn commit: r1876664 - in /httpd/httpd/trunk: modules/http2/h2_request.c server/protocol.c

Posted by Ruediger Pluem <rp...@apache.org>.


On 4/19/20 5:26 PM, Yann Ylavic wrote:
> On Fri, Apr 17, 2020 at 11:50 PM Yann Ylavic <yl...@gmail.com> wrote:
>> On Fri, Apr 17, 2020 at 9:43 PM Ruediger Pluem
>>> On 4/17/20 3:07 PM, ylavic
>>>> -    apr_brigade_destroy(tmp_bb);
>>> Why don't we destroy it any longer?
>> tmp_bb is reused below in the die_early code, and always _cleanup()
>> after use in ap_read_request().

Thanks for pointing out. I guess what got me confused is that we use the same
variable name tmp_bb at the beginning for the brigade created from r->pool, but
in after die we override it with one we get via ap_acquire_brigade. I know that
we need to do this because we use the brigade to sent the EOR bucket which destroys
r->pool when destroyed. But maybe we should use a different name in this section
e.g. eor_bb instead of tmp_bb.


> 
> ap_read_request.diff
> 
> Index: server/protocol.c
> ===================================================================
> --- server/protocol.c	(revision 1876675)
> +++ server/protocol.c	(working copy)

> @@ -1526,9 +1526,19 @@ request_rec *ap_read_request(conn_rec *conn)
>          }
>      }
>  
> +    /*
> +     * Add the HTTP_IN filter here to ensure that ap_discard_request_body
> +     * called by ap_die and by ap_send_error_response works correctly on
> +     * status codes that do not cause the connection to be dropped and
> +     * in situations where the connection should be kept alive.
> +     */
> +    ap_add_input_filter_handle(ap_http_input_filter_handle,
> +                               NULL, r, r->connection);
> +
> +    /* Validate Host/Expect headers and select vhost. */
>      if (!ap_check_request_header(r)) {
>          access_status = r->status;
> -        goto die_early;
> +        goto die_before_hooks;
>      }

Shouldn't we ensure that

r->per_dir_config = r->server->lookup_defaults;

is done before we jump to die_before_hooks ?

Otherwise I think looks good. As said my biggest worry was that we do not use vhost specific ErrorDocuments in cases where we used
them.

Regards

Rüdiger

Re: svn commit: r1876664 - in /httpd/httpd/trunk: modules/http2/h2_request.c server/protocol.c

Posted by Yann Ylavic <yl...@gmail.com>.

On Fri, Apr 17, 2020 at 11:50 PM Yann Ylavic <yl...@gmail.com> wrote:
>
> On Fri, Apr 17, 2020 at 9:43 PM Ruediger Pluem
> >
> > On 4/17/20 3:07 PM, ylavic
> > >
> > > -    apr_brigade_destroy(tmp_bb);
> >
> > Why don't we destroy it any longer?
>
> tmp_bb is reused below in the die_early code, and always _cleanup()
> after use in ap_read_request().
> If the brigade is not used anymore, there is no real advantage in
> using _destroy() over _cleanup() IMHO, the former will unregister the
> cleanup callback (thus walk the list) but the callback is a noop after
> apr_brigade_cleanup() anyway..
>
> >
> > > +    if (r->status != HTTP_OK) {
> > > +        access_status = r->status;
> > > +        goto die_early;
> >
> > Why do we die_early here and not just die later?
>
> While both "die_early" and "die" labels finally call ap_die(), the
> former is branched only before we call any hook, thus before anything
> can call ap_die() by itself. It matters because ap_die() can recurse,
> so we should set r->status = HTTP_OK only for the first call.
>
> But thinking more about it, die_early is also cleaning up the input
> filters (for potential further reads to always return APR_EOF), and
> this may be suitable only for the case where we could not read the
> request line or the headers.
> So possibly the code is missing a third label (e.g. die_first) in
> between die_early and die (at r->status = HTTP_OK), which we would
> branch instead of die_early when failing with a "suitable" input state
> (but before post_read_request() hooks still after which we can only
> call ap_die() without resetting r->status).
> Actually it depends on whether (or not) we want ap_die() (i.e.
> ErrorDocument configuration) to be able to read the body at these
> points of failure, and if so we should probably insert the HTTP_IN
> filter earlier too.
>
> WDYT?

The patch would look like the attached. It also renames the labels to
(try to) be more explicit.

Regards,
Yann.

Re: svn commit: r1876664 - in /httpd/httpd/trunk: modules/http2/h2_request.c server/protocol.c

Posted by Yann Ylavic <yl...@gmail.com>.

On Fri, Apr 17, 2020 at 9:43 PM Ruediger Pluem
>
> On 4/17/20 3:07 PM, ylavic
> >
> > -    apr_brigade_destroy(tmp_bb);
>
> Why don't we destroy it any longer?

tmp_bb is reused below in the die_early code, and always _cleanup()
after use in ap_read_request().
If the brigade is not used anymore, there is no real advantage in
using _destroy() over _cleanup() IMHO, the former will unregister the
cleanup callback (thus walk the list) but the callback is a noop after
apr_brigade_cleanup() anyway..

>
> > +    if (r->status != HTTP_OK) {
> > +        access_status = r->status;
> > +        goto die_early;
>
> Why do we die_early here and not just die later?

While both "die_early" and "die" labels finally call ap_die(), the
former is branched only before we call any hook, thus before anything
can call ap_die() by itself. It matters because ap_die() can recurse,
so we should set r->status = HTTP_OK only for the first call.

But thinking more about it, die_early is also cleaning up the input
filters (for potential further reads to always return APR_EOF), and
this may be suitable only for the case where we could not read the
request line or the headers.
So possibly the code is missing a third label (e.g. die_first) in
between die_early and die (at r->status = HTTP_OK), which we would
branch instead of die_early when failing with a "suitable" input state
(but before post_read_request() hooks still after which we can only
call ap_die() without resetting r->status).
Actually it depends on whether (or not) we want ap_die() (i.e.
ErrorDocument configuration) to be able to read the body at these
points of failure, and if so we should probably insert the HTTP_IN
filter earlier too.

WDYT?

>
> > @@ -1498,10 +1478,23 @@ request_rec *ap_read_request(conn_rec *c
> >           * HTTP/1.1 mentions twice (S9, S14.23) that a request MUST contain
> >           * a Host: header, and the server MUST respond with 400 if it doesn't.
> >           */
> > -        access_status = HTTP_BAD_REQUEST;
> >          ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(00569)
> >                        "client sent HTTP/1.1 request without hostname "
> >                        "(see RFC2616 section 14.23): %s", r->uri);
> > +        access_status = HTTP_BAD_REQUEST;
> > +        goto die_early;
>
> Why do we die_early here and not just die later?

Same here.

Regards,
Yann.