You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Ruediger Pluem <rp...@apache.org> on 2020/04/17 19:43:30 UTC
Re: svn commit: r1876664 - in /httpd/httpd/trunk:
modules/http2/h2_request.c server/protocol.c
On 4/17/20 3:07 PM, ylavic@apache.org wrote:
> Author: ylavic
> Date: Fri Apr 17 13:07:46 2020
> New Revision: 1876664
>
> URL: http://svn.apache.org/viewvc?rev=1876664&view=rev
> Log:
> core, h2: send EOR for early HTTP request failure.
>
> The core output filters depend on EOR being sent at some point for correct
> accounting of setaside limits and lifetime.
>
> Rework ap_read_request() early failure (including in post_read_request() hooks)
> so that it always sends the EOR after ap_die().
>
> Apply the same scheme in h2_request_create_rec() which is the HTTP/2 to HTTP/1
> counterpart.
>
> Modified:
> httpd/httpd/trunk/modules/http2/h2_request.c
> httpd/httpd/trunk/server/protocol.c
>
> Modified: httpd/httpd/trunk/server/protocol.c
> URL: http://svn.apache.org/viewvc/httpd/httpd/trunk/server/protocol.c?rev=1876664&r1=1876663&r2=1876664&view=diff
> ==============================================================================
> --- httpd/httpd/trunk/server/protocol.c (original)
> +++ httpd/httpd/trunk/server/protocol.c Fri Apr 17 13:07:46 2020
> @@ -1453,14 +1444,12 @@ request_rec *ap_read_request(conn_rec *c
> }
> }
>
> - apr_brigade_destroy(tmp_bb);
Why don't we destroy it any longer?
> -
> /* update what we think the virtual host is based on the headers we've
> * now read. may update status.
> */
> -
> - access_status = ap_update_vhost_from_headers_ex(r, conf->strict_host_check == AP_CORE_CONFIG_ON);
> - if (conf->strict_host_check == AP_CORE_CONFIG_ON && access_status != HTTP_OK) {
> + strict_host_check = (conf->strict_host_check == AP_CORE_CONFIG_ON);
> + access_status = ap_update_vhost_from_headers_ex(r, strict_host_check);
> + if (strict_host_check && access_status != HTTP_OK) {
> if (r->server == ap_server_conf) {
> ap_log_rerror(APLOG_MARK, APLOG_WARNING, 0, r, APLOGNO(10156)
> "Requested hostname '%s' did not match any ServerName/ServerAlias "
> @@ -1472,22 +1461,13 @@ request_rec *ap_read_request(conn_rec *c
> "current connection is %s:%u)",
> r->hostname, r->server->defn_name, r->server->defn_line_number);
> }
> - r->status = access_status;
> + goto die_early;
> }
> -
> - access_status = r->status;
> -
> - /* Toggle to the Host:-based vhost's timeout mode to fetch the
> - * request body and send the response body, if needed.
> - */
> - if (cur_timeout != r->server->timeout) {
> - apr_socket_timeout_set(csd, r->server->timeout);
> - cur_timeout = r->server->timeout;
> + if (r->status != HTTP_OK) {
> + access_status = r->status;
> + goto die_early;
Why do we die_early here and not just die later?
> }
>
> - /* we may have switched to another server */
> - r->per_dir_config = r->server->lookup_defaults;
> -
> if ((!r->hostname && (r->proto_num >= HTTP_VERSION(1, 1)))
> || ((r->proto_num == HTTP_VERSION(1, 1))
> && !apr_table_get(r->headers_in, "Host"))) {
> @@ -1498,10 +1478,23 @@ request_rec *ap_read_request(conn_rec *c
> * HTTP/1.1 mentions twice (S9, S14.23) that a request MUST contain
> * a Host: header, and the server MUST respond with 400 if it doesn't.
> */
> - access_status = HTTP_BAD_REQUEST;
> ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(00569)
> "client sent HTTP/1.1 request without hostname "
> "(see RFC2616 section 14.23): %s", r->uri);
> + access_status = HTTP_BAD_REQUEST;
> + goto die_early;
Why do we die_early here and not just die later?
> + }
> +
> + /* we may have switched to another server */
> + r->per_dir_config = r->server->lookup_defaults;
> + conf = ap_get_core_module_config(r->server->module_config);
> +
> + /* Toggle to the Host:-based vhost's timeout mode to fetch the
> + * request body and send the response body, if needed.
> + */
> + if (cur_timeout != r->server->timeout) {
> + apr_socket_timeout_set(csd, r->server->timeout);
> + cur_timeout = r->server->timeout;
> }
>
> /*
Regards
RĂ¼diger
Re: svn commit: r1876664 - in /httpd/httpd/trunk: modules/http2/h2_request.c
server/protocol.c
Posted by Yann Ylavic <yl...@gmail.com>.
On Mon, Apr 20, 2020 at 8:51 AM Ruediger Pluem:
>
> On 4/19/20 5:26 PM, Yann Ylavic wrote:
> > +
> > + /* Validate Host/Expect headers and select vhost. */
> > if (!ap_check_request_header(r)) {
> > access_status = r->status;
> > - goto die_early;
> > + goto die_before_hooks;
> > }
>
> Shouldn't we ensure that
>
> r->per_dir_config = r->server->lookup_defaults;
>
> is done before we jump to die_before_hooks ?
Yes, correct, ap_check_request_header() can fail for (e.g.) invalid
Expect header, in which case r->server may still have changed.
>
> Otherwise I think looks good. As said my biggest worry was that we do not use vhost specific ErrorDocuments in cases where we used
> them.
Thanks for the review, hopefully everything addressed in r1876784.
Regards,
Yann.
Re: svn commit: r1876664 - in /httpd/httpd/trunk:
modules/http2/h2_request.c server/protocol.c
Posted by Ruediger Pluem <rp...@apache.org>.
On 4/19/20 5:26 PM, Yann Ylavic wrote:
> On Fri, Apr 17, 2020 at 11:50 PM Yann Ylavic <yl...@gmail.com> wrote:
>> On Fri, Apr 17, 2020 at 9:43 PM Ruediger Pluem
>>> On 4/17/20 3:07 PM, ylavic
>>>> - apr_brigade_destroy(tmp_bb);
>>> Why don't we destroy it any longer?
>> tmp_bb is reused below in the die_early code, and always _cleanup()
>> after use in ap_read_request().
Thanks for pointing out. I guess what got me confused is that we use the same
variable name tmp_bb at the beginning for the brigade created from r->pool, but
in after die we override it with one we get via ap_acquire_brigade. I know that
we need to do this because we use the brigade to sent the EOR bucket which destroys
r->pool when destroyed. But maybe we should use a different name in this section
e.g. eor_bb instead of tmp_bb.
>
> ap_read_request.diff
>
> Index: server/protocol.c
> ===================================================================
> --- server/protocol.c (revision 1876675)
> +++ server/protocol.c (working copy)
> @@ -1526,9 +1526,19 @@ request_rec *ap_read_request(conn_rec *conn)
> }
> }
>
> + /*
> + * Add the HTTP_IN filter here to ensure that ap_discard_request_body
> + * called by ap_die and by ap_send_error_response works correctly on
> + * status codes that do not cause the connection to be dropped and
> + * in situations where the connection should be kept alive.
> + */
> + ap_add_input_filter_handle(ap_http_input_filter_handle,
> + NULL, r, r->connection);
> +
> + /* Validate Host/Expect headers and select vhost. */
> if (!ap_check_request_header(r)) {
> access_status = r->status;
> - goto die_early;
> + goto die_before_hooks;
> }
Shouldn't we ensure that
r->per_dir_config = r->server->lookup_defaults;
is done before we jump to die_before_hooks ?
Otherwise I think looks good. As said my biggest worry was that we do not use vhost specific ErrorDocuments in cases where we used
them.
Regards
RĂ¼diger
Re: svn commit: r1876664 - in /httpd/httpd/trunk: modules/http2/h2_request.c
server/protocol.c
Posted by Yann Ylavic <yl...@gmail.com>.
On Fri, Apr 17, 2020 at 11:50 PM Yann Ylavic <yl...@gmail.com> wrote:
>
> On Fri, Apr 17, 2020 at 9:43 PM Ruediger Pluem
> >
> > On 4/17/20 3:07 PM, ylavic
> > >
> > > - apr_brigade_destroy(tmp_bb);
> >
> > Why don't we destroy it any longer?
>
> tmp_bb is reused below in the die_early code, and always _cleanup()
> after use in ap_read_request().
> If the brigade is not used anymore, there is no real advantage in
> using _destroy() over _cleanup() IMHO, the former will unregister the
> cleanup callback (thus walk the list) but the callback is a noop after
> apr_brigade_cleanup() anyway..
>
> >
> > > + if (r->status != HTTP_OK) {
> > > + access_status = r->status;
> > > + goto die_early;
> >
> > Why do we die_early here and not just die later?
>
> While both "die_early" and "die" labels finally call ap_die(), the
> former is branched only before we call any hook, thus before anything
> can call ap_die() by itself. It matters because ap_die() can recurse,
> so we should set r->status = HTTP_OK only for the first call.
>
> But thinking more about it, die_early is also cleaning up the input
> filters (for potential further reads to always return APR_EOF), and
> this may be suitable only for the case where we could not read the
> request line or the headers.
> So possibly the code is missing a third label (e.g. die_first) in
> between die_early and die (at r->status = HTTP_OK), which we would
> branch instead of die_early when failing with a "suitable" input state
> (but before post_read_request() hooks still after which we can only
> call ap_die() without resetting r->status).
> Actually it depends on whether (or not) we want ap_die() (i.e.
> ErrorDocument configuration) to be able to read the body at these
> points of failure, and if so we should probably insert the HTTP_IN
> filter earlier too.
>
> WDYT?
The patch would look like the attached. It also renames the labels to
(try to) be more explicit.
Regards,
Yann.
Re: svn commit: r1876664 - in /httpd/httpd/trunk: modules/http2/h2_request.c
server/protocol.c
Posted by Yann Ylavic <yl...@gmail.com>.
On Fri, Apr 17, 2020 at 9:43 PM Ruediger Pluem
>
> On 4/17/20 3:07 PM, ylavic
> >
> > - apr_brigade_destroy(tmp_bb);
>
> Why don't we destroy it any longer?
tmp_bb is reused below in the die_early code, and always _cleanup()
after use in ap_read_request().
If the brigade is not used anymore, there is no real advantage in
using _destroy() over _cleanup() IMHO, the former will unregister the
cleanup callback (thus walk the list) but the callback is a noop after
apr_brigade_cleanup() anyway..
>
> > + if (r->status != HTTP_OK) {
> > + access_status = r->status;
> > + goto die_early;
>
> Why do we die_early here and not just die later?
While both "die_early" and "die" labels finally call ap_die(), the
former is branched only before we call any hook, thus before anything
can call ap_die() by itself. It matters because ap_die() can recurse,
so we should set r->status = HTTP_OK only for the first call.
But thinking more about it, die_early is also cleaning up the input
filters (for potential further reads to always return APR_EOF), and
this may be suitable only for the case where we could not read the
request line or the headers.
So possibly the code is missing a third label (e.g. die_first) in
between die_early and die (at r->status = HTTP_OK), which we would
branch instead of die_early when failing with a "suitable" input state
(but before post_read_request() hooks still after which we can only
call ap_die() without resetting r->status).
Actually it depends on whether (or not) we want ap_die() (i.e.
ErrorDocument configuration) to be able to read the body at these
points of failure, and if so we should probably insert the HTTP_IN
filter earlier too.
WDYT?
>
> > @@ -1498,10 +1478,23 @@ request_rec *ap_read_request(conn_rec *c
> > * HTTP/1.1 mentions twice (S9, S14.23) that a request MUST contain
> > * a Host: header, and the server MUST respond with 400 if it doesn't.
> > */
> > - access_status = HTTP_BAD_REQUEST;
> > ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(00569)
> > "client sent HTTP/1.1 request without hostname "
> > "(see RFC2616 section 14.23): %s", r->uri);
> > + access_status = HTTP_BAD_REQUEST;
> > + goto die_early;
>
> Why do we die_early here and not just die later?
Same here.
Regards,
Yann.