You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by William A Rowe Jr <wr...@rowe-clan.net> on 2016/08/26 03:02:23 UTC

Backporting HttpProtocolOptions survey

A couple key questions now that the full refactoring of legacy vs. strict
is mostly complete (there remain potential issues with some of the 3-4 yr
old changes on trunk which I'll raise in other posts.) But speaking only to
the request line and request header parsing...

1. Does it make sense to emit these parsing failures at the info level? Or
debug level (or in trunk/2.4, only at the trace level?) Granted some
legitimate internal diagnostics may be required, so it needs to have some
potential visibility, but the vast majority of such traffic is abusive and
doesn't need a place in most error logs.

2. Should we ban \r\n\v\f unequivocally from request and request header
fields altogether, or is there a legitimate need to support these? Or
should these follow the UnsafeWhitespace toggle and be permitted?

3. Do we need multiple layers of 'Strict'ness, or should there be a single
toggle, or no toggle, no tolerant input at all in the next 2.2/2.4 releases?

4. Should the next 2.4/2.2 releases default to Strict at all? Or remain
permissive (Unsafe) and allow the user to toggle these to Strict(...
Whitespace... URI)?

Real world direct observation especially appreciated from actual
deployments.

Re: Backporting HttpProtocolOptions survey

Posted by Jacob Champion <ch...@gmail.com>.
On 08/25/2016 08:02 PM, William A Rowe Jr wrote:
> A couple key questions now that the full refactoring of legacy vs.
> strict is mostly complete (there remain potential issues with some of
> the 3-4 yr old changes on trunk which I'll raise in other posts.) But
> speaking only to the request line and request header parsing...
>
> 1. Does it make sense to emit these parsing failures at the info level?
> Or debug level (or in trunk/2.4, only at the trace level?) Granted some
> legitimate internal diagnostics may be required, so it needs to have
> some potential visibility, but the vast majority of such traffic is
> abusive and doesn't need a place in most error logs.

Debug.

> 2. Should we ban \r\n\v\f unequivocally from request and request header
> fields altogether, or is there a legitimate need to support these? Or
> should these follow the UnsafeWhitespace toggle and be permitted?

No opinion.

> 3. Do we need multiple layers of 'Strict'ness, or should there be a
> single toggle, or no toggle, no tolerant input at all in the next
> 2.2/2.4 releases?

Multiple, at the very least to separate discouraged-but-conformant 
behavior from nonconformant behavior. (I know having multiple options is 
messy, and having too many knobs is already a problem. But I think that 
exposing only one, loosely defined, "allow a bunch of stuff" setting 
will just mean it becomes the next cargo-cult directive in everyone's 
configurations.)

> 4. Should the next 2.4/2.2 releases default to Strict at all? Or remain
> permissive (Unsafe) and allow the user to toggle these to Strict(...
> Whitespace... URI)?

Strict by default.

--Jacob


Re: Backporting HttpProtocolOptions survey

Posted by Jim Jagielski <ji...@jaguNET.com>.
+1

> On Aug 26, 2016, at 7:10 AM, Ruediger Pluem <rp...@apache.org> wrote:
> 
> 
> Debug
> 
> We should ban it unequivocally.
> 
> Only a single toggle.
> 
> Default should be strict.
> 


Re: Backporting HttpProtocolOptions survey

Posted by Yann Ylavic <yl...@gmail.com>.
On Fri, Aug 26, 2016 at 2:02 PM, Eric Covener <co...@gmail.com> wrote:
> On Fri, Aug 26, 2016 at 7:10 AM, Ruediger Pluem <rp...@apache.org> wrote:
>>
>>
>> On 08/26/2016 05:02 AM, William A Rowe Jr wrote:
>>> A couple key questions now that the full refactoring of legacy vs. strict is mostly complete (there remain potential
>>> issues with some of the 3-4 yr old changes on trunk which I'll raise in other posts.) But speaking only to the request
>>> line and request header parsing...
>>>
>>> 1. Does it make sense to emit these parsing failures at the info level? Or debug level (or in trunk/2.4, only at the
>>> trace level?) Granted some legitimate internal diagnostics may be required, so it needs to have some potential
>>> visibility, but the vast majority of such traffic is abusive and doesn't need a place in most error logs.
>>
>> Debug
>>
>>>
>>> 2. Should we ban \r\n\v\f unequivocally from request and request header fields altogether, or is there a legitimate need
>>> to support these? Or should these follow the UnsafeWhitespace toggle and be permitted?
>>
>> We should ban it unequivocally.
>>
>>>
>>> 3. Do we need multiple layers of 'Strict'ness, or should there be a single toggle, or no toggle, no tolerant input at
>>> all in the next 2.2/2.4 releases?
>>
>> Only a single toggle.
>>
>>>
>>> 4. Should the next 2.4/2.2 releases default to Strict at all? Or remain permissive (Unsafe) and allow the user to toggle
>>> these to Strict(... Whitespace... URI)?
>>
>> Default should be strict.
>>
>
> +1

+1

Re: Backporting HttpProtocolOptions survey

Posted by Eric Covener <co...@gmail.com>.
On Fri, Aug 26, 2016 at 7:10 AM, Ruediger Pluem <rp...@apache.org> wrote:
>
>
> On 08/26/2016 05:02 AM, William A Rowe Jr wrote:
>> A couple key questions now that the full refactoring of legacy vs. strict is mostly complete (there remain potential
>> issues with some of the 3-4 yr old changes on trunk which I'll raise in other posts.) But speaking only to the request
>> line and request header parsing...
>>
>> 1. Does it make sense to emit these parsing failures at the info level? Or debug level (or in trunk/2.4, only at the
>> trace level?) Granted some legitimate internal diagnostics may be required, so it needs to have some potential
>> visibility, but the vast majority of such traffic is abusive and doesn't need a place in most error logs.
>
> Debug
>
>>
>> 2. Should we ban \r\n\v\f unequivocally from request and request header fields altogether, or is there a legitimate need
>> to support these? Or should these follow the UnsafeWhitespace toggle and be permitted?
>
> We should ban it unequivocally.
>
>>
>> 3. Do we need multiple layers of 'Strict'ness, or should there be a single toggle, or no toggle, no tolerant input at
>> all in the next 2.2/2.4 releases?
>
> Only a single toggle.
>
>>
>> 4. Should the next 2.4/2.2 releases default to Strict at all? Or remain permissive (Unsafe) and allow the user to toggle
>> these to Strict(... Whitespace... URI)?
>
> Default should be strict.
>

+1

Re: Backporting HttpProtocolOptions survey

Posted by Ruediger Pluem <rp...@apache.org>.

On 08/26/2016 05:02 AM, William A Rowe Jr wrote:
> A couple key questions now that the full refactoring of legacy vs. strict is mostly complete (there remain potential
> issues with some of the 3-4 yr old changes on trunk which I'll raise in other posts.) But speaking only to the request
> line and request header parsing...
> 
> 1. Does it make sense to emit these parsing failures at the info level? Or debug level (or in trunk/2.4, only at the
> trace level?) Granted some legitimate internal diagnostics may be required, so it needs to have some potential
> visibility, but the vast majority of such traffic is abusive and doesn't need a place in most error logs.

Debug

> 
> 2. Should we ban \r\n\v\f unequivocally from request and request header fields altogether, or is there a legitimate need
> to support these? Or should these follow the UnsafeWhitespace toggle and be permitted?

We should ban it unequivocally.

> 
> 3. Do we need multiple layers of 'Strict'ness, or should there be a single toggle, or no toggle, no tolerant input at
> all in the next 2.2/2.4 releases?

Only a single toggle.

> 
> 4. Should the next 2.4/2.2 releases default to Strict at all? Or remain permissive (Unsafe) and allow the user to toggle
> these to Strict(... Whitespace... URI)?

Default should be strict.

> 
> Real world direct observation especially appreciated from actual deployments.
> 
> 
> 

Regards

R�diger


Re: Backporting HttpProtocolOptions survey

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Mon, Aug 29, 2016 at 1:04 PM, Ruediger Pluem <rp...@apache.org> wrote:

>
> On 08/29/2016 06:25 PM, William A Rowe Jr wrote:
> > Thanks all for the feedback. Status and follow-up questions inline
> >
> > On Thu, Aug 25, 2016 at 10:02 PM, William A Rowe Jr <wrowe@rowe-clan.net
> <ma...@rowe-clan.net>> wrote:
> >
> >     A couple key questions now that the full refactoring of legacy vs.
> strict is mostly complete (there remain potential
> >     issues with some of the 3-4 yr old changes on trunk which I'll raise
> in other posts.) But speaking only to the
> >     request line and request header parsing...
> >
> >     3. Do we need multiple
> >     layers of 'Strict'ness, or should there be a single toggle, or no
> toggle, no tolerant input at all in the next
> >     2.2/2.4 releases?
> >
> > Discussion item:
> >
> > I am not sold that StrictURI can be collapsed into this flag. Right now,
> not even
> > httpd itself promises to correctly encode resulting URI's, AIUI. Until
> we have our
> > own house in order, it seems we need to remain flexible about this. The
> \t\v\r\f\0
> > characters are always now prohibited, so it's considerably more safe.
> Strict further
> > bans all unencoded ctrl's in the URL. So StrictURI takes this one step
> further, and
> > bans all unencoded obs-text along with SP / '"' / '<' / '>' / '\' / '^'
> / '`' / '{' / '|' / '}'
> >
> > Since it's expected that a number of sites will have to relax UnsafeURI
> due to
> > these encoding issues, even with the resulting URI's generated by httpd
> servers,
> > and will have to do so for *public facing* interfaces, I strongly
> believe that this
> > flag  needs to remain distinct, or we will have lots of servers with
> entirely unsafe
> > parsing, not with only limited exposure by accepting bad URIs. Thoughts?
>
> Given the situation you describe it sounds sensible to keep this distinct.



> >
> >     4. Should the next 2.4/2.2 releases default to Strict at all? Or
> remain permissive (Unsafe) and allow the user to
> >     toggle these to Strict(... Whitespace... URI)?
> >
> >     Real world direct observation especially appreciated from actual
> deployments.
> >
> > Strict (and StrictURI) remain the default. The Allow0.9 and
> LenientMethods
>
> StrictURI as a default only makes sense if we have our own house in order
> (see above), otherwise it should be opt in.
>

Relative to our own house, I discovered that ';' is currently in the list
of those characters we insist on encoding. According to RFC3986,
while ';' has a special meaning, and as a sub-delim there is a potentially
distinct value of %3B... our own behavior seems entirely broken.

When we receive either ';' or %3B it is decoded to ';' in our r->uri, as we
decode all of the pct-escaped chars.  Take this example;

http://example.com/foo;enc=en%3Bus/test.html

(Ignore the fact that my example has a much more logical way to choose
the language variant of a foo path segment).

r->uri becomes /foo;enc=en;us/test.html - this is part of the path to the
file system we will look for. The httpd server has no special logic to
handle
such args or properties as described in the last paragraph of section 3.3,
they aren't an httpd consideration at all.

But worse, if the character ';' is distinct from %3B at the origin server,
we
are passing a proxied path of /foo%3Benc=en%3Bus/test.html - and also
returning redirects in that form. The character ';' is smashed and cannot
be recovered, although it's allowed by section 3.3, and it's plain-text
meaning is both more prevalent and more often correct.

Seems we are better off handing back or handing off /foo;enc=en;us/test.html
than the current /foo%3Benc=en%3Bus/test.html - the current behavior
is a much more pervasive error than the second case of embedded %3B
within a ';' section.

This appears to be our only mis-encoding from path to a composed URI.
The rest of this logic reflects section 3.3 rules, so I'll commit this
addition
shortly to this allowed (don't-escape) list, and we will still accept either
form on input;

if (!apr_isalnum(c) && !strchr("$-_.+!*'(),:@&=/~", c)) {
    flags |= T_OS_ESCAPE_PATH;
}

Re: Backporting HttpProtocolOptions survey

Posted by Ruediger Pluem <rp...@apache.org>.

On 08/29/2016 06:25 PM, William A Rowe Jr wrote:
> Thanks all for the feedback. Status and follow-up questions inline
> 
> On Thu, Aug 25, 2016 at 10:02 PM, William A Rowe Jr <wrowe@rowe-clan.net <ma...@rowe-clan.net>> wrote:
> 
>     A couple key questions now that the full refactoring of legacy vs. strict is mostly complete (there remain potential
>     issues with some of the 3-4 yr old changes on trunk which I'll raise in other posts.) But speaking only to the
>     request line and request header parsing...
> 
>     1. Does it make sense to emit these parsing failures at the info level? Or debug level (or in trunk/2.4, only at the
>     trace level?) Granted some legitimate internal diagnostics may be required, so it needs to have some potential
>     visibility, but the vast majority of such traffic is abusive and doesn't need a place in most error logs.
> 
> This is now changed, all are DEBUG except where the admin toggles for Limit's
> are broken (since this is a relatively common case where an admin sets some
> arbitrarily short limit without understanding fields such as User-Agent or Cookies.)
> 
> Discussion item: 
> 
> Limit* based failures could also become debug. Thoughts?

We shouldn't change this for 2.2/2.4. For trunk I am +0.

> 
>     2. Should we ban \r\n\v\f unequivocally from request and request header fields altogether, or is there a legitimate
>     need to support these? Or should these follow the UnsafeWhitespace toggle and be permitted?
> 
> I am changing this right now. Specifically,
> 
>  * Leading whitespace on the URI line prohibited (and always was)
>  * \r\n\v\f as well as \t always trigger 400 in the request line (new behavior)
>  * multiple or trailing spaces only permitted in Unsafe mode (always permitted before)
>  * trailing text following trailing spaces after protocol is always 400 (new behavior)
>  
>  * Spaces in or trailing the request field name always prohibited (new per RFC7230 3.2.4)
>  * \r\n\v\f always triggers 400 in the request line (new behavior)
> 
> Discussion items: 
> 
> I'm working on the patch to require CRLF line termination within ap_rgetline_core().
> This is necessary because any CR is eaten before this function completes. It's 
> not clear how this can be done trivially without negative impacts on other consumers 
> of the function. But since current 'fold' flag, which we don't use, was a bool (0 or 1), 
> adding a second control bit-flag of value 2 for CR-required seems the most obvious 
> way to go, with a minor MMN bump. Thoughts?

Seems sensible.

> 
> Two choices, CRLF is always required, or verifying this becomes a byproduct 
> of toggling the Strict (vs Unsafe) mode. Thoughts?
> 
>     3. Do we need multiple 
> 
>     layers of 'Strict'ness, or should there be a single toggle, or no toggle, no tolerant input at all in the next
>     2.2/2.4 releases?
> 
> This is now changed for whitespace, no such toggle anymore, it falls under Strict|Unsafe.
> 
> Discussion item: 
> 
> I am not sold that StrictURI can be collapsed into this flag. Right now, not even
> httpd itself promises to correctly encode resulting URI's, AIUI. Until we have our
> own house in order, it seems we need to remain flexible about this. The \t\v\r\f\0
> characters are always now prohibited, so it's considerably more safe. Strict further 
> bans all unencoded ctrl's in the URL. So StrictURI takes this one step further, and 
> bans all unencoded obs-text along with SP / '"' / '<' / '>' / '\' / '^' / '`' / '{' / '|' / '}'
> 
> Since it's expected that a number of sites will have to relax UnsafeURI due to
> these encoding issues, even with the resulting URI's generated by httpd servers,
> and will have to do so for *public facing* interfaces, I strongly believe that this
> flag  needs to remain distinct, or we will have lots of servers with entirely unsafe 
> parsing, not with only limited exposure by accepting bad URIs. Thoughts?

Given the situation you describe it sounds sensible to keep this distinct.

> 
>     4. Should the next 2.4/2.2 releases default to Strict at all? Or remain permissive (Unsafe) and allow the user to
>     toggle these to Strict(... Whitespace... URI)?
> 
>     Real world direct observation especially appreciated from actual deployments.
> 
> Strict (and StrictURI) remain the default. The Allow0.9 and LenientMethods

StrictURI as a default only makes sense if we have our own house in order (see above), otherwise it should be opt in.

> remain the defaults.
> 
> Discussion item: 
> 
> RegisteredMethods was always wrong for proxy, CGI and other cases, was
> nonsense per the spec, and cannot become a default. The Strict mode now 
> requires all methods to conform to 'token' text, so we have significant added
> protection here.
> 
> Do folks believe we want to ship with Require1.0 out of the box as the default,
> even with the 2.2.x/2.4.x backports? It seems the RFC is correct on this, that
> nearly all of these will be badly formed HTTP/1.x requests. Thoughts?
> 
> 

Are there really 0.9 clients still in the wild?

Regards

R�diger



Re: Backporting HttpProtocolOptions survey

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
Thanks all for the feedback. Status and follow-up questions inline

On Thu, Aug 25, 2016 at 10:02 PM, William A Rowe Jr <wr...@rowe-clan.net>
wrote:

> A couple key questions now that the full refactoring of legacy vs. strict
> is mostly complete (there remain potential issues with some of the 3-4 yr
> old changes on trunk which I'll raise in other posts.) But speaking only to
> the request line and request header parsing...
>
> 1. Does it make sense to emit these parsing failures at the info level? Or
> debug level (or in trunk/2.4, only at the trace level?) Granted some
> legitimate internal diagnostics may be required, so it needs to have some
> potential visibility, but the vast majority of such traffic is abusive and
> doesn't need a place in most error logs.
>
This is now changed, all are DEBUG except where the admin toggles for
Limit's
are broken (since this is a relatively common case where an admin sets some
arbitrarily short limit without understanding fields such as User-Agent or
Cookies.)

Discussion item:

Limit* based failures could also become debug. Thoughts?

2. Should we ban \r\n\v\f unequivocally from request and request header
> fields altogether, or is there a legitimate need to support these? Or
> should these follow the UnsafeWhitespace toggle and be permitted?
>
I am changing this right now. Specifically,

 * Leading whitespace on the URI line prohibited (and always was)
 * \r\n\v\f as well as \t always trigger 400 in the request line (new
behavior)
 * multiple or trailing spaces only permitted in Unsafe mode (always
permitted before)
 * trailing text following trailing spaces after protocol is always 400
(new behavior)

 * Spaces in or trailing the request field name always prohibited (new per
RFC7230 3.2.4)
 * \r\n\v\f always triggers 400 in the request line (new behavior)

Discussion items:

I'm working on the patch to require CRLF line termination within
ap_rgetline_core().
This is necessary because any CR is eaten before this function completes.
It's
not clear how this can be done trivially without negative impacts on other
consumers
of the function. But since current 'fold' flag, which we don't use, was a
bool (0 or 1),
adding a second control bit-flag of value 2 for CR-required seems the most
obvious
way to go, with a minor MMN bump. Thoughts?

Two choices, CRLF is always required, or verifying this becomes a byproduct
of toggling the Strict (vs Unsafe) mode. Thoughts?

3. Do we need multiple
>
layers of 'Strict'ness, or should there be a single toggle, or no toggle,
> no tolerant input at all in the next 2.2/2.4 releases?
>
This is now changed for whitespace, no such toggle anymore, it falls under
Strict|Unsafe.

Discussion item:

I am not sold that StrictURI can be collapsed into this flag. Right now,
not even
httpd itself promises to correctly encode resulting URI's, AIUI. Until we
have our
own house in order, it seems we need to remain flexible about this. The
\t\v\r\f\0
characters are always now prohibited, so it's considerably more safe.
Strict further
bans all unencoded ctrl's in the URL. So StrictURI takes this one step
further, and
bans all unencoded obs-text along with SP / '"' / '<' / '>' / '\' / '^' /
'`' / '{' / '|' / '}'

Since it's expected that a number of sites will have to relax UnsafeURI due
to
these encoding issues, even with the resulting URI's generated by httpd
servers,
and will have to do so for *public facing* interfaces, I strongly believe
that this
flag  needs to remain distinct, or we will have lots of servers with
entirely unsafe
parsing, not with only limited exposure by accepting bad URIs. Thoughts?

4. Should the next 2.4/2.2 releases default to Strict at all? Or remain
> permissive (Unsafe) and allow the user to toggle these to Strict(...
> Whitespace... URI)?
>
> Real world direct observation especially appreciated from actual
> deployments.
>
Strict (and StrictURI) remain the default. The Allow0.9 and LenientMethods
remain the defaults.

Discussion item:

RegisteredMethods was always wrong for proxy, CGI and other cases, was
nonsense per the spec, and cannot become a default. The Strict mode now
requires all methods to conform to 'token' text, so we have significant
added
protection here.

Do folks believe we want to ship with Require1.0 out of the box as the
default,
even with the 2.2.x/2.4.x backports? It seems the RFC is correct on this,
that
nearly all of these will be badly formed HTTP/1.x requests. Thoughts?

Re: Backporting HttpProtocolOptions survey

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Aug 25, 2016 22:02, "William A Rowe Jr" <wr...@rowe-clan.net> wrote:

> 3. Do we need multiple layers of 'Strict'ness, or should there be a
single toggle, or no toggle, no tolerant input at all in the next 2.2/2.4
releases?

My thoughts on three toggles ran like this...

Unsafe allows things httpd has offered which run counter to the current
RFC723x series of specs. Admins supporting errant user-agents would unlock
this alone.

UnsafeWhitespace allows unusual whitespace defined in RFC7230 section 3.5
that httpd has permitted. It is cautioned against but doesn't fit that
first pattern. If this is the only error encountered in a necessary
user-agents, This is all the admin should permit. This is the easiest to
fold into a general Unsafe flag.

UnsafeURI might be the single most common error encountered, and flows from
RFC3986's precise grammar. I expect more admins will have to permit this
exception than either of the two above.

Anyways, just wanted to share my thoughts on why two or three flags may be
appropriate.