You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by William A Rowe Jr <wr...@rowe-clan.net> on 2016/08/03 23:33:54 UTC

HTTP/1.1 strict ruleset

So it seems pretty absurd we are coming back to this over
three years later, but is there any reason to preserve pre-RFC 2068
behaviors? I appreciate that Stefan was trying to avoid harming
existing deployment scenarios, but even as I'm about to propose
that we backport all of this to 2.4 and 2.2, I have several questions;

1. offer a logging-only option? Why? It seems like a simple
   choice, follow the spec, or don't. If you want to see what's
   going on, Wireshark, Fiddler and dozens of other tools let
   you inspect the conversation.

2. leave the default as 'not-strict'? Seems we should most
   strongly recommend that the server observe RFC's 2068,
   2616 and 723x, and not tolerate ancient behavior by default
   unless the admin insists on being foolish.

3. retain these legacy faulty behaviors in httpd 2.next/3.0?
   Seems that once we agree on a backport, the ancient
   side of this logic should all just disappear from trunk.

4. detail the error to the error log? Again, there are inspection
   tools, but more importantly, no visual user-agent is going
   to send this garbage, and automated requests are going
   to discard the 400 response. Seems we can save a lot of
   code throwing away the details that just don't help, and
   are generally the product of abusive traffic.

Thoughts?


-----------------

http://svn.apache.org/viewvc?view=revision&revision=1426877
Author: sf
Date: Sun Dec 30 01:23:24 2012 UTC (3 years, 7 months ago)
Changed paths: 9
Log Message:
Add an option to enforce stricter HTTP conformance

This is a first stab, the checks will likely have to be revised.
For now, we check

 * if the request line contains control characters
 * if the request uri has fragment or username/password
 * that the request method is standard or registered with RegisterHttpMethod
 * that the request protocol is of the form HTTP/[1-9]+.[0-9]+,
   or missing for 0.9
 * if there is garbage in the request line after the protocol
 * if any request header contains control characters
 * if any request header has an empty name
 * for the host name in the URL or Host header:
   - if an IPv4 dotted decimal address: Reject octal or hex values, require
     exactly four parts
   - if a DNS host name: Reject non-alphanumeric characters besides '.' and
     '-'. As a side effect, this rejects multiple Host headers.
 * if any response header contains control characters
 * if any response header has an empty name
 * that the Location response header (if present) has a valid scheme and is
   absolute

If we have a host name both from the URL and the Host header, we replace the
Host header with the value from the URL to enforce RFC conformance.

There is a log-only mode, but the loglevels of the logged messages need some
thought/work. Currently, the  checks for incoming data log for 'core' and
the
checks for outgoing data log for 'http'. Maybe we need a way to configure
the
loglevels separately from the core/http loglevels.

Re: HTTP/1.1 strict ruleset

Posted by Yann Ylavic <yl...@gmail.com>.
On Fri, Aug 5, 2016 at 12:02 AM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
>
> It would be helpful if other PMC members would weigh in yea or nay on
> dropping out-of-spec behaviors from 2.4 and 2.2 maintenance branches.

IMHO we should keep an opt *out* strict mode for new errors 400 we
generate such early (no workaround), at least for 2.2, but probably
also for 2.4.
It shouldn't overcomplicate the code anyway.

Re: HTTP/1.1 strict ruleset

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Thu, Aug 4, 2016 at 8:05 PM, William A Rowe Jr <wr...@rowe-clan.net>
wrote:

> On Thu, Aug 4, 2016 at 5:21 PM, Roy T. Fielding <fi...@gbiv.com> wrote:
>
>> On Aug 4, 2016, at 3:02 PM, William A Rowe Jr <wr...@rowe-clan.net>
>> wrote:
>>
>> If consensus here agrees that no out-of-spec behavior should be tolerated
>> anymore, I'll jump on board. I'm already in the consensus block that says
>> we should not ship a new major.minor without disallowing all of this
>> garbage.
>>
>> It would be helpful if other PMC members would weigh in yea or nay on
>> dropping out-of-spec behaviors from 2.4 and 2.2 maintenance branches.
>>
>>
>> That would be weird.  One of us is going to create a patch.  That
>> specific patch is
>> going to be voted upon for backport.  If anyone wants to veto it, they
>> are free
>> to do so with justification.
>>
>
> You don't seem to comprehend the idea behind consensus, which is what I'm
> appealing for. You first among them all were perfectly happy to champion
> stupid
> design considerations as 'beyond Bill's authority' as a committer and
> reviewer,
> and sadly while I sat in the chair of httpd and got to eat dirt.
>
> I'm not trying that again, httpd 2.0 does not entirely compile against apr
> 2.0-dev,
> and likely never will due to this stalemate.
>
> I'm entirely willing to invite vetos with my code, if I wasn't I wouldn't
> commit.
> But if you go back through the archives, you will realize several of you
> were
> entirely on the wrong side of problem-solving. If you would personally like
> to invite vetoes, please be our guest.
>

And FWIW you were on record that a veto does not demand a justification,
so let that settle in.

Re: HTTP/1.1 strict ruleset

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Thu, Aug 4, 2016 at 5:21 PM, Roy T. Fielding <fi...@gbiv.com> wrote:

> On Aug 4, 2016, at 3:02 PM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
>
> If consensus here agrees that no out-of-spec behavior should be tolerated
> anymore, I'll jump on board. I'm already in the consensus block that says
> we should not ship a new major.minor without disallowing all of this
> garbage.
>
> It would be helpful if other PMC members would weigh in yea or nay on
> dropping out-of-spec behaviors from 2.4 and 2.2 maintenance branches.
>
>
> That would be weird.  One of us is going to create a patch.  That specific
> patch is
> going to be voted upon for backport.  If anyone wants to veto it, they are
> free
> to do so with justification.
>

You don't seem to comprehend the idea behind consensus, which is what I'm
appealing for. You first among them all were perfectly happy to champion
stupid
design considerations as 'beyond Bill's authority' as a committer and
reviewer,
and sadly while I sat in the chair of httpd and got to eat dirt.

I'm not trying that again, httpd 2.0 does not entirely compile against apr
2.0-dev,
and likely never will due to this stalemate.

I'm entirely willing to invite vetos with my code, if I wasn't I wouldn't
commit.
But if you go back through the archives, you will realize several of you
were
entirely on the wrong side of problem-solving. If you would personally like
to invite vetoes, please be our guest.

Re: HTTP/1.1 strict ruleset

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Thu, Aug 11, 2016 at 6:14 PM, Roy T. Fielding <fi...@gbiv.com> wrote:

> I have no idea why there is any need to discuss EBCDIC here, since HTTP
> itself
> is never EBCDIC.  We should not be transforming any input to EBCDIC until
> after the request has been parsed.
>

Just as a point of information, that isn't how this works, it isn't
how any of this works. All of the raw text off the wire has been
turned into platform-specific encoding on EBCDIC architectures
after a simple bucket-fetch readline. There is no realistic way
to code ASCII logic on a non-ASCII C compiler, so I agree with
the design model. If you would like to refactor that and deal with
the consequences, be our guest.

In any case, if you have a working parser implementation, I will be happy to
> review it regardless of my preferences.  If it is better than what we
> have, then
> it will still get my +1 for trunk regardless of longer term plans.
>

I think we are on the same page, I welcome your's and other's
improvements and enhancements, and we should ship a release.
These ideas aren't irreconcilable.

Thanks for your support, I expect we will all agree, in some round
about way, a few months from now.

Cheers,

Bill

Re: HTTP/1.1 strict ruleset

Posted by "Roy T. Fielding" <fi...@gbiv.com>.
> On Aug 11, 2016, at 9:59 AM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
> 
> On Thu, Aug 11, 2016 at 11:54 AM, Eric Covener <covener@gmail.com <ma...@gmail.com>> wrote:
> On Thu, Aug 11, 2016 at 12:44 PM, William A Rowe Jr <wrowe@rowe-clan.net <ma...@rowe-clan.net>> wrote:
> > Just to be clear, that is now 2 votes for eliminating the 'classic parser'
> > from all
> > of trunk, 2.4.x and 2.2.x branches, and using only the strict parser,
> > unconditionally.
> >
> > That's actually 3 votes for removing it from trunk, which I planned to do
> > already,
> > after 2.4 and 2.2 backports are in-sync with trunk.
> 
> Without yet reviewing the votes, I would (personally) think this kind
> of split makes it your call as the one neck deep in the issue & doing
> all the work.    Thank you for your work on this.
> 
> Maybe one last summary of your call, and a short window for strong
> objection/veto?
> 
> Certainly, that's what the backport proposal of everything from the initial
> commit by sf all the way to the present state will accomplish in STATUS.
> 
> With so many evolutions of various bits, a summary patch will be provided,
> of course. But it's helpful to me to know the opinions of Jim and Roy and
> everyone else in advance of proposing that backport.

I am having trouble keeping up while dealing with summer parenting issues.

I have no doubt that a strict parser is necessary, for some definition of strict.
I have no idea why there is any need to discuss EBCDIC here, since HTTP itself
is never EBCDIC.  We should not be transforming any input to EBCDIC until
after the request has been parsed.

I am not convinced that we need a wholesale rewrite of the parser code to
accomplish anything. Since most of the changes to trunk were tossed and repeated
multiple times due to unfamiliarity with the train parsing code or unawareness
that the read is already handling obs-fold and spurious newlines, I still think
we should just commit the simple fix (with your added logging statements)
and remove the bits from trunk that we don't actually need.

That doesn't mean we shouldn't attempt a better parser.  However, I would like
to review that as one clean diff with demonstrated performance tests.
That means setting up a test harness and proving that it is actually better.
For example, we might want to try using (or at least learning from) other parsers
that can work on the entire received buffer in one pass, rather than limit ourselves
to the existing line-at-a-time process, and simultaneously deprecate or remove
handling of obs-fold and HTTP/0.9.

In any case, if you have a working parser implementation, I will be happy to
review it regardless of my preferences.  If it is better than what we have, then
it will still get my +1 for trunk regardless of longer term plans.

....Roy


Re: HTTP/1.1 strict ruleset

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Thu, Aug 11, 2016 at 11:54 AM, Eric Covener <co...@gmail.com> wrote:

> On Thu, Aug 11, 2016 at 12:44 PM, William A Rowe Jr <wr...@rowe-clan.net>
> wrote:
> > Just to be clear, that is now 2 votes for eliminating the 'classic
> parser'
> > from all
> > of trunk, 2.4.x and 2.2.x branches, and using only the strict parser,
> > unconditionally.
> >
> > That's actually 3 votes for removing it from trunk, which I planned to do
> > already,
> > after 2.4 and 2.2 backports are in-sync with trunk.
>
> Without yet reviewing the votes, I would (personally) think this kind
> of split makes it your call as the one neck deep in the issue & doing
> all the work.    Thank you for your work on this.
>
> Maybe one last summary of your call, and a short window for strong
> objection/veto?
>

Certainly, that's what the backport proposal of everything from the initial
commit by sf all the way to the present state will accomplish in STATUS.

With so many evolutions of various bits, a summary patch will be provided,
of course. But it's helpful to me to know the opinions of Jim and Roy and
everyone else in advance of proposing that backport.

Re: HTTP/1.1 strict ruleset

Posted by Eric Covener <co...@gmail.com>.
On Thu, Aug 11, 2016 at 12:44 PM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
> Just to be clear, that is now 2 votes for eliminating the 'classic parser'
> from all
> of trunk, 2.4.x and 2.2.x branches, and using only the strict parser,
> unconditionally.
>
> That's actually 3 votes for removing it from trunk, which I planned to do
> already,
> after 2.4 and 2.2 backports are in-sync with trunk.

Without yet reviewing the votes, I would (personally) think this kind
of split makes it your call as the one neck deep in the issue & doing
all the work.    Thank you for your work on this.

Maybe one last summary of your call, and a short window for strong
objection/veto?

Re: HTTP/1.1 strict ruleset

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Thu, Aug 18, 2016 at 4:26 AM, Plüm, Rüdiger, Vodafone Group <
ruediger.pluem@vodafone.com> wrote:

>
> > -----Original Message-----
> > From: Yann Ylavic [mailto:ylavic.dev@gmail.com]
> > Sent: Donnerstag, 11. August 2016 22:40
> > To: httpd-dev
> > Subject: Re: HTTP/1.1 strict ruleset
> >
> > On Thu, Aug 11, 2016 at 6:56 PM, William A Rowe Jr <wr...@rowe-clan.net>
> > wrote:
> > >
> > > I haven't dug terribly deeply into the proxy mechanics yet, but the
> same
> > > parser for headers is used for response header processing as well as
> the
> > > request processing.
> >
> > They don't share the same code, though, ap_proxy_read_headers() would
> > need the same "strictification" than ap_get_mime_headers(_ex)()
> > currently, or be replaced by the latter.
> >
> > > It seems that the two need some potentially different
> > > rulesets. If you are running a forward proxy, you would want to be
> quite
> > > strict about the responses. If you are only a gateway of trusted
> backend
> > > servers and apps, you might want to be more tolerant (although Roy and
> > > Jim may disagree with me on this.)
> >
> > +1, behind 2.2 proxies (but possibly 2.4 too), there are some outdated
> > backends/applications (supporting SSLv3 only...) that don't receive
> > many (if any) maintenance but just work, and for that reason where
> > placed behing a proxy.
>
> So I guess we should be strict on the client side on every branch, but
> have an opt out
> for the backend of a gateway in 2.2 / 2.4.
>

That's my read, we need to be more strict in what we accept by default,
and provide administrators various 'outs' to otherwise still consume the
branch (2.2 or 2.4) that they have build their infrastructure around, at
least until they are able to clean up their local messes.

Some rules can and should be unavoidable. RFC7230 calls out several
of these, while remaining relaxed around others. That's why it took me
four different HttpProtocolOptions behavior toggles, in the latest parser
commit a couple hrs ago, to satisfy these varying layers of significance.
I even found one we could relax under 3.5 and chose not to do so (that
section is really clear that leading and trailing whitespace might be
parsed, but why introduce a new incorrect behavior we never offered
before?)

When I re-crafted HttpProtocolOptions yesterday, I'd already foreseen
the need for all four of these current options, and more as we apply these
rules to the response from a forward-proxied server (which we probably
want to enforce a distinct ruleset on). It is probably necessary for us
to create an HttpProxyProtocolOptions for back-end response handling,
rather than overloading this directive with extra-wordy components.

But I'm also anticipating that by popular demand, we will end up relaxing
one or more rules based on yet another independent toggle, based on the
frequency that never-before-valid behaviors are present in the ecosystem.

All productive suggestions are welcome, I'm not strongly tied to any of
the http://httpd.apache.org/docs/trunk/mod/core.html#httpprotocoloptions
keywords, except in as much as I tried to borrow the precise language
used in RFC7230 when it applied. Edits for legibility are also welcome.
Maybe an example for good measure :)

Cheers,

Bill

RE: HTTP/1.1 strict ruleset

Posted by Plüm, Rüdiger, Vodafone Group <ru...@vodafone.com>.

> -----Original Message-----
> From: Yann Ylavic [mailto:ylavic.dev@gmail.com]
> Sent: Donnerstag, 11. August 2016 22:40
> To: httpd-dev
> Subject: Re: HTTP/1.1 strict ruleset
> 
> On Thu, Aug 11, 2016 at 6:56 PM, William A Rowe Jr <wr...@rowe-clan.net>
> wrote:
> >
> > I haven't dug terribly deeply into the proxy mechanics yet, but the same
> > parser for headers is used for response header processing as well as the
> > request processing.
> 
> They don't share the same code, though, ap_proxy_read_headers() would
> need the same "strictification" than ap_get_mime_headers(_ex)()
> currently, or be replaced by the latter.
> 
> > It seems that the two need some potentially different
> > rulesets. If you are running a forward proxy, you would want to be quite
> > strict about the responses. If you are only a gateway of trusted backend
> > servers and apps, you might want to be more tolerant (although Roy and
> > Jim may disagree with me on this.)
> 
> +1, behind 2.2 proxies (but possibly 2.4 too), there are some outdated
> backends/applications (supporting SSLv3 only...) that don't receive
> many (if any) maintenance but just work, and for that reason where
> placed behing a proxy.

So I guess we should be strict on the client side on every branch, but have an opt out
for the backend of a gateway in 2.2 / 2.4.

Regards

Rüdiger

Re: HTTP/1.1 strict ruleset

Posted by Yann Ylavic <yl...@gmail.com>.
On Thu, Aug 11, 2016 at 6:56 PM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
>
> I haven't dug terribly deeply into the proxy mechanics yet, but the same
> parser for headers is used for response header processing as well as the
> request processing.

They don't share the same code, though, ap_proxy_read_headers() would
need the same "strictification" than ap_get_mime_headers(_ex)()
currently, or be replaced by the latter.

> It seems that the two need some potentially different
> rulesets. If you are running a forward proxy, you would want to be quite
> strict about the responses. If you are only a gateway of trusted backend
> servers and apps, you might want to be more tolerant (although Roy and
> Jim may disagree with me on this.)

+1, behind 2.2 proxies (but possibly 2.4 too), there are some outdated
backends/applications (supporting SSLv3 only...) that don't receive
many (if any) maintenance but just work, and for that reason where
placed behing a proxy.

Re: HTTP/1.1 strict ruleset

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Aug 11, 2016 15:09, "Eric Covener" <co...@gmail.com> wrote:
>
> On Thu, Aug 11, 2016 at 4:04 PM, Jim Jagielski <ji...@jagunet.com> wrote:
> >>  It seems that the two need some potentially different
> >> rulesets. If you are running a forward proxy, you would want to be
quite
> >> strict about the responses. If you are only a gateway of trusted
backend
> >> servers and apps, you might want to be more tolerant (although Roy and
> >> Jim may disagree with me on this.)
>
> Devils advocate: Trusted backend + spectre of xss could put you right
> back in strict mindset.

Pardon the language, but absofuckinglutely!

The idea of these potential overrides is to restore a server which is a
gateway to a critical, but horridly coded backend, to gey it running again
for a time.

It is not a perpetual means to avoiding the responsibility to fix the
broken backend. It isn't an excuse to avoid writing proper backends. This
is the gist of Luca's and my small disagreement over last-modified handling.

I was thinking over lunch that perhaps we *should* backport the entire
removal of the legacy parser in a later 2.4 or final 2.2 release. Not
necessarily at this next T&R, but soon, later on down the line, after we
have offered months for admins to push their dev teams to get the issues
sorted.

Re: HTTP/1.1 strict ruleset

Posted by Eric Covener <co...@gmail.com>.
On Thu, Aug 11, 2016 at 4:04 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>>  It seems that the two need some potentially different
>> rulesets. If you are running a forward proxy, you would want to be quite
>> strict about the responses. If you are only a gateway of trusted backend
>> servers and apps, you might want to be more tolerant (although Roy and
>> Jim may disagree with me on this.)

Devils advocate: Trusted backend + spectre of xss could put you right
back in strict mindset.

Re: HTTP/1.1 strict ruleset

Posted by Jim Jagielski <ji...@jaguNET.com>.
> On Aug 11, 2016, at 12:56 PM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
> 
>  It seems that the two need some potentially different 
> rulesets. If you are running a forward proxy, you would want to be quite
> strict about the responses. If you are only a gateway of trusted backend
> servers and apps, you might want to be more tolerant (although Roy and
> Jim may disagree with me on this.)
> 

FWIW, I agree w/ you on this.

Re: HTTP/1.1 strict ruleset

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Thu, Aug 11, 2016 at 11:49 AM, Eric Covener <co...@gmail.com> wrote:

> On Thu, Aug 11, 2016 at 12:44 PM, William A Rowe Jr <wr...@rowe-clan.net>
> wrote:
> > Since I've heard little support in these past weeks for leaving an HTTP
> > strict
> > 'logging-only' option, I'm going to rip that out, but replace it with
> > options to
> > independently toggle HTTPUnsafe and HTTPResponseUnsafe values, so that
> > the server can continue to deliberately process oddball backends that
> don't
> > conform, while requiring strict behavior of originating user-agents.
>
> Does the latter refer stuff being read from origins in mod_proxy_http
> or just what we're willing to put on the wire in general vs. what we
> parse on the way in?
>

I haven't dug terribly deeply into the proxy mechanics yet, but the same
parser for headers is used for response header processing as well as the
request processing. It seems that the two need some potentially different
rulesets. If you are running a forward proxy, you would want to be quite
strict about the responses. If you are only a gateway of trusted backend
servers and apps, you might want to be more tolerant (although Roy and
Jim may disagree with me on this.)

Re: HTTP/1.1 strict ruleset

Posted by Eric Covener <co...@gmail.com>.
On Thu, Aug 11, 2016 at 12:44 PM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
> Since I've heard little support in these past weeks for leaving an HTTP
> strict
> 'logging-only' option, I'm going to rip that out, but replace it with
> options to
> independently toggle HTTPUnsafe and HTTPResponseUnsafe values, so that
> the server can continue to deliberately process oddball backends that don't
> conform, while requiring strict behavior of originating user-agents.

Does the latter refer stuff being read from origins in mod_proxy_http
or just what we're willing to put on the wire in general vs. what we
parse on the way in?

-- 
Eric Covener
covener@gmail.com

Re: HTTP/1.1 strict ruleset

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Thu, Aug 11, 2016 at 7:20 AM, Jim Jagielski <ji...@jagunet.com> wrote:

>
> > On Aug 4, 2016, at 6:21 PM, Roy T. Fielding <fi...@gbiv.com> wrote:
> >
> >
> > Leaving existing users in a broken state of non-compliance with the
> primary
> > Internet standard we are claiming to implement just because of
> unsubstantiated
> > FUD is far more frustrating.  Bugs get fixed. Users choose whether or
> not to install.
> > If we find a real problem in a deployed client that causes the bug fix
> to be intolerable,
> > then of course we will need to configure workarounds.  But we are not in
> that place.
>
> +1
>

Just to be clear, that is now 2 votes for eliminating the 'classic parser'
from all
of trunk, 2.4.x and 2.2.x branches, and using only the strict parser,
unconditionally.

That's actually 3 votes for removing it from trunk, which I planned to do
already,
after 2.4 and 2.2 backports are in-sync with trunk.

I'm a little concerned that oddball back-ends will be seriously disrupted
by this
change and unable to otherwise upgrade or adopt the rest of our worthwhile
bug fixes. But I don't think that the oddball behavior should be a default,
I like
the suggestion that this whole mess be renamed HTTPUnsafe rather than
HTTPStrict. Let their configuration directive be a warning that it's not a
good
idea to enable it.

Since I've heard little support in these past weeks for leaving an HTTP
strict
'logging-only' option, I'm going to rip that out, but replace it with
options to
independently toggle HTTPUnsafe and HTTPResponseUnsafe values, so that
the server can continue to deliberately process oddball backends that don't
conform, while requiring strict behavior of originating user-agents.

But if there is a third vote against retaining the legacy support, on any
branch,
the answer is to drop all of these toggles and rip out the old code.

Re: HTTP/1.1 strict ruleset

Posted by Jim Jagielski <ji...@jaguNET.com>.
> On Aug 4, 2016, at 6:21 PM, Roy T. Fielding <fi...@gbiv.com> wrote:
> 
> 
> Leaving existing users in a broken state of non-compliance with the primary
> Internet standard we are claiming to implement just because of unsubstantiated
> FUD is far more frustrating.  Bugs get fixed. Users choose whether or not to install.
> If we find a real problem in a deployed client that causes the bug fix to be intolerable,
> then of course we will need to configure workarounds.  But we are not in that place.
> 

+1


Re: HTTP/1.1 strict ruleset

Posted by "Roy T. Fielding" <fi...@gbiv.com>.
> On Aug 4, 2016, at 3:02 PM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
> 
> On Thu, Aug 4, 2016 at 3:46 PM, Roy T. Fielding <fielding@gbiv.com <ma...@gbiv.com>> wrote:
> > On Aug 3, 2016, at 4:33 PM, William A Rowe Jr <wrowe@rowe-clan.net <ma...@rowe-clan.net>> wrote:
> >
> > So it seems pretty absurd we are coming back to this over
> > three years later, but is there any reason to preserve pre-RFC 2068
> > behaviors? I appreciate that Stefan was trying to avoid harming
> > existing deployment scenarios, but even as I'm about to propose
> > that we backport all of this to 2.4 and 2.2, I have several questions;
> 
> In general, I don't see a need for any "strict" options. The only changes we
> made to parsing in RFC7230 were for the sake of security and known failures
> to interoperate. This is exactly the feature we are supposed to be handling
> automatically on behalf of our users: secure, correct, and interoperable
> handling and generation of HTTP messaging.  We should not need to configure it.
> 
> Note that the MUST requirements in RFC7230 are not optional. We either implement
> them as specified or we are not compliant with HTTP. 
>  
> Understood. And that describes my attitude toward 2.6/3.0 next release.
> 
> We live in an ecosystem with literally hundreds of thousands of legitimate
> custom http clients asking httpd server for datum. Most projects would
> effectively declare their last major.minor release static, and fix the defects
> while doing all enhancement in their next release. This isn't that project.

By that logic, we can't make any changes in minor versions.  I disagree that
we have ever treated versioning in that way.  If a user doesn't like a bug fix,
they don't have to install the new version, or they have the code to unfix it.

> Because httpd fixes and introduces dozens of bugs each major.minor
> subversion release, and we truly agree that we want every user to move
> to the most recently released major.minor, breaking hundreds of these
> applications with *no recourse* in their build or configuration is frustrating.

Leaving existing users in a broken state of non-compliance with the primary
Internet standard we are claiming to implement just because of unsubstantiated
FUD is far more frustrating.  Bugs get fixed. Users choose whether or not to install.
If we find a real problem in a deployed client that causes the bug fix to be intolerable,
then of course we will need to configure workarounds.  But we are not in that place.

> If consensus here agrees that no out-of-spec behavior should be tolerated
> anymore, I'll jump on board. I'm already in the consensus block that says
> we should not ship a new major.minor without disallowing all of this garbage.
> 
> It would be helpful if other PMC members would weigh in yea or nay on
> dropping out-of-spec behaviors from 2.4 and 2.2 maintenance branches. 

That would be weird.  One of us is going to create a patch.  That specific patch is
going to be voted upon for backport.  If anyone wants to veto it, they are free
to do so with justification.

....Roy


Re: HTTP/1.1 strict ruleset

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Thu, Aug 4, 2016 at 3:46 PM, Roy T. Fielding <fi...@gbiv.com> wrote:

> > On Aug 3, 2016, at 4:33 PM, William A Rowe Jr <wr...@rowe-clan.net>
> wrote:
> >
> > So it seems pretty absurd we are coming back to this over
> > three years later, but is there any reason to preserve pre-RFC 2068
> > behaviors? I appreciate that Stefan was trying to avoid harming
> > existing deployment scenarios, but even as I'm about to propose
> > that we backport all of this to 2.4 and 2.2, I have several questions;
>
> In general, I don't see a need for any "strict" options. The only changes
> we
> made to parsing in RFC7230 were for the sake of security and known failures
> to interoperate. This is exactly the feature we are supposed to be handling
> automatically on behalf of our users: secure, correct, and interoperable
> handling and generation of HTTP messaging.  We should not need to
> configure it.
>
> Note that the MUST requirements in RFC7230 are not optional. We either
> implement
> them as specified or we are not compliant with HTTP.
>

Understood. And that describes my attitude toward 2.6/3.0 next release.

We live in an ecosystem with literally hundreds of thousands of legitimate
custom http clients asking httpd server for datum. Most projects would
effectively declare their last major.minor release static, and fix the
defects
while doing all enhancement in their next release. This isn't that project.

Because httpd fixes and introduces dozens of bugs each major.minor
subversion release, and we truly agree that we want every user to move
to the most recently released major.minor, breaking hundreds of these
applications with *no recourse* in their build or configuration is
frustrating.

If consensus here agrees that no out-of-spec behavior should be tolerated
anymore, I'll jump on board. I'm already in the consensus block that says
we should not ship a new major.minor without disallowing all of this
garbage.

It would be helpful if other PMC members would weigh in yea or nay on
dropping out-of-spec behaviors from 2.4 and 2.2 maintenance branches.

Re: HTTP/1.1 strict ruleset

Posted by "Roy T. Fielding" <fi...@gbiv.com>.
> On Aug 3, 2016, at 4:33 PM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
> 
> So it seems pretty absurd we are coming back to this over
> three years later, but is there any reason to preserve pre-RFC 2068
> behaviors? I appreciate that Stefan was trying to avoid harming
> existing deployment scenarios, but even as I'm about to propose
> that we backport all of this to 2.4 and 2.2, I have several questions;

In general, I don't see a need for any "strict" options. The only changes we
made to parsing in RFC7230 were for the sake of security and known failures
to interoperate. This is exactly the feature we are supposed to be handling
automatically on behalf of our users: secure, correct, and interoperable
handling and generation of HTTP messaging.  We should not need to configure it.

Note that the MUST requirements in RFC7230 are not optional. We either implement
them as specified or we are not compliant with HTTP.  So, the specific issues of

https://tools.ietf.org/html/rfc7230#section-3

   A sender MUST NOT send whitespace between the start-line and the
   first header field.  A recipient that receives whitespace between the
   start-line and the first header field MUST either reject the message
   as invalid or consume each whitespace-preceded line without further
   processing of it (i.e., ignore the entire line, along with any
   subsequent lines preceded by whitespace, until a properly formed
   header field is received or the header section is terminated).

   The presence of such whitespace in a request might be an attempt to
   trick a server into ignoring that field or processing the line after
   it as a new request, either of which might result in a security
   vulnerability if other implementations within the request chain
   interpret the same message differently.  Likewise, the presence of
   such whitespace in a response might be ignored by some clients or
   cause others to cease parsing.

and

https://tools.ietf.org/html/rfc7230#section-3.2.4

   No whitespace is allowed between the header field-name and colon.  In
   the past, differences in the handling of such whitespace have led to
   security vulnerabilities in request routing and response handling.  A
   server MUST reject any received request message that contains
   whitespace between a header field-name and colon with a response code
   of 400 (Bad Request).  A proxy MUST remove any such whitespace from a
   response message before forwarding the message downstream.


must be complied with regardless of any "strict" config setting.

Some of those other things under "strict" seem a bit wonky. For example,
changing the Host header field when the incoming request URI is absolute
is fine by default but needs to be a configurable option for gateways.
Trying to validate IPv4/6 vs DNS doesn't work in intranet environments
that use local name servers.  The Location field-value is no longer required
to be absolute ("https://tools.ietf.org/html/rfc7231#section-7.1.2").

> 1. offer a logging-only option? Why? It seems like a simple
>    choice, follow the spec, or don't. If you want to see what's
>    going on, Wireshark, Fiddler and dozens of other tools let
>    you inspect the conversation.
> 
> 2. leave the default as 'not-strict'? Seems we should most
>    strongly recommend that the server observe RFC's 2068,
>    2616 and 723x, and not tolerate ancient behavior by default
>    unless the admin insists on being foolish.

As far as the Internet is concerned, RFC723x is the new law of the land.
There is no reason to support obsolete RFCs.  No reason at all.  This has
nothing to do with semantic versioning or binary compatibility -- it is
simply doing what the product says it does: serve HTTP.

> 3. retain these legacy faulty behaviors in httpd 2.next/3.0?
>    Seems that once we agree on a backport, the ancient
>    side of this logic should all just disappear from trunk.
> 
> 4. detail the error to the error log? Again, there are inspection
>    tools, but more importantly, no visual user-agent is going
>    to send this garbage, and automated requests are going
>    to discard the 400 response. Seems we can save a lot of
>    code throwing away the details that just don't help, and
>    are generally the product of abusive traffic.
> 
> Thoughts?

I think we just need to state in the log the reason for a 400 error. I don't like
sending invalid client-provided data back in a response, even when encoded.

Whitespace before the first header field can log a static message.
Whitespace after a field-name could log the field-name (no need to log the
field value). Invalid characters can be noted as "in a field-name" without
further data, or as "in a field-value" with only the field-name logged.

These are all post-error details off the critical path, so I don't buy the CPU
argument.  However, I do think our error handling in protocol.c has become
so verbose that it obscures the rest of the code.  Maybe it would be better if
we just stopped caring about 80-column viewing for calls to ap_log_*.

....Roy


Re: HTTP/1.1 strict ruleset

Posted by Christian Folini <ch...@netnea.com>.
On Wed, Aug 03, 2016 at 06:58:26PM -0500, William A Rowe Jr wrote:
> > I see a lot of value in logging when not applying the strict parsing,
> > so you can passively assess your traffic for a day/week/month.
> 
> That requires additional CPU, and significantly more code complexity.
> In fact, I wonder whether such 'logging-only' behavior shouldn't simply
> be a no-choice default? I also wonder if those tools or others such as
> mod_security won't already provide such an option and we can wash
> our hands of this 'extra effort'?

ModSecurity Core Rules committer here.

As you know it's all in the rules with ModSecurity and the 
OWASP ModSecurity Core Rules (CRS) are the most widespread ruleset 
on the net.

We block per default, but all the checks can run log-only. 

They are listed in these rulefiles:
https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/v3.0.0-rc1/rules/REQUEST-911-METHOD-ENFORCEMENT.conf
https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/v3.0.0-rc1/rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf
https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/v3.0.0-rc1/rules/REQUEST-921-PROTOCOL-ATTACK.conf

The default policy definitions: 
https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/v3.0.0-rc1/modsecurity_crs_10_setup.conf.example

(Links are for the upcoming major release 3.0, RC1 will be out within
days now).

Overall, I think the rules are not overly aggressive. Apache has been
liberal so far and we try to avoid too many false positives due to
crazy clients and bad implementations. Missing Accept headers,
silly Range headers and numerical Host headers as frequent source
of false positives spring to mind.

Also, I think the coverage is not very systematic. Joining forces and
providing a systematic coverage for all aspects of RFC 2068 for 
CRS 3.1 would be very interesting for our project. If it would simplify
the httpd code base to refer users to ModSecurity and CRS, the
CRS could profit a lot from the endorsement (and the httpd-dev
experience brought to our rules resulting in a higher security level
overall).

A possible issue is the fact that ModSecurity runs fairly late in the
lifecycle. In fact, the default hook for the first ModSecurity rule
phase has been shifted backwards a few years ago. I take it a httpd
implementation of protocol enforcement rules would run immediately after
receiving the request line and then as the headers come in. ModSecurity
would definitely run later. However, there have been discussions to
introduce additional rule phase(s) into the ModSecurity engine / module
in the past and if there is a need from the Apache project, then the
development might be open in this regard (but it would certainly take
quite a while to get this out the door).

Cheers,

Christian Folini

-- 
https://www.feistyduck.com/training/modsecurity-training-course
mailto:christian.folini@netnea.com
twitter: @ChrFolini

Re: HTTP/1.1 strict ruleset

Posted by Eric Covener <co...@gmail.com>.
> Well, is there any point of detailing the specific offending field names?
> The bad text received? To the consumer in the response error message,
> or strictly to the logs?

I don't think it's too important in the response.

> We should log (once) the cause of a 400, but are the details interesting?
> Or is it enough to report that a bad field name, a bad header value etc
> caused the fault without any sprintf() style processing?

Kind of difficult to talk about in the abstract. I can't say the cause
needs to be "clear" in the logs but I think the offending data (line
if it's line-based even if we could already be lost for bad input) and
some hint about what we were expecting or didn't like.

Re: HTTP/1.1 strict ruleset

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Wed, Aug 3, 2016 at 6:51 PM, Eric Covener <co...@gmail.com> wrote:

> On Wed, Aug 3, 2016 at 7:33 PM, William A Rowe Jr <wr...@rowe-clan.net>
> wrote:
> > So it seems pretty absurd we are coming back to this over
> > three years later, but is there any reason to preserve pre-RFC 2068
> > behaviors? I appreciate that Stefan was trying to avoid harming
> > existing deployment scenarios, but even as I'm about to propose
> > that we backport all of this to 2.4 and 2.2, I have several questions;
> >
> > 1. offer a logging-only option? Why? It seems like a simple
> >    choice, follow the spec, or don't. If you want to see what's
> >    going on, Wireshark, Fiddler and dozens of other tools let
> >    you inspect the conversation.
>
> I see a lot of value in logging when not applying the strict parsing,
> so you can passively assess your traffic for a day/week/month.


That requires additional CPU, and significantly more code complexity.
In fact, I wonder whether such 'logging-only' behavior shouldn't simply
be a no-choice default? I also wonder if those tools or others such as
mod_security won't already provide such an option and we can wash
our hands of this 'extra effort'?

> 4. detail the error to the error log? Again, there are inspection
> >    tools, but more importantly, no visual user-agent is going
> >    to send this garbage, and automated requests are going
> >    to discard the 400 response. Seems we can save a lot of
> >    code throwing away the details that just don't help, and
> >    are generally the product of abusive traffic.
>
> I don't have a good understanding of the savings or where the line
> would be drawn on depth, but i think it's important to log (or stash
> where it can be logged) a high level reason that the core/http_filters
> ditched a request based on syntax.
>

Well, is there any point of detailing the specific offending field names?
The bad text received? To the consumer in the response error message,
or strictly to the logs?

We should log (once) the cause of a 400, but are the details interesting?
Or is it enough to report that a bad field name, a bad header value etc
caused the fault without any sprintf() style processing?

Re: HTTP/1.1 strict ruleset

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
On Wed, Aug 3, 2016 at 6:51 PM, Eric Covener <co...@gmail.com> wrote:

>
> > 2. leave the default as 'not-strict'? Seems we should most
> >    strongly recommend that the server observe RFC's 2068,
> >    2616 and 723x, and not tolerate ancient behavior by default
> >    unless the admin insists on being foolish.
>
> I am with you on default-to-strict in 2.4 and up.  I'm hesitant about 2.2.
>

Since the project consensus is to support 2.2 through up to next
summer, and these issues pose interop concerns, I don't think that
we can separate 2.2 from 2.4 discussions at this level.

Re: HTTP/1.1 strict ruleset

Posted by Eric Covener <co...@gmail.com>.
On Wed, Aug 3, 2016 at 7:33 PM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
> So it seems pretty absurd we are coming back to this over
> three years later, but is there any reason to preserve pre-RFC 2068
> behaviors? I appreciate that Stefan was trying to avoid harming
> existing deployment scenarios, but even as I'm about to propose
> that we backport all of this to 2.4 and 2.2, I have several questions;
>
> 1. offer a logging-only option? Why? It seems like a simple
>    choice, follow the spec, or don't. If you want to see what's
>    going on, Wireshark, Fiddler and dozens of other tools let
>    you inspect the conversation.

I see a lot of value in logging when not applying the strict parsing,
so you can passively assess your traffic for a day/week/month.

> 2. leave the default as 'not-strict'? Seems we should most
>    strongly recommend that the server observe RFC's 2068,
>    2616 and 723x, and not tolerate ancient behavior by default
>    unless the admin insists on being foolish.

I am with you on default-to-strict in 2.4 and up.  I'm hesitant about 2.2.

> 3. retain these legacy faulty behaviors in httpd 2.next/3.0?
>    Seems that once we agree on a backport, the ancient
>    side of this logic should all just disappear from trunk.

I don't see a good reason to keep the behavior in current trunk, but
preserving the merge-ability sounds worthwhile.

>
> 4. detail the error to the error log? Again, there are inspection
>    tools, but more importantly, no visual user-agent is going
>    to send this garbage, and automated requests are going
>    to discard the 400 response. Seems we can save a lot of
>    code throwing away the details that just don't help, and
>    are generally the product of abusive traffic.

I don't have a good understanding of the savings or where the line
would be drawn on depth, but i think it's important to log (or stash
where it can be logged) a high level reason that the core/http_filters
ditched a request based on syntax.