You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Stefan Fritsch <sf...@sfritsch.de> on 2012/11/07 12:26:23 UTC

Rethinking "be liberal in what you accept"

Hi,

considering the current state of web security, the old principle of "be 
liberal in what you accept" seems increasingly inadequate for web servers. 
It causes lots of issues like response splitting, header injection, cross 
site scripting, etc. The book "Tangled Web" by Michal Zalewski is a good 
read on this topic, the chapter on HTTP is available for free download at 
http://nostarch.com/tangledweb .

Also, nowadays performance bottle necks are usually in other places than 
request parsing. A few more cycles spent for additional checks won't make 
much difference. Therefore, I think it would make sense to integrate some 
sanity checks right into the httpd core. For a start, these would need to 
be enabled in the configuration.

Examples for such checks [RFC 2616 sections in brackets]:

Request line:
- Don't interpret all kinds of junk as "HTTP/1.0" (like "HTTP/ab" or
   "FOO") [3.1]
- If a method is not registered, bail out early.
   This would prevent CGIs from answering requests to strange methods like
   "HELO" or "http://foo/bar". This must be configurable or there must be
   at least a directive to easily register custom methods.  Otherwise, at
   least forbid strange characters in the method. [The method is a token,
   which should not contain control characters and separators; 2.2, 5.1]
- Forbid control characters in URL
- Forbid fragment parts in the URL (i.e. "#..." which should never be sent
   by the browser)
- Forbid special characters in the scheme part of absoluteURL requests,
   e.g. "<>"

Request headers:
- In Host header, only allow reasonable characters, i.e. no control
   characters, no "<>&". Maybe: only allow ascii letters, digits, and
   "-_.:[]"
- Maybe replace the Host header with the request's hostname, if they are
   different. In:
 	GET http://foo/ HTTP/1.1
 	Host: bar
   The "Host: bar" MUST be ignored by RFC 2616 [5.2]. As many webapps likely
   don't do that, we could replace the Host header to avoid any confusion.
- Don't accept requests with multiple Content-Length headers. [4.2]
- Don't accept control characters in header values (in particular single CRs,
   which we don't treat specially, but other proxies may. [4.2]

Response headers:
- Maybe error out if an output header value or name contains CR/LF/NUL (or
   all control characters?) [4.2]
- Check that some headers appear only once, e.g. Content-Length.
- Potentially check in some headers (e.g. Content-Disposition) that key=value
   pairs appear only once (this may go too far / or be too expensive).

Other:
- Maybe forbid control characters in username + password (after base64
   decoding)

As a related issue, it should be possible to disable HTTP 0.9.

The dividing line to modules like mod_security should be that we only 
check things that are forbidden by some standard and that we only look at 
the protocol and not the body.  Also, I would only allow to switch the 
checks on and off, no further configurability. And the checks should be 
implemented efficiently, i.e. don't parse things several times to do the 
checks, normally don't use regexes, etc.

What do you think?

Cheers,
Stefan

Re: Rethinking "be liberal in what you accept"

Posted by Stefan Fritsch <sf...@sfritsch.de>.
On Wed, 7 Nov 2012, Nick Kew wrote:
>> What do you think?
>
> I've made occasional efforts in this direction in the past,
> but never seen much interest in bringing such functionality
> into core (as opposed to WAF).
>
> One such: http://people.apache.org/~niq/mod_taint.html

What you proposed there was broader in scope, using regular expressions 
allowing lots of flexibility and allowing it to be adjusted to your 
webapps. I really only want to interpret the RFCs more strictly, and do 
that fast.

Looking at mod_taint, I think it may be useful for 2.2. But in 2.4, quite 
a bit of it can be done with <If>:

<If "%{req:foo} !~ /^(\w)$/" >
   Require all denied
</If>

Re: Rethinking "be liberal in what you accept"

Posted by Stefan Fritsch <sf...@sfritsch.de>.
On Wed, 7 Nov 2012, Graham Leggett wrote:
> On 07 Nov 2012, at 3:34 PM, Stefan Fritsch <sf...@sfritsch.de> wrote:
>
>> One could of course. But not everyone has lua, lua is slower than C, and even doing it in a module instead of core is sometimes more work. For example, currently we set r->protocol to "HTTP/1.0" even if the original request contained junk. So a module would have to re-parse the whole request line to get the protocol string sent by the client.
>>
>> My goal is something that the vast majority of people would want to activate, and would create pressure on manufacturers of clients that cannot cope with it to fix their clients.
>
> +1.
>
> Perhaps a "strict mode" that defaults to off in 2.4, and on in 2.5+?

Exactly.

Re: Rethinking "be liberal in what you accept"

Posted by Graham Leggett <mi...@sharp.fm>.
On 07 Nov 2012, at 3:34 PM, Stefan Fritsch <sf...@sfritsch.de> wrote:

> One could of course. But not everyone has lua, lua is slower than C, and even doing it in a module instead of core is sometimes more work. For example, currently we set r->protocol to "HTTP/1.0" even if the original request contained junk. So a module would have to re-parse the whole request line to get the protocol string sent by the client.
> 
> My goal is something that the vast majority of people would want to activate, and would create pressure on manufacturers of clients that cannot cope with it to fix their clients.

+1.

Perhaps a "strict mode" that defaults to off in 2.4, and on in 2.5+?

Regards,
Graham
--


Re: Rethinking "be liberal in what you accept"

Posted by Ben Laurie <be...@links.org>.
On Wed, Nov 7, 2012 at 1:34 PM, Stefan Fritsch <sf...@sfritsch.de> wrote:
> On Wed, 7 Nov 2012, Jim Jagielski wrote:
>
>> Certainly once mod_lua is more "production ready", we could
>> use that, couldn't we?
>
>
> One could of course. But not everyone has lua, lua is slower than C, and
> even doing it in a module instead of core is sometimes more work. For
> example, currently we set r->protocol to "HTTP/1.0" even if the original
> request contained junk. So a module would have to re-parse the whole request
> line to get the protocol string sent by the client.
>
> My goal is something that the vast majority of people would want to
> activate, and would create pressure on manufacturers of clients that cannot
> cope with it to fix their clients.

+1.

Good luck with getting clients fixed, tho :-)

Re: Rethinking "be liberal in what you accept"

Posted by Stefan Fritsch <sf...@sfritsch.de>.
On Wed, 7 Nov 2012, Jim Jagielski wrote:
>> One could of course. But not everyone has lua, lua is slower than C, and even doing it in a module instead of core is sometimes more work.
>
> My response was in regards to mod_taint...

Sorry, then I misunderstood.

Cheers,
Stefan


Re: Rethinking "be liberal in what you accept"

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Nov 7, 2012, at 8:34 AM, Stefan Fritsch <sf...@sfritsch.de> wrote:

> On Wed, 7 Nov 2012, Jim Jagielski wrote:
> 
>> Certainly once mod_lua is more "production ready", we could
>> use that, couldn't we?
> 
> One could of course. But not everyone has lua, lua is slower than C, and even doing it in a module instead of core is sometimes more work.

My response was in regards to mod_taint...


Re: Rethinking "be liberal in what you accept"

Posted by Stefan Fritsch <sf...@sfritsch.de>.
On Wed, 7 Nov 2012, Jim Jagielski wrote:

> Certainly once mod_lua is more "production ready", we could
> use that, couldn't we?

One could of course. But not everyone has lua, lua is slower than C, and 
even doing it in a module instead of core is sometimes more work. For 
example, currently we set r->protocol to "HTTP/1.0" even if the original 
request contained junk. So a module would have to re-parse the whole 
request line to get the protocol string sent by the client.

My goal is something that the vast majority of people would want to 
activate, and would create pressure on manufacturers of clients that 
cannot cope with it to fix their clients.

Re: Rethinking "be liberal in what you accept"

Posted by Jim Jagielski <ji...@jaguNET.com>.
Certainly once mod_lua is more "production ready", we could
use that, couldn't we?

On Nov 7, 2012, at 6:54 AM, Nick Kew <ni...@webthing.com> wrote:

> On Wed, 7 Nov 2012 12:26:23 +0100 (CET)
> Stefan Fritsch <sf...@sfritsch.de> wrote:
> 
> 
>> What do you think?
> 
> I've made occasional efforts in this direction in the past,
> but never seen much interest in bringing such functionality
> into core (as opposed to WAF).
> 
> One such: http://people.apache.org/~niq/mod_taint.html
> 
> -- 
> Nick Kew
> 


Re: Rethinking "be liberal in what you accept"

Posted by Nick Kew <ni...@webthing.com>.
On Wed, 7 Nov 2012 12:26:23 +0100 (CET)
Stefan Fritsch <sf...@sfritsch.de> wrote:


> What do you think?

I've made occasional efforts in this direction in the past,
but never seen much interest in bringing such functionality
into core (as opposed to WAF).

One such: http://people.apache.org/~niq/mod_taint.html

-- 
Nick Kew

Re: Rethinking "be liberal in what you accept"

Posted by Stefan Fritsch <sf...@sfritsch.de>.
On Thursday 08 November 2012, Nick Kew wrote:
> > I intended to add a directive to easily register custom methods
> > (i.e. call  ap_method_register()). Do you think there is reason
> > to allow arbitrary methods, and not just a configured list of
> > allowed ones?
> 
> If methods are to be actively checked, a module needs an API to
> register a method with the checker.

ap_method_register() already exists (probably since 2.0). AFAICS, 
modules that handle custom methods internally already have to use it.

> A configuration option might
> still be required for backends (e.g. proxy or CGI) but would
> perhaps be secondary?

Still, that was the case I was talking about. I have checked that CGI 
forwards the method without checking. But i expect that proxy and e.g. 
mod_fcgid do that, too.

Re: Rethinking "be liberal in what you accept"

Posted by Nick Kew <ni...@webthing.com>.
On Thu, 8 Nov 2012 11:18:37 +0100 (CET)
Stefan Fritsch <sf...@sfritsch.de> wrote:

> On Wed, 7 Nov 2012, Tim Bannister wrote:
> 
> > On 7 Nov 2012, at 11:26, Stefan Fritsch wrote:
> >> If a method is not registered, bail out early.
> >
> >
> > Good idea, but it would be nice to be able to use <Limit> or <LimitExcept> to re-allow it.

I'd disagree with that mechanism.  But as to your point:

> I intended to add a directive to easily register custom methods (i.e. call 
> ap_method_register()). Do you think there is reason to allow arbitrary 
> methods, and not just a configured list of allowed ones?

If methods are to be actively checked, a module needs an API to
register a method with the checker.  A configuration option might
still be required for backends (e.g. proxy or CGI) but would perhaps
be secondary?


-- 
Nick Kew

Re: Rethinking "be liberal in what you accept"

Posted by Stefan Fritsch <sf...@sfritsch.de>.
On Wed, 7 Nov 2012, Tim Bannister wrote:

> On 7 Nov 2012, at 11:26, Stefan Fritsch wrote:
>> If a method is not registered, bail out early.
>
>
> Good idea, but it would be nice to be able to use <Limit> or <LimitExcept> to re-allow it.

I intended to add a directive to easily register custom methods (i.e. call 
ap_method_register()). Do you think there is reason to allow arbitrary 
methods, and not just a configured list of allowed ones?


About <Limit>/<LimitExcept>: I think they are inherently broken and won't 
add any new functions to them. See e.g.

http://mail-archives.apache.org/mod_mbox/httpd-dev/201010.mbox/%3C201010192146.28217.sf%40sfritsch.de%3E 
http://mail-archives.apache.org/mod_mbox/httpd-dev/201010.mbox/%3C201010221727.45305.sf%40sfritsch.de%3E

if you are interested in the why.

Re: Rethinking "be liberal in what you accept"

Posted by Tim Bannister <is...@jellybaby.net>.
On 7 Nov 2012, at 11:26, Stefan Fritsch wrote:

> considering the current state of web security, the old principle of "be liberal in what you accept" seems increasingly inadequate for web servers. It causes lots of issues like response splitting, header injection, cross site scripting, etc. The book "Tangled Web" by Michal Zalewski is a good read on this topic, the chapter on HTTP is available for free download at http://nostarch.com/tangledweb .

> If a method is not registered, bail out early.


Good idea, but it would be nice to be able to use <Limit> or <LimitExcept> to re-allow it.

-- 
Tim Bannister – isoma@jellybaby.net


Re: Rethinking "be liberal in what you accept"

Posted by Christian Folini <ch...@netnea.com>.
On Thu, Nov 08, 2012 at 11:47:31AM +0100, Apache Lounge wrote:
> What about mod_security, has a lot of similar checks and even more.

ModSec can perform all these checks via regexes, but it bears a 
certain overhead in performance and administration. The protocol 
checks are part of bigger rulesets and positives will be mixed 
in the logs with other security findings of varying severity.

The standard state of the art ModSec deployment with the official
Core-Ruleset works with a scoring mechanism, that does not block
a request instantly. So depending on the combination of violations
in a request, a bogus request line may pass beneath the threshold
of the Core-Rules.

A simple, single directive to stop any protocol violations once 
and for all is preferable in my eyes.

regs,

Christian Folini

> 
> -----Original Message----- From: Stefan Fritsch
> Sent: Wednesday, November 7, 2012 12:26 Newsgroups: gmane.comp.apache.devel
> To: dev@httpd.apache.org
> Subject: Rethinking "be liberal in what you accept"
> 
> Hi,
> 
> considering the current state of web security, the old principle of "be
> liberal in what you accept" seems increasingly inadequate for web servers.
> It causes lots of issues like response splitting, header injection, cross
> site scripting, etc. The book "Tangled Web" by Michal Zalewski is a good
> read on this topic, the chapter on HTTP is available for free download at
> http://nostarch.com/tangledweb .
> 
> Also, nowadays performance bottle necks are usually in other places than
> request parsing. A few more cycles spent for additional checks won't make
> much difference. Therefore, I think it would make sense to integrate some
> sanity checks right into the httpd core. For a start, these would need to
> be enabled in the configuration.
> 
> Examples for such checks [RFC 2616 sections in brackets]:
> 
> Request line:
> - Don't interpret all kinds of junk as "HTTP/1.0" (like "HTTP/ab" or
>   "FOO") [3.1]
> - If a method is not registered, bail out early.
>   This would prevent CGIs from answering requests to strange methods like
>   "HELO" or "http://foo/bar". This must be configurable or there must be
>   at least a directive to easily register custom methods.  Otherwise, at
>   least forbid strange characters in the method. [The method is a token,
>   which should not contain control characters and separators; 2.2, 5.1]
> - Forbid control characters in URL
> - Forbid fragment parts in the URL (i.e. "#..." which should never be sent
>   by the browser)
> - Forbid special characters in the scheme part of absoluteURL requests,
>   e.g. "<>"
> 
> Request headers:
> - In Host header, only allow reasonable characters, i.e. no control
>   characters, no "<>&". Maybe: only allow ascii letters, digits, and
>   "-_.:[]"
> - Maybe replace the Host header with the request's hostname, if they are
>   different. In:
>  GET http://foo/ HTTP/1.1
>  Host: bar
>   The "Host: bar" MUST be ignored by RFC 2616 [5.2]. As many webapps likely
>   don't do that, we could replace the Host header to avoid any confusion.
> - Don't accept requests with multiple Content-Length headers. [4.2]
> - Don't accept control characters in header values (in particular
> single CRs,
>   which we don't treat specially, but other proxies may. [4.2]
> 
> Response headers:
> - Maybe error out if an output header value or name contains CR/LF/NUL (or
>   all control characters?) [4.2]
> - Check that some headers appear only once, e.g. Content-Length.
> - Potentially check in some headers (e.g. Content-Disposition) that
> key=value
>   pairs appear only once (this may go too far / or be too expensive).
> 
> Other:
> - Maybe forbid control characters in username + password (after base64
>   decoding)
> 
> As a related issue, it should be possible to disable HTTP 0.9.
> 
> The dividing line to modules like mod_security should be that we only
> check things that are forbidden by some standard and that we only look at
> the protocol and not the body.  Also, I would only allow to switch the
> checks on and off, no further configurability. And the checks should be
> implemented efficiently, i.e. don't parse things several times to do the
> checks, normally don't use regexes, etc.
> 
> What do you think?
> 
> Cheers,
> Stefan

-- 
Christian Folini - <ch...@netnea.com>

Re: Rethinking "be liberal in what you accept"

Posted by Apache Lounge <in...@apachelounge.com>.
What about mod_security, has a lot of similar checks and even more.

-----Original Message----- 
From: Stefan Fritsch
Sent: Wednesday, November 7, 2012 12:26 Newsgroups: gmane.comp.apache.devel
To: dev@httpd.apache.org
Subject: Rethinking "be liberal in what you accept"

Hi,

considering the current state of web security, the old principle of "be
liberal in what you accept" seems increasingly inadequate for web servers.
It causes lots of issues like response splitting, header injection, cross
site scripting, etc. The book "Tangled Web" by Michal Zalewski is a good
read on this topic, the chapter on HTTP is available for free download at
http://nostarch.com/tangledweb .

Also, nowadays performance bottle necks are usually in other places than
request parsing. A few more cycles spent for additional checks won't make
much difference. Therefore, I think it would make sense to integrate some
sanity checks right into the httpd core. For a start, these would need to
be enabled in the configuration.

Examples for such checks [RFC 2616 sections in brackets]:

Request line:
- Don't interpret all kinds of junk as "HTTP/1.0" (like "HTTP/ab" or
   "FOO") [3.1]
- If a method is not registered, bail out early.
   This would prevent CGIs from answering requests to strange methods like
   "HELO" or "http://foo/bar". This must be configurable or there must be
   at least a directive to easily register custom methods.  Otherwise, at
   least forbid strange characters in the method. [The method is a token,
   which should not contain control characters and separators; 2.2, 5.1]
- Forbid control characters in URL
- Forbid fragment parts in the URL (i.e. "#..." which should never be sent
   by the browser)
- Forbid special characters in the scheme part of absoluteURL requests,
   e.g. "<>"

Request headers:
- In Host header, only allow reasonable characters, i.e. no control
   characters, no "<>&". Maybe: only allow ascii letters, digits, and
   "-_.:[]"
- Maybe replace the Host header with the request's hostname, if they are
   different. In:
  GET http://foo/ HTTP/1.1
  Host: bar
   The "Host: bar" MUST be ignored by RFC 2616 [5.2]. As many webapps likely
   don't do that, we could replace the Host header to avoid any confusion.
- Don't accept requests with multiple Content-Length headers. [4.2]
- Don't accept control characters in header values (in particular single 
CRs,
   which we don't treat specially, but other proxies may. [4.2]

Response headers:
- Maybe error out if an output header value or name contains CR/LF/NUL (or
   all control characters?) [4.2]
- Check that some headers appear only once, e.g. Content-Length.
- Potentially check in some headers (e.g. Content-Disposition) that 
key=value
   pairs appear only once (this may go too far / or be too expensive).

Other:
- Maybe forbid control characters in username + password (after base64
   decoding)

As a related issue, it should be possible to disable HTTP 0.9.

The dividing line to modules like mod_security should be that we only
check things that are forbidden by some standard and that we only look at
the protocol and not the body.  Also, I would only allow to switch the
checks on and off, no further configurability. And the checks should be
implemented efficiently, i.e. don't parse things several times to do the
checks, normally don't use regexes, etc.

What do you think?

Cheers,
Stefan