You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Martin Kraemer <Ma...@Fujitsu-Siemens.com> on 2002/05/13 13:56:20 UTC

[PATCH] 1.3: Stricter check on request_line format

We have been discussing this topic in the past: a stricter check
should be applied to the request line, in order to prevent arbitrary
user input to end up in the access_log and error_log. It could be
misused to spoof accesses to nonexistent (or inaccessible) resources,
of course without the client actually getting access to them.

The solution I present here exists of two separate issues:

1) Detecting and disallowing garbage in the request line.
   Stuff like the 'planted' string:
    % echo 'GET / HTTP/1.0" 304 -\r207.46.197.102 - - [01/May/2002:00:02:25 +0200] "GET / HTTP/1.0\n' \
      | netcat www.apache.org 80
   until now looks like there was an access from 207.46.197.102,
   at least to the admin doing a "tail access_log", and maybe to
   some log analyzers too. So, any garbage after the correct
   syntax for a request
     <method> <uri> <protocol>
   (like in "GET / HTTP/1.1") should be disallowed and logged.

2) Escaping dangerous characters in the log files.
   Today, you can cause almost any character to appear in the logs.
   Simply do a "echo '<ESC>[2J' | netcat localhost 80" and if the
   administrator has a "tail -f access_log" running in some window,
   it will be cleared at this point.
   Now we don't want to break existing log analyzers, but escaping
   such b******t is definitely useful. What I do in this patch is
   to add another escaping class T_ESCAPE_LOGITEM and use it on
   items that *might* have come from the user (i.e., the 'tainted'
   items request line, request MIME header lines, possibly DNS
   reverse lookup names, ...). I escape invalid chars in \xXX
   syntax, except for '"', '\\' itself and '\r', '\n', '\b', '\t'
   which are simply escaped by preceding them with a '\\'. Thus it is
   possible to distinguish a (client-provided) '\\' from a '\\' inserted
   by the escaper. The extra escaping of '"' was added to prevent
   spoofing attemps (in the CLF, the request line is enclosed in '"',
   and it was easy to add a '"' to the request and confuse the reader
   of the log file, see example above).

The patch does not (yet) try to filter *every* log item through the
log filter, because apache users may want to add special characters
to the log at their discretion (using LogFormat). But the typical
abuse is caught (and logged) in a sensible manner.

Comments?

    Martin
-- 
<Ma...@Fujitsu-Siemens.com>         |     Fujitsu Siemens
Fon: +49-89-636-46021, FAX: +49-89-636-47655 | 81730  Munich,  Germany

Re: [PATCH] 1.3: Stricter check on request_line format

Posted by Martin Kraemer <Ma...@Fujitsu-Siemens.com>.
On Mon, May 13, 2002 at 01:58:38PM +0200, Kraemer, Martin wrote:
> @@ -1045,12 +1045,26 @@
>      r->assbackwards = (ll[0] == '\0');
>      r->protocol = ap_pstrdup(r->pool, ll[0] ? ll : "HTTP/0.9");
>  
> -    if (2 == sscanf(r->protocol, "HTTP/%u.%u", &major, &minor)
> +    if (3 == sscanf(r->protocol, "HTTP/%u.%u%n", &major, &minor, &n)

No, that should have been 
       if (2 == sscanf(r->protocol, "HTTP/%u.%u%n", &major, &minor, &n)
because %n does not increment the item count.

   Martin
-- 
<Ma...@Fujitsu-Siemens.com>         |     Fujitsu Siemens
Fon: +49-89-636-46021, FAX: +49-89-636-47655 | 81730  Munich,  Germany

Re: [PATCH] 1.3: Stricter check on request_line format

Posted by Martin Kraemer <Ma...@Fujitsu-Siemens.com>.
Okay, once again _with_ patch:

We have been discussing this topic in the past: a stricter check
should be applied to the request line, in order to prevent arbitrary
user input to end up in the access_log and error_log. It could be
misused to spoof accesses to nonexistent (or inaccessible) resources,
of course without the client actually getting access to them.

The solution I present here exists of two separate issues:

1) Detecting and disallowing garbage in the request line.
   Stuff like the 'planted' string:
    % echo 'GET / HTTP/1.0" 304 -\r207.46.197.102 - - [01/May/2002:00:02:25 +0200] "GET / HTTP/1.0\n' \
      | netcat www.apache.org 80
   until now looks like there was an access from 207.46.197.102,
   at least to the admin doing a "tail access_log", and maybe to
   some log analyzers too. So, any garbage after the correct
   syntax for a request
     <method> <uri> <protocol>
   (like in "GET / HTTP/1.1") should be disallowed and logged.

2) Escaping dangerous characters in the log files.
   Today, you can cause almost any character to appear in the logs.
   Simply do a "echo '<ESC>[2J' | netcat localhost 80" and if the
   administrator has a "tail -f access_log" running in some window,
   it will be cleared at this point.
   Now we don't want to break existing log analyzers, but escaping
   such b******t is definitely useful. What I do in this patch is
   to add another escaping class T_ESCAPE_LOGITEM and use it on
   items that *might* have come from the user (i.e., the 'tainted'
   items request line, request MIME header lines, possibly DNS
   reverse lookup names, ...). I escape invalid chars in \xXX
   syntax, except for '"', '\\' itself and '\r', '\n', '\b', '\t'
   which are simply escaped by preceding them with a '\\'. Thus it is
   possible to distinguish a (client-provided) '\\' from a '\\' inserted
   by the escaper. The extra escaping of '"' was added to prevent
   spoofing attemps (in the CLF, the request line is enclosed in '"',
   and it was easy to add a '"' to the request and confuse the reader
   of the log file, see example above).

The patch does not (yet) try to filter *every* log item through the
log filter, because apache users may want to add special characters
to the log at their discretion (using LogFormat). But the typical
abuse is caught (and logged) in a sensible manner.

Comments?

    Martin
-- 
<Ma...@Fujitsu-Siemens.com>         |     Fujitsu Siemens
Fon: +49-89-636-46021, FAX: +49-89-636-47655 | 81730  Munich,  Germany