You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Martin Kraemer <Ma...@Fujitsu-Siemens.com> on 2002/05/13 13:56:20 UTC
[PATCH] 1.3: Stricter check on request_line format
We have been discussing this topic in the past: a stricter check
should be applied to the request line, in order to prevent arbitrary
user input to end up in the access_log and error_log. It could be
misused to spoof accesses to nonexistent (or inaccessible) resources,
of course without the client actually getting access to them.
The solution I present here exists of two separate issues:
1) Detecting and disallowing garbage in the request line.
Stuff like the 'planted' string:
% echo 'GET / HTTP/1.0" 304 -\r207.46.197.102 - - [01/May/2002:00:02:25 +0200] "GET / HTTP/1.0\n' \
| netcat www.apache.org 80
until now looks like there was an access from 207.46.197.102,
at least to the admin doing a "tail access_log", and maybe to
some log analyzers too. So, any garbage after the correct
syntax for a request
<method> <uri> <protocol>
(like in "GET / HTTP/1.1") should be disallowed and logged.
2) Escaping dangerous characters in the log files.
Today, you can cause almost any character to appear in the logs.
Simply do a "echo '<ESC>[2J' | netcat localhost 80" and if the
administrator has a "tail -f access_log" running in some window,
it will be cleared at this point.
Now we don't want to break existing log analyzers, but escaping
such b******t is definitely useful. What I do in this patch is
to add another escaping class T_ESCAPE_LOGITEM and use it on
items that *might* have come from the user (i.e., the 'tainted'
items request line, request MIME header lines, possibly DNS
reverse lookup names, ...). I escape invalid chars in \xXX
syntax, except for '"', '\\' itself and '\r', '\n', '\b', '\t'
which are simply escaped by preceding them with a '\\'. Thus it is
possible to distinguish a (client-provided) '\\' from a '\\' inserted
by the escaper. The extra escaping of '"' was added to prevent
spoofing attemps (in the CLF, the request line is enclosed in '"',
and it was easy to add a '"' to the request and confuse the reader
of the log file, see example above).
The patch does not (yet) try to filter *every* log item through the
log filter, because apache users may want to add special characters
to the log at their discretion (using LogFormat). But the typical
abuse is caught (and logged) in a sensible manner.
Comments?
Martin
--
<Ma...@Fujitsu-Siemens.com> | Fujitsu Siemens
Fon: +49-89-636-46021, FAX: +49-89-636-47655 | 81730 Munich, Germany
Re: [PATCH] 1.3: Stricter check on request_line format
Posted by Martin Kraemer <Ma...@Fujitsu-Siemens.com>.
On Mon, May 13, 2002 at 01:58:38PM +0200, Kraemer, Martin wrote:
> @@ -1045,12 +1045,26 @@
> r->assbackwards = (ll[0] == '\0');
> r->protocol = ap_pstrdup(r->pool, ll[0] ? ll : "HTTP/0.9");
>
> - if (2 == sscanf(r->protocol, "HTTP/%u.%u", &major, &minor)
> + if (3 == sscanf(r->protocol, "HTTP/%u.%u%n", &major, &minor, &n)
No, that should have been
if (2 == sscanf(r->protocol, "HTTP/%u.%u%n", &major, &minor, &n)
because %n does not increment the item count.
Martin
--
<Ma...@Fujitsu-Siemens.com> | Fujitsu Siemens
Fon: +49-89-636-46021, FAX: +49-89-636-47655 | 81730 Munich, Germany
Re: [PATCH] 1.3: Stricter check on request_line format
Posted by Martin Kraemer <Ma...@Fujitsu-Siemens.com>.
Okay, once again _with_ patch:
We have been discussing this topic in the past: a stricter check
should be applied to the request line, in order to prevent arbitrary
user input to end up in the access_log and error_log. It could be
misused to spoof accesses to nonexistent (or inaccessible) resources,
of course without the client actually getting access to them.
The solution I present here exists of two separate issues:
1) Detecting and disallowing garbage in the request line.
Stuff like the 'planted' string:
% echo 'GET / HTTP/1.0" 304 -\r207.46.197.102 - - [01/May/2002:00:02:25 +0200] "GET / HTTP/1.0\n' \
| netcat www.apache.org 80
until now looks like there was an access from 207.46.197.102,
at least to the admin doing a "tail access_log", and maybe to
some log analyzers too. So, any garbage after the correct
syntax for a request
<method> <uri> <protocol>
(like in "GET / HTTP/1.1") should be disallowed and logged.
2) Escaping dangerous characters in the log files.
Today, you can cause almost any character to appear in the logs.
Simply do a "echo '<ESC>[2J' | netcat localhost 80" and if the
administrator has a "tail -f access_log" running in some window,
it will be cleared at this point.
Now we don't want to break existing log analyzers, but escaping
such b******t is definitely useful. What I do in this patch is
to add another escaping class T_ESCAPE_LOGITEM and use it on
items that *might* have come from the user (i.e., the 'tainted'
items request line, request MIME header lines, possibly DNS
reverse lookup names, ...). I escape invalid chars in \xXX
syntax, except for '"', '\\' itself and '\r', '\n', '\b', '\t'
which are simply escaped by preceding them with a '\\'. Thus it is
possible to distinguish a (client-provided) '\\' from a '\\' inserted
by the escaper. The extra escaping of '"' was added to prevent
spoofing attemps (in the CLF, the request line is enclosed in '"',
and it was easy to add a '"' to the request and confuse the reader
of the log file, see example above).
The patch does not (yet) try to filter *every* log item through the
log filter, because apache users may want to add special characters
to the log at their discretion (using LogFormat). But the typical
abuse is caught (and logged) in a sensible manner.
Comments?
Martin
--
<Ma...@Fujitsu-Siemens.com> | Fujitsu Siemens
Fon: +49-89-636-46021, FAX: +49-89-636-47655 | 81730 Munich, Germany