You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Robin Berjon <ro...@knowscape.com> on 2002/01/23 17:41:05 UTC

Cross-site scripting vulnerability in Apache::Util

Hi,

picking up on the recent discussion about XSS vulnerabilities Geoff prompted 
me to check that Apache::Util did indeed do the right thing. Taking the 
example from 
http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2000-03/msg00750.html 
I made a simple handler that compares the output of Apache::Util::escape_html 
and of HTML::Entities::encode_entities. I hope I didn't get it wrong, I'm by 
no means a security expert ;-) However it is clear that the outputs differ in 
that Apache::Util does not encode potentially dangerous characters.

Here is the module:

package XSSEntities;
use strict;
use vars qw($VERSION $XSS);
use Apache::Constants   qw(OK);
use Apache::Util        qw();
use HTML::Entities      qw();

$VERSION = '0.01';
$XSS = "\x8b" . "h1>This should *not* be a big large header" . "\x8b" . 
"/h1>";

sub handler {
    my $r = shift;
    $r->send_http_header('text/html;charset=iso-8859-1');
    $r->print(<<'    EOTOP');
    <html>
    <head>
      <title>XSSEntities</title>
    </head>
    <body>
    EOTOP

    my $htEnt = HTML::Entities::encode_entities($XSS);
    my $apUtl = Apache::Util::escape_html($XSS);

    $r->print("HTML::Entities: $htEnt <br />\n");
    $r->print("Apache::Util: $apUtl <br />\n");

    $r->print(<<'    EOBOT');
    </body>
    </html>
    EOBOT

    return OK;
}
1;

HTML::Entities correctly turns \x8b into &#139; while Apache::Util leaves it 
untouched. That character is treated by certain buggy browsers as < and can 
thus be used to fake tags. Note that just because your browser isn't 
vulnerable (ie it doesn't buy the fakes h1) doesn't mean that the problem 
isn't there :-) The source makes it explicit.

This is with 1.25 but I don't think it has changed since. The solution is to 
do what HTML::Entities does, which is basically sprintf "&#x%X;", ord($char) 
control and high bit chars. I'd submit a patch but I'm not too fluent with 
C/XS.

Hope this helps,

-- 
_______________________________________________________________________
Robin Berjon <ro...@knowscape.com> -- CTO
k n o w s c a p e : // venture knowledge agency www.knowscape.com
-----------------------------------------------------------------------
Critic, n.: A person who boasts himself hard to please because nobody
tries to please him.


Re: Cross-site scripting vulnerability in Apache::Util

Posted by Stas Bekman <st...@stason.org>.
> however it comes about is fine, I guess.  however, if Apache::Util in 1.3 is left
> un-patched then we're kinda giving a false impression that calling
> Apache::Util::escape_html() is sufficient to thwart CSS attacks when it really only keeps
> all but the most clever away.


I guess we should document this first of all, till it gets fixed. So 
there will be no surprises.


>>So what spec are you working with?
>>
> 
> robin and I were reading
> 
> http://www.cl.cam.ac.uk/~mgk25/unicode.html
> 
> but there may be others.


thanks!


>>Can we just reap the functionality from some Perl core module in
>>bleadperl that does it right?
>>
> 
> well, the problem that robin and I were contemplating is that Apache::Util is supposed to
> be fast because it uses XS.  if we went to a pure perl implementation we would loose the
> speed and duplicate something like HTML::Entities (although it would be easier to solve
> the problem).
> 
> that said, perhaps there is C code in utf8.c (or wherever) that we can steal to make life
> easier.  we probably need to get someone involved who understands the issues better than I
> do :)

Well I suggested to reap from bleadperl, which is mostly written in C :) 
But having a nicely implemented code in Perl is a good start. It's much 
easier to rewrite in C than starting from scratch.

_____________________________________________________________________
Stas Bekman             JAm_pH      --   Just Another mod_perl Hacker
http://stason.org/      mod_perl Guide   http://perl.apache.org/guide
mailto:stas@stason.org  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Cross-site scripting vulnerability in Apache::Util

Posted by Ron Savage <ro...@savage.net.au>.
Folks

Unicode refs:

Unicode|HTML|Weaving the Multilingual Web|http://www.w3.org/Talks/1999/0830-tutorial-unicode-mjd/Overview.html
Unicode|Unicode|http://www-4.ibm.com/software/developer/library/globalsoft.html
Unicode|UTF-8 and Unicode FAQ for Unix/Linux|http://www.cl.cam.ac.uk/~mgk25/unicode.html
Unicode|Zvon Character Reference|http://www.zvon.org/xxl/characterReference/Output/index.html

Cheers
Ron Savage
ron@savage.net.au
http://savage.net.au/index.html



Re: Cross-site scripting vulnerability in Apache::Util

Posted by Stas Bekman <st...@stason.org>.
> however it comes about is fine, I guess.  however, if Apache::Util in 1.3 is left
> un-patched then we're kinda giving a false impression that calling
> Apache::Util::escape_html() is sufficient to thwart CSS attacks when it really only keeps
> all but the most clever away.


I guess we should document this first of all, till it gets fixed. So 
there will be no surprises.


>>So what spec are you working with?
>>
> 
> robin and I were reading
> 
> http://www.cl.cam.ac.uk/~mgk25/unicode.html
> 
> but there may be others.


thanks!


>>Can we just reap the functionality from some Perl core module in
>>bleadperl that does it right?
>>
> 
> well, the problem that robin and I were contemplating is that Apache::Util is supposed to
> be fast because it uses XS.  if we went to a pure perl implementation we would loose the
> speed and duplicate something like HTML::Entities (although it would be easier to solve
> the problem).
> 
> that said, perhaps there is C code in utf8.c (or wherever) that we can steal to make life
> easier.  we probably need to get someone involved who understands the issues better than I
> do :)

Well I suggested to reap from bleadperl, which is mostly written in C :) 
But having a nicely implemented code in Perl is a good start. It's much 
easier to rewrite in C than starting from scratch.

_____________________________________________________________________
Stas Bekman             JAm_pH      --   Just Another mod_perl Hacker
http://stason.org/      mod_perl Guide   http://perl.apache.org/guide
mailto:stas@stason.org  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/


Re: Cross-site scripting vulnerability in Apache::Util

Posted by Geoffrey Young <ge...@modperlcookbook.org>.
Stas Bekman wrote:
> 
> Geoffrey Young wrote:
> 
> >>However I'm not sure your patch does the right thing re UTF-8, unless there's
> >>some magic involved that I'm not seeing :-/ I'm no expert on how to deal with
> >>UTF-8 in C (or even in Perl) but it looks like you're only addressing 8bit
> >>encodings.
> >>
> >
> >
> > ok, after some to and fro with robin over on #modperl it looks like we discovered a few
> > things...
> >
> > first, Apache::Util is not UTF-8 compliant, since it currently mangles C strings
> > byte-by-byte, which introduces the possibility that all or part of a 2-byte character
> > could be mangled.
> >
> > second, the patch follows suit and expands the range of 1-byte characters it mangles,
> > which makes it more non-UTF-8 friendly.
> >
> > so, basically what we're thinking is that the new Apache::Util is more secure for
> > non-UTF-8 encodings, while more broken for UTF-8.  but UTF-8 is unusable with Apache::Util
> > in either case, so the patch is probably a good thing.
> >
> > other ideas/eyeballs are welcome here, since we've just been going over the spec and
> > making some conjectures - neither of us is an expert here by any means.
> >
> > once other people chime in, we can whip up a doc patch for Apache::Util as well.
> 
> Since Apache::Util wasn't ported to mod_perl 2.0 and I was thinking to
> do that at some point. So we can work on the Apache::Util for 2.0 and
> then backport it to 1.x. Sounds like a more promising scenario.

however it comes about is fine, I guess.  however, if Apache::Util in 1.3 is left
un-patched then we're kinda giving a false impression that calling
Apache::Util::escape_html() is sufficient to thwart CSS attacks when it really only keeps
all but the most clever away.

> 
> So what spec are you working with?

robin and I were reading

http://www.cl.cam.ac.uk/~mgk25/unicode.html

but there may be others.

> 
> Can we just reap the functionality from some Perl core module in
> bleadperl that does it right?

well, the problem that robin and I were contemplating is that Apache::Util is supposed to
be fast because it uses XS.  if we went to a pure perl implementation we would loose the
speed and duplicate something like HTML::Entities (although it would be easier to solve
the problem).

that said, perhaps there is C code in utf8.c (or wherever) that we can steal to make life
easier.  we probably need to get someone involved who understands the issues better than I
do :)

--Geoff

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Cross-site scripting vulnerability in Apache::Util

Posted by Geoffrey Young <ge...@modperlcookbook.org>.
Stas Bekman wrote:
> 
> Geoffrey Young wrote:
> 
> >>However I'm not sure your patch does the right thing re UTF-8, unless there's
> >>some magic involved that I'm not seeing :-/ I'm no expert on how to deal with
> >>UTF-8 in C (or even in Perl) but it looks like you're only addressing 8bit
> >>encodings.
> >>
> >
> >
> > ok, after some to and fro with robin over on #modperl it looks like we discovered a few
> > things...
> >
> > first, Apache::Util is not UTF-8 compliant, since it currently mangles C strings
> > byte-by-byte, which introduces the possibility that all or part of a 2-byte character
> > could be mangled.
> >
> > second, the patch follows suit and expands the range of 1-byte characters it mangles,
> > which makes it more non-UTF-8 friendly.
> >
> > so, basically what we're thinking is that the new Apache::Util is more secure for
> > non-UTF-8 encodings, while more broken for UTF-8.  but UTF-8 is unusable with Apache::Util
> > in either case, so the patch is probably a good thing.
> >
> > other ideas/eyeballs are welcome here, since we've just been going over the spec and
> > making some conjectures - neither of us is an expert here by any means.
> >
> > once other people chime in, we can whip up a doc patch for Apache::Util as well.
> 
> Since Apache::Util wasn't ported to mod_perl 2.0 and I was thinking to
> do that at some point. So we can work on the Apache::Util for 2.0 and
> then backport it to 1.x. Sounds like a more promising scenario.

however it comes about is fine, I guess.  however, if Apache::Util in 1.3 is left
un-patched then we're kinda giving a false impression that calling
Apache::Util::escape_html() is sufficient to thwart CSS attacks when it really only keeps
all but the most clever away.

> 
> So what spec are you working with?

robin and I were reading

http://www.cl.cam.ac.uk/~mgk25/unicode.html

but there may be others.

> 
> Can we just reap the functionality from some Perl core module in
> bleadperl that does it right?

well, the problem that robin and I were contemplating is that Apache::Util is supposed to
be fast because it uses XS.  if we went to a pure perl implementation we would loose the
speed and duplicate something like HTML::Entities (although it would be easier to solve
the problem).

that said, perhaps there is C code in utf8.c (or wherever) that we can steal to make life
easier.  we probably need to get someone involved who understands the issues better than I
do :)

--Geoff

Re: Cross-site scripting vulnerability in Apache::Util

Posted by Stas Bekman <st...@stason.org>.
Geoffrey Young wrote:

>>However I'm not sure your patch does the right thing re UTF-8, unless there's
>>some magic involved that I'm not seeing :-/ I'm no expert on how to deal with
>>UTF-8 in C (or even in Perl) but it looks like you're only addressing 8bit
>>encodings.
>>
> 
> 
> ok, after some to and fro with robin over on #modperl it looks like we discovered a few
> things...
> 
> first, Apache::Util is not UTF-8 compliant, since it currently mangles C strings
> byte-by-byte, which introduces the possibility that all or part of a 2-byte character
> could be mangled.
> 
> second, the patch follows suit and expands the range of 1-byte characters it mangles,
> which makes it more non-UTF-8 friendly.
> 
> so, basically what we're thinking is that the new Apache::Util is more secure for
> non-UTF-8 encodings, while more broken for UTF-8.  but UTF-8 is unusable with Apache::Util
> in either case, so the patch is probably a good thing.
> 
> other ideas/eyeballs are welcome here, since we've just been going over the spec and
> making some conjectures - neither of us is an expert here by any means.
> 
> once other people chime in, we can whip up a doc patch for Apache::Util as well.

Since Apache::Util wasn't ported to mod_perl 2.0 and I was thinking to 
do that at some point. So we can work on the Apache::Util for 2.0 and 
then backport it to 1.x. Sounds like a more promising scenario.

So what spec are you working with?

Can we just reap the functionality from some Perl core module in 
bleadperl that does it right?


_____________________________________________________________________
Stas Bekman             JAm_pH      --   Just Another mod_perl Hacker
http://stason.org/      mod_perl Guide   http://perl.apache.org/guide
mailto:stas@stason.org  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Cross-site scripting vulnerability in Apache::Util

Posted by Stas Bekman <st...@stason.org>.
Geoffrey Young wrote:

>>However I'm not sure your patch does the right thing re UTF-8, unless there's
>>some magic involved that I'm not seeing :-/ I'm no expert on how to deal with
>>UTF-8 in C (or even in Perl) but it looks like you're only addressing 8bit
>>encodings.
>>
> 
> 
> ok, after some to and fro with robin over on #modperl it looks like we discovered a few
> things...
> 
> first, Apache::Util is not UTF-8 compliant, since it currently mangles C strings
> byte-by-byte, which introduces the possibility that all or part of a 2-byte character
> could be mangled.
> 
> second, the patch follows suit and expands the range of 1-byte characters it mangles,
> which makes it more non-UTF-8 friendly.
> 
> so, basically what we're thinking is that the new Apache::Util is more secure for
> non-UTF-8 encodings, while more broken for UTF-8.  but UTF-8 is unusable with Apache::Util
> in either case, so the patch is probably a good thing.
> 
> other ideas/eyeballs are welcome here, since we've just been going over the spec and
> making some conjectures - neither of us is an expert here by any means.
> 
> once other people chime in, we can whip up a doc patch for Apache::Util as well.

Since Apache::Util wasn't ported to mod_perl 2.0 and I was thinking to 
do that at some point. So we can work on the Apache::Util for 2.0 and 
then backport it to 1.x. Sounds like a more promising scenario.

So what spec are you working with?

Can we just reap the functionality from some Perl core module in 
bleadperl that does it right?


_____________________________________________________________________
Stas Bekman             JAm_pH      --   Just Another mod_perl Hacker
http://stason.org/      mod_perl Guide   http://perl.apache.org/guide
mailto:stas@stason.org  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/


Re: Cross-site scripting vulnerability in Apache::Util

Posted by Geoffrey Young <ge...@modperlcookbook.org>.
> 
> However I'm not sure your patch does the right thing re UTF-8, unless there's
> some magic involved that I'm not seeing :-/ I'm no expert on how to deal with
> UTF-8 in C (or even in Perl) but it looks like you're only addressing 8bit
> encodings.


ok, after some to and fro with robin over on #modperl it looks like we discovered a few
things...

first, Apache::Util is not UTF-8 compliant, since it currently mangles C strings
byte-by-byte, which introduces the possibility that all or part of a 2-byte character
could be mangled.

second, the patch follows suit and expands the range of 1-byte characters it mangles,
which makes it more non-UTF-8 friendly.

so, basically what we're thinking is that the new Apache::Util is more secure for
non-UTF-8 encodings, while more broken for UTF-8.  but UTF-8 is unusable with Apache::Util
in either case, so the patch is probably a good thing.

other ideas/eyeballs are welcome here, since we've just been going over the spec and
making some conjectures - neither of us is an expert here by any means.

once other people chime in, we can whip up a doc patch for Apache::Util as well.

thanks

--Geoff

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Cross-site scripting vulnerability in Apache::Util

Posted by Geoffrey Young <ge...@modperlcookbook.org>.
> 
> However I'm not sure your patch does the right thing re UTF-8, unless there's
> some magic involved that I'm not seeing :-/ I'm no expert on how to deal with
> UTF-8 in C (or even in Perl) but it looks like you're only addressing 8bit
> encodings.


ok, after some to and fro with robin over on #modperl it looks like we discovered a few
things...

first, Apache::Util is not UTF-8 compliant, since it currently mangles C strings
byte-by-byte, which introduces the possibility that all or part of a 2-byte character
could be mangled.

second, the patch follows suit and expands the range of 1-byte characters it mangles,
which makes it more non-UTF-8 friendly.

so, basically what we're thinking is that the new Apache::Util is more secure for
non-UTF-8 encodings, while more broken for UTF-8.  but UTF-8 is unusable with Apache::Util
in either case, so the patch is probably a good thing.

other ideas/eyeballs are welcome here, since we've just been going over the spec and
making some conjectures - neither of us is an expert here by any means.

once other people chime in, we can whip up a doc patch for Apache::Util as well.

thanks

--Geoff

Re: Cross-site scripting vulnerability in Apache::Util

Posted by Robin Berjon <ro...@knowscape.com>.
On Thursday 24 January 2002 15:34, Geoffrey Young wrote:
> > HTML::Entities correctly turns \x8b into &#139; while Apache::Util leaves
> > it untouched. That character is treated by certain buggy browsers as <
> > and can thus be used to fake tags. Note that just because your browser
> > isn't vulnerable (ie it doesn't buy the fakes h1) doesn't mean that the
> > problem isn't there :-) The source makes it explicit.
> >
> > This is with 1.25 but I don't think it has changed since. The solution is
> > to do what HTML::Entities does, which is basically sprintf "&#x%X;",
> > ord($char) control and high bit chars. I'd submit a patch but I'm not too
> > fluent with C/XS.
>
> I'm probably worse with C than Robin, but here's a patch that seems to fix
> the problem (as I understand it, that is).
>
> the solution is different that HTML::Entities in that it always uses the
> &#184; for characters between 126 and 255, whereas HTML::Entities uses
> stuff like &cedil;

The latter part doesn't matter as browsers now recognize numeric entities a 
vast majority of the time (and when they don't they also don't recognize the 
very extended entities that HTML::Entities has).

However I'm not sure your patch does the right thing re UTF-8, unless there's 
some magic involved that I'm not seeing :-/ I'm no expert on how to deal with 
UTF-8 in C (or even in Perl) but it looks like you're only addressing 8bit 
encodings.

-- 
_______________________________________________________________________
Robin Berjon <ro...@knowscape.com> -- CTO
k n o w s c a p e : // venture knowledge agency www.knowscape.com
-----------------------------------------------------------------------
Earth is a beta site.


Re: Cross-site scripting vulnerability in Apache::Util

Posted by Geoffrey Young <ge...@modperlcookbook.org>.
> 
> HTML::Entities correctly turns \x8b into &#139; while Apache::Util leaves it
> untouched. That character is treated by certain buggy browsers as < and can
> thus be used to fake tags. Note that just because your browser isn't
> vulnerable (ie it doesn't buy the fakes h1) doesn't mean that the problem
> isn't there :-) The source makes it explicit.
> 
> This is with 1.25 but I don't think it has changed since. The solution is to
> do what HTML::Entities does, which is basically sprintf "&#x%X;", ord($char)
> control and high bit chars. I'd submit a patch but I'm not too fluent with
> C/XS.
> 

I'm probably worse with C than Robin, but here's a patch that seems to fix the problem (as
I understand it, that is).

the solution is different that HTML::Entities in that it always uses the &#184; for
characters between 126 and 255, whereas HTML::Entities uses stuff like &cedil;

anyway, with the usual caveats of myself not being a C guy, input on a better way to do
this is not only welcomed, but encouraged :)

--Geoff

Index: Util.xs
===================================================================
RCS file: /home/cvspublic/modperl/src/modules/perl/Util.xs,v
retrieving revision 1.9
diff -u -r1.9 Util.xs
--- Util.xs     4 Mar 2000 20:55:47 -0000       1.9
+++ Util.xs     24 Jan 2002 14:31:46 -0000
@@ -36,6 +36,7 @@
 {
     int i, j;
     SV *x;
+    static char highbits[6];
 
     /* first, count the number of extra characters */
     for (i = 0, j = 0; s[i] != '\0'; i++)
@@ -43,7 +44,8 @@
            j += 3;
        else if (s[i] == '&')
            j += 4;
-        else if (s[i] == '"')
+        else if (s[i] == '"' || 
+                ((unsigned char)s[i] > 126) && (unsigned char)s[i] <= 255)
            j += 5;
 
     if (j == 0)
@@ -67,6 +69,11 @@
            memcpy(&SvPVX(x)[j], "&quot;", 6);
            j += 5;
        }
+        else if ((unsigned char)s[i] > 126 && (unsigned char)s[i] <= 255) {
+            sprintf(highbits, "&#%i;", (unsigned char)s[i]);
+           memcpy(&SvPVX(x)[j], highbits, 6);
+           j += 5;
+        }
        else
            SvPVX(x)[j] = s[i];

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Re: Cross-site scripting vulnerability in Apache::Util

Posted by Geoffrey Young <ge...@modperlcookbook.org>.
> 
> HTML::Entities correctly turns \x8b into &#139; while Apache::Util leaves it
> untouched. That character is treated by certain buggy browsers as < and can
> thus be used to fake tags. Note that just because your browser isn't
> vulnerable (ie it doesn't buy the fakes h1) doesn't mean that the problem
> isn't there :-) The source makes it explicit.
> 
> This is with 1.25 but I don't think it has changed since. The solution is to
> do what HTML::Entities does, which is basically sprintf "&#x%X;", ord($char)
> control and high bit chars. I'd submit a patch but I'm not too fluent with
> C/XS.
> 

I'm probably worse with C than Robin, but here's a patch that seems to fix the problem (as
I understand it, that is).

the solution is different that HTML::Entities in that it always uses the &#184; for
characters between 126 and 255, whereas HTML::Entities uses stuff like &cedil;

anyway, with the usual caveats of myself not being a C guy, input on a better way to do
this is not only welcomed, but encouraged :)

--Geoff

Index: Util.xs
===================================================================
RCS file: /home/cvspublic/modperl/src/modules/perl/Util.xs,v
retrieving revision 1.9
diff -u -r1.9 Util.xs
--- Util.xs     4 Mar 2000 20:55:47 -0000       1.9
+++ Util.xs     24 Jan 2002 14:31:46 -0000
@@ -36,6 +36,7 @@
 {
     int i, j;
     SV *x;
+    static char highbits[6];
 
     /* first, count the number of extra characters */
     for (i = 0, j = 0; s[i] != '\0'; i++)
@@ -43,7 +44,8 @@
            j += 3;
        else if (s[i] == '&')
            j += 4;
-        else if (s[i] == '"')
+        else if (s[i] == '"' || 
+                ((unsigned char)s[i] > 126) && (unsigned char)s[i] <= 255)
            j += 5;
 
     if (j == 0)
@@ -67,6 +69,11 @@
            memcpy(&SvPVX(x)[j], "&quot;", 6);
            j += 5;
        }
+        else if ((unsigned char)s[i] > 126 && (unsigned char)s[i] <= 255) {
+            sprintf(highbits, "&#%i;", (unsigned char)s[i]);
+           memcpy(&SvPVX(x)[j], highbits, 6);
+           j += 5;
+        }
        else
            SvPVX(x)[j] = s[i];