You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Jeremy Howard <jh...@fastmail.fm> on 2000/04/27 14:26:35 UTC

Security in displaying arbitrary HTML

I'm interested in providing 'HTML email' support for my users (like HotMail, Outlook Express, Eudora 4.0, etc provide), but I'm very nervous about security. Essentially, providing HTML email involves letting any arbitrary HTML get displayed by Apache...

Has anyone done this, or can anyone provide any tips on what the minimum amount of HTML laundering I need to do to avoid security holes? I say 'minimum', because I would like to maximise the amount of working HTML users can receive.

I assume I don't have to worry about PHP/EmbPerl/etc tags, since by the time mod_perl is finished with it, it's too late for other handlers to step in (unless I specifically chain them). Is that right? The only potential holes I can think of are 'javascript:' URLs, which I could just filter out, and cross-site scripting URLs (does anyone have any code that recognises hrefs with potential cross-site scripting problems?)

TIA,
  Jeremy

Re: Security in displaying arbitrary HTML

Posted by "Jeffrey W. Baker" <jw...@acm.org>.
On Thu, 27 Apr 2000, John M Vinopal wrote:

> I am a bad hacker and watching your line.  I see cookies A and B go to you.
> I set cookies A and B in my web browser.  I am now you.  You can try to 
> permute the cookies with IP# (breaks on proxies) or Browser type, but all
> cookie based approaches believe in the value of something sent cleartext.
> Or use SSL.

Well, uh, duh.  Any authentication sent in plain text (telnet, rsh,
ftp) is insecure.  NO authentication scheme is secure unless it uses
encryption on the line.  SSL is neccessary (but not sufficient) for secure
authentication by any means.

Those of us who have implemented secure web sites know that it is hard but
not impossible to do.  There are headaches associated with just about
every aspect of it.  The funny thing is, the major websites almost always
get it wrong!  Many of them do not escape HTML in form input.  They do not
validate form or URI input.  They do all sorts of wanky things with
cookies.

My personal philosophy is to spend a lot of cycles on site security and
get rid of all problems that are legitimately on the server
side.  Anything that triggers bugs in the browsers I leave up to the
browser vendors.  In my current work, we have built in a system for
allowing and disallowing browser revisions as new stuff comes across on
BugTraq. 

-jwb


Re: Security in displaying arbitrary HTML

Posted by John M Vinopal <jv...@abattoir.com>.
I am a bad hacker and watching your line.  I see cookies A and B go to you.
I set cookies A and B in my web browser.  I am now you.  You can try to 
permute the cookies with IP# (breaks on proxies) or Browser type, but all
cookie based approaches believe in the value of something sent cleartext.
Or use SSL.

-j

On Thu, Apr 27, 2000 at 12:34:30PM -0700, Nick Tonkin wrote:
> On Thu, 27 Apr 2000, Marc Slemko wrote:
> 
> > Cookies are not secure and will never be secure.  They may be "good
> > enough", and you may not have much choice, but they are still simply not
> > secure when you put everything together.
> 
> Can you be more specific about why you say that? If I set an encrypted,
> short-lived cookie upon validated authentication, why is that any less secure than any
> of the other approaches you mentioned?
> 
> 
> - nick
> 

RE: Security in displaying arbitrary HTML

Posted by Matt Sergeant <ma...@sergeant.org>.
On Fri, 28 Apr 2000, Gerald Richter wrote:

> >
> > Gerald, what about Embperl, does it escape \x8b?
> >
> 
> No, there is no html escape for \x8b (and I guess the other one Matt
> mentioned is \0x8d for >) I know, so Embperl will not escape it, but this
> could be simply change by an entry in epchar.c. Any suggestion to what this
> should be escaped? Then I will make a patch.

"&#" . ord("\0x8a") . ";"

Whatever that produces. Same for \0x8d.

-- 
<Matt/>

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org http://xml.sergeant.org


RE: Security in displaying arbitrary HTML

Posted by Gerald Richter <ri...@ecos.de>.
>
> Gerald, what about Embperl, does it escape \x8b?
>

No, there is no html escape for \x8b (and I guess the other one Matt
mentioned is \0x8d for >) I know, so Embperl will not escape it, but this
could be simply change by an entry in epchar.c. Any suggestion to what this
should be escaped? Then I will make a patch.

Gerald


-------------------------------------------------------------
Gerald Richter    ecos electronic communication services gmbh
Internetconnect * Webserver/-design/-datenbanken * Consulting

Post:       Tulpenstrasse 5         D-55276 Dienheim b. Mainz
E-Mail:     richter@ecos.de         Voice:    +49 6133 925151
WWW:        http://www.ecos.de      Fax:      +49 6133 925152
-------------------------------------------------------------


Re: Security in displaying arbitrary HTML

Posted by Dirk Lutzebaeck <lu...@aeccom.com>.
Matt Sergeant writes:

 > Unfortunately there's also a browser bug to contend with. They treat \x8b
 > (I think that's the right code) as < and there's a similar code for
 > >. Since most web developers are just doing s/</&lt;/g; they are open to
 > attacks based on character sets like this. Sad, but true. Even our loved
 > CGI.pm was (is?) open to this bug - I think Lincoln has fixed the
 > HTMLEncode function now though.

Gerald, what about Embperl, does it escape \x8b?

Dirk


Re: Security in displaying arbitrary HTML

Posted by Dirk Lutzebaeck <lu...@aeccom.com>.
Matt Sergeant writes:

 > Unfortunately there's also a browser bug to contend with. They treat \x8b
 > (I think that's the right code) as < and there's a similar code for
 > >. Since most web developers are just doing s/</&lt;/g; they are open to
 > attacks based on character sets like this. Sad, but true. Even our loved
 > CGI.pm was (is?) open to this bug - I think Lincoln has fixed the
 > HTMLEncode function now though.

Gerald, what about Embperl, does it escape \x8b?

Dirk


Re: Security in displaying arbitrary HTML

Posted by Gunther Birznieks <gu...@extropia.com>.
At 10:25 AM 4/28/00 +0100, Matt Sergeant wrote:
>On Fri, 28 Apr 2000, Marc Slemko wrote:
>
> > On Thu, 27 Apr 2000, Matt Sergeant wrote:
> >
> > > Unfortunately there's also a browser bug to contend with. They treat \x8b
> > > (I think that's the right code) as < and there's a similar code for
> > > >. Since most web developers are just doing s/</&lt;/g; they are open to
> > > attacks based on character sets like this. Sad, but true. Even our loved
> > > CGI.pm was (is?) open to this bug - I think Lincoln has fixed the
> > > HTMLEncode function now though.
> >
> > Mmm?  Which browsers?  Do they have to be configured for any particular
> > character set?  And can you provide an example that demonstrates it?
> >
> > I can't reproduce it...
>
>Well if you have Apache 1.3.12, it implicitly sets the Content-Encoding,
>or the character set, so this bug is minimised. But only on static
>content. If there's no character set in the Content-type or
>Content-Encoding the browser sniffer comes into play, and Netscape
>(IIRC) picks it up as Latin-1, or US-ASCII, I can't recall which. The
>details are all available over the web. Tom Christiansen had an excellent
>informative discussion about it on p5p - search the archives.

The latest version of CGI.pm also mitigates this for scripts or apps that 
use it to print the header.

perl -MCGI -e '$q = new CGI(); print $q->header();'
Content-Type: text/html; charset=ISO-8859-1

I assume Apache's request object send_http_header does whatever Apache does 
to get the header printed out with the charset stuff when the static pages 
are displayed?

Later,
    Gunther


__________________________________________________
Gunther Birznieks (gunther.birznieks@extropia.com)
Extropia - The Web Technology Company
http://www.extropia.com/


Re: Security in displaying arbitrary HTML

Posted by Matt Sergeant <ma...@sergeant.org>.
On Fri, 28 Apr 2000, Marc Slemko wrote:

> On Thu, 27 Apr 2000, Matt Sergeant wrote:
> 
> > Unfortunately there's also a browser bug to contend with. They treat \x8b
> > (I think that's the right code) as < and there's a similar code for
> > >. Since most web developers are just doing s/</&lt;/g; they are open to
> > attacks based on character sets like this. Sad, but true. Even our loved
> > CGI.pm was (is?) open to this bug - I think Lincoln has fixed the
> > HTMLEncode function now though.
> 
> Mmm?  Which browsers?  Do they have to be configured for any particular
> character set?  And can you provide an example that demonstrates it?
> 
> I can't reproduce it...

Well if you have Apache 1.3.12, it implicitly sets the Content-Encoding,
or the character set, so this bug is minimised. But only on static
content. If there's no character set in the Content-type or
Content-Encoding the browser sniffer comes into play, and Netscape
(IIRC) picks it up as Latin-1, or US-ASCII, I can't recall which. The
details are all available over the web. Tom Christiansen had an excellent
informative discussion about it on p5p - search the archives.

-- 
<Matt/>

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org http://xml.sergeant.org


Re: Security in displaying arbitrary HTML

Posted by Marc Slemko <ma...@znep.com>.
On Thu, 27 Apr 2000, Matt Sergeant wrote:

> Unfortunately there's also a browser bug to contend with. They treat \x8b
> (I think that's the right code) as < and there's a similar code for
> >. Since most web developers are just doing s/</&lt;/g; they are open to
> attacks based on character sets like this. Sad, but true. Even our loved
> CGI.pm was (is?) open to this bug - I think Lincoln has fixed the
> HTMLEncode function now though.

Mmm?  Which browsers?  Do they have to be configured for any particular
character set?  And can you provide an example that demonstrates it?

I can't reproduce it...


Re: Security in displaying arbitrary HTML

Posted by Matt Sergeant <ma...@sergeant.org>.
On Thu, 27 Apr 2000, Vivek Khera wrote:

> >>>>> "SC" == Steven Champeon <sc...@hesketh.com> writes:
> 
> SC> developers and designers) for Webmonkey:
> 
> SC>  http://hotwired.lycos.com/webmonkey/00/18/index3a.html
> 
> SC> If you want to see what sort of stuff the XSS problem opens you up for,
> SC> just try appending ?tw=<script>alert("aha!");</script> to the URL above.
> 
> Why on earth would you take user input and output it verbatim to your
> pages?  Rule number 1 of developing a web site is to never trust the
> user's input values.  *Always* validate it against what you're
> expecting.

Unfortunately there's also a browser bug to contend with. They treat \x8b
(I think that's the right code) as < and there's a similar code for
>. Since most web developers are just doing s/</&lt;/g; they are open to
attacks based on character sets like this. Sad, but true. Even our loved
CGI.pm was (is?) open to this bug - I think Lincoln has fixed the
HTMLEncode function now though.

-- 
<Matt/>

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org http://xml.sergeant.org


Re: Security in displaying arbitrary HTML

Posted by Steven Champeon <sc...@hesketh.com>.
On Thu, 27 Apr 2000, Vivek Khera wrote:
> Why on earth would you take user input and output it verbatim to your
> pages?  Rule number 1 of developing a web site is to never trust the
> user's input values.  *Always* validate it against what you're
> expecting.

I guess someone had better tell the folks at Vignette that. Well, and
the folks at all the major search engines. And the portals. And anyone
using a search box that redisplays the query on every results page...

Before I wrote the article, I checked out about fifty major portals
and search engines, and only about a third did any filtering on the
input; some did filtering, but poorly; and a good third of them just
redisplayed the query verbatim. Try searching for 

 <script>alert("aha");</script>

at your favorite search engine. The problem is, well, pretty widespread.

Steve

-- 
tired of being an underappreciated functionary in a soulless machine?
hesketh.com is hiring: <http://hesketh.com/careers/>


Re: Security in displaying arbitrary HTML

Posted by Marc Slemko <ma...@znep.com>.
On Thu, 27 Apr 2000, Vivek Khera wrote:

> >>>>> "SC" == Steven Champeon <sc...@hesketh.com> writes:
> 
> SC> developers and designers) for Webmonkey:
> 
> SC>  http://hotwired.lycos.com/webmonkey/00/18/index3a.html
> 
> SC> If you want to see what sort of stuff the XSS problem opens you up for,
> SC> just try appending ?tw=<script>alert("aha!");</script> to the URL above.
> 
> Why on earth would you take user input and output it verbatim to your
> pages?  Rule number 1 of developing a web site is to never trust the
> user's input values.  *Always* validate it against what you're
> expecting.

Remember, this isn't just cases where user A writes something that user B
sees.  This includes cases where user A creates something that only user A
sees.  Traditionally, it was often assumed that "hey, who cares?  It is
only the user that will see it, no one else will".

It is really a very hard problem to ensure it is 100% correct everywhere,
and there are a lot of gotchas involved in trying to do it properly in
anything but the simplest situation.

Every major web site has places where they don't do this properly.


Re: Security in displaying arbitrary HTML

Posted by Vivek Khera <kh...@kciLink.com>.
>>>>> "SC" == Steven Champeon <sc...@hesketh.com> writes:

SC> developers and designers) for Webmonkey:

SC>  http://hotwired.lycos.com/webmonkey/00/18/index3a.html

SC> If you want to see what sort of stuff the XSS problem opens you up for,
SC> just try appending ?tw=<script>alert("aha!");</script> to the URL above.

Why on earth would you take user input and output it verbatim to your
pages?  Rule number 1 of developing a web site is to never trust the
user's input values.  *Always* validate it against what you're
expecting.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-301-545-6996
PGP & MIME spoken here            http://www.kciLink.com/home/khera/

Re: Security in displaying arbitrary HTML

Posted by Steven Champeon <sc...@hesketh.com>.
On Thu, 27 Apr 2000, Marc Slemko wrote:
> > Can you be more specific about why you say that? If I set an encrypted,
> > short-lived cookie upon validated authentication, why is that any less
> > secure than any of the other approaches you mentioned?
> 
> It isn't necessarily any "less secure", but you just have to understand
> and properly manage what it opens you up to.  I'm not suggesting
> alternatives because they are very limited.

I just wrote an article on XSS (Cross-Site Scripting -- I use "XSS"
instead of "CSS" because CSS means Cascading Style Sheets to most Web
developers and designers) for Webmonkey:

 http://hotwired.lycos.com/webmonkey/00/18/index3a.html

If you want to see what sort of stuff the XSS problem opens you up for,
just try appending ?tw=<script>alert("aha!");</script> to the URL above.
Both the Apache folks and Microsoft security have detailed several ways
in which this attack can be much, much, worse than a single Javascript
popup. There are links to resources at the end of the article.

Myself, I've been amusing myself with this:

 http://hotwired.lycos.com/webmonkey/00/18/index3a_page6.html?tw=%3CIMG%20SRC%3Dhttp%3A%2F%2Fbarneyonline.com%2Fimages%2Fbab3.gif%3E

Cheers,
Steve

-- 
tired of being an underappreciated functionary in a soulless machine?
hesketh.com is hiring: <http://hesketh.com/careers/>


Re: Security in displaying arbitrary HTML

Posted by Marc Slemko <ma...@znep.com>.
On Thu, 27 Apr 2000, Nick Tonkin wrote:

> On Thu, 27 Apr 2000, Marc Slemko wrote:
> 
> > Cookies are not secure and will never be secure.  They may be "good
> > enough", and you may not have much choice, but they are still simply not
> > secure when you put everything together.
> 
> Can you be more specific about why you say that? If I set an encrypted,
> short-lived cookie upon validated authentication, why is that any less secure than any
> of the other approaches you mentioned?

It isn't necessarily any "less secure", but you just have to understand
and properly manage what it opens you up to.  I'm not suggesting
alternatives because they are very limited.

What it means is that if anyone can make a normal user (eg. javascript
enabled, etc.) follow an arbitrary link while they have that cookie, then
the cookie can be stolen, either through "cross site scripting" type
attacks (and I can guarantee you that if you have a site with any real
amount of dynamic content, you are almost certain to be vulnerable) or
browser specific bugs that have been or will be made known.

Sure, there is a limited time period during which that risk may be open.  
Compared to a crazy site like barnesandnoble.com, where if you enable
their fast checkout, then a cookie is stored that will give you full
access to your account forever without entering any more information; you
can even change your password, etc. without having to know the current
one.

Sure, you have to get the user to follow an arbitrary link, but that is
downright easy in many cases.  Granted, a lot easier for a site like
amazon.com than joescornergrocerystore.com.

But the risks are very real and very poorly understood.  However, they
will be problems for years to come.  The only way they will stop being a
problem is if people deploy and use a real authentication method designed
for authentication with very controlled access, instead of tacking it onto
cookies that are wide open.


Re: Security in displaying arbitrary HTML

Posted by Nick Tonkin <ni...@valueclick.com>.
On Thu, 27 Apr 2000, Marc Slemko wrote:

> Cookies are not secure and will never be secure.  They may be "good
> enough", and you may not have much choice, but they are still simply not
> secure when you put everything together.

Can you be more specific about why you say that? If I set an encrypted,
short-lived cookie upon validated authentication, why is that any less secure than any
of the other approaches you mentioned?


- nick



Re: Security in displaying arbitrary HTML

Posted by Marc Slemko <ma...@znep.com>.
On Thu, 27 Apr 2000, Jeremy Howard wrote:

> I'm interested in providing 'HTML email' support for my users (like
> HotMail, Outlook Express, Eudora 4.0, etc provide), but I'm very
> nervous about security. Essentially, providing HTML email involves
> letting any arbitrary HTML get displayed by Apache...
> 
> Has anyone done this, or can anyone provide any tips on what the
> minimum amount of HTML laundering I need to do to avoid security
> holes? I say 'minimum', because I would like to maximise the amount of
> working HTML users can receive.
> 
> I assume I don't have to worry about PHP/EmbPerl/etc tags, since by
> the time mod_perl is finished with it, it's too late for other
> handlers to step in (unless I specifically chain them). Is that right?

Assuming you never write out temporary files to disk in a web accessible
location containing user accessible info, then point the user to that temp
file directly.

> The only potential holes I can think of are 'javascript:' URLs, which
> I could just filter out, and cross-site scripting URLs (does anyone
> have any code that recognises hrefs with potential cross-site
> scripting problems?)

Sorry, you are out of luck.  You either won't have the full HTML
functionality you want, or you won't be secure.

The very fact that Microsoft is still running into new issues in Hotmail
(even when considering only IE) should attest to that.

Your efforts need to be focused on risk management.  How do you do your
authentication?  Is it persistent?  Is it completely cookie based,
completely URL based, HTTP basic auth based, or some combination?  How
long is authentication good for?  What do you let users do without
re-entering their password (eg. to change their password, they should have
to enter the existing one even if they have authenticated)?  etc.

You should also really configure only allowing specific things though
instead of trying to filter out bad things.  You still won't catch
everything, but that will catch things like about: URLs in IE (yup, you
can inject javascript using them), "mocha:" URLs in Navigator (guess 
"javascript:" wasn't enough), etc.

Cookies are not secure and will never be secure.  They may be "good
enough", and you may not have much choice, but they are still simply not
secure when you put everything together.