You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Nick Kew <ni...@webthing.com> on 2008/09/10 11:41:27 UTC

Re: svn commit: r693697 - in /httpd/sandbox/mod_valid_utf8: ./ README mod_valid_utf8.c

On Wed, 10 Sep 2008 03:54:00 -0000
ianh@apache.org wrote:

> Author: ianh
> Date: Tue Sep  9 20:53:59 2008
> New Revision: 693697
> 
> URL: http://svn.apache.org/viewvc?rev=693697&view=rev
> Log:
> initial check in.
> this filter validates that the incoming request contains valid UTF8
> characters.

Why?

Last time I looked, incoming charsets were indeed a problem area,
but a browser submitting an HTML form would de-facto use the
same charset as the form.  Not necessarily utf-8.

Not to mention the many other use cases for sending non-utf8 data.

> +static char* validate_buffer( apr_pool_t *p,  const char* inbuf,
> apr_size_t* length) +{

Looks like a potential util_ function.  Cousin to both apr_uri
and apr_xlate.

> +        if (inbuf[i] == '%' ) {
> +            if ((i < len -2 ) && ishexnumber( inbuf[i+1]) &&
> ishexnumber( inbuf[i+2])) {

... is all very well, but

> +            else {
> +                buffer[j++]=inbuf[i++];
> +            }

Shouldn't that at least be marked /* FIXME */
where you potentially let through the chars you're
supposed to block ?


> +    ap_hook_pre_connection(utf8_pre_conn, NULL, NULL,
> APR_HOOK_MIDDLE); +
> +    ap_register_input_filter(utf8_filter_name, utf8_in_filter, NULL,
> +                             AP_FTYPE_NETWORK - 1);

Huh?  Isn't that before mod_ssl, let alone mod_deflate, mod_charset?
And no excape path for binary data either.  It'll cripple any server
with it loaded!


-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Re: svn commit: r693697 - in /httpd/sandbox/mod_valid_utf8: ./ README mod_valid_utf8.c

Posted by Ian Holsman <li...@holsman.net>.
Thanks for the feedback.

I'll fix the code soon.

Nick Kew wrote:
> On Wed, 10 Sep 2008 03:54:00 -0000
> ianh@apache.org wrote:
>
>   
>> Author: ianh
>> Date: Tue Sep  9 20:53:59 2008
>> New Revision: 693697
>>
>> URL: http://svn.apache.org/viewvc?rev=693697&view=rev
>> Log:
>> initial check in.
>> this filter validates that the incoming request contains valid UTF8
>> characters.
>>     
>
> Why?
>
> Last time I looked, incoming charsets were indeed a problem area,
> but a browser submitting an HTML form would de-facto use the
> same charset as the form.  Not necessarily utf-8.
>
> Not to mention the many other use cases for sending non-utf8 data.
>
>   
>> +static char* validate_buffer( apr_pool_t *p,  const char* inbuf,
>> apr_size_t* length) +{
>>     
>
> Looks like a potential util_ function.  Cousin to both apr_uri
> and apr_xlate.
>
>   
>> +        if (inbuf[i] == '%' ) {
>> +            if ((i < len -2 ) && ishexnumber( inbuf[i+1]) &&
>> ishexnumber( inbuf[i+2])) {
>>     
>
> ... is all very well, but
>
>   
>> +            else {
>> +                buffer[j++]=inbuf[i++];
>> +            }
>>     
>
> Shouldn't that at least be marked /* FIXME */
> where you potentially let through the chars you're
> supposed to block ?
>
>
>   
>> +    ap_hook_pre_connection(utf8_pre_conn, NULL, NULL,
>> APR_HOOK_MIDDLE); +
>> +    ap_register_input_filter(utf8_filter_name, utf8_in_filter, NULL,
>> +                             AP_FTYPE_NETWORK - 1);
>>     
>
> Huh?  Isn't that before mod_ssl, let alone mod_deflate, mod_charset?
> And no excape path for binary data either.  It'll cripple any server
> with it loaded!
>
>
>   


Re: svn commit: r693697 - in /httpd/sandbox/mod_valid_utf8: ./ README mod_valid_utf8.c

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Nick Kew wrote:
> 
> Looks like a potential util_ function.  Cousin to both apr_uri
> and apr_xlate.

xlate() rejects such nonsense, so apr shouldn't need it.  It's only
an issue in other ecosystems behind httpd, such as dodging the recent
Tomcat vulnerability.

Re: svn commit: r693697 - in /httpd/sandbox/mod_valid_utf8: ./ README mod_valid_utf8.c

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Nick Kew wrote:
> 
> Last time I looked, incoming charsets were indeed a problem area,
> but a browser submitting an HTML form would de-facto use the
> same charset as the form.  Not necessarily utf-8.

It's worthwhile to add some logic to 'step out of the way' when the
charset is inferred as other-than utf-8.