You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Brian Hirt <bh...@mobygames.com> on 2002/11/21 05:53:04 UTC

problems with characters being added to a request.

I'm running into a problem with some characters being added during a
mod_perl request.  An  charater is getting added when i print the
document  STDOUT.  When i print that exact same variable to STDERR the Â
is not added.  Here are two hex dumps (via od -hc).  The first is what
is sent to the web browser, and the 2nd is what is sent to stderr.  The
issue is coming in with the hex92 and hex96 characters.

This sounds like some encoding issue UTF8/ISO-8859-1??  I can't make
heads or tails of it.  Any ideas?

Here is a dump (od -hc) of the document that is getting sent to the web
browser:

0002740 6967 3d6e 3e32 6863 7261 6361 6574 c272
          g   i   n   =   2   >   c   h   a   r   a   c   t   e   r   Â
0002760 7392 7720 7469 2068 96c2 6520 6576 686e
        222   s       w   i   t   h       Â 226       e   v   e   n   h

And here is a partial dump of the exact same variable being sent to
STDERR:
0002520 3130 302c 640a 636f 3343 635b 6168 6172
          0   1   ,   0  \n   d   o   c   C   3   [   c   h   a   r   a
0002540 7463 7265 7392 7720 7469 2068 2096 7665
          c   t   e   r 222   s       w   i   t   h     226       e   v


Notice that when the variable is sent to STDOUT, "r   Â   222" is
getting printed, and when the variable is sent to STDERR " r 222 " is
getting sent.  I get the added character with IE, Netscape, telnet
localhost 80 and wget.  So i don't think it with the browser.


And here are the exact two lines in my handler that are printing this
variable:

      print STDERR "docC3[$docContents]\n";
      print $docContents;


I'm using perl 5.8.0, Apache/1.3.26 and mod_perl/1.27


-- 
Brian Hirt <bh...@mobygames.com>


Re: problems with characters being added to a request.

Posted by Stas Bekman <st...@stason.org>.
Brian Hirt wrote:
> Okay, I've been able to create a simple testcase that reproduces the
> problem I'm having.  Hopefully some perl/mod_perl guru out there will be
> able to tell me what the deal is.   The program basically print out
> three strings, concatinates a few strings and print them out too.  I've
> attached two files, the handler, and the output of a page it created.

can't reproduce it here. Were you able to isolate the problem outside 
mod_perl, so the test case can be sent to p5p?

> Any help would really be appreciated. 
> 
> 
> On Wed, 2002-11-20 at 21:53, Brian Hirt wrote:
> 
>>I'm running into a problem with some characters being added during a
>>mod_perl request.  An  charater is getting added when i print the
>>document  STDOUT.  When i print that exact same variable to STDERR the Â
>>is not added.  Here are two hex dumps (via od -hc).  The first is what
>>is sent to the web browser, and the 2nd is what is sent to stderr.  The
>>issue is coming in with the hex92 and hex96 characters.
>>
>>This sounds like some encoding issue UTF8/ISO-8859-1??  I can't make
>>heads or tails of it.  Any ideas?
>>
>>Here is a dump (od -hc) of the document that is getting sent to the web
>>browser:
>>
>>0002740 6967 3d6e 3e32 6863 7261 6361 6574 c272
>>          g   i   n   =   2   >   c   h   a   r   a   c   t   e   r   Â
>>0002760 7392 7720 7469 2068 96c2 6520 6576 686e
>>        222   s       w   i   t   h       Â 226       e   v   e   n   h
>>
>>And here is a partial dump of the exact same variable being sent to
>>STDERR:
>>0002520 3130 302c 640a 636f 3343 635b 6168 6172
>>          0   1   ,   0  \n   d   o   c   C   3   [   c   h   a   r   a
>>0002540 7463 7265 7392 7720 7469 2068 2096 7665
>>          c   t   e   r 222   s       w   i   t   h     226       e   v
>>
>>
>>Notice that when the variable is sent to STDOUT, "r   Â   222" is
>>getting printed, and when the variable is sent to STDERR " r 222 " is
>>getting sent.  I get the added character with IE, Netscape, telnet
>>localhost 80 and wget.  So i don't think it with the browser.
>>
>>
>>And here are the exact two lines in my handler that are printing this
>>variable:
>>
>>      print STDERR "docC3[$docContents]\n";
>>      print $docContents;
>>
>>
>>I'm using perl 5.8.0, Apache/1.3.26 and mod_perl/1.27
>>
>>
>>------------------------------------------------------------------------
>>
>>package WierdHandler;
>>
>>use MIME::Base64;
>>use Storable qw(freeze thaw);
>>use Apache::Constants 'OK';
>>use strict;
>>
>><<EOF;
>>Add the handler to your httpd.conf file, and make sure the module is in your search path.
>>
>><Location "/wierd">
>>SetHandler perl-script
>>PerlHandler WierdHandler
>></Location>
>>EOF
>>
>>sub handler
>>{
>>  my $apache = shift;
>>  
>>  $apache->content_type('text/html; charset=ISO-8859-1');
>>  $apache->send_http_header;
>>
>>  print "<html><head><title>blah</title></head><body>";
>>
>>  # notice that the frozen/base64 encoded strings are slightly different.
>>  my $string1 = thaw(decode_base64('BAUEMTIzNAQEBAgXhWh0dHA6Ly93d3cuYW1hem9uLmNvbS9leGVjL29iaWRvcy9yZWRpcmVjdD90YWc9bW9ieWdhbWVzJTI2Y3JlYXRpdmU9RDEySkVYSDg3QjRWTEslMjZjYW1wPTIwMjUlMjZsaW5rX2NvZGU9c3AxJTI2cGF0aD1BU0lOL0IwMDAwMlNVUVY='));
>>#                                               ^AgXh^
>>
>>  my $string2 = thaw(decode_base64('BAUEMTIzNAQEBAgKhWh0dHA6Ly93d3cuYW1hem9uLmNvbS9leGVjL29iaWRvcy9yZWRpcmVjdD90YWc9bW9ieWdhbWVzJTI2Y3JlYXRpdmU9RDEySkVYSDg3QjRWTEslMjZjYW1wPTIwMjUlMjZsaW5rX2NvZGU9c3AxJTI2cGF0aD1BU0lOL0IwMDAwMlNVUVY='));
>>#                                               ^AgKh^
>>
>>  my $string3 = "characters with  even";
>>
>>  print "<h1>perl says string 1 and 2 are identical</h1>\n" if $$string1 eq $$string2;
>>
>>  print '$string1 is: <ul>',$$string1,"</ul>";
>>  print '$string2 is: <ul>',$$string2,"</ul>";
>>  print '$string3 is: <ul>',$string3,"</ul>";
>>  print "<hr>";
>>
>>  my $test1 = $string3 . $$string1;
>>  print '$test1 = $string3 . $string1 is: <ul>',$test1,"</ul>";
>>
>>  my $test2 = $string3 . $$string2;
>>  print '$test2 = $string3 . $string2 is: <ul>',$test2,"</ul>";
>>
>>  print "<hr>Wierd, huh?  If you look in the error log, you'll see that both test1 and test2 printed out without that funny looking character.<br>";
>>
>>  print "</body></html>";
>>
>>  print STDERR "test1:[$test1]\n";
>>  print STDERR "test2:[$test2]\n";
>>
>>  return OK;
>>}
>>
>>1;
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>>   perl says string 1 and 2 are identical
>>
>> $string1 is:
>>
>> http://www.amazon.com/exec/obidos/redirect?tag=mobygames%26creative=D12JEXH87B4VLK%26camp=2025%26link_code=sp1%26path=ASIN/B00002SUQV
>>
>> $string2 is:
>>
>> http://www.amazon.com/exec/obidos/redirect?tag=mobygames%26creative=D12JEXH87B4VLK%26camp=2025%26link_code=sp1%26path=ASIN/B00002SUQV
>>
>> $string3 is:
>>
>> characters with even
>>
>> ------------------------------------------------------------------------
>> $test1 = $string3 . $string1 is:
>>
>>       characterÂs with Â
>> evenhttp://www.amazon.com/exec/obidos/redirect?tag=mobygames%26creative=D12JEXH87B4VLK%26camp=2025%26link_code=sp1%26path=ASIN/B00002SUQV
>>
>> $test2 = $string3 . $string2 is:
>>
>>       characters with
>> evenhttp://www.amazon.com/exec/obidos/redirect?tag=mobygames%26creative=D12JEXH87B4VLK%26camp=2025%26link_code=sp1%26path=ASIN/B00002SUQV
>>
>> ------------------------------------------------------------------------
>> Wierd, huh? If you look in the error log, you'll see that both test1 
>> and test2 printed out without that funny looking character.
> 


-- 


__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: problems with characters being added to a request.

Posted by Brian Hirt <bh...@mobygames.com>.
Okay, I've been able to create a simple testcase that reproduces the
problem I'm having.  Hopefully some perl/mod_perl guru out there will be
able to tell me what the deal is.   The program basically print out
three strings, concatinates a few strings and print them out too.  I've
attached two files, the handler, and the output of a page it created.

Any help would really be appreciated. 


On Wed, 2002-11-20 at 21:53, Brian Hirt wrote:
> I'm running into a problem with some characters being added during a
> mod_perl request.  An  charater is getting added when i print the
> document  STDOUT.  When i print that exact same variable to STDERR the Â
> is not added.  Here are two hex dumps (via od -hc).  The first is what
> is sent to the web browser, and the 2nd is what is sent to stderr.  The
> issue is coming in with the hex92 and hex96 characters.
> 
> This sounds like some encoding issue UTF8/ISO-8859-1??  I can't make
> heads or tails of it.  Any ideas?
> 
> Here is a dump (od -hc) of the document that is getting sent to the web
> browser:
> 
> 0002740 6967 3d6e 3e32 6863 7261 6361 6574 c272
>           g   i   n   =   2   >   c   h   a   r   a   c   t   e   r   Â
> 0002760 7392 7720 7469 2068 96c2 6520 6576 686e
>         222   s       w   i   t   h       Â 226       e   v   e   n   h
> 
> And here is a partial dump of the exact same variable being sent to
> STDERR:
> 0002520 3130 302c 640a 636f 3343 635b 6168 6172
>           0   1   ,   0  \n   d   o   c   C   3   [   c   h   a   r   a
> 0002540 7463 7265 7392 7720 7469 2068 2096 7665
>           c   t   e   r 222   s       w   i   t   h     226       e   v
> 
> 
> Notice that when the variable is sent to STDOUT, "r   Â   222" is
> getting printed, and when the variable is sent to STDERR " r 222 " is
> getting sent.  I get the added character with IE, Netscape, telnet
> localhost 80 and wget.  So i don't think it with the browser.
> 
> 
> And here are the exact two lines in my handler that are printing this
> variable:
> 
>       print STDERR "docC3[$docContents]\n";
>       print $docContents;
> 
> 
> I'm using perl 5.8.0, Apache/1.3.26 and mod_perl/1.27
-- 
Brian Hirt <bh...@mobygames.com>