You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Matthew Darwin <ma...@mdarwin.ca> on 2004/02/25 17:56:55 UTC

[MP2] "here" documents and UTF-8 and output filters

I'm not sure if anyone has noticed this, so I thought I'd post.

If I create a string using here syntax:


my $string <<EOF;
...
...
EOF
print $string;


And $string contains UTF-8 characters they get mangled somehow when they 
go through the output chain.

However, if I build the same document using

my $string;
$string .= "..."
$string .= "..."
$string .= "..."
print $string;

Then everything is good.

Thoughts?

Linux Perl 5.8.1 (RedHat 9)


Note that this program generates the same results (mod the extra \n):


#!/usr/bin/perl -w

my $fh;
open ($fh, "<", "utf8french.txt") || die;
my $stuff = <$fh>;
close ($fh);

print $stuff;
print <<EOF
$stuff
EOF


-- 
Matthew Darwin
matthew@mdarwin.ca
http://www.mdarwin.ca

-- 
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html


Re: [MP2] "here" documents and UTF-8 and output filters

Posted by Stas Bekman <st...@stason.org>.
please post a proper bug report, Matthew. We don't even know which mp2 do you 
use. A complete *real* short example is expected as well.

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

-- 
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html


Re: [MP2] "here" documents and UTF-8 and output filters

Posted by Matthew Darwin <ma...@mdarwin.ca>.

Stas Bekman wrote:
> Matthew Darwin wrote:
> 
>>
>> Stas Bekman wrote:
> 
> 
> Ah, I suppose this is a typing error:
> 
>  > my $string <<EOF;
>  > ...
>  > ...
>  > EOF
>  > print $string;
> 
> you miss '='

Sorry... writing up exmaples has inherent dangers.

> Also try adding utf8::encode($string);

Tried that already.  No help.

-- 
Matthew Darwin
matthew@mdarwin.ca
http://www.mdarwin.ca

-- 
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html


Re: [MP2] "here" documents and UTF-8 and output filters

Posted by Stas Bekman <st...@stason.org>.
Matthew Darwin wrote:
> 
> Stas Bekman wrote:

Ah, I suppose this is a typing error:

 > my $string <<EOF;
 > ...
 > ...
 > EOF
 > print $string;

you miss '='

Also try adding utf8::encode($string);

Finally your bug report is missing. Please submit one, including a real 
example that we can try to reproduce the problem with:
http://perl.apache.org/bugs/

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

-- 
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html


Re: [MP2] "here" documents and UTF-8 and output filters

Posted by Matthew Darwin <ma...@mdarwin.ca>.
Stas Bekman wrote:
> 
> Can you reproduce this problem outside of mp2? just a plain perl program?

No.

> Any difference if you add:
> 
> use Apache::RequestIO ();
> binmode(STDOUT, ':utf8'); # Apache::RequestRec::BINMODE()

Yes, I get different garbage.

> before you do the print. Or if you use $r->print() instead?

No difference.

-- 
Matthew Darwin
matthew@mdarwin.ca
http://www.mdarwin.ca

-- 
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html


Re: [MP2] "here" documents and UTF-8 and output filters

Posted by Stas Bekman <st...@stason.org>.
Matthew Darwin wrote:
> 
> I'm not sure if anyone has noticed this, so I thought I'd post.
> 
> If I create a string using here syntax:
> 
> 
> my $string <<EOF;
> ...
> ...
> EOF
> print $string;
> 
> 
> And $string contains UTF-8 characters they get mangled somehow when they 
> go through the output chain.
> 
> However, if I build the same document using
> 
> my $string;
> $string .= "..."
> $string .= "..."
> $string .= "..."
> print $string;
> 
> Then everything is good.

Can you reproduce this problem outside of mp2? just a plain perl program?

Any difference if you add:

use Apache::RequestIO ();
binmode(STDOUT, ':utf8'); # Apache::RequestRec::BINMODE()

before you do the print. Or if you use $r->print() instead?

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

-- 
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html


Re: [MP2] "here" documents and UTF-8 and output filters

Posted by Ged Haywood <ge...@jubileegroup.co.uk>.
Hello there,

On Wed, 25 Feb 2004, Matthew Darwin wrote:

> If I create a string using here syntax:
> my $string <<EOF;
> ...
> EOF
> print $string;
>
> And $string contains UTF-8 characters they get mangled somehow when they
> go through the output chain.
>
> However, if I build the same document using
>
> my $string;
> $string .= "..."
> print $string;
>
> Then everything is good.
>
> Thoughts?
>
> Linux Perl 5.8.1 (RedHat 9)

There have been troubles with UTF-8 on 5.8.1, especially with Red Hat
who had an interesting idea for the locale setting.

I'd suggest trying 5.8.3 to see if it's fixed.

73,
Ged.


-- 
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html