You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by "Randal L. Schwartz" <me...@stonehenge.com> on 2017/06/01 15:34:48 UTC

Re: mod_perl and utf8 and CGI->param

>>>>> "Randal" == Randal L Schwartz <me...@stonehenge.com> writes:
Randal> Getting really frustrated with mod_perl2's apparent inability to
Randal> probably read UTF8 input.

Randal> Here's my mod_perl2 setup:

Randal>   Apache 2.2.[something]
Randal>   mod_perl 2.0.7 (or nearly that)
Randal>   ModPerl::Registry
Randal>   Perl "script" with CGI.pm

Randal> Very early in my app:

Randal>   ## ensure utf8 CGI params:
Randal>   $CGI::PARAM_UTF8 = 1;

Randal>   binmode STDIN, ":utf8";
Randal>   binmode STDOUT, ":utf8";
Randal>   binmode STDERR, ":utf8";

Randal> This works fine in CGI mode: when I ask for $foo = $cgi->param('foo'),
Randal> DBI::data_string_desc($foo) shows a UTF8 string with the proper
Randal> discrepency between bytes and chars.

Randal> But when I try to run it under mod_perl, the returned string appears
Randal> to be the raw ascii bytes, and definitely not utf8.  Of course, when I
Randal> store that in the database (using DBD::Pg), the "latin-1" is encoded
Randal> to "utf-8", and I get a bunch of weird chars on the output.

Randal> Has anyone managed to round-trip UTF8 from form to database and back
Randal> using a setup similar to this?

Randal> I suspect part of the problem is this in CGI.pm:

Randal>     'read_from_client' => <<'END_OF_FUNC',
Randal>     # Read data from a file handle
Randal>     sub read_from_client {
Randal>     my($self, $buff, $len, $offset) = @_;
Randal>     local $^W=0;                # prevent a warning
Randal>     return $MOD_PERL
Randal>         ? $self->r->read($$buff, $len, $offset)
Randal>             : read(\*STDIN, $$buff, $len, $offset);
Randal>     }
Randal>     END_OF_FUNC

Randal> Since I binmode STDIN, the non-$MOD_PERL works ok here.  What's the
Randal> equivalent of $r->read() that marks the incoming stream as UTF8, so I
Randal> get chars instead of bytes?  Or can I just read(\*STDIN) in mod_perl2
Randal> as well? (I know that was supported at one point...)

I realized that I never posted my ultimate solution.  I monkey patch
CGI.pm:

require CGI;
{
  my $orig = \&CGI::param;
  no warnings 'redefine';
  *CGI::param = sub {
    $CGI::LIST_CONTEXT_WARN = 0; # workaround for backward compatibility
    $CGI::PARAM_UTF8 = 1;
    goto &$orig;
  };
}

And this has been working just fine for both CGI and mod_perl.  Just for the
record.

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<me...@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix consulting, Technical writing, Comedy, etc. etc.
Still trying to think of something clever for the fourth line of this .sig

Re: mod_perl and utf8 and CGI->param

Posted by "Randal L. Schwartz" <me...@stonehenge.com>.
>>>>> "Peng" == Peng Yonghua <py...@vodafonemail.de> writes:

Peng> And, can I override any method from a class via this way? is this a general
Peng> trick? thanks.

Yes, and your downstream will hate you for it.  The ruby people do this
all the time, and it makes their code brittle.  I did this in my app,
and would never think of putting that into the core CGI::Prototype where
this gets used, even though it would solve the problem for everyone.

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<me...@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix consulting, Technical writing, Comedy, etc. etc.
Still trying to think of something clever for the fourth line of this .sig

Re: mod_perl and utf8 and CGI->param

Posted by Peng Yonghua <py...@vodafonemail.de>.
And, can I override any method from a class via this way? is this a 
general trick? thanks.

On 2017/6/2  8:48, Peng Yonghua wrote:
> good patch. thanks for sharing.
> 
> On 2017/6/1  23:34, Randal L. Schwartz wrote:
>> I realized that I never posted my ultimate solution.  I monkey patch
>> CGI.pm:
>>
>> require CGI;
>> {
>>    my $orig = \&CGI::param;
>>    no warnings 'redefine';
>>    *CGI::param = sub {
>>      $CGI::LIST_CONTEXT_WARN = 0; # workaround for backward compatibility
>>      $CGI::PARAM_UTF8 = 1;
>>      goto &$orig;
>>    };
>> }
>>
>> And this has been working just fine for both CGI and mod_perl.  Just 
>> for the
>> record.

Re: mod_perl and utf8 and CGI->param

Posted by Peng Yonghua <py...@vodafonemail.de>.
good patch. thanks for sharing.

On 2017/6/1  23:34, Randal L. Schwartz wrote:
> I realized that I never posted my ultimate solution.  I monkey patch
> CGI.pm:
> 
> require CGI;
> {
>    my $orig = \&CGI::param;
>    no warnings 'redefine';
>    *CGI::param = sub {
>      $CGI::LIST_CONTEXT_WARN = 0; # workaround for backward compatibility
>      $CGI::PARAM_UTF8 = 1;
>      goto &$orig;
>    };
> }
> 
> And this has been working just fine for both CGI and mod_perl.  Just for the
> record.