You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Tamer Embaby <Ta...@itworx.com> on 2007/04/04 10:59:37 UTC

UTF-8 encoding problems under Apache 2 with mod_perl 2.

All,

I have character encoding problem with my environment:

$ uname -a
SunOS vulcano 5.10 Generic_118844-26 i86pc i386 i86pc

Server: Apache/2.0.58 (Unix) mod_perl/2.0.3 Perl/v5.8.4

I'm hosting commercial application using mod_perl, the site we are
dealing with has Arabic character so I changed the following in Apache
to add support for UTF-8 charset:

AddDefaultCharset UTF-8

The application itself doesn't handle character set encoding as I
verified
with the vendor that they don't have anything to do with character
encoding
and they verified that their application is working fine in the same 
settings so that the problem is with my environment.

Somehow something is transforming characters with encoding above 0x7f to

HTML character entities &#XX; so that the document with Arabic letters 
arrive to the browser corrupted.

I started to suspect it's something either with Apache or mod_perl that
is 
doing that, Apache itself is capable of serving static files with UTF-8 
encoding correctly (without transforming UTF-8 character to HTML char 
entities).

Below is additional info about my server.

Would anyone have an idea about what might be causing this? And how to 
correct it.

I have a hunch that it's something to do with the Locale passed to the 
mod_perl that I should be using "PerlPassEnv LANG" or something.

Any pointers are appreciated.

Thanks,
Tamer

----- INFO BEGIN -----
$ ../../bin/apachectl -l
Compiled in modules:
  core.c
  mod_access.c
  mod_auth.c
  mod_include.c
  mod_log_config.c
  mod_env.c
  mod_setenvif.c
  prefork.c
  http_core.c
  mod_mime.c
  mod_status.c
  mod_autoindex.c
  mod_asis.c
  mod_cgi.c
  mod_negotiation.c
  mod_dir.c
  mod_imap.c
  mod_actions.c
  mod_userdir.c
  mod_alias.c
  mod_so.c

$ locale -a
C
POSIX
de
es
fi
fr
iso_8859_1
nl
ru
sl

$ locale
LANG=C
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=

$ perl -V
Summary of my perl5 (revision 5 version 8 subversion 4) configuration:
  Platform:
    osname=solaris, osvers=2.10, archname=i86pc-solaris-64int
    uname='sunos localhost 5.10 i86pc i386 i86pc'
    config_args=''
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef
usemultiplicity=undef
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=define use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
-D_TS_ERRNO',
    optimize='-O2 -fno-strict-aliasing',
    cppflags=''
    ccversion='GNU gcc', gccversion='', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long long', ivsize=8, nvtype='double', nvsize=8,
Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =''
    libpth=/lib /usr/lib /usr/ccs/lib
    libs=-lsocket -lnsl -ldl -lm -lc
    perllibs=-lsocket -lnsl -ldl -lm -lc
    libc=/lib/libc.so, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-R
/usr/perl5/5.8.4/lib/i86pc-solaris-64int/CORE'
    cccdlflags='-fPIC', lddlflags='-G'


Characteristics of this binary (from libperl):
  Compile-time options: USE_64_BIT_INT USE_LARGE_FILES
  Locally applied patches:
        22667 The optree builder was looping when constructing the ops
...
        22715 Upgrade to FileCache 1.04
        22733 Missing copyright in the README.
        22746 fix a coredump caused by rv2gv not fully converting a PV
...
        22755 Fix 29149 - another UTF8 cache bug hit by substr.
        22774 [perl #28938] split could leave an array without ...
        22775 [perl #29127] scalar delete of empty slice returned
garbage
        22776 [perl #28986] perl -e "open m" crashes Perl
        22777 add test for change #22776 ("open m" crashes Perl)
        22778 add test for change #22746 ([perl #29102] Crash on assign
...
        22781 [perl #29340] Bizarre copy of ARRAY make sure a pad op's
...
        22796 [perl #29346] Double warning for int(undef) and abs(undef)
...
        22818 BOM-marked and (BOMless) UTF-16 scripts not working
        22823 [perl #29581] glob() misses a lot of matches
        22827 Smoke [5.9.2] 22818 FAIL(F) MSWin32 WinXP/.Net SP1 (x86/1
cpu)
        22830 [perl #29637] Thread creation time is hypersensitive
        22831 improve hashing algorithm for ptr tables in perl_clone:
...
        22839 [perl #29790] Optimization busted: '@a = "b", sort @a' ...
        22850 [PATCH] 'perl -v' fails if local_patches contains code
snippets
        22852 TEST needs to ignore SCM files
        22886 Pod::Find should ignore SCM files and dirs
        22888 Remove redundant %SIG assignments from FileCache
        23006 [perl #30509] use encoding and "eq" cause memory leak
        23074 Segfault using HTML::Entities
        23106 Numeric comparison operators mustn't compare addresses of
...
        23320 [perl #30066] Memory leak in nested shared data structures
...
        23321 [perl #31459] Bug in read()
  Built under solaris
  Compiled at Jan 21 2005 15:48:11
  @INC:
    /usr/perl5/5.8.4/lib/i86pc-solaris-64int
    /usr/perl5/5.8.4/lib
    /usr/perl5/site_perl/5.8.4/i86pc-solaris-64int
    /usr/perl5/site_perl/5.8.4
    /usr/perl5/site_perl
    /usr/perl5/vendor_perl/5.8.4/i86pc-solaris-64int
    /usr/perl5/vendor_perl/5.8.4
    /usr/perl5/vendor_perl

----- INFO END -----

--

Tamer Embaby <ta...@itworx.com>

" f u cn rd ths, u cn gt a gd jb n cmptr prgrmmng. "



getting vhost ServerAliases used in conf

Posted by Iosif Fettich <if...@netsoft.ro>.
Hello,

quick question:

is there currently a way to get the ServerAliases used for an virtual host 
in config ?

My Apache2::ServerRec man page offers the "names" method for that, but it 
seems to be just a placeholder for code to come, as the function would 
return an APR::ArrayHeader object and this falls in the

---
  since: 2.0.00

        META: we don't have "APR::ArrayHeader" yet

---

category. Well, if that's not yet working, is there another way to reach 
these values ?

Many thanks,

Iosif Fettich



Re: UTF-8 encoding problems under Apache 2 with mod_perl 2.

Posted by Perrin Harkins <ph...@gmail.com>.
On 4/4/07, Tamer Embaby <Ta...@itworx.com> wrote:
> Server: Apache/2.0.58 (Unix) mod_perl/2.0.3 Perl/v5.8.4

You probably want a later Perl than that.  Many unicode bugs have been
fixed since then.

> I have a hunch that it's something to do with the Locale passed to the
> mod_perl that I should be using "PerlPassEnv LANG" or something.

I don't know that much about unicode, but I do remember that Perl does
some automatic encoding in certain situations.  There was that problem
in Red Hat a few years ago when they set LANG to UTF-8 and it broke
all kinds of CPAN module tests when Perl tried to read all files as
UTF-8.  Why don't you try setting LANG to UTF-8 and see if it helps
your situation.

- Perrin