You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Jens Stutte <st...@email.it> on 2005/03/22 18:52:25 UTC
[users@httpd] mod_proxy_html and URL encoded parameters in form
Hello,
i have a problem with a reverse proxy which uses mod_proxy_html.
The application on the server to which the reverse proxy points opens a
form which parameters then have to be posted back. The application by
default delivers all content in ISO-8859-1 encoding. If i connect
directly with the browser, the parameters of the form will be url-encoded
according to ISO-8859-1. That is, for example, an italian à becomes a %E0.
If i put in between the reverse proxy (configuration see below),
all content is delivered by mod_proxy_html as UTF-8. Therefore the
browser will encode an à as %C3 in the form parameters. This
parameter is passed to the backend web server without being
changed, but it seems that the application expects ISO-8859-1 as
input and the http header of the request has been changed to ISO-8859-1
by mod_proxy_html.
Is this an error of mod_proxy_html or am i missing something?
I had a short look at the code of the module, but i wasn't able
to find the point in which is determined the encoding used against
the backend server.
Any help would be very appreciated and thanx in advance!
Appendix: The config
ProxyRequests off
#----------------------------------------
# Le direttive proxypass standard
ProxyPass /www/ http://portal01.tuconti.telecomitalia.it:7001/
ProxyPass /cal/ http://ldap01.tuconti.telecomitalia.it:8080/
ProxyPass /aut/ http://ldap01.tuconti.telecomitalia.it:80/
ProxyPassReverse /www/ http://portal01.tuconti.telecomitalia.it:7001/
ProxyPassReverse /cal/ http://ldap01.tuconti.telecomitalia.it:8080/
ProxyPassReverse /aut/ http://ldap01.tuconti.telecomitalia.it:80/
#----------------------------------------
# Configurazione globale di mod_proxy_html
ProxyHTMLExtended On
ProxyHTMLStripComments Off
ProxyHTMLURLMap http://portal01.tuconti.telecomitalia.it:7001/ /www/
ProxyHTMLURLMap http://ldap01.tuconti.telecomitalia.it:8080/ /cal/
ProxyHTMLURLMap http://ldap01.tuconti.telecomitalia.it:80/ /aut/
<Location /www/>
SetOutputFilter proxy-html
ProxyHTMLURLMap / /www/ ec
ProxyHTMLURLMap /www /www ec
ProxyHTMLURLMap (['"])/ $1/www/ Rh
RequestHeader unset Accept-Encoding
</Location>
<Location /cal/>
SetOutputFilter proxy-html
ProxyHTMLURLMap / /cal/ ec
ProxyHTMLURLMap /cal /cal ec
$1/cal/$2 Rh
RequestHeader unset Accept-Encoding
</Location>
<Location /aut/>
SetOutputFilter proxy-html
ProxyHTMLURLMap / /aut/ ec
ProxyHTMLURLMap /aut /aut ec
ProxyHTMLURLMap (['"])/ $1/aut/ Rh
RequestHeader unset Accept-Encoding
</Location>
ProxyHTMLLogVerbose On
LogLevel Info
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
[users@httpd] Re: mod_proxy_html and URL encoded parameters in form
Posted by Jens Stutte <st...@email.it>.
Just to complete the scenario: any special character contained in the html
body is shown correctly in the browser, only the form parameters are wrong.
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
[users@httpd] Re: mod_proxy_html and URL encoded parameters in form
Posted by Jens Stutte <st...@email.it>.
Nick Kew <nick <at> webthing.com> writes:
> libxml2 uses utf-8 internally and transcodes on input, so mod_proxy_html
> determines encoding and tells libxml2. To re-transcode on output would
> be an additional overhead, which mod_proxy_html doesn't incur. Since
> transcoding filters (like mod_charset_lite) are available, it would be
> superfluous for mod_proxy_html to do that too.
>
> You could insert a transcoding filter after mod_proxy_html to get back
> your iso-8859-1. Or you insert a transcoding input filter when you
> accept the form data.
>
Thank you for this precious hint! As far as you know, in this way i can
determine the original encoding of the webserver to reencode everything as it
was? I will have to integrate several apps and cannot ensure that the encoding
they will use is the same for all.
Best regards,
Jens
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
[users@httpd] Re: mod_proxy_html and URL encoded parameters in form
Posted by Jens Stutte <st...@email.it>.
After a bit of peeking i have changed mod_charset_lite, as currently it does not
work on proxied requests. See my post on the devel list.
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: [users@httpd] mod_proxy_html and URL encoded parameters in form
Posted by Nick Kew <ni...@webthing.com>.
Jens Stutte wrote:
> If i put in between the reverse proxy (configuration see below),
> all content is delivered by mod_proxy_html as UTF-8. Therefore the
> browser will encode an à as %C3 in the form parameters. This
> parameter is passed to the backend web server without being
> changed, but it seems that the application expects ISO-8859-1 as
> input and the http header of the request has been changed to ISO-8859-1
> by mod_proxy_html.
>
> Is this an error of mod_proxy_html or am i missing something?
> I had a short look at the code of the module, but i wasn't able
> to find the point in which is determined the encoding used against
> the backend server.
>
> Any help would be very appreciated and thanx in advance!
libxml2 uses utf-8 internally and transcodes on input, so mod_proxy_html
determines encoding and tells libxml2. To re-transcode on output would
be an additional overhead, which mod_proxy_html doesn't incur. Since
transcoding filters (like mod_charset_lite) are available, it would be
superfluous for mod_proxy_html to do that too.
You could insert a transcoding filter after mod_proxy_html to get back
your iso-8859-1. Or you insert a transcoding input filter when you
accept the form data.
--
Nick Kew
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org