You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tapestry.apache.org by joseph panico <jo...@panmachine.biz> on 2004/10/04 04:41:06 UTC

Unicode chars in URL params-- a one way trip?

Folks,

I have a Link service which builds URLs via
EngineServiceLink.getAbsoluteUrl(), and the service parameters, all Strings,
which are passed to the Link constructor might have Unicode (that is,
non-ASCII) chars in them. When that URL is triggered, the service() method is
invoked:

	public void service(
		 IEngineServiceView engine_,
		 IRequestCycle cycle_,
		 ResponseOutputStream output_)
		 throws ServletException, IOException
	{
 	 Object[] parameters = this.getParameters( cycle_);
          ...

the Strings in 'parameters' returned by getParameters() are not the same
Strings that were passed to the Link constructor in the first place. So it
seems that having non-ASCII chars in service params is not reversible.

EngineServiceLink constructURL(), calls _urlCodec.encode(_parameters[i],
encoding) for each service parameter, which looks pretty reasonable to me. But
I don't see anywhere in Tapestry 3.0 where this encoding is reversed for
incoming request parameters. 

For example, if the default output encoding is UTF-8, during URL construction
service params will get converted to UTF-8, and then HTML "escape encoded" for
each byte of the UTF-8. E.g., if a service param in the URL construction
contains U0308 (combining diaeresis), this first gets converted to 0xCC88
(UTF-8) and then escape encoded as %CC%88. However, on the return trip, when
that service param String is requested in the body of the service() method,
instead of containing the single Unicode U0308 char, it now contains two
Unicode chars (U00CC , &0088), because the original encoding process was not
reversed.

Am I missing a step in my service implementation? Has anyone else bumped into
this problem?

thanks,

joe



--
Open WebMail Project (http://openwebmail.org)


---------------------------------------------------------------------
To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tapestry-user-help@jakarta.apache.org


Re: Unicode chars in URL params-- a one way trip?

Posted by Mind Bridge <mi...@yahoo.com>.
> That's not entirely true. As I pointed out in my original post, Tapestry
is
> *explicitly* performing encoding on service parameters.

Yes, because the Servlet API does not provide an implicit way to do that.

> Tapestry is using the "output-encoding", which in my case is set to UTF-8.

It also uses it to do Request.setCharacterEncoding() before decoding the
parameters.

If the servlet container conforms to the specification, it will use that
encoding to do the decoding. Most containers work in exactly that way.

There is no asymetry here.



---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.771 / Virus Database: 518 - Release Date: 9/28/2004


---------------------------------------------------------------------
To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tapestry-user-help@jakarta.apache.org


Re: Unicode chars in URL params-- a one way trip?

Posted by joseph panico <jo...@panmachine.biz>.
Marcus Brito <mbrito <at> gmail.com> writes:

> 
> URI parameters encoding/decoding is done by the servlet container, not
> by Tapestry. Please refer to the Jetty documentation on the issue.
> 

Marcus,

That's not entirely true. As I pointed out in my original post, Tapestry is
*explicitly* performing encoding on service parameters. Look at line 185 in
EngineServiceLink.java.

                    String encoded = _urlCodec.encode(_parameters[i], encoding);

Tapestry is using the "output-encoding", which in my case is set to UTF-8. Since
the Servlet container is doing the decoding of those same parameters, and it has
no idea that Tapestry has used a UTF-8 encoding, I don't see how it could work.
The encoding of service params seems to be asymmetric.


regards,

joe





---------------------------------------------------------------------
To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tapestry-user-help@jakarta.apache.org


Re: Unicode chars in URL params-- a one way trip?

Posted by Marcus Brito <mb...@gmail.com>.
URI parameters encoding/decoding is done by the servlet container, not
by Tapestry. Please refer to the Jetty documentation on the issue.

-- MB <mb...@gmail.com>

On Mon, 4 Oct 2004 12:10:13 +0000 (UTC), joseph panico
<jo...@panmachine.biz> wrote:
> 
> I'm using Jetty 4.1.1.
> 
> joe

---------------------------------------------------------------------
To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tapestry-user-help@jakarta.apache.org


Re: Unicode chars in URL params-- a one way trip?

Posted by joseph panico <jo...@panmachine.biz>.
I'm using Jetty 4.1.1.

joe 



---------------------------------------------------------------------
To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tapestry-user-help@jakarta.apache.org


Re: Unicode chars in URL params-- a one way trip?

Posted by Mind Bridge <mi...@yahoo.com>.
You are probably using Tomcat 5.

Please look at the "DirectLink bug?" thread from yesterday -- Marius Siegas
explains the cause of the problem nicely.

In short: What you describe should be problematic only in Tomcat 5 due to a
strange change in its behaviour. It will work well in all other servlet
containers, such as Jetty, Tomcat 3, Tomcat 4, etc.

In Tomcat 5, please add useBodyEncodingForURI="true" in the server
configuration, and the issue would be resolved -- everything will work as
desired.

----- Original Message ----- 
From: "joseph panico" <jo...@panmachine.biz>
To: "tapestry-user" <ta...@jakarta.apache.org>
Sent: Monday, October 04, 2004 5:41 AM
Subject: Unicode chars in URL params-- a one way trip?


> Folks,
>
> I have a Link service which builds URLs via
> EngineServiceLink.getAbsoluteUrl(), and the service parameters, all
Strings,
> which are passed to the Link constructor might have Unicode (that is,
> non-ASCII) chars in them. When that URL is triggered, the service() method
is
> invoked:
>
> public void service(
> IEngineServiceView engine_,
> IRequestCycle cycle_,
> ResponseOutputStream output_)
> throws ServletException, IOException
> {
>   Object[] parameters = this.getParameters( cycle_);
>           ...
>
> the Strings in 'parameters' returned by getParameters() are not the same
> Strings that were passed to the Link constructor in the first place. So it
> seems that having non-ASCII chars in service params is not reversible.
>
> EngineServiceLink constructURL(), calls _urlCodec.encode(_parameters[i],
> encoding) for each service parameter, which looks pretty reasonable to me.
But
> I don't see anywhere in Tapestry 3.0 where this encoding is reversed for
> incoming request parameters.
>
> For example, if the default output encoding is UTF-8, during URL
construction
> service params will get converted to UTF-8, and then HTML "escape encoded"
for
> each byte of the UTF-8. E.g., if a service param in the URL construction
> contains U0308 (combining diaeresis), this first gets converted to 0xCC88
> (UTF-8) and then escape encoded as %CC%88. However, on the return trip,
when
> that service param String is requested in the body of the service()
method,
> instead of containing the single Unicode U0308 char, it now contains two
> Unicode chars (U00CC , &0088), because the original encoding process was
not
> reversed.
>
> Am I missing a step in my service implementation? Has anyone else bumped
into
> this problem?
>
> thanks,
>
> joe
>
>
>
> --
> Open WebMail Project (http://openwebmail.org)
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
>
>


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.771 / Virus Database: 518 - Release Date: 9/29/2004


---------------------------------------------------------------------
To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tapestry-user-help@jakarta.apache.org