You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tapestry.apache.org by Martin Grotzke <ma...@javakaffee.de> on 2007/06/21 11:32:45 UTC

T5 Encoding issue (repost)

Hi,

I just want to pickup this topic in a new thread, to make sure
it's noticed - thx to Uli's suggestion in the previous thread :)

At first a short summary again:
- T5 (the PageRenderDispatcher) tries to decode activation context
  arguments (in convertActivationContext).
- The activation context arguments are read from request.getPath()
  (httpServletRequest.getServletPath()).
- The httpServletRequest.getServletPath() provides the _decoded_
  part of the url, according to the servlet specification [1].
  E.g., getServletPath() may return special characters like e.g.
  german umlauts, so the url encoded "%C3%BCbel" would be returned
  as "übel".
- When PageRenderDispatcher.convertActivationContext tries to decode
  the already decoded string (by invoking
  TapestryInternalUtils.urlDecode which itself invokes commons
  URLCodec.decode) and either fails with a
  "org.apache.commons.codec.DecoderException: Invalid URL encoding"
  (e.g. for "tr%b" or returns the wrong value (e.g. "?bel" for "übel").

Our encoding is UTF-8 btw.

My question is: why does PageRenderDispatcher.convertActivationContext
try to decode the already decoded string again? I asume there's *some*
reason for this ;)

Otherwise I'd like to submit an issue with a patch for this.

Thanx && cheers,
Martin


[1] An excerpt from the servlet spec 2.4 p. 243:

getServletPath()
[...]
Returns: a String containing the name or path of the servlet being
called,
as specified in the request URL, decoded, or an empty string if the
servlet
used to process the request is matched using the “/*” pattern.



Re: T5 Encoding issue (repost)

Posted by Martin Grotzke <ma...@javakaffee.de>.
On Mon, 2008-01-28 at 18:54 +0100, Francois Armand wrote:
> I spoke to fast : it seems that in T5.0.9, even "slashes" are handled 
> correctly with utf-8 filter activated.
This would be really great - then we should take the effort and upgrade...

Good luck to you,
cheers,
Martin


On Mon, 2008-01-28 at 18:54 +0100, Francois Armand wrote:
> Francois Armand wrote:
> >> Are you sure this issue is solved in the latest version of T5? So that
> >> you can even have slashes in your activation parameters?
> >>   
> > Well, you are right : slashes are not supported. But spaces, "+", 
> > accented  letters are well encoded/decoded
> >
> 
> I spoke to fast : it seems that in T5.0.9, even "slashes" are handled 
> correctly with utf-8 filter activated.
> 
> My test (I hope it's relevant) :
> * I create a page link with a context comporting all these nasty chars, 
> and a click to it write the good output.
> * I wrote directly in the URL all these char but /, the output is what 
> expected.
> 
> All in all,  I can't  switch to this version, there is far too others 
> behavior modification between the two :/
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org


Re: T5 Encoding issue (repost)

Posted by Francois Armand <fa...@linagora.com>.
Francois Armand wrote:
>> Are you sure this issue is solved in the latest version of T5? So that
>> you can even have slashes in your activation parameters?
>>   
> Well, you are right : slashes are not supported. But spaces, "+", 
> accented  letters are well encoded/decoded
>

I spoke to fast : it seems that in T5.0.9, even "slashes" are handled 
correctly with utf-8 filter activated.

My test (I hope it's relevant) :
* I create a page link with a context comporting all these nasty chars, 
and a click to it write the good output.
* I wrote directly in the URL all these char but /, the output is what 
expected.

All in all,  I can't  switch to this version, there is far too others 
behavior modification between the two :/

-- 
Francois Armand
Etudes & Développements J2EE
Groupe Linagora - http://www.linagora.com
Tél.: +33 (0)1 58 18 68 28
-----------
InterLDAP - http://interldap.org 
FederID - http://www.federid.org/
Open Source identities management and federation


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org


Re: T5 Encoding issue (repost)

Posted by Francois Armand <fa...@linagora.com>.
Martin Grotzke wrote:
> Hi Francois,
>
> we're currently living with a really ugly hack: we use a patched version
> of TapestryInternalUtils, with the methods urlEncode and urlDecode
> changed [1].
>   
Ho. Well, I tested almost everything I thought to, and the last item is 
"patch Tapestry 5.0.6". I really don't like that, but I have to have it 
work this evening... So...
> For us this is still an issue we want to investigate, I believe this is
> an issue in combination with mod_jk. But my memory is really bad, so I
> will have to start again investigating.
>   
I don't use mod_jk, and I don't really understand where it comes from.
> Are you sure this issue is solved in the latest version of T5? So that
> you can even have slashes in your activation parameters?
>   
Well, you are right : slashes are not supported. But spaces, "+", 
accented  letters are well encoded/decoded

-- 
Francois Armand
Etudes & Développements J2EE
Groupe Linagora - http://www.linagora.com
Tél.: +33 (0)1 58 18 68 28
-----------
InterLDAP - http://interldap.org 
FederID - http://www.federid.org/
Open Source identities management and federation


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org


Re: T5 Encoding issue (repost)

Posted by Martin Grotzke <ma...@javakaffee.de>.
Hi Francois,

we're currently living with a really ugly hack: we use a patched version
of TapestryInternalUtils, with the methods urlEncode and urlDecode
changed [1].

For us this is still an issue we want to investigate, I believe this is
an issue in combination with mod_jk. But my memory is really bad, so I
will have to start again investigating.

Are you sure this issue is solved in the latest version of T5? So that
you can even have slashes in your activation parameters?

Cheers,
Martin


[1] 

    private static final String SLASH_REPLACEMENT_CHAR = "" + 127;
    private static final String SLASH_REPLACEMENT_CHAR_ENC =
UrlUtf8Encoder.encode(SLASH_REPLACEMENT_CHAR);

    public static String urlEncode(String input) {
        try {
            String res = CODEC.encode(url);
            if (StringUtils.isNotEmpty(res)) {
                res = res.replace("+", "%20");
                res = res.replace("%2F", SLASH_REPLACEMENT_CHAR_ENC);
            }
            return res;
        } catch (EncoderException e) {
            LOG.error("Could not encode URL: "+ url, e);
            return url;
        }
    }

    public static String urlDecode(String input) {
        // only decode slashes
        String decoded = input.replace(SLASH_REPLACEMENT_CHAR, "/");
        return decoded;
    }




On Mon, 2008-01-28 at 17:19 +0100, Francois Armand wrote:
> Martin Grotzke wrote:
> > Hi,
> >   
> 
> Hi Martin,
> 
> > I just want to pickup this topic in a new thread, to make sure
> > it's noticed - thx to Uli's suggestion in the previous thread :)
> >
> > At first a short summary again:
> > - T5 (the PageRenderDispatcher) tries to decode activation context
> >   arguments (in convertActivationContext).
> > [...]
> > Our encoding is UTF-8 btw.
> >
> > My question is: why does PageRenderDispatcher.convertActivationContext
> > try to decode the already decoded string again? I asume there's *some*
> > reason for this ;)
> >   
> Sorry to resurrect this old post, but I encounter the very same bug. I 
> know it is corrected in recent version of T5 (after 5.0.8, I believe) 
> but for now, I'm stuck with 5.0.6.
> 
> So, to bypass it, I contribute to master dispatcher a 
> PageRenderDispatcher without the double decoding, but now it seems to be 
> worst :
> - it almost work but  sometimes (I think it's when I call 
> ComponentResources#createPageLink or similar methods), spaces are 
> encoded with "+", but the "+" are not decoded. So, returned link are not 
> understood by Tapestry, but if a replace "+" by "%20" or " ", everything 
> works.
> - changing the order of utf8filter (with "after:*" or "before:*" in 
> configuration) doesn't seem to do anything
> 
> 
> I believe I forgot to switch a configuration to UTF-8,  somewhere.  But 
> I don't know where :/
> 
> So, Martin, have you find a way to have it to work ? Have you any idea ?
> 
> It's a really important bug for us :/
> 
> Thanks,
> 
-- 
Martin Grotzke
http://www.javakaffee.de/blog/


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org


Re: T5 Encoding issue (repost)

Posted by Francois Armand <fa...@linagora.com>.
Martin Grotzke wrote:
> Hi,
>   

Hi Martin,

> I just want to pickup this topic in a new thread, to make sure
> it's noticed - thx to Uli's suggestion in the previous thread :)
>
> At first a short summary again:
> - T5 (the PageRenderDispatcher) tries to decode activation context
>   arguments (in convertActivationContext).
> [...]
> Our encoding is UTF-8 btw.
>
> My question is: why does PageRenderDispatcher.convertActivationContext
> try to decode the already decoded string again? I asume there's *some*
> reason for this ;)
>   
Sorry to resurrect this old post, but I encounter the very same bug. I 
know it is corrected in recent version of T5 (after 5.0.8, I believe) 
but for now, I'm stuck with 5.0.6.

So, to bypass it, I contribute to master dispatcher a 
PageRenderDispatcher without the double decoding, but now it seems to be 
worst :
- it almost work but  sometimes (I think it's when I call 
ComponentResources#createPageLink or similar methods), spaces are 
encoded with "+", but the "+" are not decoded. So, returned link are not 
understood by Tapestry, but if a replace "+" by "%20" or " ", everything 
works.
- changing the order of utf8filter (with "after:*" or "before:*" in 
configuration) doesn't seem to do anything


I believe I forgot to switch a configuration to UTF-8,  somewhere.  But 
I don't know where :/

So, Martin, have you find a way to have it to work ? Have you any idea ?

It's a really important bug for us :/

Thanks,

-- 
Francois Armand
Etudes & Développements J2EE
Groupe Linagora - http://www.linagora.com
Tél.: +33 (0)1 58 18 68 28
-----------
InterLDAP - http://interldap.org 
FederID - http://www.federid.org/
Open Source identities management and federation


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org