You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@abdera.apache.org by "Todd Wells (JIRA)" <ji...@apache.org> on 2008/04/19 08:08:21 UTC

[jira] Created: (ABDERA-150) EncodingUtil.sanitize() behavior has changed

EncodingUtil.sanitize() behavior has changed
--------------------------------------------

                 Key: ABDERA-150
                 URL: https://issues.apache.org/jira/browse/ABDERA-150
             Project: Abdera
          Issue Type: Bug
    Affects Versions: 0.4.0
            Reporter: Todd Wells


In the 3.0 client, EncodingUtil.sanitize() would escape a space in a String correctly -- with "%20".  Now it replaces it with an underbar ("_").
Sanitizer.sanitize() does the same thing.  So existing code that depended on this method is now broken.  

For example when using Abdera with Mule Galaxy, it has a default URL that includes a space for it's atom feeds "Default Workspace", so the Abdera sanitizer couldn't be used reliably since it would make this "Default_Workspace".  And looking at the code, sanitize only allows to to specify a particular slug to replace all undesired characters with -- so blindly saying replace with "%20" means that all undesired characters would be replaced with that, rather than with the proper HTML encoding.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ABDERA-150) EncodingUtil.sanitize() behavior has changed

Posted by "Todd Wells (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ABDERA-150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591110#action_12591110 ] 

Todd Wells commented on ABDERA-150:
-----------------------------------

I still can't get this to work -- Sanitizer doesn't have an encode() method and EncodingUtil.encode() doesn't do the right thing either.

> EncodingUtil.sanitize() behavior has changed
> --------------------------------------------
>
>                 Key: ABDERA-150
>                 URL: https://issues.apache.org/jira/browse/ABDERA-150
>             Project: Abdera
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Todd Wells
>
> In the 3.0 client, EncodingUtil.sanitize() would escape a space in a String correctly -- with "%20".  Now it replaces it with an underbar ("_").
> Sanitizer.sanitize() does the same thing.  So existing code that depended on this method is now broken.  
> For example when using Abdera with Mule Galaxy, it has a default URL that includes a space for it's atom feeds "Default Workspace", so the Abdera sanitizer couldn't be used reliably since it would make this "Default_Workspace".  And looking at the code, sanitize only allows to to specify a particular slug to replace all undesired characters with -- so blindly saying replace with "%20" means that all undesired characters would be replaced with that, rather than with the proper HTML encoding.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (ABDERA-150) EncodingUtil.sanitize() behavior has changed

Posted by "James M Snell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ABDERA-150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

James M Snell resolved ABDERA-150.
----------------------------------

    Resolution: Won't Fix

Current behavior is preferrable to the old behavior

> EncodingUtil.sanitize() behavior has changed
> --------------------------------------------
>
>                 Key: ABDERA-150
>                 URL: https://issues.apache.org/jira/browse/ABDERA-150
>             Project: Abdera
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Todd Wells
>
> In the 3.0 client, EncodingUtil.sanitize() would escape a space in a String correctly -- with "%20".  Now it replaces it with an underbar ("_").
> Sanitizer.sanitize() does the same thing.  So existing code that depended on this method is now broken.  
> For example when using Abdera with Mule Galaxy, it has a default URL that includes a space for it's atom feeds "Default Workspace", so the Abdera sanitizer couldn't be used reliably since it would make this "Default_Workspace".  And looking at the code, sanitize only allows to to specify a particular slug to replace all undesired characters with -- so blindly saying replace with "%20" means that all undesired characters would be replaced with that, rather than with the proper HTML encoding.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ABDERA-150) EncodingUtil.sanitize() behavior has changed

Posted by "James M Snell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ABDERA-150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591115#action_12591115 ] 

James M Snell commented on ABDERA-150:
--------------------------------------

The Sanitizer is not, and was never, intended to provide percent-encoding.  The point of the sanitizer is to take an input string and make it reasonably suitable for use as a segment of a URL.  The goal is to produce more user friendly URLs and "foo_bar" is more friendly than "foo%20bar".  If you need "foo%20bar" then use the EncodingUtil class or the java.net.URLEncoder to produce a properly percent-encoded string. 

> EncodingUtil.sanitize() behavior has changed
> --------------------------------------------
>
>                 Key: ABDERA-150
>                 URL: https://issues.apache.org/jira/browse/ABDERA-150
>             Project: Abdera
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Todd Wells
>
> In the 3.0 client, EncodingUtil.sanitize() would escape a space in a String correctly -- with "%20".  Now it replaces it with an underbar ("_").
> Sanitizer.sanitize() does the same thing.  So existing code that depended on this method is now broken.  
> For example when using Abdera with Mule Galaxy, it has a default URL that includes a space for it's atom feeds "Default Workspace", so the Abdera sanitizer couldn't be used reliably since it would make this "Default_Workspace".  And looking at the code, sanitize only allows to to specify a particular slug to replace all undesired characters with -- so blindly saying replace with "%20" means that all undesired characters would be replaced with that, rather than with the proper HTML encoding.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ABDERA-150) EncodingUtil.sanitize() behavior has changed

Posted by "James M Snell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ABDERA-150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596119#action_12596119 ] 

James M Snell commented on ABDERA-150:
--------------------------------------

Sorry, I meant UrlEncoding, not EncodingUtil, e.g.

    String t = UrlEncoding.encode("foo bar", CharUtils.Profile.PATH.filter());
    System.out.println(t);

Ultimately, however, the current behavior, while not backwards compatible, does produce a much more reasonable output for the intended purpose and should not be changed back

> EncodingUtil.sanitize() behavior has changed
> --------------------------------------------
>
>                 Key: ABDERA-150
>                 URL: https://issues.apache.org/jira/browse/ABDERA-150
>             Project: Abdera
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Todd Wells
>
> In the 3.0 client, EncodingUtil.sanitize() would escape a space in a String correctly -- with "%20".  Now it replaces it with an underbar ("_").
> Sanitizer.sanitize() does the same thing.  So existing code that depended on this method is now broken.  
> For example when using Abdera with Mule Galaxy, it has a default URL that includes a space for it's atom feeds "Default Workspace", so the Abdera sanitizer couldn't be used reliably since it would make this "Default_Workspace".  And looking at the code, sanitize only allows to to specify a particular slug to replace all undesired characters with -- so blindly saying replace with "%20" means that all undesired characters would be replaced with that, rather than with the proper HTML encoding.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ABDERA-150) EncodingUtil.sanitize() behavior has changed

Posted by "Todd Wells (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ABDERA-150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591120#action_12591120 ] 

Todd Wells commented on ABDERA-150:
-----------------------------------

java.net.URLEncoder encodes a space as '+'.  The change in behavior of sanitize()  along with http://mule.mulesource.org/jira/browse/GALAXY-199 is giving me a headache.  My point is the behavior of sanitize() (which was EncodingUtil.sanitize() in 0.3 before it was deprecated in 0.4) changed between 0.3 and 0.4 and yes, it actually did support percent-encoding in version 0.3 which is why my existing code that relied on that behavior is now broken.  EncodingUtil.sanitize() now does the same thing and EncodingUtil.encode() appears intended for mime encoding, not URL encoding.

> EncodingUtil.sanitize() behavior has changed
> --------------------------------------------
>
>                 Key: ABDERA-150
>                 URL: https://issues.apache.org/jira/browse/ABDERA-150
>             Project: Abdera
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Todd Wells
>
> In the 3.0 client, EncodingUtil.sanitize() would escape a space in a String correctly -- with "%20".  Now it replaces it with an underbar ("_").
> Sanitizer.sanitize() does the same thing.  So existing code that depended on this method is now broken.  
> For example when using Abdera with Mule Galaxy, it has a default URL that includes a space for it's atom feeds "Default Workspace", so the Abdera sanitizer couldn't be used reliably since it would make this "Default_Workspace".  And looking at the code, sanitize only allows to to specify a particular slug to replace all undesired characters with -- so blindly saying replace with "%20" means that all undesired characters would be replaced with that, rather than with the proper HTML encoding.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ABDERA-150) EncodingUtil.sanitize() behavior has changed

Posted by "Dan Diephouse (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ABDERA-150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12590682#action_12590682 ] 

Dan Diephouse commented on ABDERA-150:
--------------------------------------

I think this is because in 0.3.0 it was maybe improperly named. In 0.4, sanitize converts a String to a URL which is human readable - that is no invalid characters and no encoded characters. It replaces these with something like "_". In 0.4 you want to use the encode() method instead which will actually encode the invalid URL characters for you.

> EncodingUtil.sanitize() behavior has changed
> --------------------------------------------
>
>                 Key: ABDERA-150
>                 URL: https://issues.apache.org/jira/browse/ABDERA-150
>             Project: Abdera
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Todd Wells
>
> In the 3.0 client, EncodingUtil.sanitize() would escape a space in a String correctly -- with "%20".  Now it replaces it with an underbar ("_").
> Sanitizer.sanitize() does the same thing.  So existing code that depended on this method is now broken.  
> For example when using Abdera with Mule Galaxy, it has a default URL that includes a space for it's atom feeds "Default Workspace", so the Abdera sanitizer couldn't be used reliably since it would make this "Default_Workspace".  And looking at the code, sanitize only allows to to specify a particular slug to replace all undesired characters with -- so blindly saying replace with "%20" means that all undesired characters would be replaced with that, rather than with the proper HTML encoding.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.