You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@roller.apache.org by Susanne Gladén <su...@gmail.com> on 2011/04/04 14:25:58 UTC
RC5 tests - encoding error in comments
Hi,
I think I have found a bug in the code concerning weblogentry comments.
If I add a comment to a weblog entry: "Fint väder idag"
Then the comment is displayed as: "Fint väder idag"
In WeblogEntryCommentWrapper.java in method getContent()
public String getContent() {
String content = this.pojo.getContent();
// escape content if content-type is text/plain
if("text/plain".equals(this.pojo.getContentType())) {
content = StringEscapeUtils.escapeHtml(content);
}
// apply plugins
PluginManager pmgr = WebloggerFactory.getWeblogger().getPluginManager();
content = pmgr.applyCommentPlugins(this.pojo, content);
// always add rel=nofollow for links
content = Utilities.addNofollow(content);
return content;
}
First the content is transformed in:
if("text/plain".equals(this.pojo.getContentType())) {
content = StringEscapeUtils.escapeHtml(content);
}
Then the content transformed once again in HTMLSubsetPlugin.java (
content = pmgr.applyCommentPlugins(this.pojo, content);)
This makes the string escaped twice.
I found in Utilities.transformToHTMLSubset(String s) that you try to
make a fix for this problem for some characters by calling s.replace(
... )
But its difficult to list all Latin1 characters ...
/Susanne
Re: RC5 tests - encoding error in comments
Posted by Dave <sn...@gmail.com>.
On Mon, Apr 4, 2011 at 8:25 AM, Susanne Gladén <su...@gmail.com> wrote:
> I think I have found a bug in the code concerning weblogentry comments.
>
> If I add a comment to a weblog entry: "Fint väder idag"
> Then the comment is displayed as: "Fint väder idag"
>
>
> In WeblogEntryCommentWrapper.java in method getContent()
>
> public String getContent() {
>
> String content = this.pojo.getContent();
>
> // escape content if content-type is text/plain
> if("text/plain".equals(this.pojo.getContentType())) {
> content = StringEscapeUtils.escapeHtml(content);
> }
>
> // apply plugins
> PluginManager pmgr = WebloggerFactory.getWeblogger().getPluginManager();
> content = pmgr.applyCommentPlugins(this.pojo, content);
>
> // always add rel=nofollow for links
> content = Utilities.addNofollow(content);
>
> return content;
> }
>
> First the content is transformed in:
>
> if("text/plain".equals(this.pojo.getContentType())) {
> content = StringEscapeUtils.escapeHtml(content);
> }
>
> Then the content transformed once again in HTMLSubsetPlugin.java (
> content = pmgr.applyCommentPlugins(this.pojo, content);)
>
>
> This makes the string escaped twice.
>
>
> I found in Utilities.transformToHTMLSubset(String s) that you try to
> make a fix for this problem for some characters by calling s.replace(
> ... )
The transformToSafeHTMLSubset() is designed to unescape only the HTML
tags that are considered safe.
In WeblogEntryCommentWrapper we only escape content if the content is
text/plain, meaning that HTML comments are disabled.
I believe the fix is to change transformToSafeHTMLSubset() to act only
when the comment is text/html and therefore needs safe subsetting.
Thanks,
Dave