You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by "Ben Fortuna (JIRA)" <ji...@apache.org> on 2016/08/18 07:23:20 UTC

[jira] [Updated] (SLING-5973) HTMLSerializer not handling some unicode characters (emoji, etc.)

     [ https://issues.apache.org/jira/browse/SLING-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ben Fortuna updated SLING-5973:
-------------------------------
    Description: 
I've noticed that when I have unicode special characters (e.g. emoji) in my sling content and the sling rewriter is enabled the characters are not output correctly to the browser. For example:

{code}&#x1F601;{code} becomes {code}&#xD83C;&#xDE01;{code}

If I disable the rewriter pipeline the output is as expected.

I've looked in the code and I suspect the issue is in the HTMLSerializer from the Cocoon library, however I'm not sure why as it should be using the default encoding for output (which is UTF-8). My rewriter pipeline is using the default html-generator and html-serializer provided by sling.

My code is available on GitHub here:

https://github.com/Whistlepost/emojistrip

It provides a very simple app/content project pair with some emoji characters in the content (see src/main/resources/SLING-INF/content/phrases.json). Many thanks.


  was:
I've noticed that when I have unicode special characters (e.g. emoji) in my sling content and the sling rewriter is enabled the characters are not output correctly to the browser. For example:

&#x1F601; becomes &#xD83C;&#xDE01;

If I disable the rewriter pipeline the output is as expected.

I've looked in the code and I suspect the issue is in the HTMLSerializer from the Cocoon library, however I'm not sure why as it should be using the default encoding for output (which is UTF-8). My rewriter pipeline is using the default html-generator and html-serializer provided by sling.

My code is available on GitHub here:

https://github.com/Whistlepost/emojistrip

It provides a very simple app/content project pair with some emoji characters in the content (see src/main/resources/SLING-INF/content/phrases.json). Many thanks.



> HTMLSerializer not handling some unicode characters (emoji, etc.)
> -----------------------------------------------------------------
>
>                 Key: SLING-5973
>                 URL: https://issues.apache.org/jira/browse/SLING-5973
>             Project: Sling
>          Issue Type: Bug
>            Reporter: Ben Fortuna
>
> I've noticed that when I have unicode special characters (e.g. emoji) in my sling content and the sling rewriter is enabled the characters are not output correctly to the browser. For example:
> {code}&#x1F601;{code} becomes {code}&#xD83C;&#xDE01;{code}
> If I disable the rewriter pipeline the output is as expected.
> I've looked in the code and I suspect the issue is in the HTMLSerializer from the Cocoon library, however I'm not sure why as it should be using the default encoding for output (which is UTF-8). My rewriter pipeline is using the default html-generator and html-serializer provided by sling.
> My code is available on GitHub here:
> https://github.com/Whistlepost/emojistrip
> It provides a very simple app/content project pair with some emoji characters in the content (see src/main/resources/SLING-INF/content/phrases.json). Many thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)