You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@myfaces.apache.org by "Tomas Fischer (JIRA)" <de...@myfaces.apache.org> on 2006/08/31 15:55:24 UTC

[jira] Created: (MYFACES-1396) Too much escaping

Too much escaping
-----------------

                 Key: MYFACES-1396
                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
             Project: MyFaces Core
          Issue Type: Bug
          Components: General
            Reporter: Tomas Fischer
            Priority: Blocker


HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.

There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).

This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) International characters are not properly encoded to Mnemonic/Numeric values (Was: Too much escaping)

Posted by "Martin Marinschek (JIRA)" <de...@myfaces.apache.org>.
    [ https://issues.apache.org/jira/browse/MYFACES-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464900 ] 

Martin Marinschek commented on MYFACES-1396:
--------------------------------------------

Hi Paul,

I believe that MyFaces is indeed misbehaving here - it's probably the HtmlResponseWriter which is encoding too much. 

regards,

Martin

> International characters are not properly encoded to Mnemonic/Numeric values  (Was: Too much escaping)
> ------------------------------------------------------------------------------------------------------
>
>                 Key: MYFACES-1396
>                 URL: https://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>    Affects Versions: 1.1.5-SNAPSHOT
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) International characters are not properly encoded to Mnemonic/Numeric values (Was: Too much escaping)

Posted by "Nick Belaevski (JIRA)" <de...@myfaces.apache.org>.
    [ https://issues.apache.org/jira/browse/MYFACES-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12600213#action_12600213 ] 

Nick Belaevski commented on MYFACES-1396:
-----------------------------------------

Escaping should be done conditionally, depending on the fact whether we're outputting script/style text or not. 

E.g.:

\u00a0 should be represented as &#160; for common text, but as \u00a0 for style/script tags body

> International characters are not properly encoded to Mnemonic/Numeric values  (Was: Too much escaping)
> ------------------------------------------------------------------------------------------------------
>
>                 Key: MYFACES-1396
>                 URL: https://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>    Affects Versions: 1.1.5-SNAPSHOT
>            Reporter: Tomas Fischer
>            Assignee: Martin Marinschek
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Spencer (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12459938 ] 
            
Paul Spencer commented on MYFACES-1396:
---------------------------------------

Paul,
Please submit a patch or describe "I just fixed it by converting "UTF8" to "UTF-8" in HtmlResponseWriterImpl class.".  I just checked org.apache.myfaces.shared.renderkit.html.HtmlResponceWriterImpl in the shared project.  It has been UTF-8 for a while, so I am not sure what you fixed.

Paul Spencer



> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Spencer (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12460259 ] 
            
Paul Spencer commented on MYFACES-1396:
---------------------------------------

Thomas,

I am not sure what the JSR says about escaping international characters. Martin Marinschek may be able to answer this question.  We now have a test case that Martin, and the build process, can use to determine when this issue is resolved.


Paul Spencer



> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Spencer (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12460211 ] 
            
Paul Spencer commented on MYFACES-1396:
---------------------------------------

Paul,
The JSP, test.jsp,  includes "/base/taglibInclude.jsp", but that file is not attached to this issue.  Is it needed for the test?

Paul Spencer


> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Pogonyshev (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12459981 ] 
            
Paul Pogonyshev commented on MYFACES-1396:
------------------------------------------

1) No.  I'm too tired now to find how to do it.  2) Firefox 2.0.  3) English (US).

> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Pogonyshev (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12451598 ] 
            
Paul Pogonyshev commented on MYFACES-1396:
------------------------------------------

Also, I took all suggested measures to generate UTF-8 contents, but this didn't help.


> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Pogonyshev (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12459946 ] 
            
Paul Pogonyshev commented on MYFACES-1396:
------------------------------------------

I didn't create a patch since I didn't feel it is a proper fix for upstream version.  And now I can't create one since SVN checkout commands on your site are broken.

Anyway, let me describe it in more words:
- when instance of org.apache.myfaces.shared_tomahawk.renderkit.html.HtmlResponseWriterImpl is created (note: shared_tomahawk, not shared_impl!), it is passed "UTF8", without hyphen, as `characterEncoding';
- in all (or at least all relevant) cases before, charset is "UTF-8", with hyphen, as expected; in particular this is true for org.apache.myfaces.shared_tomahawk.renderkit.html.HtmlResponseWriterImpl (note: shared_impl, not shared_tomahawk);
- I fixed it by converting "UTF8" string to "UTF-8" in HtmlResponseWriterImpl constructor;
- a proper fix would be find out why charset becomes "UTF8", without hyphen, in the first place; ad-hoc fix above could be included too, as a way to make HtmlResponseWriterImpl more robust.


> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Pogonyshev (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12451596 ] 
            
Paul Pogonyshev commented on MYFACES-1396:
------------------------------------------

I also stumbled into this bug.  Except for XML, it is also makes it very difficult to use generated strings in JavaScript, since it does not convert entities to characters automatically (you have to do it manually and it is very inconvenient to do for each string, not to mention it is improper.)

Please don't escape encodable characters by default or add an option to not escape them somewhere in the configuration file.

I attach a simple test page with Cyrillic characters generated in various ways.  All are converted to entities here (1.1.4.)


> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Spencer (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12459985 ] 
            
Paul Spencer commented on MYFACES-1396:
---------------------------------------

I have used the tcpmon from the Apache Axis project
  http://ws.apache.org/axis/java/user-guide.html#AppendixUsingTheAxisTCPMonitorTcpmon

Paul Spencer

> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Spencer (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12460231 ] 
            
Paul Spencer commented on MYFACES-1396:
---------------------------------------

When I add the following to the outputText test case:
      <h:outputText id="escape" escape="true" value="10 &gt; 5" />
      <h:outputText value=" | " />
      <h:outputText id="notEscape" escape="false" value="10 &gt; 5" />
      <h:outputText value=" | " />
      <h:outputText id="utf8charEscaped" value="äüöß" escape="true" /> 
      <h:outputText value=" | " />
      <h:outputText id="utf8charNotEscaped" value="äüöß" escape="false" /> 
      <h:outputText value=" | " />
      <h:outputText id="utf8char" value="äüöß" /> 
      <h:outputText value=" | " />
      <h:outputText id="utf8charInEscapedFormat" value="&#228;&#252;&#246;&#223;" escape="false" /> 

I get the following output running MyFaces 1.1.5-SNAPSHOT
      <span id="escape">10 &amp;gt; 5</span>
       | 
      <span id="notEscape">10 &gt; 5</span>
       | 
      <span id="utf8charEscaped">����</span> 
       | 
      <span id="utf8charNotEscaped">����</span> 
       | 
      <span id="utf8char">����</span> 
       | 
      <span id="utf8charInEscapedFormat">&#228;&#252;&#246;&#223;</span> 

I get the following output running Sun's RI
       <span id="escape">10 &amp;gt; 5</span>
       | 
      <span id="notEscape">10 &gt; 5</span>
       | 
     <span id="utf8charEscaped">&#65533;&#65533;&#65533;&#65533;</span> 
       | 
      <span id="utf8charNotEscaped">����</span> 
       | 
      <span id="utf8char">&#65533;&#65533;&#65533;&#65533;</span> 
       | 
      <span id="utf8charInEscapedFormat">&#228;&#252;&#246;&#223;</span> 


So I see the following problem:
 Escaping, or not defining the escape attribute, incorrectly converts international characters to their numeric or mnemonic value.


> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Mario Ivankovits (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12431881 ] 
            
Mario Ivankovits commented on MYFACES-1396:
-------------------------------------------

Is it possible to deliver the content as UTF-8? As far as I remember then no escaping of german umlauts takes place.

Ciao,
Mario

> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>            Priority: Blocker
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Pogonyshev (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12459975 ] 
            
Paul Pogonyshev commented on MYFACES-1396:
------------------------------------------

Note from the code, there's nothing like that (I actually grepped all source tree.)

It may get it from the browser, I don't know.  However, in this case it is very wrong: 1) at least MyFaces must handle "UTF8" just like "UTF-8"; 2) browser must not determine encoding of pages, since it is impossible to reencode pages robustly; in particular, inserting HTML entities into unsuspecting JavaScript (as in my case) will break things.


> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Pogonyshev (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12459948 ] 
            
Paul Pogonyshev commented on MYFACES-1396:
------------------------------------------

Eh, I meant org.apache.myfaces.shared_impl.renderkit.html.HtmlResponseWriterImpl for the second list item.  Anyway, that is mentioned in parenthesis.

> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Pogonyshev (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12459986 ] 
            
Paul Pogonyshev commented on MYFACES-1396:
------------------------------------------

And let me clarify why I think browser should have absolutely no saying in the a resulting encoding.

Nowadays all browser should be able to handle any encoding just fine, as long as it is state in HTML page header.  If a browser fails to handle a particular encoding, you should upgrade it, else throw it away.  _Nothing_ I know of can reencode pages on the fly, so MyFaces seems to invent a wheel that is absolutely unneeded.  In fact you seem to encourage browser behaviour which will not work with other server-side solutions, especially if a server just contains a number of static HTMLs.  I can also confirm that it works perfectly without reencoding in JSP parts of the site.


> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Spencer (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12459978 ] 
            
Paul Spencer commented on MYFACES-1396:
---------------------------------------

Paul,
Can you verify what you browser is sending, i.e. UTF8 or UTF-8?

What is the borwser?

What is the default language?

Paul Spencer

> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Spencer (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12459974 ] 
            
Paul Spencer commented on MYFACES-1396:
---------------------------------------

Paul,
I have search the source code for "UTF8", but found nothing.  Like you said, the charset is being passed into  HtmlResponseWriterImpl,  have you verifed that MyFaces is getting the charset "UTF8" from you browser or source code, including JSP?

Paul Spencer  

> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Spencer (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12460002 ] 
            
Paul Spencer commented on MYFACES-1396:
---------------------------------------

Paul,
I would like to be create the problem so it can be addressed.  Their are many places where the charset can be set, including the browser and MyFaces tags.  At this point I do not know where the charset is set to UTF8.  

Where/are you seeing UTF8 in any of the pages generated by MyFaces?

Paul Spencer



> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Pogonyshev (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12460214 ] 
            
Paul Pogonyshev commented on MYFACES-1396:
------------------------------------------

Paul: sorry, no, it is not needed.  Just accidentally left from a real page.

Tomas: yes, I started a somewhat different discussion, but I believe your problem is caused by UTF-8 characters replaced with HTML entities due to charset being "UTF8", not "UTF-8".


> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Tomas Fischer (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12431884 ] 
            
Tomas Fischer commented on MYFACES-1396:
----------------------------------------

You are right, that the UnicodeEncoder escapes all characters >= 0x80 as XML entities &#xxx; (however it doesn't encode <, > and & which is probably another bug), but this encoder is used only if in script or style (see HtmlResponseWriterImpl). Otherwise the HTMLEncoder is used.


> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>            Priority: Blocker
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Tomas Fischer (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12460212 ] 
            
Tomas Fischer commented on MYFACES-1396:
----------------------------------------

I didn't complain that the UTF-8 characters would be passed incorrectly, I did complain that the HTMLOutputText  doesn't work properly.

Documentation states: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component. If the "escape" attribute is present and is "false" the value of the component should be rendered as text without escaping.

<h:outputText value="äüöß" escape="true" /> outputs &auml;&uuml;&ouml;&szlig;
h:outputText value="äüöß" escape="false" /> outputs &#228;&#252;&#246;&#223;

Both are incorrect according to the documentation.


> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Pogonyshev (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12459919 ] 
            
Paul Pogonyshev commented on MYFACES-1396:
------------------------------------------

I have fixed this locally since we have a release soon and it is not fixed in upstream (stable) versions.

I spent 1.5 (!) days debugging stuff.  It all went down to UTF-8 charset suddenly spelled as "UTF8" (while it was with hyphen at earlier stages.)  I have no idea why the change happened, I just fixed it by converting "UTF8" to "UTF-8" in HtmlResponseWriterImpl class.

BTW, you have a really weird package structure.  For some reason, there is a shared_tomahawk package, but it seems identical to shared_impl and even not present in the sources.  This made my debugging three times longer than it could be...


> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Paul Pogonyshev (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12460203 ] 
            
Paul Pogonyshev commented on MYFACES-1396:
------------------------------------------

I don't actively see it anywhere.  However, I did see it in org.apache.myfaces.shared_tomahawk.renderkit.html.HtmlResponseWriterImpl constructor and it caused all non-ASCII characters be replaced with HTML entities.  The entities were also seen by my work neighbor and my client, but AFAIK we all use Firefox.  And replacing "UTF8" string with "UTF-8" in this function did solve the problem, so it was indeed a (non-direct) cause.

I suggest that you try the small test page I attached.  You can also make it available somewhere on a (test) MyFaces server and then I can test it with my browser.

I searched the whole source tree for "UTF8" again.  It is not present in any configuration, Java or JSP/JSF files except in Java comments on few occasions.  And in local copy of HtmlResponseWriterImpl.java, of course.


> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MYFACES-1396) Too much escaping

Posted by "Tomas Fischer (JIRA)" <de...@myfaces.apache.org>.
    [ http://issues.apache.org/jira/browse/MYFACES-1396?page=comments#action_12460244 ] 
            
Tomas Fischer commented on MYFACES-1396:
----------------------------------------

The main problem (for our project) ist that the escaping occurs at all. We need exactly the described behaviour - either no escaping at all or escaping the XML entities only.

For generating HTML content escaping international characters -> numeric values might be OK, for generating XML content (MIME type xxx/yyy+xml) is may be inacceptable and should be disabled. Escaping international characters -> named entities is not needed at all (if the former is available) and is dangerous as not every browser understands every named entity.


> Too much escaping
> -----------------
>
>                 Key: MYFACES-1396
>                 URL: http://issues.apache.org/jira/browse/MYFACES-1396
>             Project: MyFaces Core
>          Issue Type: Bug
>          Components: General
>            Reporter: Tomas Fischer
>         Assigned To: Martin Marinschek
>            Priority: Blocker
>         Attachments: test.jsf
>
>
> HTMLOutputText (which delegates to HTMLEncoder) escapes not only XML-invalid charactres (like <, >, &), but also german umlauts. This is OK if generating (X)HTML, but not OK if generating XML. However, according to the official documentation to the outputText Tag the german umlauts should not be quoted: If the "escape" attribute is not present, or it is present and its value is "true" all angle brackets should be converted to the ampersand xx semicolon syntax when rendering the value of the "value" attribute as the value of the component.
> There is an automatic XML detection, but this is broken, as only predefined MIME-types are recognized (application/xhtml+xml, application/xml, text/xml).
> This bug prevents using JSF for generating other content (e.g. SVG, MIME-type image/svg+xml).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira