You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by "Brian Minchau (JIRA)" <xa...@xml.apache.org> on 2005/04/20 19:35:33 UTC

[jira] Created: (XALANJ-2109) \r\n in an HTML attribute is incorrectly output as \r\r\n

 \r\n  in an HTML attribute is incorrectly output as \r\r\n
-----------------------------------------------------------

         Key: XALANJ-2109
         URL: http://issues.apache.org/jira/browse/XALANJ-2109
     Project: XalanJ2
        Type: Bug
    Reporter: Brian Minchau


The serializer assumes that a single \n should be expanded to the systems end of line sequence. This is OK for text nodes, but not correct for HTML attributes. The reasons follow.

Input XML document:

<?xml version="1.0"?>
<input 
  data="xxx&#13;&#10;yyy" 
  type="hidden" 
  name="data.stuff" />

Stylesheet:

<?xml version="1.0"?> 
<xsl:stylesheet 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0" >
  <xsl:output method="html" />
  <xsl:template match="input|br">
    <xsl:copy>
      <xsl:copy-of select="@*"/>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

There are Four stages of processing:
A) what is in the input XML document
B) what is presented to Xalan by the XML parser
C) what is written out by the Xalan processor
D) what is interpreted by a browser or user agent.

The output produced by stage C) by Xalan is this:
<input data="xxx

yyy" type="hidden" name="data.stuff">

To indicate that more clearly the value for the attribute 'data' 
written out on windows is this:
"xxx\r\r\nyyy"
and on other operating systems the value written out is this:
"xxx\r\nyyy"


Current processing of the attribute by Xalan is this:
 - write out the \r as is
 - consider the \n a normalized end of line sequence produced by
   the XML parser from stage A) and it write it out
   in stage C) as the system end of line 
   sequence, either \r\n or just \n depending on the operation system.

The HTML recommendation, at 
  http://www.w3.org/TR/html401/types.html#h-6.2
says this about stage D) :
<<
User agents should interpret attribute values as follows: 
 1. Replace character entities with characters, 
 2. Ignore line feeds, 
 3. Replace each carriage return or tab with a single space. 
>>

Xalan's output on Windows OS by stage C) of "xxx\r\r\nyyy" would be interpreted
as "xxx  yyy" by a browser at stage D). Bullet 2. means that the
browser would ignore the \n, and bullet 3 means that it would
interpret \r\r as two spaces.

Xalan's output from stage C) on other operating systems
of "xxx\r\nyyy" would be interpreted as "xxx yyy" by a browser at stage D). 
This is one less space between "xxx" and "yyy"

Since the browser interpretation differs depending on which OS
we are running on this is a bug, we shouldn't normalize
the \n in the attribute value to the system end of line sequence.
We should leave it alone, thus producing this output by stage D) on all operating systems:
"xxx\r\nyyy"


I ran this through Saxon 6.5.3 and its output was:
<input data="xxx&#xA;yyy" type="hidden" name="data.stuff">

When a browser interprets Saxon's output it would apply 
bullet 1 and interpret a single newline character between "xxx" and "yyy".

It is not clear if the bullets 1,2,3 quoted from the HTML recommendation apply in sequence, or if just one of them applies. If just one of them applies the browser might interpret Saxons 'data' attribute value as "xxx\nyyy". On the other hand if one applies bullet 1. followed by bullet 2. then Saxon's 'data' atribute value is interpreted as "xxxyyy". Either way Xalan's output is different than Saxon's in a way that is significant to a browser or user agent.




-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org


[jira] Updated: (XALANJ-2109) \r\n in an HTML attribute is incorrectly output as \r\r\n

Posted by "Brian Minchau (JIRA)" <xa...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XALANJ-2109?page=all ]

Brian Minchau updated XALANJ-2109:
----------------------------------

    Xalan info: [PatchAvailable]
      reviewer: ytalwar@ca.ibm.com

>  \r\n  in an HTML attribute is incorrectly output as \r\r\n
> -----------------------------------------------------------
>
>          Key: XALANJ-2109
>          URL: http://issues.apache.org/jira/browse/XALANJ-2109
>      Project: XalanJ2
>         Type: Bug
>   Components: Serialization
>     Reporter: Brian Minchau
>     Assignee: Brian Minchau
>  Attachments: ToHTMLStream.2109.patch.txt
>
> The serializer assumes that a single \n should be expanded to the systems end of line sequence. This is OK for text nodes, but not correct for HTML attributes. The reasons follow.
> Input XML document:
> <?xml version="1.0"?>
> <input 
>   data="xxx&#13;&#10;yyy" 
>   type="hidden" 
>   name="data.stuff" />
> Stylesheet:
> <?xml version="1.0"?> 
> <xsl:stylesheet 
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>   version="1.0" >
>   <xsl:output method="html" />
>   <xsl:template match="input|br">
>     <xsl:copy>
>       <xsl:copy-of select="@*"/>
>       <xsl:apply-templates/>
>     </xsl:copy>
>   </xsl:template>
> </xsl:stylesheet>
> There are Four stages of processing:
> A) what is in the input XML document
> B) what is presented to Xalan by the XML parser
> C) what is written out by the Xalan processor
> D) what is interpreted by a browser or user agent.
> The output produced by stage C) by Xalan is this:
> <input data="xxx
> yyy" type="hidden" name="data.stuff">
> To indicate that more clearly the value for the attribute 'data' 
> written out on windows is this:
> "xxx\r\r\nyyy"
> and on other operating systems the value written out is this:
> "xxx\r\nyyy"
> Current processing of the attribute by Xalan is this:
>  - write out the \r as is
>  - consider the \n a normalized end of line sequence produced by
>    the XML parser from stage A) and it write it out
>    in stage C) as the system end of line 
>    sequence, either \r\n or just \n depending on the operation system.
> The HTML recommendation, at 
>   http://www.w3.org/TR/html401/types.html#h-6.2
> says this about stage D) :
> <<
> User agents should interpret attribute values as follows: 
>  1. Replace character entities with characters, 
>  2. Ignore line feeds, 
>  3. Replace each carriage return or tab with a single space. 
> >>
> Xalan's output on Windows OS by stage C) of "xxx\r\r\nyyy" would be interpreted
> as "xxx  yyy" by a browser at stage D). Bullet 2. means that the
> browser would ignore the \n, and bullet 3 means that it would
> interpret \r\r as two spaces.
> Xalan's output from stage C) on other operating systems
> of "xxx\r\nyyy" would be interpreted as "xxx yyy" by a browser at stage D). 
> This is one less space between "xxx" and "yyy"
> Since the browser interpretation differs depending on which OS
> we are running on this is a bug, we shouldn't normalize
> the \n in the attribute value to the system end of line sequence.
> We should leave it alone, thus producing this output by stage D) on all operating systems:
> "xxx\r\nyyy"
> I ran this through Saxon 6.5.3 and its output was:
> <input data="xxx&#xA;yyy" type="hidden" name="data.stuff">
> When a browser interprets Saxon's output it would apply 
> bullet 1 and interpret a single newline character between "xxx" and "yyy".
> It is not clear if the bullets 1,2,3 quoted from the HTML recommendation apply in sequence, or if just one of them applies. If just one of them applies the browser might interpret Saxons 'data' attribute value as "xxx\nyyy". On the other hand if one applies bullet 1. followed by bullet 2. then Saxon's 'data' atribute value is interpreted as "xxxyyy". Either way Xalan's output is different than Saxon's in a way that is significant to a browser or user agent.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org


[jira] Updated: (XALANJ-2109) \r\n in an HTML attribute is incorrectly output as \r\r\n

Posted by "Brian Minchau (JIRA)" <xa...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XALANJ-2109?page=all ]

Brian Minchau updated XALANJ-2109:
----------------------------------

    Attachment: ToHTMLStream.2109.patch.txt

Patch attached to ToHTMLStream.java so that a '\n' (newline) in an HTML attribute is not converted to "\r\n" on output when running on a Windows OS.  The fix is not OS specific, it just leaves the newline as-is.

>  \r\n  in an HTML attribute is incorrectly output as \r\r\n
> -----------------------------------------------------------
>
>          Key: XALANJ-2109
>          URL: http://issues.apache.org/jira/browse/XALANJ-2109
>      Project: XalanJ2
>         Type: Bug
>     Reporter: Brian Minchau
>  Attachments: ToHTMLStream.2109.patch.txt
>
> The serializer assumes that a single \n should be expanded to the systems end of line sequence. This is OK for text nodes, but not correct for HTML attributes. The reasons follow.
> Input XML document:
> <?xml version="1.0"?>
> <input 
>   data="xxx&#13;&#10;yyy" 
>   type="hidden" 
>   name="data.stuff" />
> Stylesheet:
> <?xml version="1.0"?> 
> <xsl:stylesheet 
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>   version="1.0" >
>   <xsl:output method="html" />
>   <xsl:template match="input|br">
>     <xsl:copy>
>       <xsl:copy-of select="@*"/>
>       <xsl:apply-templates/>
>     </xsl:copy>
>   </xsl:template>
> </xsl:stylesheet>
> There are Four stages of processing:
> A) what is in the input XML document
> B) what is presented to Xalan by the XML parser
> C) what is written out by the Xalan processor
> D) what is interpreted by a browser or user agent.
> The output produced by stage C) by Xalan is this:
> <input data="xxx
> yyy" type="hidden" name="data.stuff">
> To indicate that more clearly the value for the attribute 'data' 
> written out on windows is this:
> "xxx\r\r\nyyy"
> and on other operating systems the value written out is this:
> "xxx\r\nyyy"
> Current processing of the attribute by Xalan is this:
>  - write out the \r as is
>  - consider the \n a normalized end of line sequence produced by
>    the XML parser from stage A) and it write it out
>    in stage C) as the system end of line 
>    sequence, either \r\n or just \n depending on the operation system.
> The HTML recommendation, at 
>   http://www.w3.org/TR/html401/types.html#h-6.2
> says this about stage D) :
> <<
> User agents should interpret attribute values as follows: 
>  1. Replace character entities with characters, 
>  2. Ignore line feeds, 
>  3. Replace each carriage return or tab with a single space. 
> >>
> Xalan's output on Windows OS by stage C) of "xxx\r\r\nyyy" would be interpreted
> as "xxx  yyy" by a browser at stage D). Bullet 2. means that the
> browser would ignore the \n, and bullet 3 means that it would
> interpret \r\r as two spaces.
> Xalan's output from stage C) on other operating systems
> of "xxx\r\nyyy" would be interpreted as "xxx yyy" by a browser at stage D). 
> This is one less space between "xxx" and "yyy"
> Since the browser interpretation differs depending on which OS
> we are running on this is a bug, we shouldn't normalize
> the \n in the attribute value to the system end of line sequence.
> We should leave it alone, thus producing this output by stage D) on all operating systems:
> "xxx\r\nyyy"
> I ran this through Saxon 6.5.3 and its output was:
> <input data="xxx&#xA;yyy" type="hidden" name="data.stuff">
> When a browser interprets Saxon's output it would apply 
> bullet 1 and interpret a single newline character between "xxx" and "yyy".
> It is not clear if the bullets 1,2,3 quoted from the HTML recommendation apply in sequence, or if just one of them applies. If just one of them applies the browser might interpret Saxons 'data' attribute value as "xxx\nyyy". On the other hand if one applies bullet 1. followed by bullet 2. then Saxon's 'data' atribute value is interpreted as "xxxyyy". Either way Xalan's output is different than Saxon's in a way that is significant to a browser or user agent.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org


[jira] Updated: (XALANJ-2109) \r\n in an HTML attribute is incorrectly output as \r\r\n

Posted by "Brian Minchau (JIRA)" <xa...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XALANJ-2109?page=all ]

Brian Minchau updated XALANJ-2109:
----------------------------------

    Fix Version: 2.7
                     (was: CurrentCVS)

>  \r\n  in an HTML attribute is incorrectly output as \r\r\n
> -----------------------------------------------------------
>
>          Key: XALANJ-2109
>          URL: http://issues.apache.org/jira/browse/XALANJ-2109
>      Project: XalanJ2
>         Type: Bug
>   Components: Serialization
>     Reporter: Brian Minchau
>     Assignee: Brian Minchau
>      Fix For: 2.7
>  Attachments: ToHTMLStream.2109.patch.txt
>
> The serializer assumes that a single \n should be expanded to the systems end of line sequence. This is OK for text nodes, but not correct for HTML attributes. The reasons follow.
> Input XML document:
> <?xml version="1.0"?>
> <input 
>   data="xxx&#13;&#10;yyy" 
>   type="hidden" 
>   name="data.stuff" />
> Stylesheet:
> <?xml version="1.0"?> 
> <xsl:stylesheet 
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>   version="1.0" >
>   <xsl:output method="html" />
>   <xsl:template match="input|br">
>     <xsl:copy>
>       <xsl:copy-of select="@*"/>
>       <xsl:apply-templates/>
>     </xsl:copy>
>   </xsl:template>
> </xsl:stylesheet>
> There are Four stages of processing:
> A) what is in the input XML document
> B) what is presented to Xalan by the XML parser
> C) what is written out by the Xalan processor
> D) what is interpreted by a browser or user agent.
> The output produced by stage C) by Xalan is this:
> <input data="xxx
> yyy" type="hidden" name="data.stuff">
> To indicate that more clearly the value for the attribute 'data' 
> written out on windows is this:
> "xxx\r\r\nyyy"
> and on other operating systems the value written out is this:
> "xxx\r\nyyy"
> Current processing of the attribute by Xalan is this:
>  - write out the \r as is
>  - consider the \n a normalized end of line sequence produced by
>    the XML parser from stage A) and it write it out
>    in stage C) as the system end of line 
>    sequence, either \r\n or just \n depending on the operation system.
> The HTML recommendation, at 
>   http://www.w3.org/TR/html401/types.html#h-6.2
> says this about stage D) :
> <<
> User agents should interpret attribute values as follows: 
>  1. Replace character entities with characters, 
>  2. Ignore line feeds, 
>  3. Replace each carriage return or tab with a single space. 
> >>
> Xalan's output on Windows OS by stage C) of "xxx\r\r\nyyy" would be interpreted
> as "xxx  yyy" by a browser at stage D). Bullet 2. means that the
> browser would ignore the \n, and bullet 3 means that it would
> interpret \r\r as two spaces.
> Xalan's output from stage C) on other operating systems
> of "xxx\r\nyyy" would be interpreted as "xxx yyy" by a browser at stage D). 
> This is one less space between "xxx" and "yyy"
> Since the browser interpretation differs depending on which OS
> we are running on this is a bug, we shouldn't normalize
> the \n in the attribute value to the system end of line sequence.
> We should leave it alone, thus producing this output by stage D) on all operating systems:
> "xxx\r\nyyy"
> I ran this through Saxon 6.5.3 and its output was:
> <input data="xxx&#xA;yyy" type="hidden" name="data.stuff">
> When a browser interprets Saxon's output it would apply 
> bullet 1 and interpret a single newline character between "xxx" and "yyy".
> It is not clear if the bullets 1,2,3 quoted from the HTML recommendation apply in sequence, or if just one of them applies. If just one of them applies the browser might interpret Saxons 'data' attribute value as "xxx\nyyy". On the other hand if one applies bullet 1. followed by bullet 2. then Saxon's 'data' atribute value is interpreted as "xxxyyy". Either way Xalan's output is different than Saxon's in a way that is significant to a browser or user agent.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org


[jira] Commented: (XALANJ-2109) \r\n in an HTML attribute is incorrectly output as \r\r\n

Posted by "Yash Talwar (JIRA)" <xa...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XALANJ-2109?page=comments#action_63504 ]
     
Yash Talwar commented on XALANJ-2109:
-------------------------------------

I have reviewed the patch ToHTMLStream.2109.patch.txt submitted by Brian.
The patch looks good to me.  I approve this patch.

>  \r\n  in an HTML attribute is incorrectly output as \r\r\n
> -----------------------------------------------------------
>
>          Key: XALANJ-2109
>          URL: http://issues.apache.org/jira/browse/XALANJ-2109
>      Project: XalanJ2
>         Type: Bug
>   Components: Serialization
>     Reporter: Brian Minchau
>     Assignee: Brian Minchau
>  Attachments: ToHTMLStream.2109.patch.txt
>
> The serializer assumes that a single \n should be expanded to the systems end of line sequence. This is OK for text nodes, but not correct for HTML attributes. The reasons follow.
> Input XML document:
> <?xml version="1.0"?>
> <input 
>   data="xxx&#13;&#10;yyy" 
>   type="hidden" 
>   name="data.stuff" />
> Stylesheet:
> <?xml version="1.0"?> 
> <xsl:stylesheet 
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>   version="1.0" >
>   <xsl:output method="html" />
>   <xsl:template match="input|br">
>     <xsl:copy>
>       <xsl:copy-of select="@*"/>
>       <xsl:apply-templates/>
>     </xsl:copy>
>   </xsl:template>
> </xsl:stylesheet>
> There are Four stages of processing:
> A) what is in the input XML document
> B) what is presented to Xalan by the XML parser
> C) what is written out by the Xalan processor
> D) what is interpreted by a browser or user agent.
> The output produced by stage C) by Xalan is this:
> <input data="xxx
> yyy" type="hidden" name="data.stuff">
> To indicate that more clearly the value for the attribute 'data' 
> written out on windows is this:
> "xxx\r\r\nyyy"
> and on other operating systems the value written out is this:
> "xxx\r\nyyy"
> Current processing of the attribute by Xalan is this:
>  - write out the \r as is
>  - consider the \n a normalized end of line sequence produced by
>    the XML parser from stage A) and it write it out
>    in stage C) as the system end of line 
>    sequence, either \r\n or just \n depending on the operation system.
> The HTML recommendation, at 
>   http://www.w3.org/TR/html401/types.html#h-6.2
> says this about stage D) :
> <<
> User agents should interpret attribute values as follows: 
>  1. Replace character entities with characters, 
>  2. Ignore line feeds, 
>  3. Replace each carriage return or tab with a single space. 
> >>
> Xalan's output on Windows OS by stage C) of "xxx\r\r\nyyy" would be interpreted
> as "xxx  yyy" by a browser at stage D). Bullet 2. means that the
> browser would ignore the \n, and bullet 3 means that it would
> interpret \r\r as two spaces.
> Xalan's output from stage C) on other operating systems
> of "xxx\r\nyyy" would be interpreted as "xxx yyy" by a browser at stage D). 
> This is one less space between "xxx" and "yyy"
> Since the browser interpretation differs depending on which OS
> we are running on this is a bug, we shouldn't normalize
> the \n in the attribute value to the system end of line sequence.
> We should leave it alone, thus producing this output by stage D) on all operating systems:
> "xxx\r\nyyy"
> I ran this through Saxon 6.5.3 and its output was:
> <input data="xxx&#xA;yyy" type="hidden" name="data.stuff">
> When a browser interprets Saxon's output it would apply 
> bullet 1 and interpret a single newline character between "xxx" and "yyy".
> It is not clear if the bullets 1,2,3 quoted from the HTML recommendation apply in sequence, or if just one of them applies. If just one of them applies the browser might interpret Saxons 'data' attribute value as "xxx\nyyy". On the other hand if one applies bullet 1. followed by bullet 2. then Saxon's 'data' atribute value is interpreted as "xxxyyy". Either way Xalan's output is different than Saxon's in a way that is significant to a browser or user agent.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org


[jira] Updated: (XALANJ-2109) \r\n in an HTML attribute is incorrectly output as \r\r\n

Posted by "Brian Minchau (JIRA)" <xa...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XALANJ-2109?page=all ]

Brian Minchau updated XALANJ-2109:
----------------------------------

    Component: Serialization

>  \r\n  in an HTML attribute is incorrectly output as \r\r\n
> -----------------------------------------------------------
>
>          Key: XALANJ-2109
>          URL: http://issues.apache.org/jira/browse/XALANJ-2109
>      Project: XalanJ2
>         Type: Bug
>   Components: Serialization
>     Reporter: Brian Minchau
>     Assignee: Brian Minchau
>  Attachments: ToHTMLStream.2109.patch.txt
>
> The serializer assumes that a single \n should be expanded to the systems end of line sequence. This is OK for text nodes, but not correct for HTML attributes. The reasons follow.
> Input XML document:
> <?xml version="1.0"?>
> <input 
>   data="xxx&#13;&#10;yyy" 
>   type="hidden" 
>   name="data.stuff" />
> Stylesheet:
> <?xml version="1.0"?> 
> <xsl:stylesheet 
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>   version="1.0" >
>   <xsl:output method="html" />
>   <xsl:template match="input|br">
>     <xsl:copy>
>       <xsl:copy-of select="@*"/>
>       <xsl:apply-templates/>
>     </xsl:copy>
>   </xsl:template>
> </xsl:stylesheet>
> There are Four stages of processing:
> A) what is in the input XML document
> B) what is presented to Xalan by the XML parser
> C) what is written out by the Xalan processor
> D) what is interpreted by a browser or user agent.
> The output produced by stage C) by Xalan is this:
> <input data="xxx
> yyy" type="hidden" name="data.stuff">
> To indicate that more clearly the value for the attribute 'data' 
> written out on windows is this:
> "xxx\r\r\nyyy"
> and on other operating systems the value written out is this:
> "xxx\r\nyyy"
> Current processing of the attribute by Xalan is this:
>  - write out the \r as is
>  - consider the \n a normalized end of line sequence produced by
>    the XML parser from stage A) and it write it out
>    in stage C) as the system end of line 
>    sequence, either \r\n or just \n depending on the operation system.
> The HTML recommendation, at 
>   http://www.w3.org/TR/html401/types.html#h-6.2
> says this about stage D) :
> <<
> User agents should interpret attribute values as follows: 
>  1. Replace character entities with characters, 
>  2. Ignore line feeds, 
>  3. Replace each carriage return or tab with a single space. 
> >>
> Xalan's output on Windows OS by stage C) of "xxx\r\r\nyyy" would be interpreted
> as "xxx  yyy" by a browser at stage D). Bullet 2. means that the
> browser would ignore the \n, and bullet 3 means that it would
> interpret \r\r as two spaces.
> Xalan's output from stage C) on other operating systems
> of "xxx\r\nyyy" would be interpreted as "xxx yyy" by a browser at stage D). 
> This is one less space between "xxx" and "yyy"
> Since the browser interpretation differs depending on which OS
> we are running on this is a bug, we shouldn't normalize
> the \n in the attribute value to the system end of line sequence.
> We should leave it alone, thus producing this output by stage D) on all operating systems:
> "xxx\r\nyyy"
> I ran this through Saxon 6.5.3 and its output was:
> <input data="xxx&#xA;yyy" type="hidden" name="data.stuff">
> When a browser interprets Saxon's output it would apply 
> bullet 1 and interpret a single newline character between "xxx" and "yyy".
> It is not clear if the bullets 1,2,3 quoted from the HTML recommendation apply in sequence, or if just one of them applies. If just one of them applies the browser might interpret Saxons 'data' attribute value as "xxx\nyyy". On the other hand if one applies bullet 1. followed by bullet 2. then Saxon's 'data' atribute value is interpreted as "xxxyyy". Either way Xalan's output is different than Saxon's in a way that is significant to a browser or user agent.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org


[jira] Closed: (XALANJ-2109) \r\n in an HTML attribute is incorrectly output as \r\r\n

Posted by "Brian Minchau (JIRA)" <xa...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XALANJ-2109?page=all ]
     
Brian Minchau closed XALANJ-2109:
---------------------------------


>  \r\n  in an HTML attribute is incorrectly output as \r\r\n
> -----------------------------------------------------------
>
>          Key: XALANJ-2109
>          URL: http://issues.apache.org/jira/browse/XALANJ-2109
>      Project: XalanJ2
>         Type: Bug
>   Components: Serialization
>     Reporter: Brian Minchau
>     Assignee: Brian Minchau
>      Fix For: 2.7
>  Attachments: ToHTMLStream.2109.patch.txt
>
> The serializer assumes that a single \n should be expanded to the systems end of line sequence. This is OK for text nodes, but not correct for HTML attributes. The reasons follow.
> Input XML document:
> <?xml version="1.0"?>
> <input 
>   data="xxx&#13;&#10;yyy" 
>   type="hidden" 
>   name="data.stuff" />
> Stylesheet:
> <?xml version="1.0"?> 
> <xsl:stylesheet 
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>   version="1.0" >
>   <xsl:output method="html" />
>   <xsl:template match="input|br">
>     <xsl:copy>
>       <xsl:copy-of select="@*"/>
>       <xsl:apply-templates/>
>     </xsl:copy>
>   </xsl:template>
> </xsl:stylesheet>
> There are Four stages of processing:
> A) what is in the input XML document
> B) what is presented to Xalan by the XML parser
> C) what is written out by the Xalan processor
> D) what is interpreted by a browser or user agent.
> The output produced by stage C) by Xalan is this:
> <input data="xxx
> yyy" type="hidden" name="data.stuff">
> To indicate that more clearly the value for the attribute 'data' 
> written out on windows is this:
> "xxx\r\r\nyyy"
> and on other operating systems the value written out is this:
> "xxx\r\nyyy"
> Current processing of the attribute by Xalan is this:
>  - write out the \r as is
>  - consider the \n a normalized end of line sequence produced by
>    the XML parser from stage A) and it write it out
>    in stage C) as the system end of line 
>    sequence, either \r\n or just \n depending on the operation system.
> The HTML recommendation, at 
>   http://www.w3.org/TR/html401/types.html#h-6.2
> says this about stage D) :
> <<
> User agents should interpret attribute values as follows: 
>  1. Replace character entities with characters, 
>  2. Ignore line feeds, 
>  3. Replace each carriage return or tab with a single space. 
> >>
> Xalan's output on Windows OS by stage C) of "xxx\r\r\nyyy" would be interpreted
> as "xxx  yyy" by a browser at stage D). Bullet 2. means that the
> browser would ignore the \n, and bullet 3 means that it would
> interpret \r\r as two spaces.
> Xalan's output from stage C) on other operating systems
> of "xxx\r\nyyy" would be interpreted as "xxx yyy" by a browser at stage D). 
> This is one less space between "xxx" and "yyy"
> Since the browser interpretation differs depending on which OS
> we are running on this is a bug, we shouldn't normalize
> the \n in the attribute value to the system end of line sequence.
> We should leave it alone, thus producing this output by stage D) on all operating systems:
> "xxx\r\nyyy"
> I ran this through Saxon 6.5.3 and its output was:
> <input data="xxx&#xA;yyy" type="hidden" name="data.stuff">
> When a browser interprets Saxon's output it would apply 
> bullet 1 and interpret a single newline character between "xxx" and "yyy".
> It is not clear if the bullets 1,2,3 quoted from the HTML recommendation apply in sequence, or if just one of them applies. If just one of them applies the browser might interpret Saxons 'data' attribute value as "xxx\nyyy". On the other hand if one applies bullet 1. followed by bullet 2. then Saxon's 'data' atribute value is interpreted as "xxxyyy". Either way Xalan's output is different than Saxon's in a way that is significant to a browser or user agent.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org


[jira] Commented: (XALANJ-2109) \r\n in an HTML attribute is incorrectly output as \r\r\n

Posted by "Brian Minchau (JIRA)" <xa...@xml.apache.org>.
    [ http://issues.apache.org/jira/browse/XALANJ-2109?page=comments#action_12312580 ] 

Brian Minchau commented on XALANJ-2109:
---------------------------------------


Although this issue is resolved, it may be of interest to know that XALANJ-2093, which will be in the Xalan-J 2.7 release, will allow you to specify what \n is normalized to on output. For example, you can pick <xsl:output  xalan:line-separator="&#10;" > and it won't use the runtime library value for the line separator.

>  \r\n  in an HTML attribute is incorrectly output as \r\r\n
> -----------------------------------------------------------
>
>          Key: XALANJ-2109
>          URL: http://issues.apache.org/jira/browse/XALANJ-2109
>      Project: XalanJ2
>         Type: Bug
>   Components: Serialization
>     Reporter: Brian Minchau
>     Assignee: Brian Minchau
>      Fix For: CurrentCVS
>  Attachments: ToHTMLStream.2109.patch.txt
>
> The serializer assumes that a single \n should be expanded to the systems end of line sequence. This is OK for text nodes, but not correct for HTML attributes. The reasons follow.
> Input XML document:
> <?xml version="1.0"?>
> <input 
>   data="xxx&#13;&#10;yyy" 
>   type="hidden" 
>   name="data.stuff" />
> Stylesheet:
> <?xml version="1.0"?> 
> <xsl:stylesheet 
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>   version="1.0" >
>   <xsl:output method="html" />
>   <xsl:template match="input|br">
>     <xsl:copy>
>       <xsl:copy-of select="@*"/>
>       <xsl:apply-templates/>
>     </xsl:copy>
>   </xsl:template>
> </xsl:stylesheet>
> There are Four stages of processing:
> A) what is in the input XML document
> B) what is presented to Xalan by the XML parser
> C) what is written out by the Xalan processor
> D) what is interpreted by a browser or user agent.
> The output produced by stage C) by Xalan is this:
> <input data="xxx
> yyy" type="hidden" name="data.stuff">
> To indicate that more clearly the value for the attribute 'data' 
> written out on windows is this:
> "xxx\r\r\nyyy"
> and on other operating systems the value written out is this:
> "xxx\r\nyyy"
> Current processing of the attribute by Xalan is this:
>  - write out the \r as is
>  - consider the \n a normalized end of line sequence produced by
>    the XML parser from stage A) and it write it out
>    in stage C) as the system end of line 
>    sequence, either \r\n or just \n depending on the operation system.
> The HTML recommendation, at 
>   http://www.w3.org/TR/html401/types.html#h-6.2
> says this about stage D) :
> <<
> User agents should interpret attribute values as follows: 
>  1. Replace character entities with characters, 
>  2. Ignore line feeds, 
>  3. Replace each carriage return or tab with a single space. 
> >>
> Xalan's output on Windows OS by stage C) of "xxx\r\r\nyyy" would be interpreted
> as "xxx  yyy" by a browser at stage D). Bullet 2. means that the
> browser would ignore the \n, and bullet 3 means that it would
> interpret \r\r as two spaces.
> Xalan's output from stage C) on other operating systems
> of "xxx\r\nyyy" would be interpreted as "xxx yyy" by a browser at stage D). 
> This is one less space between "xxx" and "yyy"
> Since the browser interpretation differs depending on which OS
> we are running on this is a bug, we shouldn't normalize
> the \n in the attribute value to the system end of line sequence.
> We should leave it alone, thus producing this output by stage D) on all operating systems:
> "xxx\r\nyyy"
> I ran this through Saxon 6.5.3 and its output was:
> <input data="xxx&#xA;yyy" type="hidden" name="data.stuff">
> When a browser interprets Saxon's output it would apply 
> bullet 1 and interpret a single newline character between "xxx" and "yyy".
> It is not clear if the bullets 1,2,3 quoted from the HTML recommendation apply in sequence, or if just one of them applies. If just one of them applies the browser might interpret Saxons 'data' attribute value as "xxx\nyyy". On the other hand if one applies bullet 1. followed by bullet 2. then Saxon's 'data' atribute value is interpreted as "xxxyyy". Either way Xalan's output is different than Saxon's in a way that is significant to a browser or user agent.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org


[jira] Assigned: (XALANJ-2109) \r\n in an HTML attribute is incorrectly output as \r\r\n

Posted by "Brian Minchau (JIRA)" <xa...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XALANJ-2109?page=all ]

Brian Minchau reassigned XALANJ-2109:
-------------------------------------

    Assign To: Brian Minchau

>  \r\n  in an HTML attribute is incorrectly output as \r\r\n
> -----------------------------------------------------------
>
>          Key: XALANJ-2109
>          URL: http://issues.apache.org/jira/browse/XALANJ-2109
>      Project: XalanJ2
>         Type: Bug
>   Components: Serialization
>     Reporter: Brian Minchau
>     Assignee: Brian Minchau
>  Attachments: ToHTMLStream.2109.patch.txt
>
> The serializer assumes that a single \n should be expanded to the systems end of line sequence. This is OK for text nodes, but not correct for HTML attributes. The reasons follow.
> Input XML document:
> <?xml version="1.0"?>
> <input 
>   data="xxx&#13;&#10;yyy" 
>   type="hidden" 
>   name="data.stuff" />
> Stylesheet:
> <?xml version="1.0"?> 
> <xsl:stylesheet 
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>   version="1.0" >
>   <xsl:output method="html" />
>   <xsl:template match="input|br">
>     <xsl:copy>
>       <xsl:copy-of select="@*"/>
>       <xsl:apply-templates/>
>     </xsl:copy>
>   </xsl:template>
> </xsl:stylesheet>
> There are Four stages of processing:
> A) what is in the input XML document
> B) what is presented to Xalan by the XML parser
> C) what is written out by the Xalan processor
> D) what is interpreted by a browser or user agent.
> The output produced by stage C) by Xalan is this:
> <input data="xxx
> yyy" type="hidden" name="data.stuff">
> To indicate that more clearly the value for the attribute 'data' 
> written out on windows is this:
> "xxx\r\r\nyyy"
> and on other operating systems the value written out is this:
> "xxx\r\nyyy"
> Current processing of the attribute by Xalan is this:
>  - write out the \r as is
>  - consider the \n a normalized end of line sequence produced by
>    the XML parser from stage A) and it write it out
>    in stage C) as the system end of line 
>    sequence, either \r\n or just \n depending on the operation system.
> The HTML recommendation, at 
>   http://www.w3.org/TR/html401/types.html#h-6.2
> says this about stage D) :
> <<
> User agents should interpret attribute values as follows: 
>  1. Replace character entities with characters, 
>  2. Ignore line feeds, 
>  3. Replace each carriage return or tab with a single space. 
> >>
> Xalan's output on Windows OS by stage C) of "xxx\r\r\nyyy" would be interpreted
> as "xxx  yyy" by a browser at stage D). Bullet 2. means that the
> browser would ignore the \n, and bullet 3 means that it would
> interpret \r\r as two spaces.
> Xalan's output from stage C) on other operating systems
> of "xxx\r\nyyy" would be interpreted as "xxx yyy" by a browser at stage D). 
> This is one less space between "xxx" and "yyy"
> Since the browser interpretation differs depending on which OS
> we are running on this is a bug, we shouldn't normalize
> the \n in the attribute value to the system end of line sequence.
> We should leave it alone, thus producing this output by stage D) on all operating systems:
> "xxx\r\nyyy"
> I ran this through Saxon 6.5.3 and its output was:
> <input data="xxx&#xA;yyy" type="hidden" name="data.stuff">
> When a browser interprets Saxon's output it would apply 
> bullet 1 and interpret a single newline character between "xxx" and "yyy".
> It is not clear if the bullets 1,2,3 quoted from the HTML recommendation apply in sequence, or if just one of them applies. If just one of them applies the browser might interpret Saxons 'data' attribute value as "xxx\nyyy". On the other hand if one applies bullet 1. followed by bullet 2. then Saxon's 'data' atribute value is interpreted as "xxxyyy". Either way Xalan's output is different than Saxon's in a way that is significant to a browser or user agent.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org


[jira] Resolved: (XALANJ-2109) \r\n in an HTML attribute is incorrectly output as \r\r\n

Posted by "Brian Minchau (JIRA)" <xa...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XALANJ-2109?page=all ]
     
Brian Minchau resolved XALANJ-2109:
-----------------------------------

     Resolution: Fixed
    Fix Version: CurrentCVS

I applied the patch to current CVS, the issue is resolved.

>  \r\n  in an HTML attribute is incorrectly output as \r\r\n
> -----------------------------------------------------------
>
>          Key: XALANJ-2109
>          URL: http://issues.apache.org/jira/browse/XALANJ-2109
>      Project: XalanJ2
>         Type: Bug
>   Components: Serialization
>     Reporter: Brian Minchau
>     Assignee: Brian Minchau
>      Fix For: CurrentCVS
>  Attachments: ToHTMLStream.2109.patch.txt
>
> The serializer assumes that a single \n should be expanded to the systems end of line sequence. This is OK for text nodes, but not correct for HTML attributes. The reasons follow.
> Input XML document:
> <?xml version="1.0"?>
> <input 
>   data="xxx&#13;&#10;yyy" 
>   type="hidden" 
>   name="data.stuff" />
> Stylesheet:
> <?xml version="1.0"?> 
> <xsl:stylesheet 
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>   version="1.0" >
>   <xsl:output method="html" />
>   <xsl:template match="input|br">
>     <xsl:copy>
>       <xsl:copy-of select="@*"/>
>       <xsl:apply-templates/>
>     </xsl:copy>
>   </xsl:template>
> </xsl:stylesheet>
> There are Four stages of processing:
> A) what is in the input XML document
> B) what is presented to Xalan by the XML parser
> C) what is written out by the Xalan processor
> D) what is interpreted by a browser or user agent.
> The output produced by stage C) by Xalan is this:
> <input data="xxx
> yyy" type="hidden" name="data.stuff">
> To indicate that more clearly the value for the attribute 'data' 
> written out on windows is this:
> "xxx\r\r\nyyy"
> and on other operating systems the value written out is this:
> "xxx\r\nyyy"
> Current processing of the attribute by Xalan is this:
>  - write out the \r as is
>  - consider the \n a normalized end of line sequence produced by
>    the XML parser from stage A) and it write it out
>    in stage C) as the system end of line 
>    sequence, either \r\n or just \n depending on the operation system.
> The HTML recommendation, at 
>   http://www.w3.org/TR/html401/types.html#h-6.2
> says this about stage D) :
> <<
> User agents should interpret attribute values as follows: 
>  1. Replace character entities with characters, 
>  2. Ignore line feeds, 
>  3. Replace each carriage return or tab with a single space. 
> >>
> Xalan's output on Windows OS by stage C) of "xxx\r\r\nyyy" would be interpreted
> as "xxx  yyy" by a browser at stage D). Bullet 2. means that the
> browser would ignore the \n, and bullet 3 means that it would
> interpret \r\r as two spaces.
> Xalan's output from stage C) on other operating systems
> of "xxx\r\nyyy" would be interpreted as "xxx yyy" by a browser at stage D). 
> This is one less space between "xxx" and "yyy"
> Since the browser interpretation differs depending on which OS
> we are running on this is a bug, we shouldn't normalize
> the \n in the attribute value to the system end of line sequence.
> We should leave it alone, thus producing this output by stage D) on all operating systems:
> "xxx\r\nyyy"
> I ran this through Saxon 6.5.3 and its output was:
> <input data="xxx&#xA;yyy" type="hidden" name="data.stuff">
> When a browser interprets Saxon's output it would apply 
> bullet 1 and interpret a single newline character between "xxx" and "yyy".
> It is not clear if the bullets 1,2,3 quoted from the HTML recommendation apply in sequence, or if just one of them applies. If just one of them applies the browser might interpret Saxons 'data' attribute value as "xxx\nyyy". On the other hand if one applies bullet 1. followed by bullet 2. then Saxon's 'data' atribute value is interpreted as "xxxyyy". Either way Xalan's output is different than Saxon's in a way that is significant to a browser or user agent.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org