You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by "Michael Glavassevich (JIRA)" <xa...@xml.apache.org> on 2005/03/16 22:10:20 UTC

[jira] Commented: (XALANJ-2081) XSLTC does ignores non-whitespaces as if they were whitespaces.

     [ http://issues.apache.org/jira/browse/XALANJ-2081?page=comments#action_61003 ]
     
Michael Glavassevich commented on XALANJ-2081:
----------------------------------------------

The only characters which are considered to be spaces in XML are 0x09 (TAB), 0x0A (LF), 0x0D (CR) and 0x20 (SPACE). Character.isWhitespace() will return true for many characters which aren't XML spaces such as 0x1F.

> XSLTC does ignores non-whitespaces as if they were whitespaces.
> ---------------------------------------------------------------
>
>          Key: XALANJ-2081
>          URL: http://issues.apache.org/jira/browse/XALANJ-2081
>      Project: XalanJ2
>         Type: Bug
>   Components: XSLTC
>     Versions: CurrentCVS
>  Environment: Windows 2000
>     Reporter: Yash Talwar
>     Assignee: Yash Talwar
>      Fix For: CurrentCVS
>  Attachments: XalanJ2081_Patch.txt
>
> Given following stylesheet and input document, Xalan Interpretive and XSLTC has difference in output.
> Example 1:
> ==========
> Stylesheet:
> ------------
> <?xml version="1.1" ?>
> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
>   <xsl:output method="xml" version="1.1" encoding="UTF-8" />
>   <xsl:template match="/">
>     <out>&#x08;&#x1F;</out>
>   </xsl:template>
> </xsl:stylesheet>
> Input XML: 
> -------------
> <?xml version="1.1"?>
> <doc />
> Output using Xalan Interpretive:
> --------------------------------
> <?xml version="1.1" encoding="UTF-8"?><out>&#8;&#31;</out>
> Output using XSLTC:
> -------------------
> <?xml version="1.1" encoding="UTF-8"?><out/>
> Here, Xalan Interpretive has correct ouptut.
> Here #x08 is a backspace character.  XSLTC treats both characters as whitespace characters.
> Just look at following snippet of java code:
> 		System.out.println(Character.isWhitespace((char)0x08));
> 		System.out.println(Character.isWhitespace((char)0x1f));
> 		String s = "\u0008\u001f";
> 		System.out.println(s.trim().length());
> 		
> The output for this code snippet is:
>                false
>                true
>                0       // The length here should not be 0.
> This shows that 0x08 is not a whitespace character, but trim() method of java.lang.String treats it as a whitespace character.
> I believe the problem lies in the class org.apache.xalan.xsltc.compiler.Text
> The following line of code is the probematic:
>  if (_text.trim().length() == 0) _ignore = true;
>  
> In above sample, it can be seen that trim() method is problematic.
>  
> I will attach a patch to fix this problem.
> Yash Talwar

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org