You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by "Piotr Wielgolaski (JIRA)" <xa...@xml.apache.org> on 2004/11/10 11:34:25 UTC

[jira] Commented: (XALANJ-1710) Incorrect SAXException about bad integral value of a character to be written out.

     [ http://nagoya.apache.org/jira/browse/XALANJ-1710?page=comments#action_55287 ]
     
Piotr Wielgolaski commented on XALANJ-1710:
-------------------------------------------

The main problem of encoding to text output method in ToTextStream. This class iterate for all chars which has to be written and check if the value of each char isn't greater that M_MAXCHARACTER, but value can be valid and be greater that M_MAXCHARACTER. 

If we use 1-byte coding the value can be convert from 2-byte format (internal representation of Java char type) then should be written otherwise shoulb be raise exception. 

I propose use function escapingNotNeeded from ToStream class or something in this way, which check if secifc char can be converted in current encoding.





> Incorrect SAXException about bad integral value of a character to be written out.
> ---------------------------------------------------------------------------------
>
>          Key: XALANJ-1710
>          URL: http://nagoya.apache.org/jira/browse/XALANJ-1710
>      Project: XalanJ2
>         Type: Bug
>   Components: Serialization
>     Versions: CurrentCVS
>  Environment: Operating System: Other
> Platform: Other
>     Reporter: Brian Minchau
>     Assignee: Xalan Developers Mailing List
>  Attachments: TransformerImplPatch.txt, apache.patch.24278.txt, apache.patch.24279.txt, bug4.xml, bug4.xsl
>
> I got this exception for Xalan-J interpretive (note that this works
> just fine with XSTLC):
> org.xml.sax.SAXException: 
> Attempt to output character of integral value 338 
> that is not represented in specified output encoding of .
> 	at org.apache.xml.serializer.ToTextStream.writeNormalizedChars
> (ToTextStream.java:393)
> 	at org.apache.xml.serializer.ToTextStream.characters
> (ToTextStream.java:237)
> 	at org.apache.xml.utils.FastStringBuffer.sendSAXcharacters
> (FastStringBuffer.java:1024)
> 	at org.apache.xml.dtm.ref.sax2dtm.SAX2DTM.dispatchCharactersEvents
> (SAX2DTM.java:599)
> . . .
> There are two problems. I shouldn't get this message at all, but if I should
> then it should have the name of the encoding UTF-8, which it doesn't.
> I'm gong to attach a simple XML/XSL pair as a testcase. This problem is in 
> ToTextStream and is due to the fix for bug 795 being applied. The else {...} 
> clause in writing out a character in ToTextStream:
>            if (S_LINEFEED == c && useLineSep)
>             {
>                 writer.write(m_lineSep, 0, m_lineSepLen);
>             }
>             else if (c <= M_MAXCHARACTER)
>             {
>                 writer.write(c);
>             }
>             else if (isUTF16Surrogate(c))
>             {
>                 writeUTF16Surrogate(c, ch, i, end);
>                 i++; // two input characters processed
>             }
>             else
>             {
>                 String encoding = getEncoding();
>                 String integralValue = Integer.toString(c);
>                 throw new SAXException(XMLMessages.createXMLMessage(
>                     XMLErrorResources.ER_ILLEGAL_CHARACTER,
>                     new Object[]{ integralValue, encoding}));                 
>             }
> now gives a SAXException, but it used to just write out the character anyways.
> The problem is that M_MAXCHARACTER is 127 and the encoding is not set for
> the ToTextStream serializer at all.  Should the encoding be set?  I'm not sure 
> because this is an intermediate, internal use of a serializer to create a value.
> It is not the final serializer, which would be a ToXMLStream one.
> Perhaps we need a way to officially signal to a serializer that it doesn't have
> to do any escaping or worry about character encoding.  We've had trouble like 
> this before where '&' turned into &amp;  then into &amp;amp; because of double 
> processing by an intermediate and then a final serializer.  It would be cleaner 
> to let a serializer know that it is just an intermediate utility one.  I've 
> discussed this with Morris Kwan, but he doesn't think that this is a needed in 
> general, probably just for ToTextStream.  
> Still we've managed to make the serializer independant of Xalan-J interpretive 
> and of XSLTC, I'd like to make the reverse more true and just use the 
> serializer by its interface only....  but I'm digressing.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://nagoya.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org