You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-dev@axis.apache.org by "Davanum Srinivas (JIRA)" <ax...@ws.apache.org> on 2004/11/23 01:10:24 UTC

[jira] Resolved: (AXIS-1676) SAXParseException for message containing more than 2 bytes UTF-8 chars and DIME Attachment

     [ http://nagoya.apache.org/jira/browse/AXIS-1676?page=history ]
     
Davanum Srinivas resolved AXIS-1676:
------------------------------------

    Resolution: Fixed

Applied patch.

> SAXParseException for message containing more than 2 bytes UTF-8 chars and DIME Attachment
> ------------------------------------------------------------------------------------------
>
>          Key: AXIS-1676
>          URL: http://nagoya.apache.org/jira/browse/AXIS-1676
>      Project: Axis
>         Type: Bug
>   Components: Serialization/Deserialization
>     Versions: 1.1
>  Environment: Detected on Axis 1.1 under win2000.
> But clearly affect newer Axis versions under any operating system, JDK etc.
> Cf. Description and patch.
>     Reporter: Damien
>  Attachments: diff.txt
>
> Description from a user point of view
> ---------------------------------------
> In some cases, on client side, you may get a SAXParseException for messages including 2 bytes (or more) UTF-8 characters when it includes DIME Attachment.
> This is systematic for a given message (but depends on the message content)
> But when there is no DIME Attachment or when it includes Mime Attachment it works fine.
> Description from a patch submitter point of view
> -------------------------------------------------
> The org.apache.axis.attachments.DimeDelimitedInputStream read() method is buggy.
> Whereas it should always return a positive int value, or -1 when the End Of Stream is reached, it may return a negative value.
> This is due to a "cast error" from byte to int.
>  
> Full analysis
> --------------
> When Xerces tries to parse a message, it first reads a buffer of a given size (2048 bytes) using the UTF8Reader class and the read(byte[], int, int) method of the inputStream.
> This byte array is then converted to an UTF-8 char array. If ever, the last byte of the buffer is the beginning of an UTF-8 character then one (or more) additonnal byte is requested so as to complete this character.
> This is done through the read() method (with no parameter). In case of a message with DIME Attachment, the input stream is a DimeDelimitedInputStream. Because the read() method may return a negative value, the UTF8Reader may consider that the End Of Stream has been reached (which is not the case). As a consequence, the SOAPPart is not fully passed to the parser and the parsing fails !
> ---
> The patch is available and going to be submitted.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://nagoya.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira