You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Scott Cantor (Jira)" <xe...@xml.apache.org> on 2019/12/09 17:50:00 UTC

[jira] [Updated] (XERCESC-2158) XMLUTF8Transcoder: One multibyte UTF8 character is swallowed from the srcData when the resulting surrogate pair does not fit in toFill at the end

     [ https://issues.apache.org/jira/browse/XERCESC-2158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Scott Cantor updated XERCESC-2158:
----------------------------------
    Fix Version/s: 3.2.3

> XMLUTF8Transcoder: One multibyte UTF8 character is swallowed from the srcData when the resulting surrogate pair does not fit in toFill at the end
> -------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESC-2158
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2158
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 3.1.4, 3.2.2
>         Environment: OS independent: Linux (RedHat 7.5)/Windows 10
> Compiler independent
>            Reporter: Johannes Willnecker
>            Priority: Major
>             Fix For: 3.2.3
>
>         Attachments: UTF8.xml, xerces.patch
>
>
> *Bug found in Xerces-C++ Version 3.1.4* (based on code reviews also newer versions are affected)
>  
> *How to reproduce:* Call SAX2Print for the attached UTF8.xml file "SAX2Print UTF8.xml".
> One chinese character is missing in the name attribute of the last but one Instance element.
> *Fix:* The fix for this bug is included in the xerces.patch file.
> In XMLUTF8Transcoder.cpp a check for this issue was already included but the conclusion
> that the bytes read are updated at the end of the loop was wrong.
> The bytes read (bytesEaten) calculation is based on the srcPtr which was already updated when the check is made.
> Therefore srcPtr needs to be repositioned in case the Surrogate pair does not fit into the toFill buffer.
>  
> *Contributor related:*
> Author Name of the code being contributed: Johannes Willnecker
> Employer: Siemens AG
> I have the right to grant the copyright licenses for the contribution.
> My employer has rights to the code that I have written. My employer gave me permission to contribute this code on its behalf.
> I am not aware of any third-party license or other restrictions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org