You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by ji...@apache.org on 2004/06/10 18:45:18 UTC
[jira] Updated: (XERCESC-1226) Parser reports bogus content when parsing
The following issue has been updated:
Updater: David Bertoni (mailto:david_n_bertoni@us.ibm.com)
Date: Thu, 10 Jun 2004 9:44 AM
Comment:
XML document to reproduce the problem.
Changes:
Attachment changed to test1.xml
---------------------------------------------------------------------
For a full history of the issue, see:
http://issues.apache.org/jira/browse/XERCESC-1226?page=history
---------------------------------------------------------------------
View the issue:
http://issues.apache.org/jira/browse/XERCESC-1226
Here is an overview of the issue:
---------------------------------------------------------------------
Key: XERCESC-1226
Summary: Parser reports bogus content when parsing
Type: Bug
Status: Unassigned
Priority: Major
Project: Xerces-C++
Components:
SAX/SAX2
Versions:
Nightly build (please specify the date)
Assignee:
Reporter: David Bertoni
Created: Thu, 10 Jun 2004 9:42 AM
Updated: Thu, 10 Jun 2004 9:44 AM
Environment: All platforms
Description:
When parsing the following document, the parser reports garbage characters.
<?xml version="1.0"?>
<subject>Research [𝔸]rticle</subject>
I traced this down to this function in XMLReader, starting on line 612:
inline bool XMLReader::isPlainContentChar(const XMLCh toCheck)
{
return ((fgCharCharsTable[toCheck] & gPlainContentCharMask) != 0);
}
Apparently, for the character "]" (U+005D RIGHT SQUARE BRACKET), the flags in fgCharCharsTable indicate it's not plain content. This causes the parser to misbehave badly, and deliver broken character data, including unpaired low surrogates.
When I used the debugger, and returned "true" from this function, rather than false, the parser delivered the correct character data.
---------------------------------------------------------------------
JIRA INFORMATION:
This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
If you want more information on JIRA, or have a bug to report see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org