You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by xe...@xml.apache.org on 2004/10/15 10:35:51 UTC
[jira] Created: (XERCESC-1288) Wrong line/column number in UTFDataFormatException
Message:
A new issue has been created in JIRA.
---------------------------------------------------------------------
View the issue:
http://issues.apache.org/jira/browse/XERCESC-1288
Here is an overview of the issue:
---------------------------------------------------------------------
Key: XERCESC-1288
Summary: Wrong line/column number in UTFDataFormatException
Type: Bug
Status: Unassigned
Priority: Minor
Project: Xerces-C++
Components:
DOM
Non-Validating Parser
SAX/SAX2
Versions:
2.5.0
2.6.0
Assignee:
Reporter: Valerio Gionco
Created: Fri, 15 Oct 2004 1:34 AM
Updated: Fri, 15 Oct 2004 1:34 AM
Environment: Linux (SUSE 9.1, Fedora core 2, Redhat 9) on Intel, Solaris 7 on SPARC, various gcc versions.
Description:
I've the following (bad) XML file:
--------------- bad.xml ----------------------------
<?xml version="1.0" encoding="UTF-8"?>
<block>
<field>Blah blah</field>
<field>Blah blah ò blah blah</field>
<field>Blah blah</field>
</block>
----------------------------------------------------
(note the accented 'o' in the 2nd "field" line - hope it won't be
destroyed...)
The file is bad because the accented 'o' is represented with a single
byte, 0xf2. This is the hed dump:
3e 42 6c 61 68 20 62 6c 61 68 20 f2 20 62 6c 61 |>Blah blah . bla|
Problem is, when I run "SAXPrint bad.xml" i get the following error:
Fatal Error at file /users/valerio/tmp/bad.xml, line 1, char 39
Message: An exception occurred! Type:UTFDataFormatException, Message:invalid byte 2 ( ) of a 4-byte sequence.
The row and column reported by SAXParseException::getColumnNumber()
and SAXParseException::getLineNumber() are wrong. I seem to recall
this was not the case with older (2.0 or 2.2?) versions of Xerces-C,
but I'm not sure.
I noticed the issue with 2.5, then tried with 2.6 but there was
no apparent difference. Can somebody take care of this? We often
have big XML files to parse, and not knowing where the error
really is is a real pain.
---------------------------------------------------------------------
JIRA INFORMATION:
This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
If you want more information on JIRA, or have a bug to report see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org