You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Jon Smirl <jo...@mediaone.net> on 2001/03/03 06:51:34 UTC

test case

Is there some more test data for Xerces - what does the encoding test use
for input?

Jon Smirl
jonsmirl@mediaone.net



Re: test case

Posted by Jon Smirl <jo...@mediaone.net>.
These probably should get checked into CVS so that they don't get lost.

Jon Smirl
jonsmirl@mediaone.net



Re: test case

Posted by Andy Heninger <an...@jtcsv.com>.
"Jon Smirl" <jo...@mediaone.net> asks
> Is there some more test data for Xerces - what does the encoding test
use
> for input?


Yes, there are more data files associated with the encoding test.
I no longer have them at hand, though.  Maybe someone from Toronto
can post them.

EncodingTest consisted of the C++ program, an equivalent Java
program, a set of test files for several different encodings that
focused on characters that are not common to most encodings,
and a script.   The XML files were parsed with IBM XML4C
(essentially xerces-c + ICU transcoding), and with xerces-J.
The C++ and Java results were then compared.  The intent was
to have some verification that different encodings really were
being handled correctly.

Leaving aside EBCDIC, most encodings include  7 bit ASCII as a
subset, and the bulk of most of the XML tests, and all markup,
are 7 bit ascii.  So just specifying an unusual encoding
in a test file does and not getting parse errors does not
necessarily mean that the file was read correctly.  Which
is why we did this test.



Andy Heninger
IBM XML Technology Group, Cupertino, CA
heninger@us.ibm.com

----- Original Message -----
From: "Jon Smirl" <jo...@mediaone.net>
To: "xerces" <xe...@xml.apache.org>
Sent: Friday, March 02, 2001 9:51 PM
Subject: test case


> Is there some more test data for Xerces - what does the encoding test
use
> for input?
>