You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by bu...@apache.org on 2002/05/08 01:10:48 UTC
DO NOT REPLY [Bug 8893] New: -
Creation of DOM containing invalid xml characters.
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=8893>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=8893
Creation of DOM containing invalid xml characters.
Summary: Creation of DOM containing invalid xml characters.
Product: Xerces-J
Version: 1.4.3
Platform: All
OS/Version: All
Status: NEW
Severity: Critical
Priority: Other
Component: Core
AssignedTo: xerces-j-dev@xml.apache.org
ReportedBy: abhilash.koneri@bestbuy.com
I am using xerces for building a dom document from character data. The
character data contains some characters which are not legal xml characters.
However, this does not cause any exception during the creation of the the DOM.
However, when I serialize the dom (to a string) and the re-parse it to
obtain the dom, I get an sax exception reporting the invalid xml character.
The code used is attached below.
-----
import java.io.*;
import javax.xml.parsers.*;
import org.xml.sax.*;
import org.w3c.dom.*;
import org.apache.xml.serialize.XMLSerializer;
import org.apache.xml.serialize.OutputFormat;
public class ItemTest
{
public static void main(String[] args) throws Throwable
{
String illegalUnicodeString = "BLACK ";
char[] chars = illegalUnicodeString.toCharArray();
for(int i=0;i<chars.length; i++)
{
System.out.println("character "+chars[i]+":"+Character.isLetter
(chars[i]));
}
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(false);
DocumentBuilder builder = factory.newDocumentBuilder();
Document dom = builder.newDocument();
Element rootElement= dom.createElement("ITEMRECORD");
rootElement.appendChild(dom.createTextNode(illegalUnicodeString));
dom.appendChild(rootElement);
String domString = getXmlAsString(dom, false);
System.out.println("The serialized dom string is \n"+domString);
StringReader reader = new StringReader(domString);
InputSource is = new InputSource(reader);
dom = builder.parse(is);
}
public static String getXmlAsString(Document dom, boolean supressHeader)
{
String xmlString = null;
try
{
OutputFormat format = new OutputFormat(dom);
format.setPreserveSpace(true);
format.setOmitXMLDeclaration(supressHeader); // skip boilerplate
at top of XML document
StringWriter sOut = new StringWriter();
XMLSerializer serializer = new XMLSerializer(sOut,format);
serializer.asDOMSerializer();
serializer.serialize(dom.getDocumentElement());
xmlString = sOut.getBuffer().toString();
}
catch(DOMException domE)
{
domE.printStackTrace();
}
catch(IOException e)
{
e.printStackTrace();
}
return xmlString;
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org