You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by "Miller, Grant" <Gr...@COGNOS.com> on 2002/04/05 18:52:29 UTC
XSLTC: Character escaping (bug?)
Hi,
I am having a problem with xsltc. I do not seem to be able to generate
non-escaped output for xml or html. (Or at least, this is what I think the
problem is).
I want to do this because I need to use the output of the transform as an
input to another process (non-xsltc) before being rendered in a browser.
The second process takes care of preparing the xml for presentation in a
browser and so does any necessary escaping. So having xsltc do escaping
means everything gets double escaped and so it is impossible to display eg.
'&' in the browser ( I get '&' displayed and the source doc contains
'&amp;').
FYI, in this case the second process is a serializer in cocoon. I have a
custom generator which internally uses xsltc to generate the sax output of
the generate step. So I am not sure if this may raise difficulties. (We have
a deadline next week so I cannot really afford to wait for the official
cocoon2 integration and even then a custom generator may be desirable for
performance reasons).
There are 2 cases where problems arise:
1) the input text contains characters such as '&'.
2) text in an xsl is used to create xml (can be useful for creating what the
xml parser thinks is bad markup but generates valid xml at runtime).
[ In xalan I can do
<xsl:text disable-output-escaping="yes"><b>hello<\b></xsl:text>
and have this create a bold tag in the output doc - but with xsltc I always
get "<b>hello<\b>" displayed in the browser ].
This is the xsltc code I use (mostly copied from examples) - this outputs to
a file for debug purposes:
final Class clazz = Class.forName("pagerender");
final Translet translet = (Translet)clazz.newInstance();
// Create a SAX parser and get the XMLReader object it uses
final SAXParserFactory factory = SAXParserFactory.newInstance();
try {
factory.setFeature(Constants.NAMESPACE_FEATURE,true);
}
catch (Exception e) {
factory.setNamespaceAware(true);
}
final SAXParser parser = factory.newSAXParser();
final XMLReader reader = parser.getXMLReader();
// Set the DOM's DOM builder as the XMLReader's SAX2 content
handler
final DOMImpl dom = new DOMImpl();
DOMBuilder builder = dom.getBuilder();
reader.setContentHandler(builder);
try {
String prop =
"http://xml.org/sax/properties/lexical-handler";
reader.setProperty(prop, builder);
}
catch (SAXException e) {
// quitely ignored
}
// Create a DTD monitor and pass it to the XMLReader object
final DTDMonitor dtdMonitor = new DTDMonitor(reader);
AbstractTranslet _translet = (AbstractTranslet)translet;
builder.startDocument();
builder.startElement("", "html", "html", _NullAttributes);
builder.startElement("", "background", "background",
_NullAttributes);
XSPObjectHelper.xspExpr(builder, "a & b"); // uses
'characters' method to add content
builder.endElement("", "background", "background");
builder.endElement("", "html", "html");
builder.endDocument();
builder = null;
// If there are any elements with ID attributes, build an index
dtdMonitor.buildIdIndex(dom, 0, _translet);
// Pass unparsed entity descriptions to the translet
_translet.setDTDMonitor(dtdMonitor);
// Transform the document
String encoding = _translet._encoding;
// Create our default SAX/DTD handler
java.io.FileOutputStream out = new
java.io.FileOutputStream("e:/out.xml");
DefaultSAXOutputHandler saxHandler =
new DefaultSAXOutputHandler(out, encoding);
TextOutput textOutput =
new TextOutput((ContentHandler)saxHandler,
(LexicalHandler)saxHandler, encoding);
textOutput.setType(TextOutput.XML);
textOutput.setEscaping(false);
// Transform and pass output to the translet output handler
translet.transform(dom, textOutput);
out.close();
This always produces "a & b" in the output file. I notice that the code
for TextOutput always sets escaping to true for xml (although in some places
it appears to have been written to allow a false value too). Even if I
output with type HTML or TEXT I still get "a & b" in the output file, so
I am rather confused.
Any help much appreciated!
Cheers,
Grant
This message may contain privileged and/or confidential information. If you
have received this e-mail in error or are not the intended recipient, you
may not use, copy, disseminate or distribute it; do not open any
attachments, delete it immediately from your system and notify the sender
promptly by e-mail that you have done so. Thank you.