You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by "Miller, Grant" <Gr...@COGNOS.com> on 2002/04/05 18:52:29 UTC

XSLTC: Character escaping (bug?)

Hi,
I am having a problem with xsltc. I do not seem to be able to generate
non-escaped output for xml or html. (Or at least, this is what I think the
problem is).
I want to do this because I need to use the output of the transform as an
input to another process (non-xsltc) before being rendered in a browser.
The second process takes care of preparing the xml for presentation in a
browser and so does any necessary escaping. So having xsltc do escaping
means everything gets double escaped and so it is impossible to display eg.
'&' in the browser ( I get '&amp;' displayed and the source doc contains
'&amp;amp;').

FYI, in this case the second process is a serializer in cocoon. I have a
custom generator which internally uses xsltc to generate the sax output of
the generate step. So I am not sure if this may raise difficulties. (We have
a deadline next week so I cannot really afford to wait for the official
cocoon2 integration and even then a custom generator may be desirable for
performance reasons).

There are 2 cases where problems arise:

1) the input text contains characters such as '&'.
2) text in an xsl is used to create xml (can be useful for creating what the
xml parser thinks is bad markup but generates valid xml at runtime).
[ In xalan I can do 

<xsl:text disable-output-escaping="yes">&lt;b&gt;hello&lt;\b&gt;</xsl:text>

and have this create a bold tag in the output doc - but with xsltc I always
get "&lt;b&gt;hello&lt;\b&gt;" displayed in the browser ].


This is the xsltc code I use (mostly copied from examples) - this outputs to
a file for debug purposes:

	    final Class clazz = Class.forName("pagerender");
	    final Translet translet = (Translet)clazz.newInstance();

	    // Create a SAX parser and get the XMLReader object it uses
	    final SAXParserFactory factory = SAXParserFactory.newInstance();
	    try {
		factory.setFeature(Constants.NAMESPACE_FEATURE,true);
	    }
	    catch (Exception e) {
		factory.setNamespaceAware(true);
	    }
	    final SAXParser parser = factory.newSAXParser();
	    final XMLReader reader = parser.getXMLReader();

	    // Set the DOM's DOM builder as the XMLReader's SAX2 content
handler
	    final DOMImpl dom = new DOMImpl();
	    DOMBuilder builder = dom.getBuilder();
	    reader.setContentHandler(builder);

	    try {
		String prop =
"http://xml.org/sax/properties/lexical-handler";
		reader.setProperty(prop, builder);
	    }
	    catch (SAXException e) {
		// quitely ignored
	    }
	    
	    // Create a DTD monitor and pass it to the XMLReader object
	    final DTDMonitor dtdMonitor = new DTDMonitor(reader);
	    AbstractTranslet _translet = (AbstractTranslet)translet;
	   
           	    builder.startDocument();
                builder.startElement("", "html", "html", _NullAttributes);
                builder.startElement("", "background", "background",
_NullAttributes);
                XSPObjectHelper.xspExpr(builder, "a & b"); // uses
'characters' method to add content
                builder.endElement("", "background", "background");
                builder.endElement("", "html", "html");
                builder.endDocument();

	    builder = null;

	    // If there are any elements with ID attributes, build an index
	    dtdMonitor.buildIdIndex(dom, 0, _translet);
	    // Pass unparsed entity descriptions to the translet
	    _translet.setDTDMonitor(dtdMonitor);

	    // Transform the document
	    String encoding = _translet._encoding;

	    // Create our default SAX/DTD handler
	    java.io.FileOutputStream out = new
java.io.FileOutputStream("e:/out.xml");
            
                DefaultSAXOutputHandler saxHandler =
		new DefaultSAXOutputHandler(out, encoding);
                TextOutput textOutput =
		new TextOutput((ContentHandler)saxHandler,
			       (LexicalHandler)saxHandler, encoding);
                textOutput.setType(TextOutput.XML);
            
                textOutput.setEscaping(false);
                // Transform and pass output to the translet output handler
	    translet.transform(dom, textOutput);
            
                out.close();

This always produces "a &amp; b" in the output file. I notice that the code
for TextOutput always sets escaping to true for xml (although in some places
it appears to have been written to allow a false value too). Even if I
output with type HTML or TEXT I still get "a &amp; b" in the output file, so
I am rather confused.
Any help much appreciated!

Cheers,
Grant

This message may contain privileged and/or confidential information.  If you
have received this e-mail in error or are not the intended recipient, you
may not use, copy, disseminate or distribute it; do not open any
attachments, delete it immediately from your system and notify the sender
promptly by e-mail that you have done so.  Thank you.