You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by Gary L Peskin <ga...@firstech.com> on 2000/09/27 08:44:17 UTC

Re: XalanJ2 XMLSerializer problems [3 of 3]

Gary L Peskin wrote:
> The XMLDecl and TextDecl look almost the same except the XMLDecl allows
> for a standalone= pseudoattribute in the XMLDecl.  Xerces currently only
> handles the creation of an XMLDecl, not a TextDecl.  I have composed a
> proposed letter to the Xerces people and sent that as [3 of 3] of this
> message.  We can discuss changes and then forward it on to the Xerces
> people.

Here is my proposed email to the Xerces people:

We are currently using your org.apache.xml.serialize.XMLSerializer class
and it base class, BaseMarkupSerializer.  We're using the SAX
interfaces.

Unless suppressed by a call to OutputFormat.setOmitXMLDeclaration(true),
the XMLSerializer class will automatically emit an XMLDecl
(http://www.w3.org/TR/REC-xml#NT-XMLDecl) upon encountering the first
startElement() (or serializeElement()) call.  This is fine when we are
generating well-formed XML document entities.

However, we also are attempting to use XMLSerializer to output
well-formed XML external general parsed entities.  In this case, we need
XMLSerializer to emit a TextDecl
(http://www.w3.org/TR/REC-xml#NT-TextDecl) at the beginning of the
output.  It currently does not do this.

Thus, we attempt to output the following well-formed XML external
general parsed entity:

-------------------------------------------------------------------
This is a test
*abc<h1>dummy element content</h1>def*
-------------------------------------------------------------------

We do this by calling:
startDocument()
characters() for "This is a test\n"
characters() for "*abc"
startElement() for <h1>
characters() for "dummy element content"
endElement() for </h1>
characters() for "def*"
endDocument()

The output produced by the XMLSerializer looks like this:
-------------------------------------------------------------------
This is a test
*abc<?xml version="1.0" encoding="UTF-8"?>
<h1>dummy element content</h1>def*
-------------------------------------------------------------------

with the XMLDecl immediately preceding the first element (<h1> in the
output.

What we need is this:
-------------------------------------------------------------------
<?xml version="1.0" encoding="UTF-8"?>
This is a test
*abc<h1>dummy element content</h1>def*
-------------------------------------------------------------------

In other words, if the output consists only of comments, processing
instructions, whitespace, and a doctype declaration before the first
element, then continue to output an XMLDecl.  Otherwise, output a
TextDecl at the beginning of the serialized output.

We need this change so that XalanJ2 can be compliant with the XSLT
Recommendation.

Can you please give us your thoughts on this change?  We can supply the
diffs to implement this change for your review if you think that would
be best.

Thanks,