You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xalan.apache.org by s7...@netscape.net on 2008/01/23 12:46:41 UTC
Output a new line after the XML declaration using indent="yes"
[resending my original message as it didn't appear in the list,
trying out 4 times.]
Using the example serialization code (see at the end) and the
built-in Sun's Java 1.4 JAXP implementation I get a result file:
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<para>foo bar</para>
</doc>
However when I plug-in Xalan 2.7.1 I get a result file:
<?xml version="1.0" encoding="UTF-8"?><doc>
<para>foo bar</para>
</doc>
Is there a way to make the document element appear on a new line
after the XML declaration when using the indent="yes" output option?
-----XMLSerializationTest.java
import java.io.File;
import java.io.FileOutputStream;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.sax.SAXTransformerFactory;
import javax.xml.transform.sax.TransformerHandler;
import javax.xml.transform.stream.StreamResult;
import org.xml.sax.Attributes;
import org.xml.sax.helpers.AttributesImpl;
public class XMLSerializationTest
{
static final String XALAN_INDENT_AMOUNT =
"{http://xml.apache.org/xslt}" + "indent-amount";
public static void main(String[] args) throws Exception
{
File resultFile = new File("test.xml");
SAXTransformerFactory stf = (SAXTransformerFactory)
TransformerFactory.newInstance();
TransformerHandler handler = stf.newTransformerHandler();
Transformer transformer = handler.getTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(XALAN_INDENT_AMOUNT, "2");
handler.setResult(new StreamResult(
new FileOutputStream(resultFile)));
Attributes noAtts = new AttributesImpl();
String text = "foo bar";
handler.startDocument();
handler.startElement("", "", "doc", noAtts);
handler.startElement("", "", "para", noAtts);
handler.characters(text.toCharArray(), 0, text.length());
handler.endElement("", "", "para");
handler.endElement("", "", "doc");
handler.endDocument();
System.out.println("Done.");
}
}
-----XMLSerializationTest.java--
--
Stanimir
Re: Output a new line after the XML declaration using indent="yes"
Posted by Jörg Hohwiller <jo...@j-hohwiller.de>.
Hi Dave,
thanks for your response...
> >
> > 2. Having a newline after XML-declaration as well
> > as at the end of the file is a common need that
> > JAXP users have. I can NOT accept your point
> > saying that this is generally NOT supported because
> > it could cause some trouble you may have implementing this.
> I find the tone of your post to be offensive. No one on this list owes
> you anything in particular, and I suggest you be more polite in future
> posts.
Sorry for that. Of course nobody owes me anything.
I just ran into this problem and read this thread.
This gave me the impression that a user demand
-that is obvious to me- is not really seen here (I am
talking about my impression).
Besides English is not my native language so I do
not know if there is something very offensive in my words.
I just wanted to make my personal position very clear.
Besides I tended to always write "NOT" capitalized,
because I often read over this word and get the wrong
message. A friend explained me that this is no good style
here because it overemphasizes what I wanted to say...
> > This has a large impact after everybody has to eat this
> > when just using a plain JDK. It is a lot worse if people
> > start adding newlines manually to OutputStream,
> > because this will cause real trouble with encodings, especially
> > if the encoding is NOT 8-bit-wise (e.g. UTF-16) and a newlines
> > get broken because hackers of such workaround do NOT think
> > of such problems.
> Any XML parser that requires a newline between the XML declaration and
> the root element, or after the root element, is non-conforming and
> should be fixed.
Absolutely true. But XML is also read by humans.
I wrote an open-source tool (maven-plugin) that modifies
XML that can be handwritten. Users tell me that there is a
bug in my product because newlines get lost.
I think they are right but my problem is that
I see the main problem in xalan-j.
> >
> > Please provide a solution here.
> You can always post-process the document and
> add the new line characters.
Of course I can do that. I already wrote about this option
before and the pitfalls one can run into with encodings, etc.
I was choosing Xalan-J because I used JAXP
and did not want to add thirdparty libs (like XOM, JDOM or
Dom4j). But if I need such workaround I'd better use one
of those external libs instead.
Besides Xalan-J also stips a newline between XML-Comments
and the root-tag. I can not simply workaround this problem without
reimplementing an XML-Writer or just using an other product.
> Dave
Regards
Jörg
--
View this message in context: http://www.nabble.com/Output-a-new-line-after-the-XML-declaration-using-indent%3D%22yes%22-tp15040090p24129270.html
Sent from the Xalan - J - Users mailing list archive at Nabble.com.
Re: Output a new line after the XML declaration using indent="yes"
Posted by David Bertoni <db...@apache.org>.
Jörg Hohwiller wrote:
> Hi there,
>
>
> Brian Minchau wrote:
>> Hi Stanimir.
>>
>> The Xalan
>> serializer doesn't know about whether the serialized XML will
>> be used in the the future as an external general parsed
>> entity and included in yet another XML file.
>>
>> It is possible that the XML will be included next to a text node that
>> is not all whitespace and the extra whitespace that we inject after the
>> XML header would be included next to non-whitespace
>> text and become part of that text node, modifying it.
>>
>> Extra whitespace added for indentation is done in ignorable locations,
>> but this particular one (just after the header) might not be ignored.
>>
>> Added indentation or extra whitespace before the document element
>> is not always correct, so Xalan doesn't do it.
>>
>> There is no Xalan specific option to control this behavior.
>>
>> - Brian
>>
>
> 1. Could you please give an example or a link to the
> XML-specification to point out why a newline after
> XML-declation or at the end of the file should make the
> XML illegal. I am NOT talking about your problems
> to implement this properly, but why
> <?xml ....?>^n
> <root>...</root>^n
> should be illegal!?!?
Brian didn't say it would make the result "illegal." He simply said
that it would modify the content of the result inappropriately. The
processor generates a external general parsed entity, and a newline
between the XML declaration would introduce whitespace into the content
of the entity.
>
> 2. Having a newline after XML-declaration as well
> as at the end of the file is a common need that
> JAXP users have. I can NOT accept your point
> saying that this is generally NOT supported because
> it could cause some trouble you may have implementing this.
I find the tone of your post to be offensive. No one on this list owes
you anything in particular, and I suggest you be more polite in future
posts.
> This has a large impact after everybody has to eat this
> when just using a plain JDK. It is a lot worse if people
> start adding newlines manually to OutputStream,
> because this will cause real trouble with encodings, especially
> if the encoding is NOT 8-bit-wise (e.g. UTF-16) and a newlines
> get broken because hackers of such workaround do NOT think
> of such problems.
Any XML parser that requires a newline between the XML declaration and
the root element, or after the root element, is non-conforming and
should be fixed.
>
> Please provide a solution here.
You can always post-process the document and add the new line characters.
Dave
Re: Output a new line after the XML declaration using indent="yes"
Posted by Jörg Hohwiller <jo...@j-hohwiller.de>.
Hi there,
Brian Minchau wrote:
>
> Hi Stanimir.
>
> The Xalan
> serializer doesn't know about whether the serialized XML will
> be used in the the future as an external general parsed
> entity and included in yet another XML file.
>
> It is possible that the XML will be included next to a text node that
> is not all whitespace and the extra whitespace that we inject after the
> XML header would be included next to non-whitespace
> text and become part of that text node, modifying it.
>
> Extra whitespace added for indentation is done in ignorable locations,
> but this particular one (just after the header) might not be ignored.
>
> Added indentation or extra whitespace before the document element
> is not always correct, so Xalan doesn't do it.
>
> There is no Xalan specific option to control this behavior.
>
> - Brian
>
1. Could you please give an example or a link to the
XML-specification to point out why a newline after
XML-declation or at the end of the file should make the
XML illegal. I am NOT talking about your problems
to implement this properly, but why
<?xml ....?>^n
<root>...</root>^n
should be illegal!?!?
2. Having a newline after XML-declaration as well
as at the end of the file is a common need that
JAXP users have. I can NOT accept your point
saying that this is generally NOT supported because
it could cause some trouble you may have implementing this.
This has a large impact after everybody has to eat this
when just using a plain JDK. It is a lot worse if people
start adding newlines manually to OutputStream,
because this will cause real trouble with encodings, especially
if the encoding is NOT 8-bit-wise (e.g. UTF-16) and a newlines
get broken because hackers of such workaround do NOT think
of such problems.
Please provide a solution here.
Thanks
Jörg
--
View this message in context: http://www.nabble.com/Output-a-new-line-after-the-XML-declaration-using-indent%3D%22yes%22-tp15040090p24017219.html
Sent from the Xalan - J - Users mailing list archive at Nabble.com.
Re: Output a new line after the XML declaration using indent="yes"
Posted by s7...@netscape.net.
Wed, 23 Jan 2008 11:10:44 -0500, /Brian Minchau/:
> The Xalan
> serializer doesn't know about whether the serialized XML will
> be used in the the future as an external general parsed
> entity and included in yet another XML file.
[...]
> There is no Xalan specific option to control this behavior.
I see. Still I think it would be nice if the user could explicitly
control this behavior. As a workaround I've made it output the XML
declaration and a new line manually:
SAXTransformerFactory stf = (SAXTransformerFactory)
TransformerFactory.newInstance();
TransformerHandler handler = stf.newTransformerHandler();
Transformer transformer = handler.getTransformer();
...
transformer.setOutputProperty(OutputKeys
.OMIT_XML_DECLARATION, "yes");
...
OutputStream out = ...;
handler.setResult(new StreamResult(out));
String xmlDecl =
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
+ System.getProperty("line.separator");
out.write(xmlDecl.getBytes("US-ASCII"));
handler.startDocument();
...
--
Stanimir
Re: Output a new line after the XML declaration using indent="yes"
Posted by Brian Minchau <mi...@ca.ibm.com>.
Hi Stanimir.
The Xalan
serializer doesn't know about whether the serialized XML will
be used in the the future as an external general parsed
entity and included in yet another XML file.
It is possible that the XML will be included next to a text node that
is not all whitespace and the extra whitespace that we inject after the
XML header would be included next to non-whitespace
text and become part of that text node, modifying it.
Extra whitespace added for indentation is done in ignorable locations,
but this particular one (just after the header) might not be ignored.
Added indentation or extra whitespace before the document element
is not always correct, so Xalan doesn't do it.
There is no Xalan specific option to control this behavior.
- Brian
- - - - - - - - - - - - - - - - - - - -
Brian Minchau, Ph.D.
XSLT Development, IBM Toronto
(780) 431-2633
e-mail: minchau@ca.ibm.com
s7an10@netscape.n
et
To
01/23/2008 06:46 xalan-j-users@xml.apache.org
AM cc
Subject
Output a new line after the XML
declaration using indent="yes"
[resending my original message as it didn't appear in the list,
trying out 4 times.]
Using the example serialization code (see at the end) and the
built-in Sun's Java 1.4 JAXP implementation I get a result file:
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<para>foo bar</para>
</doc>
However when I plug-in Xalan 2.7.1 I get a result file:
<?xml version="1.0" encoding="UTF-8"?><doc>
<para>foo bar</para>
</doc>
Is there a way to make the document element appear on a new line
after the XML declaration when using the indent="yes" output option?
-----XMLSerializationTest.java
import java.io.File;
import java.io.FileOutputStream;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.sax.SAXTransformerFactory;
import javax.xml.transform.sax.TransformerHandler;
import javax.xml.transform.stream.StreamResult;
import org.xml.sax.Attributes;
import org.xml.sax.helpers.AttributesImpl;
public class XMLSerializationTest
{
static final String XALAN_INDENT_AMOUNT =
"{http://xml.apache.org/xslt}" + "indent-amount";
public static void main(String[] args) throws Exception
{
File resultFile = new File("test.xml");
SAXTransformerFactory stf = (SAXTransformerFactory)
TransformerFactory.newInstance();
TransformerHandler handler = stf.newTransformerHandler();
Transformer transformer = handler.getTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(XALAN_INDENT_AMOUNT, "2");
handler.setResult(new StreamResult(
new FileOutputStream(resultFile)));
Attributes noAtts = new AttributesImpl();
String text = "foo bar";
handler.startDocument();
handler.startElement("", "", "doc", noAtts);
handler.startElement("", "", "para", noAtts);
handler.characters(text.toCharArray(), 0, text.length());
handler.endElement("", "", "para");
handler.endElement("", "", "doc");
handler.endDocument();
System.out.println("Done.");
}
}
-----XMLSerializationTest.java--
--
Stanimir