You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@commons.apache.org by Stefan Burkard <ne...@burknet.ch> on 2005/03/11 10:27:17 UTC

[Digester] Error "Document root element is missing" for files with BOM

hi there

today i encountered a problem with an xml-to-object transformation with 
digester.
i get an exception saying "Document root element is missing" (stack 
trace below), but the xml seemed to be absolutely fine. because i found 
some posts in other groups that they had this with several editors, i 
opened the xml-files in textwrangler (macintosh text editor).

in textwrangler the encoding of the files was (absolutely correct) 
UTF-8. BUT as i opened other xml-files that were imported fine, i saw 
that these are declared as "UTF-8, no BOM".

as soon as i changed the problematic files to "no BOM", they were 
imported fine by digester.

a look in the docu of textwrangler says that this is for "byte order 
mark" and that it is recommended to save files always with this BOM. but 
as far as it looks, they can't parsed with digester.

is this a known bug? can i configure digester to work with these files too?

thanks and greetings
stefan

full stacktrace:
----------------
545  [main] ERROR org.apache.commons.digester.Digester  - Parse Fatal 
Error at line 1 column -1: Document root element is missing.
org.xml.sax.SAXParseException: Document root element is missing.
         at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3339)
         at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3327)
         at 
org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:635)
         at org.apache.crimson.parser.Parser2.parse(Parser2.java:333)
         at 
org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:448)
         at org.apache.commons.digester.Digester.parse(Digester.java:1532)
         at

...and so on...


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Re: [Digester] Error "Document root element is missing" for files with BOM

Posted by Simon Kitching <sk...@apache.org>.
Hi Stefan,

On Fri, 2005-03-11 at 10:27 +0100, Stefan Burkard wrote:
> hi there
> 
> today i encountered a problem with an xml-to-object transformation with 
> digester.
> i get an exception saying "Document root element is missing"
<snip>
> as soon as i changed the problematic files to "no BOM", they were 
> imported fine by digester.

16-bit unicode text (ie data where two adjacent bytes indicate character
#0000 through #FFFF from the unicode character set) can be stored
most-significant-byte-first (MSB) or least-significant-byte-first (LSB).
The byte-order-mark is a sequence of bytes at the start of a UTF-16
document that indicates whether MSB or LSB is used to store the
character codes. [that's a rough description].

Digester doesn't bother itself with that sort of detail; that's the xml
parser's responsibility. 

So the short answer is: make sure the xml parser that digester is using
is correctly configured to handle whatever input you're feeding into it.

If you still want Digester to create the xml parser instance, then call
  Digester digester = new Digester();
  XMLReader xmlReader = Digester.getXMLReader();
  // now configure the xmlReader as appropriate

Alternatively, create the xml parser yourself:
  // create the xml parser however you want
  // configure the xml parser however you want
  Digester digester = new Digester(parser);


The exception being thrown is actually generated by the underlying
parser; Digester is just forwarding it on.

Regards,

Simon


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org