You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Glenn Marcy <gm...@us.ibm.com> on 2002/11/11 14:29:16 UTC

Re: BUG14378 . Error parsing XML document with a leading white space character.

Jan Les writes:
> Thank you very much for your explanation. I agree that Buzilla should not
> be used for lengthy discussions and opening/reopening of the bugs.
>
> I have tested Crimson and Oracle parser to see how they handle this
> problem. In both cases I have got error with a clear error message saying
> that document must start with the XML declaration.
>
> Thanks again,
> Jan

I wonder what it says for:

"
<?XML HELLO THERE?>
"

or

"
<?xml hello there?>
"

or

"<?xml version='1.0'?>
<?xml version='1.0'?>
"

or

"<?xml version='1.0'?>
<?xml hello there?>
"

or

"<!-- this is first -->
<?xml hello there?>
"

etc., etc.

The answers reflect how complex the code would need to be to special case
the parsing of processing instructions to track all of the places where
you would say that the markup is a misplaced XML declaration instead of
a processing instruction with an invalid PI target.

I have reincluded the developers list since in all fairness this is a
community developed technology and I for one would appreciate hearing
how others feel about this issue.

Regards,
Glenn




---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: BUG14378 . Error parsing XML document with a leading white space character.

Posted by Claude Montpetit <cm...@nertec.com>.
When occasionally using XML, and being confronted to this error, I once lost
too much time on this. Even if one goes through an XML tutorial, remembering
the XML declaration requirements/restrictions is not something garanteed to
last in the long term memory.

In this situation (when the XML decalration does not start at first char),
personnally, I have learned to interpret the Xerces error message, but I can
undersand the frustration of users that are not as familiar with XML.

Defining appropriate error messages is an art. While the current error
message is absolutly true, a hint on the possible cause of this error may
save some research time for some users.

Claude Montpetit

----- Original Message -----
From: Joseph Kesselman
To: xerces-j-dev@xml.apache.org
Sent: Monday, November 11, 2002 8:45 AM
Subject: Re: BUG14378 . Error parsing XML document with a leading white
space character.


There's not a lot of discussion needed here.

The XML Declaration MUST be the first thing in the file if present, with
the sole exception of the byte order mark. This comes right out of the
grammar. (See  productions 1, 22, 23 in the XML 1.0 Recommendation,
available at http://www.w3.org/TR/.)

Processing instructions, which share the <??> syntax with the XML
Declaration, MUST NOT use the target name "XML" in any mixture of upper
and lower case. (See production 17.) That's reserved for use by the W3C,
and so far they have (correctly, in my opinion) not chosen to use it.

Hence there's no question of "a misplaced declaration" -- if it isn't in
the right place, it isn't a declaration and with that name it can't be
anything else. All the examples in Jan's note are quite clearly
ill-formed.


(A good XML tutorial should have covered this point.)

______________________________________
Joe Kesselman  / IBM Research

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: BUG14378 . Error parsing XML document with a leading white space character.

Posted by Joseph Kesselman <ke...@us.ibm.com>.
There's not a lot of discussion needed here.

The XML Declaration MUST be the first thing in the file if present, with 
the sole exception of the byte order mark. This comes right out of the 
grammar. (See  productions 1, 22, 23 in the XML 1.0 Recommendation, 
available at http://www.w3.org/TR/.)

Processing instructions, which share the <??> syntax with the XML 
Declaration, MUST NOT use the target name "XML" in any mixture of upper 
and lower case. (See production 17.) That's reserved for use by the W3C, 
and so far they have (correctly, in my opinion) not chosen to use it.

Hence there's no question of "a misplaced declaration" -- if it isn't in 
the right place, it isn't a declaration and with that name it can't be 
anything else. All the examples in Jan's note are quite clearly 
ill-formed.


(A good XML tutorial should have covered this point.)

______________________________________
Joe Kesselman  / IBM Research

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org