You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@xml.apache.org by Vincent Faidherbe <vf...@icogs.com> on 2001/10/12 15:24:19 UTC

Crimson: Bug when document begins with whitespaces?

When a parsed XML document begin with one or more whitespace(s), Crimson 
fails when it encounters the XML declaration  and it returns the  
message P-019 : "XML declaration may only begin entities". I've the 
problem with the RSS documents generated by 10.am.
So i've added a line in the  parseInternal (InputSource input) of 
Parse2, here's the code

    private void parseInternal (InputSource input)
    throws SAXException, IOException
    {
        if (input == null)
            fatal ("P-000");

        try {

....
            contentHandler.setDocumentLocator (locator);

            contentHandler.startDocument ();

....
             // Add by Vincent Faidherbe (vfaid@icogs.com)
             // Because the XML document could start with whitespaces.
             // Check RSS feed from 10.am for example ;-)
             //
            maybeWhitespace();
            // end add

            maybeXmlDecl ();
....


Now, it's works fine when the document starts with whitespaces.
I don't know if it is a bug but if it is, i think i've fixed it.
     

-- 
Vincent Faidherbe
icogs

"Do you think C++ is lovable? Unless you're into SM (Software Masochism), probably not." (JLG)



---------------------------------------------------------------------
In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org


Re: Crimson: Bug when document begins with whitespaces?

Posted by "Thomas B. Passin" <tp...@mitretek.org>.
[Vincent Faidherbe]

> When a parsed XML document begin with one or more whitespace(s), Crimson
> fails when it encounters the XML declaration  and it returns the
> message P-019 : "XML declaration may only begin entities". I've the
> problem with the RSS documents generated by 10.am.

This is correct behavior and no patch should be applied.  The XML Rec
requries that a document start with with a byte order mark (if it is
utf-16), the xml declaration, or the root element.  No leading whitespace is
allowed.

Should a processor be tolerant despite the Rec?  That's always a good
question.  Here I would say no.

Cheers,

Tom P


---------------------------------------------------------------------
In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org