You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Simon Kitching <si...@ecnetwork.co.nz> on 2004/07/25 06:18:19 UTC

Re: Content is not allowed in prolog error when sax parsing large file

On Sat, 2004-07-24 at 11:06, Duane Jung wrote:
> Hi,
> 
> While parsing a large XML file (1000 items/56mb), I get the following error:
> 
> [Fatal Error] 1k_CIN.xml:2:1: Content is not allowed in prolog.
> 
> I've chopped this file up into smaller sets of items (50 items/1.5mb) and can parse those without
> any problems.  Its only when I try to parse the larger file that I encounter the prolog error.
> 
> The prolog is the same in all of the files:
> <?xml version="1.0" encoding="UTF-8"?>
> 
> I'm using xerces-2_6_2, parsing with Sax XMLReader.  I've tried passing in a system id and
> inputsource to the parser -- both result in the prolog error with the larger file.  The smaller
> files parse without any problems.
> 
> Does anyone have any ideas?

Normally, this error message indicates that there is some non-xml
present before the xml starts.

I'm wondering if there are some "invisible" chars in your document and
the way you "chop up" the file is cleaning them out.

If you are using unix/linux, have you tried inspecting the file using
something like
  od -cx input.xml | more
to see *exactly* what chars your file starts with?

Regards,

Simon


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Content is not allowed in prolog error when sax parsing large file

Posted by Duane Jung <du...@yahoo.com>.
Hi Simon,

That was the exact problem!  I had some invisible characters from my dataset creation process. 
Thank you very much!!!

Duane

--- Simon Kitching <si...@ecnetwork.co.nz> wrote:
> On Sat, 2004-07-24 at 11:06, Duane Jung wrote:
> > Hi,
> > 
> > While parsing a large XML file (1000 items/56mb), I get the following error:
> > 
> > [Fatal Error] 1k_CIN.xml:2:1: Content is not allowed in prolog.
> > 
> > I've chopped this file up into smaller sets of items (50 items/1.5mb) and can parse those
> without
> > any problems.  Its only when I try to parse the larger file that I encounter the prolog error.
> > 
> > The prolog is the same in all of the files:
> > <?xml version="1.0" encoding="UTF-8"?>
> > 
> > I'm using xerces-2_6_2, parsing with Sax XMLReader.  I've tried passing in a system id and
> > inputsource to the parser -- both result in the prolog error with the larger file.  The
> smaller
> > files parse without any problems.
> > 
> > Does anyone have any ideas?
> 
> Normally, this error message indicates that there is some non-xml
> present before the xml starts.
> 
> I'm wondering if there are some "invisible" chars in your document and
> the way you "chop up" the file is cleaning them out.
> 
> If you are using unix/linux, have you tried inspecting the file using
> something like
>   od -cx input.xml | more
> to see *exactly* what chars your file starts with?
> 
> Regards,
> 
> Simon
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org
> 
> 



		
__________________________________
Do you Yahoo!?
Yahoo! Mail is new and improved - Check it out!
http://promotions.yahoo.com/new_mail

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org