You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Marat Ruvinov <Ma...@msdw.com> on 2000/07/13 16:04:56 UTC

parsing entities with Xerces-C

I'm having a problem parsing entities.

Simplified XML file:
"<?xml version="1.0"?>
<!DOCTYPE Personnel SYSTEM "person.dtd">
<personnel>
&tst;
</personnel>
"

The simplified DTD file:
"<?xml encoding="US-ASCII"?>
<!ENTITY tst "testing">
<!ELEMENT personnel (person,)+>
<!ELEMENT person (name, email*, url*)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT url (#PCDATA)>
"

I get the following exception: "Invalid character (Unicode: 0x0) "

refering the line with &tst;  and character right after the semicolon.

I have the setEntityHandler(this) set to the parser .....  Am I not declaring
the entities
correctly ?

-Marat


Re: parsing entities with Xerces-C

Posted by Marat Ruvinov <Ma...@msdw.com>.
The problem has been resolved. It turned out that I was linking with an older
version of the Xerces library which had a bug with entities. After relinking, it
parses fine.
Thanks,

-Marat

Dean Roddey wrote:

> What are you doing in the entity handler? There really isn't any need for
> one, from what you've said so far. So perhaps you are doing somethign wrong
> in the entity handler?
>
> --------------------------
> Dean Roddey
> The CIDLib C++ Frameworks
> Charmed Quark Software
> droddey@charmedquark.com
> http://www.charmedquark.com
>
> "You young, and you gotcha health. Whatchoo wanna job fer?"
>
> ----- Original Message -----
> From: "Marat Ruvinov" <Ma...@msdw.com>
> To: "Dean Roddey" <dr...@charmedquark.com>; <xe...@xml.apache.org>
> Sent: Thursday, July 13, 2000 1:39 PM
> Subject: Re: parsing entities with Xerces-C
>
> > I tried running it thru SAXPrint, DOMPrint, ... and I get the same error.
> If I
> > use a
> > pre-defined entity like &amp   or  &lt   the error goes away. I only get
> it
> > when
> > defining the entity. The faq mentions that there could be a hidden control
> > character -
> > so I used "od -c" and did not see any control characters.
> >
> > Puzzled....
> >
> > Marat
> >
> > Dean Roddey wrote:
> >
> > > Looks like you are doing the right thing. Try running this through one
> of
> > > the sample programs that comes with the parser distribution. If it works
> > > there, its your program, probably the build options. Otherwise, let us
> know
> > > and we can look at it.
> > >
> > > --------------------------
> > > Dean Roddey
> > > The CIDLib C++ Frameworks
> > > Charmed Quark Software
> > > droddey@charmedquark.com
> > > http://www.charmedquark.com
> > >
> > > "You young, and you gotcha health. Whatchoo wanna job fer?"
> > >
> > > ----- Original Message -----
> > > From: "Marat Ruvinov" <Ma...@msdw.com>
> > > To: <xe...@xml.apache.org>
> > > Sent: Thursday, July 13, 2000 7:04 AM
> > > Subject: parsing entities with Xerces-C
> > >
> > > >
> > > > I'm having a problem parsing entities.
> > > >
> > > > Simplified XML file:
> > > > "<?xml version="1.0"?>
> > > > <!DOCTYPE Personnel SYSTEM "person.dtd">
> > > > <personnel>
> > > > &tst;
> > > > </personnel>
> > > > "
> > > >
> > > > The simplified DTD file:
> > > > "<?xml encoding="US-ASCII"?>
> > > > <!ENTITY tst "testing">
> > > > <!ELEMENT personnel (person,)+>
> > > > <!ELEMENT person (name, email*, url*)>
> > > > <!ELEMENT name (#PCDATA)>
> > > > <!ELEMENT email (#PCDATA)>
> > > > <!ELEMENT url (#PCDATA)>
> > > > "
> > > >
> > > > I get the following exception: "Invalid character (Unicode: 0x0) "
> > > >
> > > > refering the line with &tst;  and character right after the semicolon.
> > > >
> > > > I have the setEntityHandler(this) set to the parser .....  Am I not
> > > declaring
> > > > the entities
> > > > correctly ?
> > > >
> > > > -Marat
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > > > For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> > > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: parsing entities with Xerces-C

Posted by Dean Roddey <dr...@charmedquark.com>.
What are you doing in the entity handler? There really isn't any need for
one, from what you've said so far. So perhaps you are doing somethign wrong
in the entity handler?

--------------------------
Dean Roddey
The CIDLib C++ Frameworks
Charmed Quark Software
droddey@charmedquark.com
http://www.charmedquark.com

"You young, and you gotcha health. Whatchoo wanna job fer?"


----- Original Message -----
From: "Marat Ruvinov" <Ma...@msdw.com>
To: "Dean Roddey" <dr...@charmedquark.com>; <xe...@xml.apache.org>
Sent: Thursday, July 13, 2000 1:39 PM
Subject: Re: parsing entities with Xerces-C


> I tried running it thru SAXPrint, DOMPrint, ... and I get the same error.
If I
> use a
> pre-defined entity like &amp   or  &lt   the error goes away. I only get
it
> when
> defining the entity. The faq mentions that there could be a hidden control
> character -
> so I used "od -c" and did not see any control characters.
>
> Puzzled....
>
> Marat
>
> Dean Roddey wrote:
>
> > Looks like you are doing the right thing. Try running this through one
of
> > the sample programs that comes with the parser distribution. If it works
> > there, its your program, probably the build options. Otherwise, let us
know
> > and we can look at it.
> >
> > --------------------------
> > Dean Roddey
> > The CIDLib C++ Frameworks
> > Charmed Quark Software
> > droddey@charmedquark.com
> > http://www.charmedquark.com
> >
> > "You young, and you gotcha health. Whatchoo wanna job fer?"
> >
> > ----- Original Message -----
> > From: "Marat Ruvinov" <Ma...@msdw.com>
> > To: <xe...@xml.apache.org>
> > Sent: Thursday, July 13, 2000 7:04 AM
> > Subject: parsing entities with Xerces-C
> >
> > >
> > > I'm having a problem parsing entities.
> > >
> > > Simplified XML file:
> > > "<?xml version="1.0"?>
> > > <!DOCTYPE Personnel SYSTEM "person.dtd">
> > > <personnel>
> > > &tst;
> > > </personnel>
> > > "
> > >
> > > The simplified DTD file:
> > > "<?xml encoding="US-ASCII"?>
> > > <!ENTITY tst "testing">
> > > <!ELEMENT personnel (person,)+>
> > > <!ELEMENT person (name, email*, url*)>
> > > <!ELEMENT name (#PCDATA)>
> > > <!ELEMENT email (#PCDATA)>
> > > <!ELEMENT url (#PCDATA)>
> > > "
> > >
> > > I get the following exception: "Invalid character (Unicode: 0x0) "
> > >
> > > refering the line with &tst;  and character right after the semicolon.
> > >
> > > I have the setEntityHandler(this) set to the parser .....  Am I not
> > declaring
> > > the entities
> > > correctly ?
> > >
> > > -Marat
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > > For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> > >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>


Re: parsing entities with Xerces-C

Posted by Marat Ruvinov <Ma...@msdw.com>.
I tried running it thru SAXPrint, DOMPrint, ... and I get the same error. If I
use a
pre-defined entity like &amp   or  &lt   the error goes away. I only get it
when
defining the entity. The faq mentions that there could be a hidden control
character -
so I used "od -c" and did not see any control characters.

Puzzled....

Marat

Dean Roddey wrote:

> Looks like you are doing the right thing. Try running this through one of
> the sample programs that comes with the parser distribution. If it works
> there, its your program, probably the build options. Otherwise, let us know
> and we can look at it.
>
> --------------------------
> Dean Roddey
> The CIDLib C++ Frameworks
> Charmed Quark Software
> droddey@charmedquark.com
> http://www.charmedquark.com
>
> "You young, and you gotcha health. Whatchoo wanna job fer?"
>
> ----- Original Message -----
> From: "Marat Ruvinov" <Ma...@msdw.com>
> To: <xe...@xml.apache.org>
> Sent: Thursday, July 13, 2000 7:04 AM
> Subject: parsing entities with Xerces-C
>
> >
> > I'm having a problem parsing entities.
> >
> > Simplified XML file:
> > "<?xml version="1.0"?>
> > <!DOCTYPE Personnel SYSTEM "person.dtd">
> > <personnel>
> > &tst;
> > </personnel>
> > "
> >
> > The simplified DTD file:
> > "<?xml encoding="US-ASCII"?>
> > <!ENTITY tst "testing">
> > <!ELEMENT personnel (person,)+>
> > <!ELEMENT person (name, email*, url*)>
> > <!ELEMENT name (#PCDATA)>
> > <!ELEMENT email (#PCDATA)>
> > <!ELEMENT url (#PCDATA)>
> > "
> >
> > I get the following exception: "Invalid character (Unicode: 0x0) "
> >
> > refering the line with &tst;  and character right after the semicolon.
> >
> > I have the setEntityHandler(this) set to the parser .....  Am I not
> declaring
> > the entities
> > correctly ?
> >
> > -Marat
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> >


Re: parsing entities with Xerces-C

Posted by Dean Roddey <dr...@charmedquark.com>.
Looks like you are doing the right thing. Try running this through one of
the sample programs that comes with the parser distribution. If it works
there, its your program, probably the build options. Otherwise, let us know
and we can look at it.

--------------------------
Dean Roddey
The CIDLib C++ Frameworks
Charmed Quark Software
droddey@charmedquark.com
http://www.charmedquark.com

"You young, and you gotcha health. Whatchoo wanna job fer?"


----- Original Message -----
From: "Marat Ruvinov" <Ma...@msdw.com>
To: <xe...@xml.apache.org>
Sent: Thursday, July 13, 2000 7:04 AM
Subject: parsing entities with Xerces-C


>
> I'm having a problem parsing entities.
>
> Simplified XML file:
> "<?xml version="1.0"?>
> <!DOCTYPE Personnel SYSTEM "person.dtd">
> <personnel>
> &tst;
> </personnel>
> "
>
> The simplified DTD file:
> "<?xml encoding="US-ASCII"?>
> <!ENTITY tst "testing">
> <!ELEMENT personnel (person,)+>
> <!ELEMENT person (name, email*, url*)>
> <!ELEMENT name (#PCDATA)>
> <!ELEMENT email (#PCDATA)>
> <!ELEMENT url (#PCDATA)>
> "
>
> I get the following exception: "Invalid character (Unicode: 0x0) "
>
> refering the line with &tst;  and character right after the semicolon.
>
> I have the setEntityHandler(this) set to the parser .....  Am I not
declaring
> the entities
> correctly ?
>
> -Marat
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>