You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Magnus Strand <ma...@tim.se> on 2003/04/22 10:31:21 UTC
Problems parsing entities
Hi,
I have problems with parsing entities.
When I print the test XML file below with the DOMPrint sample I get
problems
with entities (both text and character entities).
The second time I use an enitity it gets outputted twice!
The third time I use an enitity it gets outputted three times and so on.
I tested also with Xerces-J without problems.
I spent a day debugging with VC++7 on Win 2000 and also CodeWarrior 8.3
on Mac.
It seems to be a problem in AbstractDOMParser::parse.
I think it was in the docCharacterData-method that the entitys content
got appended
to the DOMEntityRefImpl when it was used.
The DOMEntityRefImpl-node is changed from being read-only to
read/write,
when the text in the test-element i parsed.
I wonder if this is correct?
If DOMEntityRefImpl was read/only at this time it wouldn't get double
content.
Does anyone know what is the solution to this problem?
------------
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE test
[
<!ELEMENT test (#PCDATA)>
<!ENTITY greeting "hi">
]
><test>&greeting;|&greeting;|&greeting;</test>
--------
I get this output:
---------
<?xml version="1.0" encoding="iso-8859-1" standalone="no" ?><!DOCTYPE
test [
<!ELEMENT test (#PCDATA)>
<!ENTITY greeting "hi">
]><test>hi|hihi|hihihi</test>
------------
Regards,
Magnus Strand
System Developer, MSc
Teknik i Media Sverige AB (publ)
Södra Förstadsgatan 2, SE-211 43 Malmö, Sweden
http://www.tim.se
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
Re: Problems parsing entities
Posted by Gareth Reakes <ga...@decisionsoft.com>.
Hi,
this is a bug. I am looking into it further and will produce a
Bugzilla report and/or fix for it.
Gareth
On Tue, 22 Apr 2003, Magnus Strand wrote:
> Hi,
>
> I have problems with parsing entities.
> When I print the test XML file below with the DOMPrint sample I get
> problems
> with entities (both text and character entities).
> The second time I use an enitity it gets outputted twice!
> The third time I use an enitity it gets outputted three times and so on.
>
> I tested also with Xerces-J without problems.
>
> I spent a day debugging with VC++7 on Win 2000 and also CodeWarrior 8.3
> on Mac.
>
> It seems to be a problem in AbstractDOMParser::parse.
> I think it was in the docCharacterData-method that the entitys content
> got appended
> to the DOMEntityRefImpl when it was used.
> The DOMEntityRefImpl-node is changed from being read-only to
> read/write,
> when the text in the test-element i parsed.
> I wonder if this is correct?
> If DOMEntityRefImpl was read/only at this time it wouldn't get double
> content.
>
> Does anyone know what is the solution to this problem?
>
> ------------
> <?xml version="1.0" encoding="iso-8859-1"?>
> <!DOCTYPE test
> [
> <!ELEMENT test (#PCDATA)>
> <!ENTITY greeting "hi">
> ]
> ><test>&greeting;|&greeting;|&greeting;</test>
> --------
>
> I get this output:
> ---------
> <?xml version="1.0" encoding="iso-8859-1" standalone="no" ?><!DOCTYPE
> test [
> <!ELEMENT test (#PCDATA)>
> <!ENTITY greeting "hi">
> ]><test>hi|hihi|hihihi</test>
> ------------
>
>
> Regards,
> Magnus Strand
>
>
> System Developer, MSc
>
> Teknik i Media Sverige AB (publ)
> Södra Förstadsgatan 2, SE-211 43 Malmö, Sweden
> http://www.tim.se
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>
--
Gareth Reakes, Head of Product Development +44-1865-203192
DecisionSoft Limited http://www.decisionsoft.com
XML Development and Services
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
RE: Problems parsing entities (using DOM parser)
Posted by Gareth Reakes <ga...@decisionsoft.com>.
I'll take a look at this today.
Gareth
On Fri, 25 Apr 2003, Magnus Strand wrote:
> Hi,
>
> I would like to know if anyone could confirm if the problem mentioned
> in my previous e-mail is a bug or not?
>
>
> Many thanks,
> Magnus Strand
>
> PS. Thanks Erik for the code for getTextContent, it works good.
>
>
> System Developer, MSc
>
> Teknik i Media Sverige AB (publ)
> Södra Förstadsgatan 2, SE-211 43 Malmö, Sweden
> http://www.tim.se
>
>
--
Gareth Reakes, Head of Product Development +44-1865-203192
DecisionSoft Limited http://www.decisionsoft.com
XML Development and Services
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
RE: Problems parsing entities (using DOM parser)
Posted by Magnus Strand <ma...@tim.se>.
Hi,
I would like to know if anyone could confirm if the problem mentioned
in my previous e-mail is a bug or not?
Many thanks,
Magnus Strand
PS. Thanks Erik for the code for getTextContent, it works good.
System Developer, MSc
Teknik i Media Sverige AB (publ)
Södra Förstadsgatan 2, SE-211 43 Malmö, Sweden
http://www.tim.se