You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Sebastien Ponce <se...@cern.ch> on 2000/10/13 18:42:19 UTC

Entity references inside attributes

I experimented some strange behavior with the xerces DOMparser when I
use entity references inside attributes. These references are replaced
by their values but do not create entity reference nodes in the DOM
tree, while it is the case for entity references outside attributes.
Here is an example of this :

Here is my (very simple) dtd :
<!ENTITY helloEntity "hello">
<!ELEMENT DDDB (#PCDATA)>
<!ATTLIST DDDB attr CDATA #REQUIRED>

and here is an XML file using it :
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE DDDB SYSTEM "test.dtd">
<DDDB attr="&helloEntity; world node">
 &helloEntity; world !
</DDDB>

The DOM tree looks like that :
DDDB
    Attribute attr with value "hello world node"
    Entity Reference helloEntity
        Text node "hello"
    Text node "world"

I would like it to look like :
DDDB
    Attribute attr with value "hello world node"
        Entity Reference helloEntity
            Text node "hello"
        Text node "world node"
    Entity Reference helloEntity
        Text node "hello"
    Text node "world"

Is is a normal behavior of xerces. If yes, is there a simple way to
parse the attribute strings ?

Sebastien


Re: Entity references inside attributes

Posted by Sebastien Ponce <se...@cern.ch>.
> Yes, this is normal. Xerces does not put entity references in the
> attribute's child nodes. It currently has no way of reporting the
> occurrence of entity references in attribute values.
>
> The Xerces 2 concept design has the ability to do this but it's
> just not implemented, yet. While the design can support it, I can
> already see a problem looming on the horizon. When the scanner
> reads the attribute value, it can store the occurrence of entity
> references. However, when the validator goes to normalize those
> values based on the attribute's type, the location of entity
> references can (and will) move. I think that it is definitely
> implementable; it'll just be a little challenging. Would you
> like to write this part of the code?

I would like to write it but I've currently no time to do it. I'll think
at it as soon as I get some time.

Sebastien


Re: Entity references inside attributes

Posted by Andy Clark <an...@apache.org>.
Sebastien Ponce wrote:
> I would like it to look like :
> DDDB
>     Attribute attr with value "hello world node"
>         Entity Reference helloEntity
>             Text node "hello"
>         Text node "world node"
>     Entity Reference helloEntity
>         Text node "hello"
>     Text node "world"
> 
> Is is a normal behavior of xerces. If yes, is there a simple 
> way to parse the attribute strings ?

Yes, this is normal. Xerces does not put entity references in the
attribute's child nodes. It currently has no way of reporting the
occurrence of entity references in attribute values.

The Xerces 2 concept design has the ability to do this but it's
just not implemented, yet. While the design can support it, I can
already see a problem looming on the horizon. When the scanner
reads the attribute value, it can store the occurrence of entity
references. However, when the validator goes to normalize those
values based on the attribute's type, the location of entity
references can (and will) move. I think that it is definitely
implementable; it'll just be a little challenging. Would you
like to write this part of the code?

-- 
Andy Clark * IBM, JTC - Silicon Valley * andyc@apache.org