You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@xerces.apache.org by Armin Pfarr <ap...@vipsurf.de> on 2000/02/04 12:38:43 UTC

CData-Attributes

Hi,

if I'm using a DTD

<!ELEMENT a
	EMPTY >
<!ATTLIST a
	href CDATA #REQUIRED >

and construct an Instance

<a href="www.w3c.org?test=1&test2=5"/>

I get a parsing error
"The Reference to entity test2 must end with the ";" delimiter.

If I understand the spec correctly, CDATA-Attributes are final and may not
be processed by the parser. In this case I'd expect, that there is no
parsing error.

Am I right?

Armin
P.S.
The XHTML 1.0 DTD also describes the href-attributes as

"href CDATA #IMPLIED"


Re: CData-Attributes

Posted by Mike Pogue <mp...@apache.org>.
An attribute containing CDATA is not the same as a "CDATA section".

Use the Annotated XML Spec [was: CData-Attributes]

Posted by Mike Pogue <mp...@apache.org>.
By the way, the Annotated Spec is an EXCELLENT way to look up this kind of thing.
It puts a lot of the esoteric stuff into perspective, and it
uses normal language, instead of "spec language".

You can find it at:

	http://www.xml.com/axml/axml.html

Mike

Mike Pogue wrote:
> 
> Let me try that again...(I hit SEND too soon!)
> 
> An attribute of type "CDATA" is not the same as a "CDATA section".
> 
> >From the spec:
> 
> The ampersand character (&) and the left angle bracket (<) may appear in their
> literal form only when used as markup delimiters, or within a comment, a
> processing instruction, or a CDATA section. They are also legal within the literal
> entity value of an internal entity declaration; see "4.3.2 Well-Formed Parsed
> Entities". If they are needed elsewhere, they must be escaped using either numeric
> character references or the strings "&amp;" and "&lt;" respectively.
> 
> >From the annotated spec:
> 
> Escaping Delimiters
> 
> The spec goes into a lot of agonizing detail, but it all
> amounts to: always use &lt; for < (the "lt" stands for
> less-than, by the way) and &amp; for &.
> 
> Mike
> 
> Armin Pfarr wrote:
> >
> > Hi,
> >
> > if I'm using a DTD
> >
> > <!ELEMENT a
> >         EMPTY >
> > <!ATTLIST a
> >         href CDATA #REQUIRED >
> >
> > and construct an Instance
> >
> > <a href="www.w3c.org?test=1&test2=5"/>
> >
> > I get a parsing error
> > "The Reference to entity test2 must end with the ";" delimiter.
> >
> > If I understand the spec correctly, CDATA-Attributes are final and may not
> > be processed by the parser. In this case I'd expect, that there is no
> > parsing error.
> >
> > Am I right?
> >
> > Armin
> > P.S.
> > The XHTML 1.0 DTD also describes the href-attributes as
> >
> > "href CDATA #IMPLIED"

AW: CData-Attributes

Posted by Armin Pfarr <ap...@vipsurf.de>.
Hello Mike,

thank to your comments, I got the point.

I already had the annotated spec at hand. I looked up 3.3.1 "Attribute
Types" which states StringType::='CDATA'. There is no description there and
also no annotation. Since this IS something new in XML, I'd at least mention
it here.

It is astonishing for somebody who has some knowledge of SGML (like me) that
the CDATA construct in XML differs from the one in SGML at all AND STILL
USES THE SAME NAME. In SGML, a CDATA-Attribute is unparsed and not subject
to any translations. As a result, the typical href's in HTML can contain
something like "http://test.de?id=1&test=2". I can't see any good reason why
this has changed in XML, but maybe the intention was to disturb half of the
world.

The same goes for the treatment of public identifiers in XML. They W3C just
forgot to introduce catalogs and now my telephone starts dialing when I try
to access an XHTML-Document (since that includes a URL to w3c in its
System-Identifier). Another annoying "feature" is, that SGML allows a "< "
in PCDATA, XML wants to have "&lt ". I could easily continue with points
like this.

I don't want to say "why didn't they use SGML" - there is lots of shit in
SGML, too. But these new points do not offer any additional benefits for
endusers or parser-builders (e.g. the SAX interface allows to specify an
Entity-resolver). All that leads to - in my opinion - unnecessary efforts
you have to take in order to make your old content XML-ready.

This isn't meant to criticise anybody at xml.apache.org. You are really
doing a great job.

Thanks again

Armin


Re: CData-Attributes

Posted by Mike Pogue <mp...@apache.org>.
Let me try that again...(I hit SEND too soon!)

An attribute of type "CDATA" is not the same as a "CDATA section".

Re: CData-Attributes

Posted by Andy Clark <an...@apache.org>.
In addition to Mike's excellent comments, I'll list the necessary
changes in your document:

> <a href="www.w3c.org?test=1&test2=5"/>

Should be:

<a href="www.w3c.org?test=1&amp;test2=5"/>

> I get a parsing error
> "The Reference to entity test2 must end with the ";" delimiter.

The parser is thinking that you are referencing an entity
called "test2" (the nmtoken characters after the ampersand).
When it hits the equals it barfs thinking that you should
be ending the entity reference.

-- 
Andy Clark * IBM, JTC - Silicon Valley * andyc@apache.org