You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@xerces.apache.org by Pierpaolo Fumagalli <pi...@apache.org> on 2000/02/10 05:18:12 UTC

Small bug in SAXParser implementation (I believe)

I found a small "bug" or a discrepancy between how the SAX parser
behaves and how the DOM parser does:

I have this document:

<document xmlns:test="http://www.betaversion.org/2000/test">
    <test:test>Test</test:test>
</document>

When parsed the DOM parser reports that the attribute "xmlns:test" has
the following properties:
  - namespace uri: http://www.w3.org/2000/xmlns/
  - prefix:        xmlns
  - local name:    html
  - (raw) name:    xmlns:html
  - value:         http://www.betaversion.org/2000/test

while the SAX parser reports:
  - namespace uri: 
  - prefix:        (not available in SAX)
  - local name:    html
  - (raw) name:    xmlns:html
  - value:         http://www.betaversion.org/2000/test

In short words, for XMLNS declarations the DOM parser reports that
xmlns: URI is "http://www.w3.org/2000/xmlns" while the sax parser
doesnt report any uri.

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: The last bug in SAXParser implementation for tonight

Posted by Pierpaolo Fumagalli <pi...@apache.org>.
Andy Clark wrote:
> 
> Pierpaolo Fumagalli wrote:
> > I see a strange behaviour of the SAXParser when I parse this document:
> 
> How did this message get into this thread? ;)
> 
> Anyway, I'm not surprised that there are problems with SAX2beta
> support. It has *not* been tested, yet. I posted a request on
> the mailing list for someone to test it (and provide the patches)
> but it hasn't happened. Therefore, Ralf has agreed to look into
> testing it. It will be tested before we release.

I am reporting a bug :)

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: The last bug in SAXParser implementation for tonight

Posted by Andy Clark <an...@apache.org>.
Pierpaolo Fumagalli wrote:
> I see a strange behaviour of the SAXParser when I parse this document:

How did this message get into this thread? ;)

Anyway, I'm not surprised that there are problems with SAX2beta
support. It has *not* been tested, yet. I posted a request on
the mailing list for someone to test it (and provide the patches)
but it hasn't happened. Therefore, Ralf has agreed to look into
testing it. It will be tested before we release.

-- 
Andy Clark * IBM, JTC - Silicon Valley * andyc@apache.org

The last bug in SAXParser implementation for tonight

Posted by Pierpaolo Fumagalli <pi...@apache.org>.
Here I come again :) For the last time tonight (I hope) :)

I see a strange behaviour of the SAXParser when I parse this document:

?xml version="1.0"?>

<!DOCTYPE document [
  <!ENTITY nbsp "&#64;">
]>

<document>
  <test value="&nbsp;"/>
</document>

What happens on the sax? Here is a "trace" of the combined content and
lexical handlers installed:

setDocumentLocator
startDocument
  startDTD name="document" publicId=(null) systemId=(null)
  endDTD
  startElement uri="" loc="document" raw="document"
    characters [
  ]
    startEntity name="nbsp"
    endEntity name="nbsp"
    startElement uri="" loc="test" raw="test"
    + attribute uri="" loc="value" raw="value" typ="CDATA" val="@"
    endElement uri="" loc="test" raw="test"
    characters [
]
  endElement uri="" loc="document" raw="document"
endDocument

You see? before I see the notification of the "test" element, I receive
notification of an entity with no body called "nbsp", the entity I have
as an attribute in the test element.

Done bothering for today (probably!)

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: Small bug in SAXParser implementation (I believe)

Posted by Arnaud Le Hors <le...@us.ibm.com>.
Pierpaolo Fumagalli wrote:
> 
> Yes, but as far as I can see, since the xmlns prefix is bound to the
> "http://www.w3.org/2000/xmlns/", why does the SAX parser doesn't report
> it? In my code I check for that, but IMVHO, the parser itself should
> specify the URI when meeting a xmlns:* attribute...

Doing this change now would mean our SAX parser is no longer compliant
with the current Namespaces in XML specification. Again, it's
unfortunate we have to live with this decrepancy between the DOM and the
XML parser, but we'll have to live with this until the Namespaces in XML
spec is revised or an errata is published.
-- 
Arnaud  Le Hors - IBM Cupertino, XML Technology Group

Re: Small bug in SAXParser implementation (I believe)

Posted by Pierpaolo Fumagalli <pi...@apache.org>.
Arnaud Le Hors wrote:
> 
> Pierpaolo Fumagalli wrote:
> >
> > I found a small "bug" or a discrepancy between how the SAX parser
> > behaves and how the DOM parser does:
> 
> That's not a bug. As far as the DOM Level 2 is concerned every namespace
> declaration attributes is, by definition, bound to
> "http://www.w3.org/2000/xmlns/". These are attributes whose the prefix
> is "xmlns" or the qualifed name is "xmlns" (default namespace
> declaration). This was recently decided so no public document reflects
> this yet but, I'm the editor of the Core section of the DOM Level 2 spec
> so you can trust me on that. ;-)
> 
> This decision has been ratified by the XML Core WG and will be stated in
> the next revision of the Namespaces for XML specification. Until then, I
> think we'll have to live with this discrepancy.

Yes, but as far as I can see, since the xmlns prefix is bound to the
"http://www.w3.org/2000/xmlns/", why does the SAX parser doesn't report
it? In my code I check for that, but IMVHO, the parser itself should 
specify the URI when meeting a xmlns:* attribute...

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: Small bug in SAXParser implementation (I believe)

Posted by Arnaud Le Hors <le...@us.ibm.com>.
Pierpaolo Fumagalli wrote:
> 
> I found a small "bug" or a discrepancy between how the SAX parser
> behaves and how the DOM parser does:

That's not a bug. As far as the DOM Level 2 is concerned every namespace
declaration attributes is, by definition, bound to
"http://www.w3.org/2000/xmlns/". These are attributes whose the prefix
is "xmlns" or the qualifed name is "xmlns" (default namespace
declaration). This was recently decided so no public document reflects
this yet but, I'm the editor of the Core section of the DOM Level 2 spec
so you can trust me on that. ;-)

This decision has been ratified by the XML Core WG and will be stated in
the next revision of the Namespaces for XML specification. Until then, I
think we'll have to live with this discrepancy.
-- 
Arnaud  Le Hors - IBM Cupertino, XML Technology Group

Another small bug in DOMParser implementation

Posted by Pierpaolo Fumagalli <pi...@apache.org>.
I'm a pain... I know it :)
But I found another one....
On the DOM parser, if I use:

> parser.setFeature("http://apache.org/xml/features/dom/defer-node-expansion",false);

wich is NOT the default, I get all document correctly.
If instead I leave the defer-node-expansion feature to true, instead
of seeing EntityReferenceImpl nodes, I get an unknown node of type -1
(its class: class org.apache.xerces.dom.DeferredElementDefinitionImpl).

I'm building some new SAX2 -> DOM2 and DOM2 -> SAX2 translators, so,
I'm trying out all possible combinations of stuff....

BTW, do someone know of good testcases???

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------