You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@xerces.apache.org by Ryosuke Nanba <Ry...@justsystem.co.jp> on 2000/02/24 12:31:00 UTC

[BUG?] first "xmlns" attribute of root element causes network connection on non-validating parsing.

Hi everyone,

I found Xerces-J's non-validating parser(default configration of
org.apache.xerces.parsers.SAXParser) try to establish network 
connection on some cases.

#  Version number of Xerces-J: 1.0.2 and 1.0.0 
#  Version number of JDK: 1.2.2 

following cases cause network connection.

 <html xmlns="http://www.w3.org/1999/xhtml">
   ...
 </html>

 <x:html xmlns="http://horobi.com/" xmlns:x="http://www.w3.org/1999/xhtml">
   ...
 </x:html>

following cases parsed silently.

 <x:html xmlns:x="http://www.w3.org/1999/xhtml">
   ...
 </x:html>

 <x:html xmlns:x="http://www.w3.org/1999/xhtml" xmlns="http://horobi.com/">
   ...
 </x:html>

 <html foo="bar" xmlns="http://www.w3.org/1999/xhtml">
   ...
 </html>

Maybe, only when the first attribute of the root element is "xmlns",
parser access URL of it's value (to read schema entity?).

Is it right behavior ? or BUG?
# I don't know much about XML Schema...
---
	NANBA Ryosuke

Re: [BUG?] first "xmlns" attribute of root element causes network connection on non-validating parsing.

Posted by Ryosuke Nanba <Ry...@justsystem.co.jp>.
Hi everyone.

I wrote:
> Hacked parser still access www.w3.org, when I feed UTF-16 input.
> # try utf-16le.xml & utf-16be.xml in samples.zip.
> 
> And hacked parser can't parse some UTF-8 input which can be
> parsed by normal SAXParser.
> # try diary.xhtml (written in Japanese) in samples.zip

All cases work fine on Xerces-J 1.0.3, without any hack!
Thanks!
---
	Ryosuke Nanba

Re: [BUG?] first "xmlns" attribute of root element causes network connection on non-validating parsing.

Posted by Ryosuke Nanba <Ry...@justsystem.co.jp>.
Hi, everyone.

Pierpaolo Fumagalli wrote:
> One easy hack is to set the "namespace" feature to true...
> It solved that in my environment...

Thank you!
But I'm still unhappy...

Hacked parser still access www.w3.org, when I feed UTF-16 input.
# try utf-16le.xml & utf-16be.xml in samples.zip.

And hacked parser can't parse some UTF-8 input which can be 
parsed by normal SAXParser.
# try diary.xhtml (written in Japanese) in samples.zip 
---
	Ryosuke Nanba

Re: [BUG?] first "xmlns" attribute of root element causes network connection on non-validating parsing.

Posted by Pierpaolo Fumagalli <pi...@apache.org>.
Andy Clark wrote:
> 
> Ryosuke Nanba wrote:
> > Maybe, only when the first attribute of the root element is "xmlns",
> > parser access URL of it's value (to read schema entity?).
> >
> > Is it right behavior ? or BUG?
> 
> You are correct. The experimental implementation of Schema
> requires that the first attribute of the document root be
> "xmlns" if you want to load a Schema grammar. This is a
> problem that will be fixed in the future. But for now it
> is a limitation.

One easy hack is to set the "namespace" feature to true...
It solved that in my environment...

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------



Re: [BUG?] first "xmlns" attribute of root element causes network connection on non-validating parsing.

Posted by Ryosuke Nanba <Ry...@justsystem.co.jp>.
Hi everyone.

Andy Clark wrote:
> You are correct. The experimental implementation of Schema
> requires that the first attribute of the document root be
> "xmlns" if you want to load a Schema grammar.

I'm sorry! I didn't read "Other Limitations" in 
http://xml.apache.org/xerces-j/schema.html

> This is a
> problem that will be fixed in the future. But for now it
> is a limitation.

OK. But why the first attribute is treated specially ?
I think it's not good idea.

Attributes and declared namespaces are unordered sets.
Tools may not have options to control attribute's order.
---
	Ryosuke Nanba

Re: [BUG?] first "xmlns" attribute of root element causes network connection on non-validating parsing.

Posted by Andy Clark <an...@apache.org>.
Ryosuke Nanba wrote:
> Maybe, only when the first attribute of the root element is "xmlns",
> parser access URL of it's value (to read schema entity?).
> 
> Is it right behavior ? or BUG?

You are correct. The experimental implementation of Schema
requires that the first attribute of the document root be
"xmlns" if you want to load a Schema grammar. This is a 
problem that will be fixed in the future. But for now it
is a limitation.

-- 
Andy Clark * IBM, JTC - Silicon Valley * andyc@apache.org