You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Simon Kitching <si...@ecnetwork.co.nz> on 2003/04/16 10:09:05 UTC

Re: Can I stop SAX validation from resolving non-approved namespaces?

On Wed, 2003-04-16 at 19:24, Andy Taylor wrote:
> Hi Jeff,
> 
> I understand this, but I would like to stop the validation process
> (which, I believe, will run through the entire document resolving
> entities BEFORE using my SAX ContentHandler Interface) from accessing
> DTDs and schemas on unknown machines.
> 
> The only way I can think of doing this would be to parse the document
> twice - first with validation off so I can spot any "unapproved"
> namespaces. If all namespaces are OK, I can then parse the document
> again with validation on. This seems like a bit of a waste of time!

This should not be necessary.

When a document is being parsed, and a namespace is encountered, then
the registered EntityResolver is *immediately* called, passing the
namespace URI.

The entity resolver can inspect this URI, and do any of the following
(on a per-uri basis):

(a) return null, causing xerces to do its default behaviour which is to
attempt to load the document referenced by the URI.

(b) return an input stream for any source of data you wish. this is how
to implement a local "cache" of schemas/dtds :- look up a mapping of
uri->local-file-name and return the contents of the local file. Xerces
will then accept this data *instead of* fetching from the original uri.

(c) throw an exception. This will cancel any further processing of the
document. This, I believe, is what you want to do if an unknown URI is
encountered.


As Jeff said, the xsi:schemaLocation is a hint; basically it sets up an
EntityResolver which looks up the uri in the list of (uri->file)
mappings in the xsi:schemaLocation. If a match is found, it returns the
corresponding file. However if a match is not found, it returns null
causing Xerces to try to fetch the remote file - which is not what you
want.

Implementing an EntityResolver to do what you wish is about 50 lines of
code - not a big deal.

> >From: "Jeff Greif" 
> >Reply-To: xerces-j-user@xml.apache.org 
> >To: 
> >Subject: Re: Can I stop SAX validation from resolving non-approved
> namespaces? 
> >Date: Tue, 15 Apr 2003 09:52:37 -0700 
> > 
> >No. The xsi:schemaLocation attribute is just a hint to the parser
> about where to find the schema. If you attach an EntityResolver to the
> parser (in its constructor, or by some method like setEntityResolver)
> you have complete control of how any URIs are looked up. You can
> resolve particular namespaces to your fixed copy of the schemas in
> question (and can make local copies of common ones found on the
> Internet to avoid redundant network activity). 




---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Can I stop SAX validation from resolving non-approved namespaces?

Posted by Joseph Kesselman <ke...@us.ibm.com>.
Just a reminder, to maintain context: Namespaces, per se, are never 
dereferenced. They may be bound to schemas, and if so the _schema_ URI may 
be dereferenced. But a namespace URI is just a "magic word"; officially, 
it does not point to anything and is never dereferenced.

______________________________________
Joe Kesselman, IBM Next-Generation Web Technologies: XML, XSL and more. 
"may'ron DaroQbe'chugh vaj bIrIQbej"  ("Put down the squeezebox and nobody 
gets hurt.")