You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Alberto Massari <al...@exceloncorp.com> on 2001/12/13 14:51:38 UTC

Use of EntityResolver in XMLSchema validation (v. 1.6)

Hi everybody,
I am facing a problem with the way XMLSchema validation is done in Xerces, 
that also involves how the EntityResolver interface has been designed, and 
I cannot find an easy solution...

The current situation (that works for DTDs):

I need to install an EntityResolver handler to be able to load files from 
multiple repositories (file system, URLs, and XML databases); to correctly 
handle relative pathnames, I need to know the path of the file currently 
being parsed, but the EntityResolver::resolveEntity callback method is only 
given the public id and the system id of the file to be opened. So I must 
implement the EntityResolver interface in the same class that implements 
the DocumentHandler, so that I can receive the Locator object through the 
setDocumentLocator callback, store it, and call its method getSystemId to 
know what is the document trying to open an external entity.

The problem with XMLSchema validation:

XMLSchema validation is done through a temporary DOMParser object, but 
using the same pointer to the EntityResolver interface currently being 
used. At this point, if the XMLSchema makes use of relative paths inside 
DTDs, xsd:import or xsd:include statements, the interface gets called but 
the Locator object still thinks that the current file is the original XML 
(instead of the XMLSchema)!!!
And I cannot even intercept a call to expandSystemId in the parser object, 
because what is created is a plain DOMParser object.
Furthermore, when an error (schema error, not XML error) is found in the 
XMLSchema being parsed, the returned exception places it in the main XML 
file, at the line and column that caused the schema to be loaded, instead 
of pointing to invalid XMLSchema definition.

Possible solutions:

a) change the SAX specs to have EntityResolver::resolveEntity have a third 
parameter specifying the systemId of the current XML file (I know, it 
involves dealing with a spec change)
b) change XMLScanner::resolveSchemaGrammar, 
TraverseSchema::traverseInclude, TraverseSchema::traverseImport, 
TraverseSchema::openRedefinedSchema not to use the plain DOMParser, but to 
use a callback function to build one, so that I can customize it
c) instead of creating a local InputSource object in those 4 functions, ask 
the ReaderMgr to create a reader for it and push it on the stack, so that 
the main parser has the knowledge of what is the current file
d) I don't know: ideas anyone?

Ciao,

Alberto

-------------------------------
Alberto Massari
eXcelon Corp.
http://www.StylusStudio.com


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org