You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by "William Eliot Kimber (JIRA)" <xe...@xml.apache.org> on 2005/10/05 18:06:50 UTC

[jira] Created: (XERCESJ-1104) Resolution of schemaLocation URIs should be via URI resolution, not entity resolution

Resolution of schemaLocation URIs should be via URI resolution, not entity resolution
-------------------------------------------------------------------------------------

         Key: XERCESJ-1104
         URL: http://issues.apache.org/jira/browse/XERCESJ-1104
     Project: Xerces2-J
        Type: Bug
  Components: XNI  
    Versions: 2.6.0, 2.7.1    
 Environment: All
    Reporter: William Eliot Kimber


The resolveSchema() method of XSDHandler uses the entityResolver to resolve URIs in schema location hints. This has the effect that, when using a catalog resolver that SYSTEM entries are used to resolve the schema URIs.

However, schemas are not entities and therefore their URI references should not be resolved via an entity resolver but via a URI resolver and should, therefore be resolved via URI catalog entries, not SYSTEM entries.

That is, by the OASIS Entity Resolution spec one would expect to declare URI entries to remap schema location URIs but this does not work. 

I'm happy to develop a fix but it may take me a while to figure out exactly how to go about it. 

Because this behavior has been around for a while (since at least version 2.6) and is documented in at least one tutorial I found, it will probably be necessary to control the use of an entityResolver or URI resolver through a system property.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Resolved: (XERCESJ-1104) Resolution of schemaLocation URIs should be via URI resolution, not entity resolution

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XERCESJ-1104?page=all ]
     
Michael Glavassevich resolved XERCESJ-1104:
-------------------------------------------

    Resolution: Won't Fix

There are several subinterfaces of XMLResourceIdentifier which are passed to XMLEntityResolver.resolveEntity(XMLResourceIdentifier). A XMLResourceIdentifier for an entity will also be an instance of org.apache.xerces.impl.XMLEntityDescription. If it's a grammar it will be a org.apache.xerces.xni.grammars.XMLGrammarDescription. If the grammar is a DTD it will also be a org.apache.xerces.xni.grammars.XMLDTDDescription. If the grammar is an XML schema it will also be a org.apache.xerces.xni.grammars.XMLSchemaDescription. With an instanceof check you can tell what type of resource is being resolved and if there isn't a subinterface for a particular type of resource it's because there hasn't been a specific need for one.

This leads me back to what I've said from the start. You can already do what you want. The resolveEntity methods on XMLCatalogResolver are just there for convienence. If one of the resolve methods doesn't do what you want/need you can override it by extending the class. I admit that XNI could have had a better design for resource resolution but we're stuck with what we have. The changes you've suggested add methods to interfaces which are already in use and change the circumstances in which the existing ones are called. When you do this (see discussions on the net about DOM Level 3) it causes grief for both established users and implementers. We've made minor changes to XNI in the past but those have all been critical for supporting the latest version of JAXP. I'm against making changes that would break applications which use XNI and other implementers of XNI (yes there are a few others than Xerces) when the change shuffles around capability which is already there, even if the proposed changes are in principle a better design than the existing one.

> Resolution of schemaLocation URIs should be via URI resolution, not entity resolution
> -------------------------------------------------------------------------------------
>
>          Key: XERCESJ-1104
>          URL: http://issues.apache.org/jira/browse/XERCESJ-1104
>      Project: Xerces2-J
>         Type: Bug
>   Components: XNI
>     Versions: 2.6.0, 2.7.1
>  Environment: All
>     Reporter: William Eliot Kimber

>
> The resolveSchema() method of XSDHandler uses the entityResolver to resolve URIs in schema location hints. This has the effect that, when using a catalog resolver that SYSTEM entries are used to resolve the schema URIs.
> However, schemas are not entities and therefore their URI references should not be resolved via an entity resolver but via a URI resolver and should, therefore be resolved via URI catalog entries, not SYSTEM entries.
> That is, by the OASIS Entity Resolution spec one would expect to declare URI entries to remap schema location URIs but this does not work. 
> I'm happy to develop a fix but it may take me a while to figure out exactly how to go about it. 
> Because this behavior has been around for a while (since at least version 2.6) and is documented in at least one tutorial I found, it will probably be necessary to control the use of an entityResolver or URI resolver through a system property.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1104) Resolution of schemaLocation URIs should be via URI resolution, not entity resolution

Posted by "Norman Walsh (JIRA)" <xe...@xml.apache.org>.
    [ http://issues.apache.org/jira/browse/XERCESJ-1104?page=comments#action_12332632 ] 

Norman Walsh commented on XERCESJ-1104:
---------------------------------------

If the current "out of the box" configuration uses the EntityResolver in its attempt to locate a schema document for a schema URI, it is manifestly doing the wrong thing. The documentation for resolveEntity() is clear:

   Allow the application to resolve external entities.
   ...
   Application writers can use this method to redirect external system identifiers

Whatever else may be true, neither namespace names nor schema location hints can properly be called "external identifiers".

That said, I'm sympathetic to the position that says that XNI needs to have some flexibility at this level.

Many applications support a command line option that allows the user to identify the name of a class to be used to instantiate an EntityResolver. Is there an analagous object in the XNI paradigm that could be instantiated and could applications reasonably be expected to support this behavior?

I'd be happy to extend the XML Commons Resolver to support new interfaces to make proper use of XML Catalogs with Apache schema processing easier.


> Resolution of schemaLocation URIs should be via URI resolution, not entity resolution
> -------------------------------------------------------------------------------------
>
>          Key: XERCESJ-1104
>          URL: http://issues.apache.org/jira/browse/XERCESJ-1104
>      Project: Xerces2-J
>         Type: Bug
>   Components: XNI
>     Versions: 2.6.0, 2.7.1
>  Environment: All
>     Reporter: William Eliot Kimber

>
> The resolveSchema() method of XSDHandler uses the entityResolver to resolve URIs in schema location hints. This has the effect that, when using a catalog resolver that SYSTEM entries are used to resolve the schema URIs.
> However, schemas are not entities and therefore their URI references should not be resolved via an entity resolver but via a URI resolver and should, therefore be resolved via URI catalog entries, not SYSTEM entries.
> That is, by the OASIS Entity Resolution spec one would expect to declare URI entries to remap schema location URIs but this does not work. 
> I'm happy to develop a fix but it may take me a while to figure out exactly how to go about it. 
> Because this behavior has been around for a while (since at least version 2.6) and is documented in at least one tutorial I found, it will probably be necessary to control the use of an entityResolver or URI resolver through a system property.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1104) Resolution of schemaLocation URIs should be via URI resolution, not entity resolution

Posted by "William Eliot Kimber (JIRA)" <xe...@xml.apache.org>.
    [ http://issues.apache.org/jira/browse/XERCESJ-1104?page=comments#action_12331811 ] 

William Eliot Kimber commented on XERCESJ-1104:
-----------------------------------------------

I understand the desire to not change the XNI API.

If it's possible to examine an XML entity description and determine that it is in fact a schema, then that does make it possible to fix the implementation of schema loader.

However, the bug is still a bug: the Xerces implementation of how schemas are resolved via catalogs is wrong and needs to be fixed.

Therefore, I don't think this issue should be closed.

I'm not sure how to be clearer: the code that is part of Xerces is broken. It's not an issue of being able or not being able to extend the core code, it's a function of the code, as implemented, being wrong.

I will look at reworking resolveEntity() in XMLCatalogResolver to see if I can put the logic there to use URI entries for schemas.

There may still be other issues with how Xerces enables the resolution of non-entity resources during parsing (i.e., XIncludes) but I haven't had the time to dig into that code.

> Resolution of schemaLocation URIs should be via URI resolution, not entity resolution
> -------------------------------------------------------------------------------------
>
>          Key: XERCESJ-1104
>          URL: http://issues.apache.org/jira/browse/XERCESJ-1104
>      Project: Xerces2-J
>         Type: Bug
>   Components: XNI
>     Versions: 2.6.0, 2.7.1
>  Environment: All
>     Reporter: William Eliot Kimber

>
> The resolveSchema() method of XSDHandler uses the entityResolver to resolve URIs in schema location hints. This has the effect that, when using a catalog resolver that SYSTEM entries are used to resolve the schema URIs.
> However, schemas are not entities and therefore their URI references should not be resolved via an entity resolver but via a URI resolver and should, therefore be resolved via URI catalog entries, not SYSTEM entries.
> That is, by the OASIS Entity Resolution spec one would expect to declare URI entries to remap schema location URIs but this does not work. 
> I'm happy to develop a fix but it may take me a while to figure out exactly how to go about it. 
> Because this behavior has been around for a while (since at least version 2.6) and is documented in at least one tutorial I found, it will probably be necessary to control the use of an entityResolver or URI resolver through a system property.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


Re: can't seem to get to jira, what am i doing wrong?

Posted by db...@qis.net.
it's working now... twas busted for a few days.
thanks.

Quoting Brian Minchau <mi...@ca.ibm.com>:

> Dave,
> your link works for me. But it does seem to redirect to:
>    http://issues.apache.org/jira/secure/Dashboard.jspa
> 
> It's curious that your http request,  http://issues.apache.org/jira/
> somehow is translated into a GET for a particular Xerces issue.... wierd.
> 
> - Brian
> - - - - - - - - - - - - - - - - - - - -
> Brian Minchau
> 
> 
> 
>                                                                            
>              "Dave Brosius"                                                
>              <dbrosius@qis.net                                             
>              >                                                          To 
>                                        <j-...@xerces.apache.org>           
>              10/08/2005 11:29                                           cc 
>              AM                                                            
>                                                                    Subject 
>                                        can't seem to get to jira, what am  
>              Please respond to         i doing wrong?                      
>                    j-dev                                                   
>                                                                            
>                                                                            
>                                                                            
>                                                                            
>                                                                            
> 
> 
> 
> 
> http://issues.apache.org/jira/
> 
> gives me
> 
> Proxy Error
> 
> 
> The proxy server received an invalid response from an upstream server.
> The proxy server could not handle the request GET /jira/browse/XERCESJ-1104
> .
> 
> 
> Reason: Error reading from remote server
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-dev-help@xerces.apache.org
> 




---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


Re: can't seem to get to jira, what am i doing wrong?

Posted by Brian Minchau <mi...@ca.ibm.com>.
Dave,
your link works for me. But it does seem to redirect to:
   http://issues.apache.org/jira/secure/Dashboard.jspa

It's curious that your http request,  http://issues.apache.org/jira/
somehow is translated into a GET for a particular Xerces issue.... wierd.

- Brian
- - - - - - - - - - - - - - - - - - - -
Brian Minchau



                                                                           
             "Dave Brosius"                                                
             <dbrosius@qis.net                                             
             >                                                          To 
                                       <j-...@xerces.apache.org>           
             10/08/2005 11:29                                           cc 
             AM                                                            
                                                                   Subject 
                                       can't seem to get to jira, what am  
             Please respond to         i doing wrong?                      
                   j-dev                                                   
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




http://issues.apache.org/jira/

gives me

Proxy Error


The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /jira/browse/XERCESJ-1104
.


Reason: Error reading from remote server




---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


can't seem to get to jira, what am i doing wrong?

Posted by Dave Brosius <db...@qis.net>.
http://issues.apache.org/jira/

gives me

Proxy Error
The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /jira/browse/XERCESJ-1104. 

Reason: Error reading from remote server

[jira] Commented: (XERCESJ-1104) Resolution of schemaLocation URIs should be via URI resolution, not entity resolution

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
    [ http://issues.apache.org/jira/browse/XERCESJ-1104?page=comments#action_12331625 ] 

Michael Glavassevich commented on XERCESJ-1104:
-----------------------------------------------

If you understood what I described I was hoping you would have realized that the single method on XMLEntityResolver works for any kind of resource. Both the method parameter (XMLResourceIdentifier) and return type (XMLInputSource) are extensible. Heck you can even return DOM nodes and SAX parsers (see org.apache.xerces.util.DOMInputSource & org.apache.xerces.util.SAXInputSource) from this method. There's nothing you can't do with this method that would require another one, especially another method with essentially the same signature.

I have a feeling you're just getting hung up on the name of the interface. XNI borrowed many concepts from SAX including the names of its classes, interfaces and methods; XMLEntityResolver being no exception. Perhaps it should have been called XMLResourceResolver since that's essentially what it is (but it's too late now).

XMLEntityResolver evolved far beyond what SAX supports. Perhaps the Javadoc for XMLEntityResolver should be updated to reflect that it has the ability to resolve more than just XML parsed entities since this is how it's always been used. In my opinion that's the only thing that should be fixed here.

> Resolution of schemaLocation URIs should be via URI resolution, not entity resolution
> -------------------------------------------------------------------------------------
>
>          Key: XERCESJ-1104
>          URL: http://issues.apache.org/jira/browse/XERCESJ-1104
>      Project: Xerces2-J
>         Type: Bug
>   Components: XNI
>     Versions: 2.6.0, 2.7.1
>  Environment: All
>     Reporter: William Eliot Kimber

>
> The resolveSchema() method of XSDHandler uses the entityResolver to resolve URIs in schema location hints. This has the effect that, when using a catalog resolver that SYSTEM entries are used to resolve the schema URIs.
> However, schemas are not entities and therefore their URI references should not be resolved via an entity resolver but via a URI resolver and should, therefore be resolved via URI catalog entries, not SYSTEM entries.
> That is, by the OASIS Entity Resolution spec one would expect to declare URI entries to remap schema location URIs but this does not work. 
> I'm happy to develop a fix but it may take me a while to figure out exactly how to go about it. 
> Because this behavior has been around for a while (since at least version 2.6) and is documented in at least one tutorial I found, it will probably be necessary to control the use of an entityResolver or URI resolver through a system property.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1104) Resolution of schemaLocation URIs should be via URI resolution, not entity resolution

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
    [ http://issues.apache.org/jira/browse/XERCESJ-1104?page=comments#action_12332090 ] 

Michael Glavassevich commented on XERCESJ-1104:
-----------------------------------------------

What XMLCatalogResolver does is documented both in the Javadoc and in the documentation which accompanies Xerces.  It was discussed and reviewed with the community when it was under development.  It does what it says it does and has been that way for almost two years now.  We cannot change what the methods do without breaking existing users.

During the initial discussion on the lists and elsewhere it became pretty evident that no one implementation of the resolveEntity() methods will meet the requirements of all applications, so we went with fairly generalized default behaviour which may or may not do the right thing for your application.  This is why the Javadoc clearly states that the methods should be overrided in a subclass if you want them to do something else.

> Resolution of schemaLocation URIs should be via URI resolution, not entity resolution
> -------------------------------------------------------------------------------------
>
>          Key: XERCESJ-1104
>          URL: http://issues.apache.org/jira/browse/XERCESJ-1104
>      Project: Xerces2-J
>         Type: Bug
>   Components: XNI
>     Versions: 2.6.0, 2.7.1
>  Environment: All
>     Reporter: William Eliot Kimber

>
> The resolveSchema() method of XSDHandler uses the entityResolver to resolve URIs in schema location hints. This has the effect that, when using a catalog resolver that SYSTEM entries are used to resolve the schema URIs.
> However, schemas are not entities and therefore their URI references should not be resolved via an entity resolver but via a URI resolver and should, therefore be resolved via URI catalog entries, not SYSTEM entries.
> That is, by the OASIS Entity Resolution spec one would expect to declare URI entries to remap schema location URIs but this does not work. 
> I'm happy to develop a fix but it may take me a while to figure out exactly how to go about it. 
> Because this behavior has been around for a while (since at least version 2.6) and is documented in at least one tutorial I found, it will probably be necessary to control the use of an entityResolver or URI resolver through a system property.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1104) Resolution of schemaLocation URIs should be via URI resolution, not entity resolution

Posted by "William Eliot Kimber (JIRA)" <xe...@xml.apache.org>.
    [ http://issues.apache.org/jira/browse/XERCESJ-1104?page=comments#action_12331504 ] 

William Eliot Kimber commented on XERCESJ-1104:
-----------------------------------------------

I may be misunderstanding how the code works but while I understand that the behavior of the resolveEntity() method is as you describe, it is the schema loader, which knows that it is loading a schema, that is calling this method. Therefore, it could just as easily call a different method.

That is, the SAX parser can't be reporting anything (other than a CDATA attribute value) because SAX parsers are not, as far as I know, schema aware. But maybe I'm misunderstanding how the parser works?  I'm basing this assumption on the fact that since a schema reference is not an entity reference but, at the XML level, just an attribute, there would be no way per the current SAX API for a SAX parser to handle this as a SAX event (i.e., startSchemaReference or something) as opposed to handling it as part of the SAX event handler's business logic. 

I've started experimenting with a fix, which is as follows:

1. Extend the XMLEntityResolver to add a new method, resolveResourceByUri()  (named so it doesn't conflict with the resolveURI() method on a related interface).

2. In the XMLSchemaLoader's resolveDocument() method, replace the current line:

   return entityResolver.resolveEntity(desc);

with:
  
  entityResolver.resolveResourceByUri(desc);

Where the systemId value of of the XML resource description is the target URI from the schema location hint.

This is somewhat abusing the notion of "entity resolver" as schemas are not entities in the XML sense, but it's the least disruptive change I could think of.

What I haven't been able to do yet (ran out of time yesterday) is build a test case that demonstrates that this works when processing an XML document with validation turned on--I know how to configure a validating Xerces SAX parser (I use one with Saxon) but I haven't figured out what I need to put around that to actually trigger the schema processing. Here's my test case:

    public void testSchemaLoaderViaCatalog() {

        XMLReader reader = new SAXParser();
        URL catalogUrl = this.getClass().getResource("catalog-01.xml");
        String[] catalogs = {catalogUrl.toExternalForm(),};

//         Create catalog resolver and set a catalog list.
        XMLCatalogResolver resolver = new XMLCatalogResolver();
        resolver.setPreferPublic(true);
        resolver.setCatalogList(catalogs);

//         Set the resolver on the parser.
        try {
            reader.setProperty(
              "http://apache.org/xml/properties/internal/entity-resolver", 
              resolver);
        } catch (Throwable e) {            
            e.printStackTrace();
            fail(e.getMessage());
        }
        
        DOMParser dp = new DOMParser();
        dp.setEntityResolver(reader.getEntityResolver());
        URL sourceDoc = this.getClass().getResource("doc-1.xml");
        Document doc = null;
        try {
            dp.setFeature("http://xml.org/sax/features/validation", true);
            dp.setFeature("http://apache.org/xml/features/validation/dynamic", true);
            dp.setFeature("http://apache.org/xml/features/validation/schema", true);
            dp.setFeature("http://apache.org/xml/features/validation/schema/normalized-value", true);
            dp.parse(sourceDoc.toExternalForm());
            doc = dp.getDocument();
        } catch (Throwable e) {            
            e.printStackTrace();
            fail(e.getMessage());
        }
        Element root = doc.getDocumentElement();
        assertEquals("oneElementDocument", root.getNodeName());
        
    }

This runs but it should in fact fail at the moment because the input document isn't valid against the schema (and the test schema may or may not be valid--I'm still figuring that one out--don't ask).





> Resolution of schemaLocation URIs should be via URI resolution, not entity resolution
> -------------------------------------------------------------------------------------
>
>          Key: XERCESJ-1104
>          URL: http://issues.apache.org/jira/browse/XERCESJ-1104
>      Project: Xerces2-J
>         Type: Bug
>   Components: XNI
>     Versions: 2.6.0, 2.7.1
>  Environment: All
>     Reporter: William Eliot Kimber

>
> The resolveSchema() method of XSDHandler uses the entityResolver to resolve URIs in schema location hints. This has the effect that, when using a catalog resolver that SYSTEM entries are used to resolve the schema URIs.
> However, schemas are not entities and therefore their URI references should not be resolved via an entity resolver but via a URI resolver and should, therefore be resolved via URI catalog entries, not SYSTEM entries.
> That is, by the OASIS Entity Resolution spec one would expect to declare URI entries to remap schema location URIs but this does not work. 
> I'm happy to develop a fix but it may take me a while to figure out exactly how to go about it. 
> Because this behavior has been around for a while (since at least version 2.6) and is documented in at least one tutorial I found, it will probably be necessary to control the use of an entityResolver or URI resolver through a system property.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1104) Resolution of schemaLocation URIs should be via URI resolution, not entity resolution

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
    [ http://issues.apache.org/jira/browse/XERCESJ-1104?page=comments#action_12331458 ] 

Michael Glavassevich commented on XERCESJ-1104:
-----------------------------------------------

What you're describing is the behaviour of one implementation of XMLEntityResolver, specifically the SAX EntityResolverWrapper which bridges between XNI and a SAX EntityResolver. I agree that schema locations are not system identifiers however there's no way to differentiate them with SAX so they're reported as system ids. This is a limitation of SAX. EntityResolver was not designed for resolving schema documents but it's the only interface available to SAX users, so we have to use what's there regardless of whether it's semantically wrong.

XMLEntityResolver is an interface. There's nothing restricting its behaviour in the way you describe. If you register your own implementation of XMLEntityResolver with the parser you can map the XMLResourceIdentifier (which has many fields including the target namespace if the resource is a schema document) [1] passed to resolveEntity in any way you feel like to a catalog.  If the resource being resolved is a schema document then the XMLResourceIdentifier will be an instance of XMLSchemaDescription [2][3] which contains even more information like a complete list of location hints. There's no need for another type of resolver.  The current interfaces already support what you need.

[1] http://xml.apache.org/xerces2-j/javadocs/xni/org/apache/xerces/xni/XMLResourceIdentifier.html
[2] http://xml.apache.org/xerces2-j/faq-write.html#faq-5
[3] http://xml.apache.org/xerces2-j/javadocs/xni/org/apache/xerces/xni/grammars/XMLSchemaDescription.html

> Resolution of schemaLocation URIs should be via URI resolution, not entity resolution
> -------------------------------------------------------------------------------------
>
>          Key: XERCESJ-1104
>          URL: http://issues.apache.org/jira/browse/XERCESJ-1104
>      Project: Xerces2-J
>         Type: Bug
>   Components: XNI
>     Versions: 2.6.0, 2.7.1
>  Environment: All
>     Reporter: William Eliot Kimber

>
> The resolveSchema() method of XSDHandler uses the entityResolver to resolve URIs in schema location hints. This has the effect that, when using a catalog resolver that SYSTEM entries are used to resolve the schema URIs.
> However, schemas are not entities and therefore their URI references should not be resolved via an entity resolver but via a URI resolver and should, therefore be resolved via URI catalog entries, not SYSTEM entries.
> That is, by the OASIS Entity Resolution spec one would expect to declare URI entries to remap schema location URIs but this does not work. 
> I'm happy to develop a fix but it may take me a while to figure out exactly how to go about it. 
> Because this behavior has been around for a while (since at least version 2.6) and is documented in at least one tutorial I found, it will probably be necessary to control the use of an entityResolver or URI resolver through a system property.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1104) Resolution of schemaLocation URIs should be via URI resolution, not entity resolution

Posted by "William Eliot Kimber (JIRA)" <xe...@xml.apache.org>.
    [ http://issues.apache.org/jira/browse/XERCESJ-1104?page=comments#action_12331645 ] 

William Eliot Kimber commented on XERCESJ-1104:
-----------------------------------------------

I think you're misunderstanding what the problem is that I'm referring to.

The issue, as I see it, is this:

1. In XMLSchemaLoader, in the resolveDocument() method, the loader calls resolveEntity() on the entityResolver.

2. In XMLCatalogResolver, the resolveEntity() method *only* considers the systemId and publicId properties of the resource descriptor:

       if (resolvedId == null) {
            String publicId = resourceIdentifier.getPublicId();
            String systemId = getUseLiteralSystemId() 
                ? resourceIdentifier.getLiteralSystemId()
                : resourceIdentifier.getExpandedSystemId();
            if (publicId != null && systemId != null) {
                resolvedId = resolvePublic(publicId, systemId);
            }
            else if (systemId != null) {
                resolvedId = resolveSystem(systemId);
            }
        }
        return resolvedId;

This is the correct implementation for *resolveEntity()* because XML entities (as opposed to non-entity resources) are *only* resolved via SYSTEM and PUBLIC entries. That is, a correct implementation of OASIS Entity Resolution catalog processing *cannot* use URI entries (directly) to resolve entities, therefore, this is the only possible way that a catalog resolver can work.

This means, that with this implementation, *it is impossible* for XMLSchemaLocation to resolve schema references via URI catalog entries.

Thus there needs to be a way to indicate that what you want to resolve is in fact *not an entity* (and therefore is resolved via URI entries and not PUBLIC or SYSTEM entries.

My engineering practice says that the clearest way to make that distinction is with the method name.

It could also be done by adding another property to the resource descriptor to indicate whether it is in fact an entity or a non-entity resource. This would work but would be much less clear to observers than having a clearly-named method that only resolves resouces. Since entity resolution only happens during parsing and anything done that isn't parsing *can't be* entity resolution, I don't see that there would ever be ambiguity about which method to call in a given circumstance, since you always know if you're doing XML parsing or if you're not. (For example, resolving XIncludes *is not XML parsing* and therefore the resolution of XInclude href= values should be via URI catalog entries, not SYSTEM or PUBLIC, just as for schemas. But I am now sure that this code is incorrectly using resolveEntity() as currently implemented to do this resolution. Thus my assertion that this is an architectural problem in Xerces.

Progress report: I have my test case working to the degree that my schema is loaded when I parse my test document. This required that I add an implementation of resolveResourceByUri() to XMLEntityResolver(). What I still haven't figured out is how to configure the parser so that the schema loader is initialized with the XMLCatalogResolver rather than the default XMLEntityResolver--I haven't yet found the documentation for or a sample of how to set that up.

> Resolution of schemaLocation URIs should be via URI resolution, not entity resolution
> -------------------------------------------------------------------------------------
>
>          Key: XERCESJ-1104
>          URL: http://issues.apache.org/jira/browse/XERCESJ-1104
>      Project: Xerces2-J
>         Type: Bug
>   Components: XNI
>     Versions: 2.6.0, 2.7.1
>  Environment: All
>     Reporter: William Eliot Kimber

>
> The resolveSchema() method of XSDHandler uses the entityResolver to resolve URIs in schema location hints. This has the effect that, when using a catalog resolver that SYSTEM entries are used to resolve the schema URIs.
> However, schemas are not entities and therefore their URI references should not be resolved via an entity resolver but via a URI resolver and should, therefore be resolved via URI catalog entries, not SYSTEM entries.
> That is, by the OASIS Entity Resolution spec one would expect to declare URI entries to remap schema location URIs but this does not work. 
> I'm happy to develop a fix but it may take me a while to figure out exactly how to go about it. 
> Because this behavior has been around for a while (since at least version 2.6) and is documented in at least one tutorial I found, it will probably be necessary to control the use of an entityResolver or URI resolver through a system property.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1104) Resolution of schemaLocation URIs should be via URI resolution, not entity resolution

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
    [ http://issues.apache.org/jira/browse/XERCESJ-1104?page=comments#action_12332642 ] 

Michael Glavassevich commented on XERCESJ-1104:
-----------------------------------------------

The "out of the box" behaviour for XNI attempts to locate schema documents using the target namespace.  They are rightfully treated as URI references so resolveURI() is called on the catalog to find a matching URI entry.  If there is no matching URI entry in the catalog or the schema has no target namespace then the location hint is treated as an external identifier.  This was done for consistency with the SAX behaviour where only the schema location hint is available and is passed to the resolver as the systemId parameter.  For a SAX application the only interface available is EntityResolver.  If you want to allow these applications to resolve non-entity resources using SAX alone then you have no choice but to use the fields available: systemId or publicId.  It's unfortunate that SAX's entity resolver wasn't designed for broader usage but it's all we've got.  DOM Level 3 (LSResourceResolver) and StAX (XMLResolver) are a better fit for schema resolution (though they too aren't general enough for resolving other types of resources).

An XNI XMLEntityResolver can be set on the parser by calling setProperty() with this property URI: http://apache.org/xml/properties/internal/entity-resolver.  Applications which rely on it are obviously limited to using Xerces.

> Resolution of schemaLocation URIs should be via URI resolution, not entity resolution
> -------------------------------------------------------------------------------------
>
>          Key: XERCESJ-1104
>          URL: http://issues.apache.org/jira/browse/XERCESJ-1104
>      Project: Xerces2-J
>         Type: Bug
>   Components: XNI
>     Versions: 2.6.0, 2.7.1
>  Environment: All
>     Reporter: William Eliot Kimber

>
> The resolveSchema() method of XSDHandler uses the entityResolver to resolve URIs in schema location hints. This has the effect that, when using a catalog resolver that SYSTEM entries are used to resolve the schema URIs.
> However, schemas are not entities and therefore their URI references should not be resolved via an entity resolver but via a URI resolver and should, therefore be resolved via URI catalog entries, not SYSTEM entries.
> That is, by the OASIS Entity Resolution spec one would expect to declare URI entries to remap schema location URIs but this does not work. 
> I'm happy to develop a fix but it may take me a while to figure out exactly how to go about it. 
> Because this behavior has been around for a while (since at least version 2.6) and is documented in at least one tutorial I found, it will probably be necessary to control the use of an entityResolver or URI resolver through a system property.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org