You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by "Tin Pavlinic (JIRA)" <xe...@xml.apache.org> on 2006/10/17 23:25:34 UTC

[jira] Created: (XERCESJ-1205) Entity resolution does not work with DTD grammar caching resolved

Entity resolution does not work with DTD grammar caching resolved
-----------------------------------------------------------------

                 Key: XERCESJ-1205
                 URL: http://issues.apache.org/jira/browse/XERCESJ-1205
             Project: Xerces2-J
          Issue Type: Bug
          Components: DTD
    Affects Versions: 2.8.1
         Environment: JDK1.5. The issue appears on various machines, Windows, Linux, Mac OSX. I don't believe it is platform specific.
            Reporter: Tin Pavlinic


We have a DTD which defines some entities. We are parsing multiple documents against this DTD. If grammar caching is enabled, the entities are unresolved when the grammar is loaded from the cache, instead of the DTD. 

It seems that they are cleared every time a document is parsed and are only loaded when a DTD is loaded and not from the cache.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1205) Entity resolution does not work with DTD grammar caching resolved

Posted by "Radu Coravu (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921279#action_12921279 ] 

Radu Coravu commented on XERCESJ-1205:
--------------------------------------

Also, if you want to reuse the DTD Grammar when parsing XML files with different internal subsets, then the internal entites should not be copied from the DTDGrammar to the XMLEntityManager when XMLEntityManager.initFromDTD () gets called.
So the code would be something like:

    public void initFromDTD(DTDGrammar grammar)
    {
        final XMLEntityDecl entityDecl = new XMLEntityDecl();
        int index = 0;
     
        while(grammar.getEntityDecl(index++, entityDecl)) {
          fInExternalSubset = entityDecl.inExternal;
          if(entityDecl.inExternal) {
            if(entityDecl.publicId != null
                || entityDecl.systemId != null) {
                try {
                    addExternalEntity(entityDecl.name, entityDecl.publicId, entityDecl.systemId, entityDecl.baseSystemId);
                }
                catch(IOException e) {
                }
            } else {
                addInternalEntity(entityDecl.name, entityDecl.value);
            }
          }
        }
        fInExternalSubset = false;
    }

> Entity resolution does not work with DTD grammar caching resolved
> -----------------------------------------------------------------
>
>                 Key: XERCESJ-1205
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1205
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DTD
>    Affects Versions: 2.8.1
>         Environment: JDK1.5. The issue appears on various machines, Windows, Linux, Mac OSX. I don't believe it is platform specific.
>            Reporter: Tin Pavlinic
>            Assignee: Michael Glavassevich
>         Attachments: bug.zip, XERCESJ-1465.patch
>
>
> We have a DTD which defines some entities. We are parsing multiple documents against this DTD. If grammar caching is enabled, the entities are unresolved when the grammar is loaded from the cache, instead of the DTD. 
> It seems that they are cleared every time a document is parsed and are only loaded when a DTD is loaded and not from the cache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1205) Entity resolution does not work with DTD grammar caching resolved

Posted by "Dannes Wessels (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12621259#action_12621259 ] 

Dannes Wessels commented on XERCESJ-1205:
-----------------------------------------

I encountered the same issue during the development of the exist-db project. For performance reasons we cache the SAX parser and use the XMLGrammarPoolImpl object. Example code for triggering the issue can be found here http://existdb-contrib.googlecode.com/svn/trunk/XercesGrammarCache/ in src/grammarcache ; I guess I need to to some cleanup here.

anyway, my get around for the moment is to remove all cached DTDs from the pool when the reader is returned to the pool. A parser.reset() does not help either.

> Entity resolution does not work with DTD grammar caching resolved
> -----------------------------------------------------------------
>
>                 Key: XERCESJ-1205
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1205
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DTD
>    Affects Versions: 2.8.1
>         Environment: JDK1.5. The issue appears on various machines, Windows, Linux, Mac OSX. I don't believe it is platform specific.
>            Reporter: Tin Pavlinic
>         Attachments: bug.zip
>
>
> We have a DTD which defines some entities. We are parsing multiple documents against this DTD. If grammar caching is enabled, the entities are unresolved when the grammar is loaded from the cache, instead of the DTD. 
> It seems that they are cleared every time a document is parsed and are only loaded when a DTD is loaded and not from the cache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1205) Entity resolution does not work with DTD grammar caching resolved

Posted by "Radu Coravu (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920898#action_12920898 ] 

Radu Coravu commented on XERCESJ-1205:
--------------------------------------

Just one small rectification to what Thomas added as a patch:
In the method XMLEntityManager.initFromDTD

the current code is:
..............
 if(entityDecl.publicId != null {
                try {
                    addExternalEntity(entityDecl.name, entityDecl.publicId, entityDecl.systemId, entityDecl.baseSystemId);
                }
.................................

it should be something like:
..........................
if(entityDecl.publicId != null
                || entityDecl.systemId != null) {
                try {
                    addExternalEntity(entityDecl.name, entityDecl.publicId, entityDecl.systemId, entityDecl.baseSystemId);
                }
................................

In this way external SYSTEM entities will also be preserved.

> Entity resolution does not work with DTD grammar caching resolved
> -----------------------------------------------------------------
>
>                 Key: XERCESJ-1205
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1205
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DTD
>    Affects Versions: 2.8.1
>         Environment: JDK1.5. The issue appears on various machines, Windows, Linux, Mac OSX. I don't believe it is platform specific.
>            Reporter: Tin Pavlinic
>            Assignee: Michael Glavassevich
>         Attachments: bug.zip, XERCESJ-1465.patch
>
>
> We have a DTD which defines some entities. We are parsing multiple documents against this DTD. If grammar caching is enabled, the entities are unresolved when the grammar is loaded from the cache, instead of the DTD. 
> It seems that they are cleared every time a document is parsed and are only loaded when a DTD is loaded and not from the cache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Assigned: (XERCESJ-1205) Entity resolution does not work with DTD grammar caching resolved

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESJ-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Glavassevich reassigned XERCESJ-1205:
---------------------------------------------

    Assignee: Michael Glavassevich

> Entity resolution does not work with DTD grammar caching resolved
> -----------------------------------------------------------------
>
>                 Key: XERCESJ-1205
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1205
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DTD
>    Affects Versions: 2.8.1
>         Environment: JDK1.5. The issue appears on various machines, Windows, Linux, Mac OSX. I don't believe it is platform specific.
>            Reporter: Tin Pavlinic
>            Assignee: Michael Glavassevich
>         Attachments: bug.zip
>
>
> We have a DTD which defines some entities. We are parsing multiple documents against this DTD. If grammar caching is enabled, the entities are unresolved when the grammar is loaded from the cache, instead of the DTD. 
> It seems that they are cleared every time a document is parsed and are only loaded when a DTD is loaded and not from the cache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Updated: (XERCESJ-1205) Entity resolution does not work with DTD grammar caching resolved

Posted by "Tin Pavlinic (JIRA)" <xe...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XERCESJ-1205?page=all ]

Tin Pavlinic updated XERCESJ-1205:
----------------------------------

    Attachment: bug.zip

The zip file contains a JUnit test case demonstrating the bug, a sample document and a sample DTD demonstrating the behaviour.

> Entity resolution does not work with DTD grammar caching resolved
> -----------------------------------------------------------------
>
>                 Key: XERCESJ-1205
>                 URL: http://issues.apache.org/jira/browse/XERCESJ-1205
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DTD
>    Affects Versions: 2.8.1
>         Environment: JDK1.5. The issue appears on various machines, Windows, Linux, Mac OSX. I don't believe it is platform specific.
>            Reporter: Tin Pavlinic
>         Attachments: bug.zip
>
>
> We have a DTD which defines some entities. We are parsing multiple documents against this DTD. If grammar caching is enabled, the entities are unresolved when the grammar is loaded from the cache, instead of the DTD. 
> It seems that they are cleared every time a document is parsed and are only loaded when a DTD is loaded and not from the cache.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1205) Entity resolution does not work with DTD grammar caching resolved

Posted by "Dannes Wessels (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720320#action_12720320 ] 

Dannes Wessels commented on XERCESJ-1205:
-----------------------------------------

Will this be fixed in an upcoming version of xerces?

> Entity resolution does not work with DTD grammar caching resolved
> -----------------------------------------------------------------
>
>                 Key: XERCESJ-1205
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1205
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DTD
>    Affects Versions: 2.8.1
>         Environment: JDK1.5. The issue appears on various machines, Windows, Linux, Mac OSX. I don't believe it is platform specific.
>            Reporter: Tin Pavlinic
>            Assignee: Michael Glavassevich
>         Attachments: bug.zip
>
>
> We have a DTD which defines some entities. We are parsing multiple documents against this DTD. If grammar caching is enabled, the entities are unresolved when the grammar is loaded from the cache, instead of the DTD. 
> It seems that they are cleared every time a document is parsed and are only loaded when a DTD is loaded and not from the cache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Updated: (XERCESJ-1205) Entity resolution does not work with DTD grammar caching resolved

Posted by "Thomas Krammer (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESJ-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Krammer updated XERCESJ-1205:
------------------------------------

    Attachment: XERCESJ-1465.patch

I looked at the source code and it seems the parser maintains two entity tables: one in DTDGrammar and one in XMLEntityManager. The table in the XMLEntityManager is filled by the XMLDTDScannerImpl which is never triggered when the parsed DTD is retrieved from the XMLGrammarPool.

So the table in the XMLEntityManager is empty and the entities aren't resolved during parsing.

As a quick fix I updated XMLDTDValidator to copy the entities in DTDGrammar to the XMLEntityManager when the grammar is retrieved from the pool.

A patch for this change is attached.



> Entity resolution does not work with DTD grammar caching resolved
> -----------------------------------------------------------------
>
>                 Key: XERCESJ-1205
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1205
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DTD
>    Affects Versions: 2.8.1
>         Environment: JDK1.5. The issue appears on various machines, Windows, Linux, Mac OSX. I don't believe it is platform specific.
>            Reporter: Tin Pavlinic
>            Assignee: Michael Glavassevich
>         Attachments: bug.zip, XERCESJ-1465.patch
>
>
> We have a DTD which defines some entities. We are parsing multiple documents against this DTD. If grammar caching is enabled, the entities are unresolved when the grammar is loaded from the cache, instead of the DTD. 
> It seems that they are cleared every time a document is parsed and are only loaded when a DTD is loaded and not from the cache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1205) Entity resolution does not work with DTD grammar caching resolved

Posted by "Michael Knorr (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622129#action_12622129 ] 

Michael Knorr commented on XERCESJ-1205:
----------------------------------------

We have the same problem when validating against the w3c xhtml DTD while using a grammar pool. The first XML document we parse is validated correctly and all defined entities are resolved. But for subsequent XML documents the entities can't be resolved. 
When we are not using a grammar pool the enitties are resolved correctly every time, but then parsing takes too much time for our requirements.


> Entity resolution does not work with DTD grammar caching resolved
> -----------------------------------------------------------------
>
>                 Key: XERCESJ-1205
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1205
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DTD
>    Affects Versions: 2.8.1
>         Environment: JDK1.5. The issue appears on various machines, Windows, Linux, Mac OSX. I don't believe it is platform specific.
>            Reporter: Tin Pavlinic
>         Attachments: bug.zip
>
>
> We have a DTD which defines some entities. We are parsing multiple documents against this DTD. If grammar caching is enabled, the entities are unresolved when the grammar is loaded from the cache, instead of the DTD. 
> It seems that they are cleared every time a document is parsed and are only loaded when a DTD is loaded and not from the cache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1205) Entity resolution does not work with DTD grammar caching resolved

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722303#action_12722303 ] 

Michael Glavassevich commented on XERCESJ-1205:
-----------------------------------------------

I never predict dates though probably more likely to get done sooner if someone from the community contributes a working patch.

> Entity resolution does not work with DTD grammar caching resolved
> -----------------------------------------------------------------
>
>                 Key: XERCESJ-1205
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1205
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DTD
>    Affects Versions: 2.8.1
>         Environment: JDK1.5. The issue appears on various machines, Windows, Linux, Mac OSX. I don't believe it is platform specific.
>            Reporter: Tin Pavlinic
>            Assignee: Michael Glavassevich
>         Attachments: bug.zip
>
>
> We have a DTD which defines some entities. We are parsing multiple documents against this DTD. If grammar caching is enabled, the entities are unresolved when the grammar is loaded from the cache, instead of the DTD. 
> It seems that they are cleared every time a document is parsed and are only loaded when a DTD is loaded and not from the cache.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org