You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by "Philip Arickx (JIRA)" <xe...@xml.apache.org> on 2010/08/02 14:15:15 UTC

[jira] Reopened: (XERCESJ-1462) When Reusing an XMLGrammarPool and reusing a validating XMLDocumentParser, parsing will fail on consecutive documents using different Schemas

     [ https://issues.apache.org/jira/browse/XERCESJ-1462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Philip Arickx reopened XERCESJ-1462:
------------------------------------


I don't agree entirely with the assessment. The title of the submitted bug is still accurate, even if my initial assessment at code-level may have been wrong to some extent.

The 2 documents provided in the case description are both valid XML documents, with different schemas and no namespace. Reusing the parser and the grammarpool will result in a failure. I still consider that a bug, since it means that in a general environment (i.e. any kind of XML document could be parsed) using vanilla Xerces, it is unsafe to reuse XMLDocumentParsers if they are validating.

The basic reasons are that "no namespace" is insufficient information for the general case (which includes different documents with no namespace and different schemas), and that the grammarpool is being lied to : the second time round, it gets an XSDDescription which contains mostly outdated information (only the namespace info is correct).

Anybody rolling their own GrammarPool will run into issues if they base any logic on the rest of the XSDDescription contents.

I see several possible ways to improve this :
- XSDDescription could be reset at the start of the parse, so that the information at least isn't false - it wouldn't solve the issue in Xerces (i.e. without a custom GrammarPool), but it would not lead as easily to surprises when other fields of XSDDescription are examined.
- The grammarpool method should have only the namespace as parameter - in that case, it is perfectly clear at API level that you can only cache based on the namespace. This precludes however caching of schemas with no target namespace, with as only workaround for the reuse case to return null when the namespace is null. This fix reflects the design more accurately.
- The XSDDescription is fully updated before being passed to the grammarpool. In this case, the grammarpool can detect the null namespace, and look at the location hints or other information to correctly retrieve the grammar anyway. It would too far down the road I think to actually change XMLGrammarPoolImpl as well (or, for that matter, the equals method of XSDDescription), but at least custom grammarpools can go a little further in caching (in my case the location hints for instance would have been sufficient to determine that a new grammar was seen).

> When Reusing an XMLGrammarPool and reusing a validating XMLDocumentParser, parsing will fail on consecutive documents using different Schemas
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1462
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1462
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: JAXP (javax.xml.validation)
>    Affects Versions: 2.10.0
>         Environment: N/A
>            Reporter: Philip Arickx
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> The issue is triggered with the following sequence :
> * Create an XML11Xonfiguration()
> * Set the config to validating
> * Create an XMLGrammarPoolImpl()
> * Set the "http://apache.org/xml/properties/internal/grammar-pool" to the pool
> * Create a parser with the config
> * Parse a document referring to a schema
> * Parse a second document referring to a different schema, using the same parser
> This will fail, specifying that it cannot find the element.
> The reason is the XMLSchemaValidator findSchemaGrammar() method.
> At line 2632, the GrammarPool, if one is configured, is asked for the schema. The problem is that the fXSDDescription object for a second parse at this point is still identical to what it was at the end of the first parse. In other words, it refers to the schema that was used on the first parse. This schema is retrieved, and the new document is validated against the old schema.
> To fix this, lines 2652 - 2670 should be moved and inserted after line 2629. This ensures that the fXSDDescription is updated with the current schema information before interrogating the GrammarPool.
> ----------------------
> The following test case + xml documents illustrates the issue.
> ====== Test Case Output ======
> Doc 1 Parsed
> :file:/D:/RX/Perforce/arickp_Solor/XDB_10.1/test2.xml:file:/D:/RX/Perforce/arickp_Solor/XDB_10.1/test2.xml:file:/D:/RX/Perforce/arickp_Solor/XDB_10.1/test2.xml:3:44:143:cvc-elt.1.a: Cannot find the declaration of element 'root2'.
> 	at org.apache.xerces.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:374)
> 	at org.apache.xerces.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:325)
> 	at org.apache.xerces.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:282)
> 	[...]
> ====== Test Case Code ======
> public class ParserTest {
>   @Test
>   public void testParser() {
>     CountingErrorHandler errorHandler = new CountingErrorHandler();
>     XMLParserConfiguration parserConfig = new XML11Configuration();
>     parserConfig.setErrorHandler(errorHandler);
>     XMLDocumentParser parser = new XMLDocumentParser(parserConfig);
>     parserConfig.setFeature("http://xml.org/sax/features/validation", true);
>     parserConfig.setFeature("http://apache.org/xml/features/validation/schema", true);
>     parserConfig.setFeature("http://apache.org/xml/features/validation/schema/normalized-value",
>         false);
>     XMLGrammarPool pool = new XMLGrammarPoolImpl();
>     parserConfig.setProperty("http://apache.org/xml/properties/internal/grammar-pool", pool);
>     try {
>       File file1 = new File("test1.xml");
>       URL url1 = file1.toURI().toURL();
>       XMLInputSource source1 = new XMLInputSource(null, url1.toString(), null);
>       parser.parse(source1);
>       System.out.println("Doc 1 Parsed");
>       File file2 = new File("test2.xml");
>       URL url2 = file2.toURI().toURL();
>       XMLInputSource source2 = new XMLInputSource(null, url2.toString(), null);
>       parser.parse(source2);
>       System.out.println("Doc 2 Parsed");
>       assertEquals(
>           "Found " + errorHandler.getWarningCount() + " warnings, " + errorHandler.getErrorCount()
>               + " errors, " + errorHandler.getFatalErrorCount() + " fatal errors", 0,
>           errorHandler.getTotalCount());
>     } catch (IOException ioe) {
>       ioe.printStackTrace();
>       fail("Test failed with unexpected IO exception : " + ioe.getMessage());
>     } catch (XNIException xnie) {
>       xnie.printStackTrace();
>       fail("Test failed with unexpected XNI exception : " + xnie.getMessage());
>     }
>   }
>   public class CountingErrorHandler implements XMLErrorHandler {
>     private int warningCount;
>     private int errorCount;
>     private int fatalErrorCount;
>     @Override
>     public void warning(String domain, String key, XMLParseException exception) throws XNIException {
>       warningCount++;
>       throw exception;
>     }
>     @Override
>     public void fatalError(String domain, String key, XMLParseException exception)
>         throws XNIException {
>       fatalErrorCount++;
>       throw exception;
>     }
>     @Override
>     public void error(String domain, String key, XMLParseException exception) throws XNIException {
>       errorCount++;
>       throw exception;
>     }
>     public int getTotalCount() {
>       return warningCount + errorCount + fatalErrorCount;
>     }
>     public int getWarningCount() {
>       return warningCount;
>     }
>     public int getErrorCount() {
>       return errorCount;
>     }
>     public int getFatalErrorCount() {
>       return fatalErrorCount;
>     }
>   }
> }
> ====== test1.xml ======
> <?xml version="1.0" encoding="UTF-8"?>
> <root1 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> 	xsi:noNamespaceSchemaLocation='test1.xsd'>
> 	<el1>hello</el1>
> 	<el1>world</el1>
> </root1>
> ====== test1.xsd ======
> <?xml version="1.0" encoding="UTF-8"?>
> <xsd:schema xmlns:xsd='http://www.w3.org/2001/XMLSchema'>
> 	<xsd:element name="root1">
> 		<xsd:complexType>
> 			<xsd:sequence>
> 				<xsd:element ref="el1" minOccurs='1' maxOccurs='unbounded' />
> 			</xsd:sequence>
> 		</xsd:complexType>
> 	</xsd:element>
> 	<xsd:element name="el1" type='xsd:string' />
> </xsd:schema>
> ====== test2.xml ======
> <?xml version="1.0" encoding="UTF-8"?>
> <root2 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> 	xsi:noNamespaceSchemaLocation='test2.xsd'>
> 	<el2>hello</el2>
> 	<el2>world</el2>
> </root2>
> ====== test2.xsd ======
> <?xml version="1.0" encoding="UTF-8"?>
> <xsd:schema xmlns:xsd='http://www.w3.org/2001/XMLSchema'>
> 	<xsd:element name="root2">
> 		<xsd:complexType>
> 			<xsd:sequence>
> 				<xsd:element ref="el2" minOccurs='1' maxOccurs='unbounded' />
> 			</xsd:sequence>
> 		</xsd:complexType>
> 	</xsd:element>
> 	<xsd:element name="el2" type='xsd:string' />
> </xsd:schema>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org