You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Michael Glavassevich <mr...@ca.ibm.com> on 2008/05/28 19:27:19 UTC

Re: catching large maxOccurs values when validating against XSD

Hi Franck,

The JAXP Validation API [1] also supports in-memory DOM validation as well
as the secure processing feature [2]. You could use this instead of
normalizeDocument(). There are samples [3] included in the binary
distribution which show how to use it. You could also try upgrading to the
latest release (2.9.1) which made significant improvements to the way in
which minOccurs/maxOccurs are processed (constant time and memory for many
cases) and can probably handle your schema with large maxOccurs.

Thanks.

[1]
http://xerces.apache.org/xerces2-j/javadocs/api/javax/xml/validation/package-summary.html
[2]
http://xerces.apache.org/xerces2-j/javadocs/api/javax/xml/XMLConstants.html#FEATURE_SECURE_PROCESSING
[3] http://xerces.apache.org/xerces2-j/samples-jaxp.html#SourceValidator

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

"Schmidlin, Franck" <Fr...@anite.com> wrote on 05/28/2008
12:28:29 PM:

> Hello everyone.
>
> I have just spent the day investigating an issue in my application,
> without any success, and I'd appreciate any help you good people
> could provide me.
>
> The core of this issue is an XSD schema I got from a third party
> which define several elements with large values of maxOccurs (e.g.
> 10000), instead of unbounded.
> This causes an OutOfMemory exception when validating any documents
> against this schema.
>
> A bit of googling has quickly located several mentions of the JAXP
> secure processing feature [1] and SecurityManager class [2].
>
> My problem is that I do not apply validation when loading the
> document, but much later, using external schemas rather than the
> ones listed in the schemaLocation attribute.
> To do this, I use the DOM3 normalizeDocument() method [3].
>
> Having looked at the code for xerces 2.7.1 (which is my current
> version), I cannot find a way to leverage the SecurityManager when
> using normalizeDocument().
> As far as I can see, I would either need access to the
> ComponentManager, or be able to set or access the DomConfiguration
> 'parentSettings'.
> I have tried setting the secure processing features when parsing the
> document, but by the time I apply normalizeDocument() I cannot see
> any SecurityManager.
>
> Can you think of a way around this problem? Please spare me the
> obvious 'change the maxOccurs value' :-)
> On the other hand, I would consider any clean alternative to
> normalizeDocument() to validate a fully formed DOMDocument.
> If necessary I can upgrade my xerces libraries to a supported
> version, but I do not want to build my own.
>
> All help and comments will be gratefully appreciated :-)
>
>
> [1] http://xerces.apache.org/xerces2-
> j/javadocs/api/javax/xml/XMLConstants.html#FEATURE_SECURE_PROCESSING
> [2] http://xerces.apache.org/xerces2-j/properties.html#security-manager
> and http://xerces.apache.org/xerces2-
> j/javadocs/xerces2/org/apache/xerces/util/SecurityManager.html
> [3] http://xerces.apache.org/xerces2-
> j/javadocs/xerces2/org/apache/xerces/dom/CoreDocumentImpl.
> html#normalizeDocument()
>
> Code sample:
>
> XsdValidator iXsdValidator = new XsdValidator();
> org.w3c.dom.Document document = ...;
> org.w3c.dom.DOMConfiguration config = document.getDomConfig();
> config.setParameter("error-handler", iXsdValidator);
> config.setParameter("validate", Boolean.TRUE);
> config.setParameter("schema-type", http://www.w3.org/2001/XMLSchema);
> config.setParameter("schema-location", iXsdValidator.getSchemas());
> document.normalizeDocument();
> with
> public class XsdValidator extends org.xml.sax.helpers.DefaultHandler
> implements org.w3c.dom.DOMErrorHandler..
> ______________________________
> Franck Schmidlin
> Corporate Integration Consultant
> Anite Connect Technical Architect
>
> Anite Public Sector
> Transformation
> ______________________________
> P Save Paper - Do you really need to print this e-mail?

RE: catching large maxOccurs values when validating against XSD

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hi Franck,

The limit which you can specify on the
org.apache.xerces.util.SecurityManager is not defined in terms of maxOccurs
but rather the number of content model nodes which get generated internally
for the tree which represents the complex type. With the improvements to
minOccurs/maxOccurs we can often represent those particles with a single
node in the tree so the default is now quite large compared to the numbers
of nodes which are typically generated.

There are still plenty of "loopholes" [1]. The improvements were only for
some simple cases (which also happen to be the common cases for large
maxOccurs) so still worth setting the secure processing feature if your
application is open to accepting schemas from arbitrary sources.

Thanks.

[1] http://issues.apache.org/jira/browse/XERCESJ-1227

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

"Schmidlin, Franck" <Fr...@anite.com> wrote on 05/29/2008
05:00:16 AM:

> Michael,
>
> once again, you save the day :-)
>
> Release 2.9.1 does indeed handle these schemas with maxOccurs=10000
> without any problem.
>
> The JAXP validation API also seems to be a nicer alternative to
> Document.normalize(), although I'll need to refactor a few classes
> before I can use it.
>
> Now, a follow up question regarding the secure processing feature:
> I understand that this is intended as a protection against the
> malicious usage of know bugs/limitations in a parser implementation.
> Shouldn't the default values in org.apache.xerces.util.
> SecurityManager have been updated in 2.9.1 to reflect the fact that
> large maxOccurs are not a limitation anymore?
> Or are there still enough loopholes to warrant the default value of
> maxOccurs = 3000?
>
> I haven't tested it yet, but because of this, I suspect that
> implementing the secure processing feature would cause my schemas to
> be rejected by the SecurityManager.
> This is not critical to my application, because I only use local
> copies of schemas and therefore there is no security risk, but
> still, I am curious.
>
> Thanks
> ______________________________
> Franck Schmidlin
> Corporate Integration Consultant
> Anite Connect Technical Architect
>
> Anite Public Sector
> Transformation
> ______________________________
> P Save Paper - Do you really need to print this e-mail?
>
>
> From: Michael Glavassevich [mailto:mrglavas@ca.ibm.com]
> Sent: 28 May 2008 18:27
> To: j-dev@xerces.apache.org
> Cc: j-users@xerces.apache.org
> Subject: Re: catching large maxOccurs values when validating against XSD

> Hi Franck,
>
> The JAXP Validation API [1] also supports in-memory DOM validation
> as well as the secure processing feature [2]. You could use this
> instead of normalizeDocument(). There are samples [3] included in
> the binary distribution which show how to use it. You could also try
> upgrading to the latest release (2.9.1) which made significant
> improvements to the way in which minOccurs/maxOccurs are processed
> (constant time and memory for many cases) and can probably handle
> your schema with large maxOccurs.
>
> Thanks.
>
> [1] http://xerces.apache.org/xerces2-
> j/javadocs/api/javax/xml/validation/package-summary.html
> [2] http://xerces.apache.org/xerces2-
> j/javadocs/api/javax/xml/XMLConstants.html#FEATURE_SECURE_PROCESSING
> [3] http://xerces.apache.org/xerces2-j/samples-jaxp.html#SourceValidator
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> "Schmidlin, Franck" <Fr...@anite.com> wrote on 05/28/2008
> 12:28:29 PM:
>
> > Hello everyone.
> >
> > I have just spent the day investigating an issue in my application,
> > without any success, and I'd appreciate any help you good people
> > could provide me.
> >
> > The core of this issue is an XSD schema I got from a third party
> > which define several elements with large values of maxOccurs (e.g.
> > 10000), instead of unbounded.
> > This causes an OutOfMemory exception when validating any documents
> > against this schema.
> >
> > A bit of googling has quickly located several mentions of the JAXP
> > secure processing feature [1] and SecurityManager class [2].
> >
> > My problem is that I do not apply validation when loading the
> > document, but much later, using external schemas rather than the
> > ones listed in the schemaLocation attribute.
> > To do this, I use the DOM3 normalizeDocument() method [3].
> >
> > Having looked at the code for xerces 2.7.1 (which is my current
> > version), I cannot find a way to leverage the SecurityManager when
> > using normalizeDocument().
> > As far as I can see, I would either need access to the
> > ComponentManager, or be able to set or access the DomConfiguration
> > 'parentSettings'.
> > I have tried setting the secure processing features when parsing the
> > document, but by the time I apply normalizeDocument() I cannot see
> > any SecurityManager.
> >
> > Can you think of a way around this problem? Please spare me the
> > obvious 'change the maxOccurs value' :-)
> > On the other hand, I would consider any clean alternative to
> > normalizeDocument() to validate a fully formed DOMDocument.
> > If necessary I can upgrade my xerces libraries to a supported
> > version, but I do not want to build my own.
> >
> > All help and comments will be gratefully appreciated :-)
> >
> >
> > [1] http://xerces.apache.org/xerces2-
> > j/javadocs/api/javax/xml/XMLConstants.html#FEATURE_SECURE_PROCESSING
> > [2] http://xerces.apache.org/xerces2-j/properties.html#security-manager
> > and http://xerces.apache.org/xerces2-
> > j/javadocs/xerces2/org/apache/xerces/util/SecurityManager.html
> > [3] http://xerces.apache.org/xerces2-
> > j/javadocs/xerces2/org/apache/xerces/dom/CoreDocumentImpl.
> > html#normalizeDocument()
> >
> > Code sample:
> >
> > XsdValidator iXsdValidator = new XsdValidator();
> > org.w3c.dom.Document document = ...;
> > org.w3c.dom.DOMConfiguration config = document.getDomConfig();
> > config.setParameter("error-handler", iXsdValidator);
> > config.setParameter("validate", Boolean.TRUE);
> > config.setParameter("schema-type", http://www.w3.org/2001/XMLSchema);
> > config.setParameter("schema-location", iXsdValidator.getSchemas());
> > document.normalizeDocument();
> > with
> > public class XsdValidator extends org.xml.sax.helpers.DefaultHandler
> > implements org.w3c.dom.DOMErrorHandler..
> > ______________________________
> > Franck Schmidlin
> > Corporate Integration Consultant
> > Anite Connect Technical Architect
> >
> > Anite Public Sector
> > Transformation
> > ______________________________
> > P Save Paper - Do you really need to print this e-mail?

RE: catching large maxOccurs values when validating against XSD

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hi Franck,

The limit which you can specify on the
org.apache.xerces.util.SecurityManager is not defined in terms of maxOccurs
but rather the number of content model nodes which get generated internally
for the tree which represents the complex type. With the improvements to
minOccurs/maxOccurs we can often represent those particles with a single
node in the tree so the default is now quite large compared to the numbers
of nodes which are typically generated.

There are still plenty of "loopholes" [1]. The improvements were only for
some simple cases (which also happen to be the common cases for large
maxOccurs) so still worth setting the secure processing feature if your
application is open to accepting schemas from arbitrary sources.

Thanks.

[1] http://issues.apache.org/jira/browse/XERCESJ-1227

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

"Schmidlin, Franck" <Fr...@anite.com> wrote on 05/29/2008
05:00:16 AM:

> Michael,
>
> once again, you save the day :-)
>
> Release 2.9.1 does indeed handle these schemas with maxOccurs=10000
> without any problem.
>
> The JAXP validation API also seems to be a nicer alternative to
> Document.normalize(), although I'll need to refactor a few classes
> before I can use it.
>
> Now, a follow up question regarding the secure processing feature:
> I understand that this is intended as a protection against the
> malicious usage of know bugs/limitations in a parser implementation.
> Shouldn't the default values in org.apache.xerces.util.
> SecurityManager have been updated in 2.9.1 to reflect the fact that
> large maxOccurs are not a limitation anymore?
> Or are there still enough loopholes to warrant the default value of
> maxOccurs = 3000?
>
> I haven't tested it yet, but because of this, I suspect that
> implementing the secure processing feature would cause my schemas to
> be rejected by the SecurityManager.
> This is not critical to my application, because I only use local
> copies of schemas and therefore there is no security risk, but
> still, I am curious.
>
> Thanks
> ______________________________
> Franck Schmidlin
> Corporate Integration Consultant
> Anite Connect Technical Architect
>
> Anite Public Sector
> Transformation
> ______________________________
> P Save Paper - Do you really need to print this e-mail?
>
>
> From: Michael Glavassevich [mailto:mrglavas@ca.ibm.com]
> Sent: 28 May 2008 18:27
> To: j-dev@xerces.apache.org
> Cc: j-users@xerces.apache.org
> Subject: Re: catching large maxOccurs values when validating against XSD

> Hi Franck,
>
> The JAXP Validation API [1] also supports in-memory DOM validation
> as well as the secure processing feature [2]. You could use this
> instead of normalizeDocument(). There are samples [3] included in
> the binary distribution which show how to use it. You could also try
> upgrading to the latest release (2.9.1) which made significant
> improvements to the way in which minOccurs/maxOccurs are processed
> (constant time and memory for many cases) and can probably handle
> your schema with large maxOccurs.
>
> Thanks.
>
> [1] http://xerces.apache.org/xerces2-
> j/javadocs/api/javax/xml/validation/package-summary.html
> [2] http://xerces.apache.org/xerces2-
> j/javadocs/api/javax/xml/XMLConstants.html#FEATURE_SECURE_PROCESSING
> [3] http://xerces.apache.org/xerces2-j/samples-jaxp.html#SourceValidator
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> "Schmidlin, Franck" <Fr...@anite.com> wrote on 05/28/2008
> 12:28:29 PM:
>
> > Hello everyone.
> >
> > I have just spent the day investigating an issue in my application,
> > without any success, and I'd appreciate any help you good people
> > could provide me.
> >
> > The core of this issue is an XSD schema I got from a third party
> > which define several elements with large values of maxOccurs (e.g.
> > 10000), instead of unbounded.
> > This causes an OutOfMemory exception when validating any documents
> > against this schema.
> >
> > A bit of googling has quickly located several mentions of the JAXP
> > secure processing feature [1] and SecurityManager class [2].
> >
> > My problem is that I do not apply validation when loading the
> > document, but much later, using external schemas rather than the
> > ones listed in the schemaLocation attribute.
> > To do this, I use the DOM3 normalizeDocument() method [3].
> >
> > Having looked at the code for xerces 2.7.1 (which is my current
> > version), I cannot find a way to leverage the SecurityManager when
> > using normalizeDocument().
> > As far as I can see, I would either need access to the
> > ComponentManager, or be able to set or access the DomConfiguration
> > 'parentSettings'.
> > I have tried setting the secure processing features when parsing the
> > document, but by the time I apply normalizeDocument() I cannot see
> > any SecurityManager.
> >
> > Can you think of a way around this problem? Please spare me the
> > obvious 'change the maxOccurs value' :-)
> > On the other hand, I would consider any clean alternative to
> > normalizeDocument() to validate a fully formed DOMDocument.
> > If necessary I can upgrade my xerces libraries to a supported
> > version, but I do not want to build my own.
> >
> > All help and comments will be gratefully appreciated :-)
> >
> >
> > [1] http://xerces.apache.org/xerces2-
> > j/javadocs/api/javax/xml/XMLConstants.html#FEATURE_SECURE_PROCESSING
> > [2] http://xerces.apache.org/xerces2-j/properties.html#security-manager
> > and http://xerces.apache.org/xerces2-
> > j/javadocs/xerces2/org/apache/xerces/util/SecurityManager.html
> > [3] http://xerces.apache.org/xerces2-
> > j/javadocs/xerces2/org/apache/xerces/dom/CoreDocumentImpl.
> > html#normalizeDocument()
> >
> > Code sample:
> >
> > XsdValidator iXsdValidator = new XsdValidator();
> > org.w3c.dom.Document document = ...;
> > org.w3c.dom.DOMConfiguration config = document.getDomConfig();
> > config.setParameter("error-handler", iXsdValidator);
> > config.setParameter("validate", Boolean.TRUE);
> > config.setParameter("schema-type", http://www.w3.org/2001/XMLSchema);
> > config.setParameter("schema-location", iXsdValidator.getSchemas());
> > document.normalizeDocument();
> > with
> > public class XsdValidator extends org.xml.sax.helpers.DefaultHandler
> > implements org.w3c.dom.DOMErrorHandler..
> > ______________________________
> > Franck Schmidlin
> > Corporate Integration Consultant
> > Anite Connect Technical Architect
> >
> > Anite Public Sector
> > Transformation
> > ______________________________
> > P Save Paper - Do you really need to print this e-mail?

RE: catching large maxOccurs values when validating against XSD

Posted by "Schmidlin, Franck" <Fr...@anite.com>.
Michael,
 
once again, you save the day :-)
 
Release 2.9.1 does indeed handle these schemas with maxOccurs=10000
without any problem.
 
The JAXP validation API also seems to be a nicer alternative to
Document.normalize(), although I'll need to refactor a few classes
before I can use it.
 
Now, a follow up question regarding the secure processing feature:
I understand that this is intended as a protection against the malicious
usage of know bugs/limitations in a parser implementation.
Shouldn't the default values in org.apache.xerces.util.SecurityManager
have been updated in 2.9.1 to reflect the fact that large maxOccurs are
not a limitation anymore?
Or are there still enough loopholes to warrant the default value of
maxOccurs = 3000?
 
I haven't tested it yet, but because of this, I suspect that
implementing the secure processing feature would cause my schemas to be
rejected by the SecurityManager.
This is not critical to my application, because I only use local copies
of schemas and therefore there is no security risk, but still, I am
curious.
 
Thanks

______________________________
Franck Schmidlin
Corporate Integration Consultant
Anite Connect Technical Architect

Anite Public Sector
Transformation
______________________________ 

P Save Paper - Do you really need to print this e-mail? 
 

________________________________

From: Michael Glavassevich [mailto:mrglavas@ca.ibm.com] 
Sent: 28 May 2008 18:27
To: j-dev@xerces.apache.org
Cc: j-users@xerces.apache.org
Subject: Re: catching large maxOccurs values when validating against XSD



Hi Franck,

The JAXP Validation API [1] also supports in-memory DOM validation as
well as the secure processing feature [2]. You could use this instead of
normalizeDocument(). There are samples [3] included in the binary
distribution which show how to use it. You could also try upgrading to
the latest release (2.9.1) which made significant improvements to the
way in which minOccurs/maxOccurs are processed (constant time and memory
for many cases) and can probably handle your schema with large
maxOccurs.

Thanks.

[1]
http://xerces.apache.org/xerces2-j/javadocs/api/javax/xml/validation/pac
kage-summary.html
[2]
http://xerces.apache.org/xerces2-j/javadocs/api/javax/xml/XMLConstants.h
tml#FEATURE_SECURE_PROCESSING
[3] http://xerces.apache.org/xerces2-j/samples-jaxp.html#SourceValidator

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

"Schmidlin, Franck" <Fr...@anite.com> wrote on 05/28/2008
12:28:29 PM:

> Hello everyone.
>  
> I have just spent the day investigating an issue in my application, 
> without any success, and I'd appreciate any help you good people 
> could provide me.
>  
> The core of this issue is an XSD schema I got from a third party 
> which define several elements with large values of maxOccurs (e.g. 
> 10000), instead of unbounded.
> This causes an OutOfMemory exception when validating any documents 
> against this schema.
>  
> A bit of googling has quickly located several mentions of the JAXP 
> secure processing feature [1] and SecurityManager class [2].
>  
> My problem is that I do not apply validation when loading the 
> document, but much later, using external schemas rather than the 
> ones listed in the schemaLocation attribute.
> To do this, I use the DOM3 normalizeDocument() method [3].
>  
> Having looked at the code for xerces 2.7.1 (which is my current 
> version), I cannot find a way to leverage the SecurityManager when 
> using normalizeDocument().
> As far as I can see, I would either need access to the 
> ComponentManager, or be able to set or access the DomConfiguration 
> 'parentSettings'.
> I have tried setting the secure processing features when parsing the
> document, but by the time I apply normalizeDocument() I cannot see 
> any SecurityManager.
>  
> Can you think of a way around this problem? Please spare me the 
> obvious 'change the maxOccurs value' :-)
> On the other hand, I would consider any clean alternative to 
> normalizeDocument() to validate a fully formed DOMDocument.
> If necessary I can upgrade my xerces libraries to a supported 
> version, but I do not want to build my own.
>  
> All help and comments will be gratefully appreciated :-)
>  
>  
> [1] http://xerces.apache.org/xerces2-
> j/javadocs/api/javax/xml/XMLConstants.html#FEATURE_SECURE_PROCESSING
> [2]
http://xerces.apache.org/xerces2-j/properties.html#security-manager
> and http://xerces.apache.org/xerces2-
> j/javadocs/xerces2/org/apache/xerces/util/SecurityManager.html
> [3] http://xerces.apache.org/xerces2-
> j/javadocs/xerces2/org/apache/xerces/dom/CoreDocumentImpl.
> html#normalizeDocument()
>  
> Code sample:
>  
> XsdValidator iXsdValidator = new XsdValidator();
> org.w3c.dom.Document document = ...;
> org.w3c.dom.DOMConfiguration config = document.getDomConfig();
> config.setParameter("error-handler", iXsdValidator);
> config.setParameter("validate", Boolean.TRUE);
> config.setParameter("schema-type", http://www.w3.org/2001/XMLSchema);
> config.setParameter("schema-location", iXsdValidator.getSchemas());
> document.normalizeDocument();
> with 
> public class XsdValidator extends org.xml.sax.helpers.DefaultHandler 
> implements org.w3c.dom.DOMErrorHandler..
> ______________________________
> Franck Schmidlin
> Corporate Integration Consultant
> Anite Connect Technical Architect
> 
> Anite Public Sector
> Transformation
> ______________________________ 
> P Save Paper - Do you really need to print this e-mail? 



 

Scanned for viruses by BlackSpider MailControl
<http://www.blackspider.com/> 

Click here
<https://www.mailcontrol.com/sr/wQw0zmjPoHdJTZGyOCrrhg==
TrRa2aIa8060v15+flwC7GlwqJNEARgyKEsY6wpstbrpNg==>  to report this email
as spam.



Please refer to www.anite.com for individual Anite company details. The contents of this e-mail and any attachments are for the intended recipient only. If you are not the intended recipient, you are not authorised to and must not disclose, copy, distribute, or retain this message or any part of it. It may contain information which is confidential and/or covered by legal professional or other privilege. Contracts cannot be concluded with us nor legal service effected by email.  

Anite plc
Registered in England No.1798114
Registered Office: 353 Buckingham Avenue Slough Berks SL1 4PF United Kingdom
VAT Registration No. GB 787 418187

Scanned for viruses by BlackSpider MailControl.