You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by "Michael Glavassevich (Updated) (JIRA)" <xe...@xml.apache.org> on 2012/03/11 01:32:56 UTC

[jira] [Updated] (XERCESJ-1276) Improve performance of XML Schema Identity-constraint validation --- XMLSchemaValidator$ValueStoreBase.contains() is painfully slow.

     [ https://issues.apache.org/jira/browse/XERCESJ-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Glavassevich updated XERCESJ-1276:
------------------------------------------

    Summary: Improve performance of XML Schema Identity-constraint validation --- XMLSchemaValidator$ValueStoreBase.contains() is painfully slow.  (was: XMLSchemaValidator$ValueStoreBase.contains() is painfully slow)
    
> Improve performance of XML Schema Identity-constraint validation --- XMLSchemaValidator$ValueStoreBase.contains() is painfully slow.
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1276
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1276
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: XML Schema 1.0 Structures
>    Affects Versions: 2.6.2, 2.9.1
>            Reporter: Kenny MacLeod
>              Labels: gsoc, gsoc2012
>         Attachments: XMLSchemaValidator.java
>
>
> Under certain conditions, the contains() method in XMLSchemaValidator$ValueStoreBase can cripple the performance of parsing and validation.
> I'm not sure what those conditions are, but as a guideline figure I was using JAXB2 to deserialize a 22meg XML file.  Without schema validation, it took 5 seconds.  With validation, it took over 3 minutes (JDK 1.5.0_10 on win32). My profiler pointed the finger squarely at that method XMLSchemaValidator.
> Suspicions were aroused further when seeing this comment in the source:
> public boolean contains() {
>             // REVISIT: we can improve performance by using hash codes, instead of
>             // traversing global vector that could be quite large.
> This is present in Xerces 2.6.2 contained with JDK1.5.0_10, and also in the source for 2.9.1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org