You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by "yiguang hu (JIRA)" <xe...@xml.apache.org> on 2005/08/01 16:54:37 UTC

[jira] Created: (XERCESJ-1091) xml validation performance bad when maxOccurs is big

xml validation performance bad when maxOccurs is big
----------------------------------------------------

         Key: XERCESJ-1091
         URL: http://issues.apache.org/jira/browse/XERCESJ-1091
     Project: Xerces2-J
        Type: Improvement
  Components: SAX  
    Versions: 2.0.0 [beta 4]    
 Environment: Solaris 9
    Reporter: yiguang hu


For the following schema, the validation of a XML file (with only one record) against this schema took several minutes. But if the  maxOccurs="9999" change to maxOccurs="999", the validation takes 2 second.  I was able to get around it by using "unbounded". Don't understand why 9999 could cause such a big differnce for validation. 

....
        <complexType name="MatListType">^M
                <sequence>^M
                        <element name="MatItem" minOccurs="0" maxOccurs="9999">^M
                                <complexType>^M
                                        <sequence>^M
                                                <element name="SequenceNumber" type="wvd:A082_LIN_ITM_NO"/>^M
                                                <element name="Material" type="sg:Material"/>^M
                                                <element name="PONumber" type="sg:ReferenceNumber" minOccurs="0"/>^M
                                                <element name="POItemNumber" type="wvd:A154_REF_NO" minOccurs="0"/>^M
                                                <element name="Quantity" type="sg:QuantityC" minOccurs="0"/>^M
                                                <element name="PackingQuantity" type="sg:QuantityC" minOccurs="0"/>^M
                                                <element name="ContainerNumber" type="sg:ContainerInfo" minOccurs="0"/>^M
                                                <element name="InvoiceNumber" type="sg:ReferenceNumber" minOccurs="0"/>^M
                                        </sequence>^M
                                </complexType>^M
                        </element>^M
                </sequence>^M
        </complexType>^M
.....

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Resolved: (XERCESJ-1091) xml validation performance bad when maxOccurs is big

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XERCESJ-1091?page=all ]
     
Michael Glavassevich resolved XERCESJ-1091:
-------------------------------------------

    Resolution: Duplicate

This is a known limitation which has been reported (XERCESJ-773) before and is due to the algorithm [1] for building the DFA which represents the complex type. This algorithm performs very poorly on bounded repetition since it generates many states.

[1] http://lists.xml.org/archives/xml-dev/200506/msg00074.html

> xml validation performance bad when maxOccurs is big
> ----------------------------------------------------
>
>          Key: XERCESJ-1091
>          URL: http://issues.apache.org/jira/browse/XERCESJ-1091
>      Project: Xerces2-J
>         Type: Improvement
>   Components: SAX
>     Versions: 2.0.0 [beta 4]
>  Environment: Solaris 9
>     Reporter: yiguang hu

>
> For the following schema, the validation of a XML file (with only one record) against this schema took several minutes. But if the  maxOccurs="9999" change to maxOccurs="999", the validation takes 2 second.  I was able to get around it by using "unbounded". Don't understand why 9999 could cause such a big differnce for validation. 
> ....
>         <complexType name="MatListType">^M
>                 <sequence>^M
>                         <element name="MatItem" minOccurs="0" maxOccurs="9999">^M
>                                 <complexType>^M
>                                         <sequence>^M
>                                                 <element name="SequenceNumber" type="wvd:A082_LIN_ITM_NO"/>^M
>                                                 <element name="Material" type="sg:Material"/>^M
>                                                 <element name="PONumber" type="sg:ReferenceNumber" minOccurs="0"/>^M
>                                                 <element name="POItemNumber" type="wvd:A154_REF_NO" minOccurs="0"/>^M
>                                                 <element name="Quantity" type="sg:QuantityC" minOccurs="0"/>^M
>                                                 <element name="PackingQuantity" type="sg:QuantityC" minOccurs="0"/>^M
>                                                 <element name="ContainerNumber" type="sg:ContainerInfo" minOccurs="0"/>^M
>                                                 <element name="InvoiceNumber" type="sg:ReferenceNumber" minOccurs="0"/>^M
>                                         </sequence>^M
>                                 </complexType>^M
>                         </element>^M
>                 </sequence>^M
>         </complexType>^M
> .....

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org