You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xmlbeans.apache.org by "Bryce Alcock (JIRA)" <xm...@xml.apache.org> on 2010/05/10 20:42:37 UTC

[jira] Created: (XMLBEANS-438) O(n^2) operation in xmlbeans would like to make O(n log n) or better

O(n^2) operation in xmlbeans would like to make O(n log n) or better
--------------------------------------------------------------------

                 Key: XMLBEANS-438
                 URL: https://issues.apache.org/jira/browse/XMLBEANS-438
             Project: XMLBeans
          Issue Type: Improvement
    Affects Versions: Version 2.4 
         Environment: Using XMLbeans inside of POI when saving spreadsheets in the XLSX file format.
            Reporter: Bryce Alcock


Using POI to save a spread sheet in XLSX format causes an O(n^2) operation to occur.
Specifically:


Just a Quick overview
it seems like XmlComplexContentImpl.arraySetterHelper iterates over all elements in the spread sheet.
then in side of that iteration, the Xobj.find_element_user(Qname,int) is an Order n operation.

Just making Xobj.find_element_user a O(log n) would completely solve the problem.


ANALYSIS:   O(n^2) in XML Beans:   (Around Line 1153 or so pf XmlComplexContentImpl.java method arraySetterHelper)
-----------------------------------------------------------
 //   Order N  (this is invoked 1 time for each element in sources (~120,000 records in my case)


 for ( i++, j++; i < sources.length; i++, j++)
 {
 long t1 = System.currentTimeMillis();
    XmlCursor c = sources[i].isImmutable() ? null : sources[i].newCursor();
 getCursorTime += (System.currentTimeMillis()-t1);
 long t2=System.currentTimeMillis();
    if (c != null && c.toParent() && c.getObject() == this)
    {
 ifCheckTime += (System.currentTimeMillis()-t2);
 long t3 = System.currentTimeMillis();
        c.dispose();
 long t4 = System.currentTimeMillis();

 //  THIS CAUSES O(n^2) SEE CODE BELOW, but this on O(n) for each record.
 //  THIS IS THE INEFFICIENCY in XMLBeans...

        current = store.find_element_user( elemName, j );


 findUserElement += (System.currentTimeMillis() - t4);
        if (current == sources[i])
              ; // advance
        else
        {
             // Fall back to the general case
             break;
        }
 insideIfTime += (System.currentTimeMillis()-t3);
   }
   else
   {
 ifCheckTime += (System.currentTimeMillis()-t2);
       c.dispose();
       // Insert before the current element
       TypeStoreUser user = store.insert_element_user( elemName, j );
       ((XmlObjectBase) user).set( sources[i] );
    }
 }





 ---------------- Method from store.find_element_user(QName,int i)---------

 //  THIS METHOD IS 0(n)
 //  but this is called n times by XmlComplexContentImpl.java around line 1153....
 public TypeStoreUser find_element_user ( QName name, int i )
  {
     for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling )
        if (x.isElem() && x._name.equals( name ) && --i < 0)
            return x.getUser();
        return null;
  }


I will add a sample code showing exactly how to reproduce this issue.




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org


[jira] Updated: (XMLBEANS-438) O(n^2) operation in xmlbeans would like to make O(n log n) or better

Posted by "Bryce Alcock (JIRA)" <xm...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XMLBEANS-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryce Alcock updated XMLBEANS-438:
----------------------------------

    Attachment: Main.java
                ResponseTimeVsRowCount.png


This is a main Test.
To run it with the xmlbeans (xlsx format)  type:

#  java poitest.Main true

you may have to modify your classpath.
I can upload an executable jar if that would help.


About the PNG file:

this is an image from excel that shows the results of running the above main test.
You will see that at the rows increase the processing time/row (red dots) increase linearly O(n). 
This should be a constant time operation.

This causes the total processing time to increase as O(n^2) the blue dots.





> O(n^2) operation in xmlbeans would like to make O(n log n) or better
> --------------------------------------------------------------------
>
>                 Key: XMLBEANS-438
>                 URL: https://issues.apache.org/jira/browse/XMLBEANS-438
>             Project: XMLBeans
>          Issue Type: Improvement
>    Affects Versions: Version 2.4 
>         Environment: Using XMLbeans inside of POI when saving spreadsheets in the XLSX file format.
>            Reporter: Bryce Alcock
>         Attachments: Main.java, ResponseTimeVsRowCount.png
>
>
> Using POI to save a spread sheet in XLSX format causes an O(n^2) operation to occur.
> Specifically:
> Just a Quick overview
> it seems like XmlComplexContentImpl.arraySetterHelper iterates over all elements in the spread sheet.
> then in side of that iteration, the Xobj.find_element_user(Qname,int) is an Order n operation.
> Just making Xobj.find_element_user a O(log n) would completely solve the problem.
> ANALYSIS:   O(n^2) in XML Beans:   (Around Line 1153 or so pf XmlComplexContentImpl.java method arraySetterHelper)
> -----------------------------------------------------------
>  //   Order N  (this is invoked 1 time for each element in sources (~120,000 records in my case)
>  for ( i++, j++; i < sources.length; i++, j++)
>  {
>  long t1 = System.currentTimeMillis();
>     XmlCursor c = sources[i].isImmutable() ? null : sources[i].newCursor();
>  getCursorTime += (System.currentTimeMillis()-t1);
>  long t2=System.currentTimeMillis();
>     if (c != null && c.toParent() && c.getObject() == this)
>     {
>  ifCheckTime += (System.currentTimeMillis()-t2);
>  long t3 = System.currentTimeMillis();
>         c.dispose();
>  long t4 = System.currentTimeMillis();
>  //  THIS CAUSES O(n^2) SEE CODE BELOW, but this on O(n) for each record.
>  //  THIS IS THE INEFFICIENCY in XMLBeans...
>         current = store.find_element_user( elemName, j );
>  findUserElement += (System.currentTimeMillis() - t4);
>         if (current == sources[i])
>               ; // advance
>         else
>         {
>              // Fall back to the general case
>              break;
>         }
>  insideIfTime += (System.currentTimeMillis()-t3);
>    }
>    else
>    {
>  ifCheckTime += (System.currentTimeMillis()-t2);
>        c.dispose();
>        // Insert before the current element
>        TypeStoreUser user = store.insert_element_user( elemName, j );
>        ((XmlObjectBase) user).set( sources[i] );
>     }
>  }
>  ---------------- Method from store.find_element_user(QName,int i)---------
>  //  THIS METHOD IS 0(n)
>  //  but this is called n times by XmlComplexContentImpl.java around line 1153....
>  public TypeStoreUser find_element_user ( QName name, int i )
>   {
>      for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling )
>         if (x.isElem() && x._name.equals( name ) && --i < 0)
>             return x.getUser();
>         return null;
>   }
> I will add a sample code showing exactly how to reproduce this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org


[jira] Updated: (XMLBEANS-438) O(n^2) operation in xmlbeans would like to make O(n log n) or better

Posted by "Bryce Alcock (JIRA)" <xm...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XMLBEANS-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryce Alcock updated XMLBEANS-438:
----------------------------------

    Attachment: PoiTest.jar

For Convenience,
I have included an executable Jar File which will allow anyone to run the following test:

the format is:
each of the 3 lines represent "Sheet Time", Worksheet Save Time:
Worksheet save Time:
File  Rows  "Total Time"  "Time Per Row"  "used Mem...."


laptop% java -jar PoiTest.jar true 

Sheet Time : 2151
Worksheet save Time : 501
./20000_true.xls  	20000	 8957 	0.44785	 used mem 	 123527376
Sheet Time : 3448
Worksheet save Time : 400
./24000_true.xls  	24000	 6641 	0.27670833333333333	 used mem 	 162906032
Sheet Time : 4139
Worksheet save Time : 435
./28800_true.xls  	28800	 7208 	0.25027777777777777	 used mem 	 103586704
Sheet Time : 8019
Worksheet save Time : 541
./34560_true.xls  	34560	 11118 	0.3217013888888889	 used mem 	 202799568
Sheet Time : 12960
Worksheet save Time : 637
./41472_true.xls  	41472	 16987 	0.4096016589506173	 used mem 	 236485200
Sheet Time : 20555
Worksheet save Time : 767
./49766_true.xls  	49766	 25380 	0.5099867379335289	 used mem 	 205574344
Sheet Time : 39619
Worksheet save Time : 880
./59719_true.xls  	59719	 45568 	0.7630402384500745	 used mem 	 314282784
Sheet Time : 58512
Worksheet save Time : 1067
./71662_true.xls  	71662	 65863 	0.9190784516201055	 used mem 	 234697232
Sheet Time : 94646
Worksheet save Time : 1408
./85994_true.xls  	85994	 103034 	1.1981533595367118	 used mem 	 370101736
Sheet Time : 141941
Worksheet save Time : 1509
./103192_true.xls 	103192	 153521 	1.48772191642763	 used mem 	 317937848



> O(n^2) operation in xmlbeans would like to make O(n log n) or better
> --------------------------------------------------------------------
>
>                 Key: XMLBEANS-438
>                 URL: https://issues.apache.org/jira/browse/XMLBEANS-438
>             Project: XMLBeans
>          Issue Type: Improvement
>    Affects Versions: Version 2.4 
>         Environment: Using XMLbeans inside of POI when saving spreadsheets in the XLSX file format.
>            Reporter: Bryce Alcock
>         Attachments: Main.java, PoiTest.jar, ResponseTimeVsRowCount.png
>
>
> Using POI to save a spread sheet in XLSX format causes an O(n^2) operation to occur.
> Specifically:
> Just a Quick overview
> it seems like XmlComplexContentImpl.arraySetterHelper iterates over all elements in the spread sheet.
> then in side of that iteration, the Xobj.find_element_user(Qname,int) is an Order n operation.
> Just making Xobj.find_element_user a O(log n) would completely solve the problem.
> ANALYSIS:   O(n^2) in XML Beans:   (Around Line 1153 or so pf XmlComplexContentImpl.java method arraySetterHelper)
> -----------------------------------------------------------
>  //   Order N  (this is invoked 1 time for each element in sources (~120,000 records in my case)
>  for ( i++, j++; i < sources.length; i++, j++)
>  {
>  long t1 = System.currentTimeMillis();
>     XmlCursor c = sources[i].isImmutable() ? null : sources[i].newCursor();
>  getCursorTime += (System.currentTimeMillis()-t1);
>  long t2=System.currentTimeMillis();
>     if (c != null && c.toParent() && c.getObject() == this)
>     {
>  ifCheckTime += (System.currentTimeMillis()-t2);
>  long t3 = System.currentTimeMillis();
>         c.dispose();
>  long t4 = System.currentTimeMillis();
>  //  THIS CAUSES O(n^2) SEE CODE BELOW, but this on O(n) for each record.
>  //  THIS IS THE INEFFICIENCY in XMLBeans...
>         current = store.find_element_user( elemName, j );
>  findUserElement += (System.currentTimeMillis() - t4);
>         if (current == sources[i])
>               ; // advance
>         else
>         {
>              // Fall back to the general case
>              break;
>         }
>  insideIfTime += (System.currentTimeMillis()-t3);
>    }
>    else
>    {
>  ifCheckTime += (System.currentTimeMillis()-t2);
>        c.dispose();
>        // Insert before the current element
>        TypeStoreUser user = store.insert_element_user( elemName, j );
>        ((XmlObjectBase) user).set( sources[i] );
>     }
>  }
>  ---------------- Method from store.find_element_user(QName,int i)---------
>  //  THIS METHOD IS 0(n)
>  //  but this is called n times by XmlComplexContentImpl.java around line 1153....
>  public TypeStoreUser find_element_user ( QName name, int i )
>   {
>      for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling )
>         if (x.isElem() && x._name.equals( name ) && --i < 0)
>             return x.getUser();
>         return null;
>   }
> I will add a sample code showing exactly how to reproduce this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org


[jira] Commented: (XMLBEANS-438) O(n^2) operation in xmlbeans would like to make O(n log n) or better

Posted by "David Fisher (JIRA)" <xm...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XMLBEANS-438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920613#action_12920613 ] 

David Fisher commented on XMLBEANS-438:
---------------------------------------

This is a problem for Apache POI. We would really like to have a solution.

Any comments on array setters and making these efficient?

> O(n^2) operation in xmlbeans would like to make O(n log n) or better
> --------------------------------------------------------------------
>
>                 Key: XMLBEANS-438
>                 URL: https://issues.apache.org/jira/browse/XMLBEANS-438
>             Project: XMLBeans
>          Issue Type: Improvement
>    Affects Versions: Version 2.4 
>         Environment: Using XMLbeans inside of POI when saving spreadsheets in the XLSX file format.
>            Reporter: Bryce Alcock
>         Attachments: Main.java, PoiTest.jar, ResponseTimeVsRowCount.png
>
>
> Using POI to save a spread sheet in XLSX format causes an O(n^2) operation to occur.
> Specifically:
> Just a Quick overview
> it seems like XmlComplexContentImpl.arraySetterHelper iterates over all elements in the spread sheet.
> then in side of that iteration, the Xobj.find_element_user(Qname,int) is an Order n operation.
> Just making Xobj.find_element_user a O(log n) would completely solve the problem.
> ANALYSIS:   O(n^2) in XML Beans:   (Around Line 1153 or so pf XmlComplexContentImpl.java method arraySetterHelper)
> -----------------------------------------------------------
>  //   Order N  (this is invoked 1 time for each element in sources (~120,000 records in my case)
>  for ( i++, j++; i < sources.length; i++, j++)
>  {
>  long t1 = System.currentTimeMillis();
>     XmlCursor c = sources[i].isImmutable() ? null : sources[i].newCursor();
>  getCursorTime += (System.currentTimeMillis()-t1);
>  long t2=System.currentTimeMillis();
>     if (c != null && c.toParent() && c.getObject() == this)
>     {
>  ifCheckTime += (System.currentTimeMillis()-t2);
>  long t3 = System.currentTimeMillis();
>         c.dispose();
>  long t4 = System.currentTimeMillis();
>  //  THIS CAUSES O(n^2) SEE CODE BELOW, but this on O(n) for each record.
>  //  THIS IS THE INEFFICIENCY in XMLBeans...
>         current = store.find_element_user( elemName, j );
>  findUserElement += (System.currentTimeMillis() - t4);
>         if (current == sources[i])
>               ; // advance
>         else
>         {
>              // Fall back to the general case
>              break;
>         }
>  insideIfTime += (System.currentTimeMillis()-t3);
>    }
>    else
>    {
>  ifCheckTime += (System.currentTimeMillis()-t2);
>        c.dispose();
>        // Insert before the current element
>        TypeStoreUser user = store.insert_element_user( elemName, j );
>        ((XmlObjectBase) user).set( sources[i] );
>     }
>  }
>  ---------------- Method from store.find_element_user(QName,int i)---------
>  //  THIS METHOD IS 0(n)
>  //  but this is called n times by XmlComplexContentImpl.java around line 1153....
>  public TypeStoreUser find_element_user ( QName name, int i )
>   {
>      for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling )
>         if (x.isElem() && x._name.equals( name ) && --i < 0)
>             return x.getUser();
>         return null;
>   }
> I will add a sample code showing exactly how to reproduce this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org