You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xmlbeans.apache.org by "Bryce Alcock (JIRA)" <xm...@xml.apache.org> on 2010/05/10 20:42:37 UTC
[jira] Created: (XMLBEANS-438) O(n^2) operation in xmlbeans would
like to make O(n log n) or better
O(n^2) operation in xmlbeans would like to make O(n log n) or better
--------------------------------------------------------------------
Key: XMLBEANS-438
URL: https://issues.apache.org/jira/browse/XMLBEANS-438
Project: XMLBeans
Issue Type: Improvement
Affects Versions: Version 2.4
Environment: Using XMLbeans inside of POI when saving spreadsheets in the XLSX file format.
Reporter: Bryce Alcock
Using POI to save a spread sheet in XLSX format causes an O(n^2) operation to occur.
Specifically:
Just a Quick overview
it seems like XmlComplexContentImpl.arraySetterHelper iterates over all elements in the spread sheet.
then in side of that iteration, the Xobj.find_element_user(Qname,int) is an Order n operation.
Just making Xobj.find_element_user a O(log n) would completely solve the problem.
ANALYSIS: O(n^2) in XML Beans: (Around Line 1153 or so pf XmlComplexContentImpl.java method arraySetterHelper)
-----------------------------------------------------------
// Order N (this is invoked 1 time for each element in sources (~120,000 records in my case)
for ( i++, j++; i < sources.length; i++, j++)
{
long t1 = System.currentTimeMillis();
XmlCursor c = sources[i].isImmutable() ? null : sources[i].newCursor();
getCursorTime += (System.currentTimeMillis()-t1);
long t2=System.currentTimeMillis();
if (c != null && c.toParent() && c.getObject() == this)
{
ifCheckTime += (System.currentTimeMillis()-t2);
long t3 = System.currentTimeMillis();
c.dispose();
long t4 = System.currentTimeMillis();
// THIS CAUSES O(n^2) SEE CODE BELOW, but this on O(n) for each record.
// THIS IS THE INEFFICIENCY in XMLBeans...
current = store.find_element_user( elemName, j );
findUserElement += (System.currentTimeMillis() - t4);
if (current == sources[i])
; // advance
else
{
// Fall back to the general case
break;
}
insideIfTime += (System.currentTimeMillis()-t3);
}
else
{
ifCheckTime += (System.currentTimeMillis()-t2);
c.dispose();
// Insert before the current element
TypeStoreUser user = store.insert_element_user( elemName, j );
((XmlObjectBase) user).set( sources[i] );
}
}
---------------- Method from store.find_element_user(QName,int i)---------
// THIS METHOD IS 0(n)
// but this is called n times by XmlComplexContentImpl.java around line 1153....
public TypeStoreUser find_element_user ( QName name, int i )
{
for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling )
if (x.isElem() && x._name.equals( name ) && --i < 0)
return x.getUser();
return null;
}
I will add a sample code showing exactly how to reproduce this issue.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org
[jira] Updated: (XMLBEANS-438) O(n^2) operation in xmlbeans would
like to make O(n log n) or better
Posted by "Bryce Alcock (JIRA)" <xm...@xml.apache.org>.
[ https://issues.apache.org/jira/browse/XMLBEANS-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bryce Alcock updated XMLBEANS-438:
----------------------------------
Attachment: Main.java
ResponseTimeVsRowCount.png
This is a main Test.
To run it with the xmlbeans (xlsx format) type:
# java poitest.Main true
you may have to modify your classpath.
I can upload an executable jar if that would help.
About the PNG file:
this is an image from excel that shows the results of running the above main test.
You will see that at the rows increase the processing time/row (red dots) increase linearly O(n).
This should be a constant time operation.
This causes the total processing time to increase as O(n^2) the blue dots.
> O(n^2) operation in xmlbeans would like to make O(n log n) or better
> --------------------------------------------------------------------
>
> Key: XMLBEANS-438
> URL: https://issues.apache.org/jira/browse/XMLBEANS-438
> Project: XMLBeans
> Issue Type: Improvement
> Affects Versions: Version 2.4
> Environment: Using XMLbeans inside of POI when saving spreadsheets in the XLSX file format.
> Reporter: Bryce Alcock
> Attachments: Main.java, ResponseTimeVsRowCount.png
>
>
> Using POI to save a spread sheet in XLSX format causes an O(n^2) operation to occur.
> Specifically:
> Just a Quick overview
> it seems like XmlComplexContentImpl.arraySetterHelper iterates over all elements in the spread sheet.
> then in side of that iteration, the Xobj.find_element_user(Qname,int) is an Order n operation.
> Just making Xobj.find_element_user a O(log n) would completely solve the problem.
> ANALYSIS: O(n^2) in XML Beans: (Around Line 1153 or so pf XmlComplexContentImpl.java method arraySetterHelper)
> -----------------------------------------------------------
> // Order N (this is invoked 1 time for each element in sources (~120,000 records in my case)
> for ( i++, j++; i < sources.length; i++, j++)
> {
> long t1 = System.currentTimeMillis();
> XmlCursor c = sources[i].isImmutable() ? null : sources[i].newCursor();
> getCursorTime += (System.currentTimeMillis()-t1);
> long t2=System.currentTimeMillis();
> if (c != null && c.toParent() && c.getObject() == this)
> {
> ifCheckTime += (System.currentTimeMillis()-t2);
> long t3 = System.currentTimeMillis();
> c.dispose();
> long t4 = System.currentTimeMillis();
> // THIS CAUSES O(n^2) SEE CODE BELOW, but this on O(n) for each record.
> // THIS IS THE INEFFICIENCY in XMLBeans...
> current = store.find_element_user( elemName, j );
> findUserElement += (System.currentTimeMillis() - t4);
> if (current == sources[i])
> ; // advance
> else
> {
> // Fall back to the general case
> break;
> }
> insideIfTime += (System.currentTimeMillis()-t3);
> }
> else
> {
> ifCheckTime += (System.currentTimeMillis()-t2);
> c.dispose();
> // Insert before the current element
> TypeStoreUser user = store.insert_element_user( elemName, j );
> ((XmlObjectBase) user).set( sources[i] );
> }
> }
> ---------------- Method from store.find_element_user(QName,int i)---------
> // THIS METHOD IS 0(n)
> // but this is called n times by XmlComplexContentImpl.java around line 1153....
> public TypeStoreUser find_element_user ( QName name, int i )
> {
> for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling )
> if (x.isElem() && x._name.equals( name ) && --i < 0)
> return x.getUser();
> return null;
> }
> I will add a sample code showing exactly how to reproduce this issue.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org
[jira] Updated: (XMLBEANS-438) O(n^2) operation in xmlbeans would
like to make O(n log n) or better
Posted by "Bryce Alcock (JIRA)" <xm...@xml.apache.org>.
[ https://issues.apache.org/jira/browse/XMLBEANS-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bryce Alcock updated XMLBEANS-438:
----------------------------------
Attachment: PoiTest.jar
For Convenience,
I have included an executable Jar File which will allow anyone to run the following test:
the format is:
each of the 3 lines represent "Sheet Time", Worksheet Save Time:
Worksheet save Time:
File Rows "Total Time" "Time Per Row" "used Mem...."
laptop% java -jar PoiTest.jar true
Sheet Time : 2151
Worksheet save Time : 501
./20000_true.xls 20000 8957 0.44785 used mem 123527376
Sheet Time : 3448
Worksheet save Time : 400
./24000_true.xls 24000 6641 0.27670833333333333 used mem 162906032
Sheet Time : 4139
Worksheet save Time : 435
./28800_true.xls 28800 7208 0.25027777777777777 used mem 103586704
Sheet Time : 8019
Worksheet save Time : 541
./34560_true.xls 34560 11118 0.3217013888888889 used mem 202799568
Sheet Time : 12960
Worksheet save Time : 637
./41472_true.xls 41472 16987 0.4096016589506173 used mem 236485200
Sheet Time : 20555
Worksheet save Time : 767
./49766_true.xls 49766 25380 0.5099867379335289 used mem 205574344
Sheet Time : 39619
Worksheet save Time : 880
./59719_true.xls 59719 45568 0.7630402384500745 used mem 314282784
Sheet Time : 58512
Worksheet save Time : 1067
./71662_true.xls 71662 65863 0.9190784516201055 used mem 234697232
Sheet Time : 94646
Worksheet save Time : 1408
./85994_true.xls 85994 103034 1.1981533595367118 used mem 370101736
Sheet Time : 141941
Worksheet save Time : 1509
./103192_true.xls 103192 153521 1.48772191642763 used mem 317937848
> O(n^2) operation in xmlbeans would like to make O(n log n) or better
> --------------------------------------------------------------------
>
> Key: XMLBEANS-438
> URL: https://issues.apache.org/jira/browse/XMLBEANS-438
> Project: XMLBeans
> Issue Type: Improvement
> Affects Versions: Version 2.4
> Environment: Using XMLbeans inside of POI when saving spreadsheets in the XLSX file format.
> Reporter: Bryce Alcock
> Attachments: Main.java, PoiTest.jar, ResponseTimeVsRowCount.png
>
>
> Using POI to save a spread sheet in XLSX format causes an O(n^2) operation to occur.
> Specifically:
> Just a Quick overview
> it seems like XmlComplexContentImpl.arraySetterHelper iterates over all elements in the spread sheet.
> then in side of that iteration, the Xobj.find_element_user(Qname,int) is an Order n operation.
> Just making Xobj.find_element_user a O(log n) would completely solve the problem.
> ANALYSIS: O(n^2) in XML Beans: (Around Line 1153 or so pf XmlComplexContentImpl.java method arraySetterHelper)
> -----------------------------------------------------------
> // Order N (this is invoked 1 time for each element in sources (~120,000 records in my case)
> for ( i++, j++; i < sources.length; i++, j++)
> {
> long t1 = System.currentTimeMillis();
> XmlCursor c = sources[i].isImmutable() ? null : sources[i].newCursor();
> getCursorTime += (System.currentTimeMillis()-t1);
> long t2=System.currentTimeMillis();
> if (c != null && c.toParent() && c.getObject() == this)
> {
> ifCheckTime += (System.currentTimeMillis()-t2);
> long t3 = System.currentTimeMillis();
> c.dispose();
> long t4 = System.currentTimeMillis();
> // THIS CAUSES O(n^2) SEE CODE BELOW, but this on O(n) for each record.
> // THIS IS THE INEFFICIENCY in XMLBeans...
> current = store.find_element_user( elemName, j );
> findUserElement += (System.currentTimeMillis() - t4);
> if (current == sources[i])
> ; // advance
> else
> {
> // Fall back to the general case
> break;
> }
> insideIfTime += (System.currentTimeMillis()-t3);
> }
> else
> {
> ifCheckTime += (System.currentTimeMillis()-t2);
> c.dispose();
> // Insert before the current element
> TypeStoreUser user = store.insert_element_user( elemName, j );
> ((XmlObjectBase) user).set( sources[i] );
> }
> }
> ---------------- Method from store.find_element_user(QName,int i)---------
> // THIS METHOD IS 0(n)
> // but this is called n times by XmlComplexContentImpl.java around line 1153....
> public TypeStoreUser find_element_user ( QName name, int i )
> {
> for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling )
> if (x.isElem() && x._name.equals( name ) && --i < 0)
> return x.getUser();
> return null;
> }
> I will add a sample code showing exactly how to reproduce this issue.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org
[jira] Commented: (XMLBEANS-438) O(n^2) operation in xmlbeans would
like to make O(n log n) or better
Posted by "David Fisher (JIRA)" <xm...@xml.apache.org>.
[ https://issues.apache.org/jira/browse/XMLBEANS-438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920613#action_12920613 ]
David Fisher commented on XMLBEANS-438:
---------------------------------------
This is a problem for Apache POI. We would really like to have a solution.
Any comments on array setters and making these efficient?
> O(n^2) operation in xmlbeans would like to make O(n log n) or better
> --------------------------------------------------------------------
>
> Key: XMLBEANS-438
> URL: https://issues.apache.org/jira/browse/XMLBEANS-438
> Project: XMLBeans
> Issue Type: Improvement
> Affects Versions: Version 2.4
> Environment: Using XMLbeans inside of POI when saving spreadsheets in the XLSX file format.
> Reporter: Bryce Alcock
> Attachments: Main.java, PoiTest.jar, ResponseTimeVsRowCount.png
>
>
> Using POI to save a spread sheet in XLSX format causes an O(n^2) operation to occur.
> Specifically:
> Just a Quick overview
> it seems like XmlComplexContentImpl.arraySetterHelper iterates over all elements in the spread sheet.
> then in side of that iteration, the Xobj.find_element_user(Qname,int) is an Order n operation.
> Just making Xobj.find_element_user a O(log n) would completely solve the problem.
> ANALYSIS: O(n^2) in XML Beans: (Around Line 1153 or so pf XmlComplexContentImpl.java method arraySetterHelper)
> -----------------------------------------------------------
> // Order N (this is invoked 1 time for each element in sources (~120,000 records in my case)
> for ( i++, j++; i < sources.length; i++, j++)
> {
> long t1 = System.currentTimeMillis();
> XmlCursor c = sources[i].isImmutable() ? null : sources[i].newCursor();
> getCursorTime += (System.currentTimeMillis()-t1);
> long t2=System.currentTimeMillis();
> if (c != null && c.toParent() && c.getObject() == this)
> {
> ifCheckTime += (System.currentTimeMillis()-t2);
> long t3 = System.currentTimeMillis();
> c.dispose();
> long t4 = System.currentTimeMillis();
> // THIS CAUSES O(n^2) SEE CODE BELOW, but this on O(n) for each record.
> // THIS IS THE INEFFICIENCY in XMLBeans...
> current = store.find_element_user( elemName, j );
> findUserElement += (System.currentTimeMillis() - t4);
> if (current == sources[i])
> ; // advance
> else
> {
> // Fall back to the general case
> break;
> }
> insideIfTime += (System.currentTimeMillis()-t3);
> }
> else
> {
> ifCheckTime += (System.currentTimeMillis()-t2);
> c.dispose();
> // Insert before the current element
> TypeStoreUser user = store.insert_element_user( elemName, j );
> ((XmlObjectBase) user).set( sources[i] );
> }
> }
> ---------------- Method from store.find_element_user(QName,int i)---------
> // THIS METHOD IS 0(n)
> // but this is called n times by XmlComplexContentImpl.java around line 1153....
> public TypeStoreUser find_element_user ( QName name, int i )
> {
> for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling )
> if (x.isElem() && x._name.equals( name ) && --i < 0)
> return x.getUser();
> return null;
> }
> I will add a sample code showing exactly how to reproduce this issue.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org