You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xmlbeans.apache.org by Br...@sungard.com on 2010/05/07 20:17:28 UTC

O(n^2) operation in xmlbeans would like to make O(n log n)

POI uses XMLBeans,
This is a problem to help make POI faster:
There has been on going discussion in the dev.at.poi.apache.org list,
However I will include back ground here.
This issue is significant.




Took your suggestion and pulled XMLBeans 2.4 into POI,
it worked but contained the O(n^2) issue.

I have spent some time further investigating this, and it is
fairly simple to analyse and understand why it is O(n^2).

The real question is what can we do.
NOTE:  My line numbers won't match with SVN because I have modified the code to put
some timers and system outs in place.


Just a Quick overview
it seems like XmlComplexContentImpl.arraySetterHelper iterates over all elements in the spread sheet.
then in side of that iteration, the Xobj.find_element_user(Qname,int) is an Order n operation.

Just making Xobj.find_element_user a O(log n) would completely solve the problem.



ANALYSIS:   O(n^2) in XML Beans:   (Around Line 1153 or so....)
-----------------------------------------------------------
//   Order N  (this is invoked 1 time for each element in sources (~120,000 records in my case)


for ( i++, j++; i < sources.length; i++, j++)
{
long t1 = System.currentTimeMillis();
    XmlCursor c = sources[i].isImmutable() ? null : sources[i].newCursor();
getCursorTime += (System.currentTimeMillis()-t1);
long t2=System.currentTimeMillis();
    if (c != null && c.toParent() && c.getObject() == this)
    {
ifCheckTime += (System.currentTimeMillis()-t2);
long t3 = System.currentTimeMillis();
        c.dispose();
long t4 = System.currentTimeMillis();

//  THIS CAUSES O(n^2) SEE CODE BELOW, but this on O(n) for each record.
//  THIS IS THE INEFFICIENCY in XMLBeans...

        current = store.find_element_user( elemName, j );


findUserElement += (System.currentTimeMillis() - t4);
        if (current == sources[i])
              ; // advance
        else
        {
             // Fall back to the general case
             break;
        }
insideIfTime += (System.currentTimeMillis()-t3);
   }
   else
   {
ifCheckTime += (System.currentTimeMillis()-t2);
       c.dispose();
       // Insert before the current element
       TypeStoreUser user = store.insert_element_user( elemName, j );
       ((XmlObjectBase) user).set( sources[i] );
    }
}





---------------- Method from store.find_element_user(QName,int i)---------

//  THIS METHOD IS 0(n)
//  but this is called n times by XmlComplexContentImpl.java around line 1153....
public TypeStoreUser find_element_user ( QName name, int i )
 {
     for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling )
        if (x.isElem() && x._name.equals( name ) && --i < 0)
            return x.getUser();
        return null;
 }




Best regards
Bryce



------------------------------------------------------------------------------------------

Bryce Alcock * Performance Architect * SunGard *
Asset Arena * 377 E. Butterfield Road Suite 800, Lombard, IL 60148 *
Tel 630-986-3006 *
www.sungard.com/assetarena

..........................................................................................
http://www.brycealcock.com/
------------------------------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org


RE: O(n^2) operation in xmlbeans would like to make O(n log n)

Posted by Br...@sungard.com.
Forwarding from XMLBEANS mailing list to POI mailing list.

Cezar,

Thanks for the response.
I think this may help formulate a strategy going forward.
I am forwarding this to the POI Dev List,
so that the POI Team can read your comments.  


Bryce Alcock * Performance Architect * SunGard * Asset Arena * 
377 E. Butterfield Road Suite 800, Lombard, IL 60148 * 
Think before you print


-----Original Message-----
From: Cezar Andrei [mailto:cezar.andrei@oracle.com] 
Sent: Wednesday, June 09, 2010 1:02 PM
To: dev@xmlbeans.apache.org
Subject: RE: O(n^2) operation in xmlbeans would like to make O(n log n)

Bryce,

I don't think it's possible to make store.find_element_user() of O(log
N), but if you have the elements in the right order you can try using
the XmlCursor interface to insert new elements in O(1). This way you
avoid searching for the right place to insert a new element.

Cezar

-----Original Message-----
From: Bryce.Alcock@sungard.com [mailto:Bryce.Alcock@sungard.com] 
Sent: Friday, May 07, 2010 1:17 PM
To: dev@xmlbeans.apache.org
Subject: O(n^2) operation in xmlbeans would like to make O(n log n)


POI uses XMLBeans,
This is a problem to help make POI faster:
There has been on going discussion in the dev.at.poi.apache.org list,
However I will include back ground here.
This issue is significant.




Took your suggestion and pulled XMLBeans 2.4 into POI,
it worked but contained the O(n^2) issue.

I have spent some time further investigating this, and it is
fairly simple to analyse and understand why it is O(n^2).

The real question is what can we do.
NOTE:  My line numbers won't match with SVN because I have modified the
code to put
some timers and system outs in place.


Just a Quick overview
it seems like XmlComplexContentImpl.arraySetterHelper iterates over all
elements in the spread sheet.
then in side of that iteration, the Xobj.find_element_user(Qname,int) is
an Order n operation.

Just making Xobj.find_element_user a O(log n) would completely solve the
problem.



ANALYSIS:   O(n^2) in XML Beans:   (Around Line 1153 or so....)
-----------------------------------------------------------
//   Order N  (this is invoked 1 time for each element in sources
(~120,000 records in my case)


for ( i++, j++; i < sources.length; i++, j++)
{
long t1 = System.currentTimeMillis();
    XmlCursor c = sources[i].isImmutable() ? null :
sources[i].newCursor();
getCursorTime += (System.currentTimeMillis()-t1);
long t2=System.currentTimeMillis();
    if (c != null && c.toParent() && c.getObject() == this)
    {
ifCheckTime += (System.currentTimeMillis()-t2);
long t3 = System.currentTimeMillis();
        c.dispose();
long t4 = System.currentTimeMillis();

//  THIS CAUSES O(n^2) SEE CODE BELOW, but this on O(n) for each record.
//  THIS IS THE INEFFICIENCY in XMLBeans...

        current = store.find_element_user( elemName, j );


findUserElement += (System.currentTimeMillis() - t4);
        if (current == sources[i])
              ; // advance
        else
        {
             // Fall back to the general case
             break;
        }
insideIfTime += (System.currentTimeMillis()-t3);
   }
   else
   {
ifCheckTime += (System.currentTimeMillis()-t2);
       c.dispose();
       // Insert before the current element
       TypeStoreUser user = store.insert_element_user( elemName, j );
       ((XmlObjectBase) user).set( sources[i] );
    }
}





---------------- Method from store.find_element_user(QName,int
i)---------

//  THIS METHOD IS 0(n)
//  but this is called n times by XmlComplexContentImpl.java around line
1153....
public TypeStoreUser find_element_user ( QName name, int i )
 {
     for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling )
        if (x.isElem() && x._name.equals( name ) && --i < 0)
            return x.getUser();
        return null;
 }




Best regards
Bryce



------------------------------------------------------------------------
------------------

Bryce Alcock * Performance Architect * SunGard *
Asset Arena * 377 E. Butterfield Road Suite 800, Lombard, IL 60148 *
Tel 630-986-3006 *
www.sungard.com/assetarena

........................................................................
..................
http://www.brycealcock.com/
------------------------------------------------------------------------
------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org


RE: O(n^2) operation in xmlbeans would like to make O(n log n)

Posted by Br...@sungard.com.
Forwarding from XMLBEANS mailing list to POI mailing list.

Cezar,

Thanks for the response.
I think this may help formulate a strategy going forward.
I am forwarding this to the POI Dev List,
so that the POI Team can read your comments.  


Bryce Alcock * Performance Architect * SunGard * Asset Arena * 
377 E. Butterfield Road Suite 800, Lombard, IL 60148 * 
Think before you print


-----Original Message-----
From: Cezar Andrei [mailto:cezar.andrei@oracle.com] 
Sent: Wednesday, June 09, 2010 1:02 PM
To: dev@xmlbeans.apache.org
Subject: RE: O(n^2) operation in xmlbeans would like to make O(n log n)

Bryce,

I don't think it's possible to make store.find_element_user() of O(log
N), but if you have the elements in the right order you can try using
the XmlCursor interface to insert new elements in O(1). This way you
avoid searching for the right place to insert a new element.

Cezar

-----Original Message-----
From: Bryce.Alcock@sungard.com [mailto:Bryce.Alcock@sungard.com] 
Sent: Friday, May 07, 2010 1:17 PM
To: dev@xmlbeans.apache.org
Subject: O(n^2) operation in xmlbeans would like to make O(n log n)


POI uses XMLBeans,
This is a problem to help make POI faster:
There has been on going discussion in the dev.at.poi.apache.org list,
However I will include back ground here.
This issue is significant.




Took your suggestion and pulled XMLBeans 2.4 into POI,
it worked but contained the O(n^2) issue.

I have spent some time further investigating this, and it is
fairly simple to analyse and understand why it is O(n^2).

The real question is what can we do.
NOTE:  My line numbers won't match with SVN because I have modified the
code to put
some timers and system outs in place.


Just a Quick overview
it seems like XmlComplexContentImpl.arraySetterHelper iterates over all
elements in the spread sheet.
then in side of that iteration, the Xobj.find_element_user(Qname,int) is
an Order n operation.

Just making Xobj.find_element_user a O(log n) would completely solve the
problem.



ANALYSIS:   O(n^2) in XML Beans:   (Around Line 1153 or so....)
-----------------------------------------------------------
//   Order N  (this is invoked 1 time for each element in sources
(~120,000 records in my case)


for ( i++, j++; i < sources.length; i++, j++)
{
long t1 = System.currentTimeMillis();
    XmlCursor c = sources[i].isImmutable() ? null :
sources[i].newCursor();
getCursorTime += (System.currentTimeMillis()-t1);
long t2=System.currentTimeMillis();
    if (c != null && c.toParent() && c.getObject() == this)
    {
ifCheckTime += (System.currentTimeMillis()-t2);
long t3 = System.currentTimeMillis();
        c.dispose();
long t4 = System.currentTimeMillis();

//  THIS CAUSES O(n^2) SEE CODE BELOW, but this on O(n) for each record.
//  THIS IS THE INEFFICIENCY in XMLBeans...

        current = store.find_element_user( elemName, j );


findUserElement += (System.currentTimeMillis() - t4);
        if (current == sources[i])
              ; // advance
        else
        {
             // Fall back to the general case
             break;
        }
insideIfTime += (System.currentTimeMillis()-t3);
   }
   else
   {
ifCheckTime += (System.currentTimeMillis()-t2);
       c.dispose();
       // Insert before the current element
       TypeStoreUser user = store.insert_element_user( elemName, j );
       ((XmlObjectBase) user).set( sources[i] );
    }
}





---------------- Method from store.find_element_user(QName,int
i)---------

//  THIS METHOD IS 0(n)
//  but this is called n times by XmlComplexContentImpl.java around line
1153....
public TypeStoreUser find_element_user ( QName name, int i )
 {
     for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling )
        if (x.isElem() && x._name.equals( name ) && --i < 0)
            return x.getUser();
        return null;
 }




Best regards
Bryce



------------------------------------------------------------------------
------------------

Bryce Alcock * Performance Architect * SunGard *
Asset Arena * 377 E. Butterfield Road Suite 800, Lombard, IL 60148 *
Tel 630-986-3006 *
www.sungard.com/assetarena

........................................................................
..................
http://www.brycealcock.com/
------------------------------------------------------------------------
------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


RE: O(n^2) operation in xmlbeans would like to make O(n log n)

Posted by Cezar Andrei <ce...@oracle.com>.
Bryce,

I don't think it's possible to make store.find_element_user() of O(log N), but if you have the elements in the right order you can try using the XmlCursor interface to insert new elements in O(1). This way you avoid searching for the right place to insert a new element.

Cezar

-----Original Message-----
From: Bryce.Alcock@sungard.com [mailto:Bryce.Alcock@sungard.com] 
Sent: Friday, May 07, 2010 1:17 PM
To: dev@xmlbeans.apache.org
Subject: O(n^2) operation in xmlbeans would like to make O(n log n)


POI uses XMLBeans,
This is a problem to help make POI faster:
There has been on going discussion in the dev.at.poi.apache.org list,
However I will include back ground here.
This issue is significant.




Took your suggestion and pulled XMLBeans 2.4 into POI,
it worked but contained the O(n^2) issue.

I have spent some time further investigating this, and it is
fairly simple to analyse and understand why it is O(n^2).

The real question is what can we do.
NOTE:  My line numbers won't match with SVN because I have modified the code to put
some timers and system outs in place.


Just a Quick overview
it seems like XmlComplexContentImpl.arraySetterHelper iterates over all elements in the spread sheet.
then in side of that iteration, the Xobj.find_element_user(Qname,int) is an Order n operation.

Just making Xobj.find_element_user a O(log n) would completely solve the problem.



ANALYSIS:   O(n^2) in XML Beans:   (Around Line 1153 or so....)
-----------------------------------------------------------
//   Order N  (this is invoked 1 time for each element in sources (~120,000 records in my case)


for ( i++, j++; i < sources.length; i++, j++)
{
long t1 = System.currentTimeMillis();
    XmlCursor c = sources[i].isImmutable() ? null : sources[i].newCursor();
getCursorTime += (System.currentTimeMillis()-t1);
long t2=System.currentTimeMillis();
    if (c != null && c.toParent() && c.getObject() == this)
    {
ifCheckTime += (System.currentTimeMillis()-t2);
long t3 = System.currentTimeMillis();
        c.dispose();
long t4 = System.currentTimeMillis();

//  THIS CAUSES O(n^2) SEE CODE BELOW, but this on O(n) for each record.
//  THIS IS THE INEFFICIENCY in XMLBeans...

        current = store.find_element_user( elemName, j );


findUserElement += (System.currentTimeMillis() - t4);
        if (current == sources[i])
              ; // advance
        else
        {
             // Fall back to the general case
             break;
        }
insideIfTime += (System.currentTimeMillis()-t3);
   }
   else
   {
ifCheckTime += (System.currentTimeMillis()-t2);
       c.dispose();
       // Insert before the current element
       TypeStoreUser user = store.insert_element_user( elemName, j );
       ((XmlObjectBase) user).set( sources[i] );
    }
}





---------------- Method from store.find_element_user(QName,int i)---------

//  THIS METHOD IS 0(n)
//  but this is called n times by XmlComplexContentImpl.java around line 1153....
public TypeStoreUser find_element_user ( QName name, int i )
 {
     for ( Xobj x = _firstChild ; x != null ; x = x._nextSibling )
        if (x.isElem() && x._name.equals( name ) && --i < 0)
            return x.getUser();
        return null;
 }




Best regards
Bryce



------------------------------------------------------------------------------------------

Bryce Alcock * Performance Architect * SunGard *
Asset Arena * 377 E. Butterfield Road Suite 800, Lombard, IL 60148 *
Tel 630-986-3006 *
www.sungard.com/assetarena

..........................................................................................
http://www.brycealcock.com/
------------------------------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org