You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Adrian Nistor (JIRA)" <ji...@apache.org> on 2012/06/01 19:56:23 UTC

[jira] [Created] (COLLECTIONS-407) ListOrderedSet.removeAll() is slow

Adrian Nistor created COLLECTIONS-407:
-----------------------------------------

             Summary: ListOrderedSet.removeAll() is slow
                 Key: COLLECTIONS-407
                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-407
             Project: Commons Collections
          Issue Type: Bug
    Affects Versions: 3.2.1
         Environment: java 1.6.0_24
Ubuntu 11.10
            Reporter: Adrian Nistor
         Attachments: Test.java, patch.diff

Hi,

I am encountering a performance problem in ListOrderedSet.removeAll().
It appears in version 3.2.1 and also in revision 1344775 (31 May
2012).  I have attached a test that exposes this problem and a
one-line patch that fixes it.  On my machine, for this test, the patch
provides a 317X speedup.

To run the test, just do:

$ java Test

The output for the un-patched version is:
Time is 3812

The output for the patched version is:
Time is 12

As the patch shows, the problem is in 
ListOrderedSet.remove(Object object).  The code for this method is:

boolean result = collection.remove(object);
setOrder.remove(object);
return result;

The patch changes it to :

boolean result = collection.remove(object);
if (result) setOrder.remove(object);
return result;

The "setOrder.remove(object)" is not necessary if 
"collection.remove(object)" did not find the object.

This small change speeds up
ListOrderedSet.ListOrderedSet.removeAll(Collection<?> coll) because
ListOrderedSet.ListOrderedSet.removeAll(Collection<?> coll) iterates
over all the elements in "coll" and calls 
ListOrderedSet.remove(Object object).  So the un-patched code has
quadratic complexity, while the patched code has linear complexity.

Is this truly a bug, or am I missing something here?  If so, can you
please confirm if the patch is correct?

Thanks,

Adrian


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (COLLECTIONS-407) ListOrderedSet.removeAll() is slow

Posted by "Adrian Nistor (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COLLECTIONS-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrian Nistor updated COLLECTIONS-407:
--------------------------------------

    Attachment: Test.java
                patch.diff
    
> ListOrderedSet.removeAll() is slow
> ----------------------------------
>
>                 Key: COLLECTIONS-407
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-407
>             Project: Commons Collections
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>         Attachments: Test.java, patch.diff
>
>
> Hi,
> I am encountering a performance problem in ListOrderedSet.removeAll().
> It appears in version 3.2.1 and also in revision 1344775 (31 May
> 2012).  I have attached a test that exposes this problem and a
> one-line patch that fixes it.  On my machine, for this test, the patch
> provides a 317X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 3812
> The output for the patched version is:
> Time is 12
> As the patch shows, the problem is in 
> ListOrderedSet.remove(Object object).  The code for this method is:
> boolean result = collection.remove(object);
> setOrder.remove(object);
> return result;
> The patch changes it to :
> boolean result = collection.remove(object);
> if (result) setOrder.remove(object);
> return result;
> The "setOrder.remove(object)" is not necessary if 
> "collection.remove(object)" did not find the object.
> This small change speeds up
> ListOrderedSet.ListOrderedSet.removeAll(Collection<?> coll) because
> ListOrderedSet.ListOrderedSet.removeAll(Collection<?> coll) iterates
> over all the elements in "coll" and calls 
> ListOrderedSet.remove(Object object).  So the un-patched code has
> quadratic complexity, while the patched code has linear complexity.
> Is this truly a bug, or am I missing something here?  If so, can you
> please confirm if the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COLLECTIONS-407) ListOrderedSet.removeAll() is slow

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COLLECTIONS-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396990#comment-13396990 ] 

Hudson commented on COLLECTIONS-407:
------------------------------------

Integrated in commons-collections #21 (See [https://builds.apache.org/job/commons-collections/21/])
    [COLLECTIONS-407] improve performance of remove method by taking method result from underlying collection into account. Thanks to Adrian Nistor for reporting and providing a patch. (Revision 1351800)

     Result = SUCCESS
tn : http://svn.apache.org/viewvc/?view=rev&rev=1351800
Files : 
* /commons/proper/collections/trunk/src/main/java/org/apache/commons/collections/set/ListOrderedSet.java

                
> ListOrderedSet.removeAll() is slow
> ----------------------------------
>
>                 Key: COLLECTIONS-407
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-407
>             Project: Commons Collections
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>             Fix For: 4.0
>
>         Attachments: Test.java, patch.diff
>
>
> Hi,
> I am encountering a performance problem in ListOrderedSet.removeAll().
> It appears in version 3.2.1 and also in revision 1344775 (31 May
> 2012).  I have attached a test that exposes this problem and a
> one-line patch that fixes it.  On my machine, for this test, the patch
> provides a 317X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 3812
> The output for the patched version is:
> Time is 12
> As the patch shows, the problem is in 
> ListOrderedSet.remove(Object object).  The code for this method is:
> boolean result = collection.remove(object);
> setOrder.remove(object);
> return result;
> The patch changes it to :
> boolean result = collection.remove(object);
> if (result) setOrder.remove(object);
> return result;
> The "setOrder.remove(object)" is not necessary if 
> "collection.remove(object)" did not find the object.
> This small change speeds up
> ListOrderedSet.ListOrderedSet.removeAll(Collection<?> coll) because
> ListOrderedSet.ListOrderedSet.removeAll(Collection<?> coll) iterates
> over all the elements in "coll" and calls 
> ListOrderedSet.remove(Object object).  So the un-patched code has
> quadratic complexity, while the patched code has linear complexity.
> Is this truly a bug, or am I missing something here?  If so, can you
> please confirm if the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COLLECTIONS-407) ListOrderedSet.removeAll() is slow

Posted by "Thomas Neidhart (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COLLECTIONS-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396899#comment-13396899 ] 

Thomas Neidhart commented on COLLECTIONS-407:
---------------------------------------------

You are right, in fact other methods (in ListOrderedSet, e.g. retainAll) already use similar logic as proposed in your patch, so I guess it is safe to do it also for the remove method.
                
> ListOrderedSet.removeAll() is slow
> ----------------------------------
>
>                 Key: COLLECTIONS-407
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-407
>             Project: Commons Collections
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>         Attachments: Test.java, patch.diff
>
>
> Hi,
> I am encountering a performance problem in ListOrderedSet.removeAll().
> It appears in version 3.2.1 and also in revision 1344775 (31 May
> 2012).  I have attached a test that exposes this problem and a
> one-line patch that fixes it.  On my machine, for this test, the patch
> provides a 317X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 3812
> The output for the patched version is:
> Time is 12
> As the patch shows, the problem is in 
> ListOrderedSet.remove(Object object).  The code for this method is:
> boolean result = collection.remove(object);
> setOrder.remove(object);
> return result;
> The patch changes it to :
> boolean result = collection.remove(object);
> if (result) setOrder.remove(object);
> return result;
> The "setOrder.remove(object)" is not necessary if 
> "collection.remove(object)" did not find the object.
> This small change speeds up
> ListOrderedSet.ListOrderedSet.removeAll(Collection<?> coll) because
> ListOrderedSet.ListOrderedSet.removeAll(Collection<?> coll) iterates
> over all the elements in "coll" and calls 
> ListOrderedSet.remove(Object object).  So the un-patched code has
> quadratic complexity, while the patched code has linear complexity.
> Is this truly a bug, or am I missing something here?  If so, can you
> please confirm if the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COLLECTIONS-407) ListOrderedSet.removeAll() is slow

Posted by "Adrian Nistor (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COLLECTIONS-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396860#comment-13396860 ] 

Adrian Nistor commented on COLLECTIONS-407:
-------------------------------------------

Hi Thomas,

Thanks, it's great that the patch works.

java.util, Apache Commons Collections, Guava Libraries, all use on a
regular basis methods from the collections passed as parameters (over
which these three libraries have no control), just like in this case.
It's the normal thing to do, there must be hundreds of such examples,
and nobody tries to avoid them.

In fact, how can one use the collections passes as parameters
otherwise?  For example, the implementation of ListOrderedSet alone
(not to mention the other classes in Commons Collections, java.util,
and Guava Libraries) uses methods from "collection" 9 different times,
including the return value of "collection.retainAll" in the
implementation of "ListOrderedSet.retainAll(Collection<?> coll)".  It
would not be possible to implement Commons Collections otherwise.

> a bit slower

Quadratic vs linear complexity is a big difference.  This can get
really slow (e.g., two orders of magnitude) even for medium size data
sets.

Best,

Adrian


                
> ListOrderedSet.removeAll() is slow
> ----------------------------------
>
>                 Key: COLLECTIONS-407
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-407
>             Project: Commons Collections
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>         Attachments: Test.java, patch.diff
>
>
> Hi,
> I am encountering a performance problem in ListOrderedSet.removeAll().
> It appears in version 3.2.1 and also in revision 1344775 (31 May
> 2012).  I have attached a test that exposes this problem and a
> one-line patch that fixes it.  On my machine, for this test, the patch
> provides a 317X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 3812
> The output for the patched version is:
> Time is 12
> As the patch shows, the problem is in 
> ListOrderedSet.remove(Object object).  The code for this method is:
> boolean result = collection.remove(object);
> setOrder.remove(object);
> return result;
> The patch changes it to :
> boolean result = collection.remove(object);
> if (result) setOrder.remove(object);
> return result;
> The "setOrder.remove(object)" is not necessary if 
> "collection.remove(object)" did not find the object.
> This small change speeds up
> ListOrderedSet.ListOrderedSet.removeAll(Collection<?> coll) because
> ListOrderedSet.ListOrderedSet.removeAll(Collection<?> coll) iterates
> over all the elements in "coll" and calls 
> ListOrderedSet.remove(Object object).  So the un-patched code has
> quadratic complexity, while the patched code has linear complexity.
> Is this truly a bug, or am I missing something here?  If so, can you
> please confirm if the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (COLLECTIONS-407) ListOrderedSet.removeAll() is slow

Posted by "Thomas Neidhart (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COLLECTIONS-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Neidhart resolved COLLECTIONS-407.
-----------------------------------------

       Resolution: Fixed
    Fix Version/s: 4.0

Applied patch in r1351800.

Thanks for reporting!
                
> ListOrderedSet.removeAll() is slow
> ----------------------------------
>
>                 Key: COLLECTIONS-407
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-407
>             Project: Commons Collections
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>             Fix For: 4.0
>
>         Attachments: Test.java, patch.diff
>
>
> Hi,
> I am encountering a performance problem in ListOrderedSet.removeAll().
> It appears in version 3.2.1 and also in revision 1344775 (31 May
> 2012).  I have attached a test that exposes this problem and a
> one-line patch that fixes it.  On my machine, for this test, the patch
> provides a 317X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 3812
> The output for the patched version is:
> Time is 12
> As the patch shows, the problem is in 
> ListOrderedSet.remove(Object object).  The code for this method is:
> boolean result = collection.remove(object);
> setOrder.remove(object);
> return result;
> The patch changes it to :
> boolean result = collection.remove(object);
> if (result) setOrder.remove(object);
> return result;
> The "setOrder.remove(object)" is not necessary if 
> "collection.remove(object)" did not find the object.
> This small change speeds up
> ListOrderedSet.ListOrderedSet.removeAll(Collection<?> coll) because
> ListOrderedSet.ListOrderedSet.removeAll(Collection<?> coll) iterates
> over all the elements in "coll" and calls 
> ListOrderedSet.remove(Object object).  So the un-patched code has
> quadratic complexity, while the patched code has linear complexity.
> Is this truly a bug, or am I missing something here?  If so, can you
> please confirm if the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COLLECTIONS-407) ListOrderedSet.removeAll() is slow

Posted by "Thomas Neidhart (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COLLECTIONS-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396574#comment-13396574 ] 

Thomas Neidhart commented on COLLECTIONS-407:
---------------------------------------------

The patch looks correct. The only thing that comes to my mind is that there may be collections which wrongly implement the methods in question and do not correctly return the change status.

So the way it is implemented right now is a bit slower but will work in all cases.
                
> ListOrderedSet.removeAll() is slow
> ----------------------------------
>
>                 Key: COLLECTIONS-407
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-407
>             Project: Commons Collections
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>         Attachments: Test.java, patch.diff
>
>
> Hi,
> I am encountering a performance problem in ListOrderedSet.removeAll().
> It appears in version 3.2.1 and also in revision 1344775 (31 May
> 2012).  I have attached a test that exposes this problem and a
> one-line patch that fixes it.  On my machine, for this test, the patch
> provides a 317X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 3812
> The output for the patched version is:
> Time is 12
> As the patch shows, the problem is in 
> ListOrderedSet.remove(Object object).  The code for this method is:
> boolean result = collection.remove(object);
> setOrder.remove(object);
> return result;
> The patch changes it to :
> boolean result = collection.remove(object);
> if (result) setOrder.remove(object);
> return result;
> The "setOrder.remove(object)" is not necessary if 
> "collection.remove(object)" did not find the object.
> This small change speeds up
> ListOrderedSet.ListOrderedSet.removeAll(Collection<?> coll) because
> ListOrderedSet.ListOrderedSet.removeAll(Collection<?> coll) iterates
> over all the elements in "coll" and calls 
> ListOrderedSet.remove(Object object).  So the un-patched code has
> quadratic complexity, while the patched code has linear complexity.
> Is this truly a bug, or am I missing something here?  If so, can you
> please confirm if the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira