You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2012/06/19 21:34:42 UTC
[jira] [Commented] (COLLECTIONS-408) performance problem in
SetUniqueList.removeAll()
[ https://issues.apache.org/jira/browse/COLLECTIONS-408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397003#comment-13397003 ]
Hudson commented on COLLECTIONS-408:
------------------------------------
Integrated in commons-collections #22 (See [https://builds.apache.org/job/commons-collections/22/])
[COLLECTIONS-408] improve performance of remove and removeAll methods, add missing javadoc. Thanks to Adrian Nistor for reporting. (Revision 1351804)
Result = SUCCESS
tn : http://svn.apache.org/viewvc/?view=rev&rev=1351804
Files :
* /commons/proper/collections/trunk/src/main/java/org/apache/commons/collections/list/SetUniqueList.java
> performance problem in SetUniqueList.removeAll()
> ------------------------------------------------
>
> Key: COLLECTIONS-408
> URL: https://issues.apache.org/jira/browse/COLLECTIONS-408
> Project: Commons Collections
> Issue Type: Bug
> Environment: java 1.6.0_24
> Ubuntu 11.10
> Reporter: Adrian Nistor
> Fix For: 4.0
>
> Attachments: Test.java, patch.diff
>
>
> Hi,
> I am encountering a performance problem in SetUniqueList.removeAll().
> It appears in version 3.2.1 and also in revision 1344775 (31 May
> 2012). I have attached a test that exposes this problem and a
> one-line patch that fixes it. The patch makes the code two times
> faster for this test.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is: 5027
> The output for the patched version is:
> Time is: 2554
> The one-line patch I attached changes the
> SetUniqueList.removeAll(Collection<?> coll) code from:
> boolean result = super.removeAll(coll);
> set.removeAll(coll);
> return result;
> to:
> boolean result = super.removeAll(coll);
> if (result) set.removeAll(coll);
> return result;
> If "super.removeAll(coll)" did not change the collection, there is no
> need to call "set.removeAll(coll)", because we already know there is
> nothing to remove.
> As one may expect "set.removeAll(coll)" (on a set) to be faster than
> "super.removeAll(coll)" (on a list), one may have expected the speedup
> gained by avoiding "set.removeAll(coll)" to be smaller than 2X
> achieved for the attached test. However, the speedup is 2X because
> "java.util.HashSet.removeAll(Collection<?> collection)" has quadratic
> (not linear) complexity if "this.size() <= collection.size()" and the
> "collection" is a list. Thus, "set.removeAll(coll)" is about as slow
> as "super.removeAll(coll)" in this case, and not executing
> "set.removeAll(coll)" reduces the work done by half. The quadratic
> behavior of "java.util.HashSet.removeAll(Collection<?> collection)"
> comes from "java.util.AbstractSet.removeAll(Collection<?> c)" and is
> discussed for example here:
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2011-July/007148.html
> (The link is for OpenJDK, but Oracle JDK has the same problem.)
> In many other cases "set.removeAll(coll)" is actually faster than
> "super.removeAll(coll)", so one can get even more speedup by
> reordering those two checks:
> boolean result = set.removeAll(coll);
> if (result) super.removeAll(coll);
> return result;
> Is this a bug, or am I misunderstanding the intended behavior? If so,
> can you please confirm that the patch is correct?
> Thanks,
> Adrian
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira