You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Adrian Nistor (JIRA)" <ji...@apache.org> on 2012/06/30 00:16:44 UTC
[jira] [Created] (COLLECTIONS-418) ListUtils.retainAll() is very
slow
Adrian Nistor created COLLECTIONS-418:
-----------------------------------------
Summary: ListUtils.retainAll() is very slow
Key: COLLECTIONS-418
URL: https://issues.apache.org/jira/browse/COLLECTIONS-418
Project: Commons Collections
Issue Type: Bug
Affects Versions: 3.2.1
Environment: java 1.6.0_24
Ubuntu 11.10
Reporter: Adrian Nistor
Hi,
I am encountering a performance problem in ListUtils.retainAll(). It
appears in version 3.2.1 and also in revision 1355448. I attached a
test that exposes this problem and a one-line patch that fixes it. On
my machine, for this test, the patch provides a 238X speedup.
To run the test, just do:
$ java Test
The output for the un-patched version is:
Time is 5485
The output for the patched version is:
Time is 23
As the patch shows, the problem is that
"ListUtils.retainAll(Collection<E> collection, Collection<?> retain)"
performs "retain.contains(obj)" for each element in "collection",
which can be very expensive if "retain.contains(obj)" is expensive,
e.g., when "retain" is a list.
The one-line patch I attached puts the elements of "retain" in a
HashSet (which has very fast "contains()"), if "retain" is not already
a set:
"if (!(retain instanceof java.util.Set<?>)) retain = new HashSet<Object>(retain);"
Is this a bug, or am I misunderstanding the intended behavior? If so,
can you please confirm that the patch is correct?
Thanks,
Adrian
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COLLECTIONS-418) ListUtils.retainAll() is very
slow
Posted by "Adrian Nistor (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/COLLECTIONS-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Adrian Nistor updated COLLECTIONS-418:
--------------------------------------
Attachment: Test.java
patch.diff
> ListUtils.retainAll() is very slow
> ----------------------------------
>
> Key: COLLECTIONS-418
> URL: https://issues.apache.org/jira/browse/COLLECTIONS-418
> Project: Commons Collections
> Issue Type: Bug
> Affects Versions: 3.2.1
> Environment: java 1.6.0_24
> Ubuntu 11.10
> Reporter: Adrian Nistor
> Attachments: Test.java, patch.diff
>
>
> Hi,
> I am encountering a performance problem in ListUtils.retainAll(). It
> appears in version 3.2.1 and also in revision 1355448. I attached a
> test that exposes this problem and a one-line patch that fixes it. On
> my machine, for this test, the patch provides a 238X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 5485
> The output for the patched version is:
> Time is 23
> As the patch shows, the problem is that
> "ListUtils.retainAll(Collection<E> collection, Collection<?> retain)"
> performs "retain.contains(obj)" for each element in "collection",
> which can be very expensive if "retain.contains(obj)" is expensive,
> e.g., when "retain" is a list.
> The one-line patch I attached puts the elements of "retain" in a
> HashSet (which has very fast "contains()"), if "retain" is not already
> a set:
> "if (!(retain instanceof java.util.Set<?>)) retain = new HashSet<Object>(retain);"
> Is this a bug, or am I misunderstanding the intended behavior? If so,
> can you please confirm that the patch is correct?
> Thanks,
> Adrian
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (COLLECTIONS-418) ListUtils.retainAll() is very
slow
Posted by "Thomas Neidhart (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/COLLECTIONS-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas Neidhart resolved COLLECTIONS-418.
-----------------------------------------
Resolution: Won't Fix
Added to the javadoc a clarification on the runtime complexity of the method. Users shall use a data structure for the elements to be retained which supports a fast implementation of contains.
> ListUtils.retainAll() is very slow
> ----------------------------------
>
> Key: COLLECTIONS-418
> URL: https://issues.apache.org/jira/browse/COLLECTIONS-418
> Project: Commons Collections
> Issue Type: Bug
> Affects Versions: 3.2.1
> Environment: java 1.6.0_24
> Ubuntu 11.10
> Reporter: Adrian Nistor
> Attachments: patch.diff, Test.java
>
>
> Hi,
> I am encountering a performance problem in ListUtils.retainAll(). It
> appears in version 3.2.1 and also in revision 1355448. I attached a
> test that exposes this problem and a one-line patch that fixes it. On
> my machine, for this test, the patch provides a 238X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 5485
> The output for the patched version is:
> Time is 23
> As the patch shows, the problem is that
> "ListUtils.retainAll(Collection<E> collection, Collection<?> retain)"
> performs "retain.contains(obj)" for each element in "collection",
> which can be very expensive if "retain.contains(obj)" is expensive,
> e.g., when "retain" is a list.
> The one-line patch I attached puts the elements of "retain" in a
> HashSet (which has very fast "contains()"), if "retain" is not already
> a set:
> "if (!(retain instanceof java.util.Set<?>)) retain = new HashSet<Object>(retain);"
> Is this a bug, or am I misunderstanding the intended behavior? If so,
> can you please confirm that the patch is correct?
> Thanks,
> Adrian
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira