You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Adrian Nistor (JIRA)" <ji...@apache.org> on 2012/06/20 22:45:43 UTC

[jira] [Updated] (COLLECTIONS-412) CollectionUtils.subtract() is very slow

     [ https://issues.apache.org/jira/browse/COLLECTIONS-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrian Nistor updated COLLECTIONS-412:
--------------------------------------

    Attachment: Test.java
                patch.diff
    
> CollectionUtils.subtract() is very slow
> ---------------------------------------
>
>                 Key: COLLECTIONS-412
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-412
>             Project: Commons Collections
>          Issue Type: Bug
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>         Attachments: Test.java, patch.diff
>
>
> Hi,
> I am encountering a performance problem in CollectionUtils.subtract().
> It appears in version 3.2.1 and also in revision 1352300 (20 June
> 2012).  I attached a test that exposes this problem and a patch that
> fixes it.  On my machine, for this test, the patch provides a 204X
> speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 11036
> The output for the patched version is:
> Time is 54
> The root cause of this problem is similar to the root cause of the
> previously fixed COLLECTIONS-406 in ListUtils.subtract(), i.e.,
> quadratic complexity instead of linear complexity.  This problem
> affects two methods:
> CollectionUtils.subtract(final Iterable<? extends O> a, final Iterable<? extends O> b)
> and
> CollectionUtils.subtract(final Iterable<? extends O> a, final Iterable<? extends O> b, final Predicate<O> p)
> because the former just calls the latter.
> Currently, the code for
> "CollectionUtils.subtract(final Iterable<? extends O> a, final Iterable<? extends O> b, final Predicate<O> p)"
> is:
> ArrayList<O> list = new ArrayList<O>();
> addAll(list, a);
> for (O element : b) {
>     if (p.evaluate(element)) {
>         list.remove(element);
>     }
> }
> which is quadratic, because "list.remove(element)" has linear
> complexity for "ArrayList<O> list = new ArrayList<O>()".
> The attached patch makes the remove() be constant complexity by
> removing from an org.apache.commons.collections.bag.HashBag.  The
> attached patch is very similar to the patch of COLLECTIONS-406, so I
> assume the risk of applying this patch is minimal.  Just like in the
> patch for COLLECTIONS-406, this patch uses a HashBag (and not a
> HashSet) to respect cardinality when there are repeated objects.
> Can you please confirm if the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira