You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Thomas Neidhart (JIRA)" <ji...@apache.org> on 2012/06/23 13:34:42 UTC

[jira] [Resolved] (COLLECTIONS-412) CollectionUtils.subtract() is very slow

     [ https://issues.apache.org/jira/browse/COLLECTIONS-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Neidhart resolved COLLECTIONS-412.
-----------------------------------------

       Resolution: Fixed
    Fix Version/s: 4.0

Fixed in r1353111.

I did not apply the patch, but rather changed the method in a similar way as outlined in the patch. Instead of evaluating the predicate on all elements of A, only elements of B that satisfy the predicate are added to a bag. The rest is similar to the patch.

Also added more information in the javadoc how the resulting collection will look like depending on the predicate evaluation.

Thanks for reporting and the patch!
                
> CollectionUtils.subtract() is very slow
> ---------------------------------------
>
>                 Key: COLLECTIONS-412
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-412
>             Project: Commons Collections
>          Issue Type: Bug
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>             Fix For: 4.0
>
>         Attachments: Test.java, patch.diff
>
>
> Hi,
> I am encountering a performance problem in CollectionUtils.subtract().
> It appears in version 3.2.1 and also in revision 1352300 (20 June
> 2012).  I attached a test that exposes this problem and a patch that
> fixes it.  On my machine, for this test, the patch provides a 204X
> speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 11036
> The output for the patched version is:
> Time is 54
> The root cause of this problem is similar to the root cause of the
> previously fixed COLLECTIONS-406 in ListUtils.subtract(), i.e.,
> quadratic complexity instead of linear complexity.  This problem
> affects two methods:
> CollectionUtils.subtract(final Iterable<? extends O> a, final Iterable<? extends O> b)
> and
> CollectionUtils.subtract(final Iterable<? extends O> a, final Iterable<? extends O> b, final Predicate<O> p)
> because the former just calls the latter.
> Currently, the code for
> "CollectionUtils.subtract(final Iterable<? extends O> a, final Iterable<? extends O> b, final Predicate<O> p)"
> is:
> ArrayList<O> list = new ArrayList<O>();
> addAll(list, a);
> for (O element : b) {
>     if (p.evaluate(element)) {
>         list.remove(element);
>     }
> }
> which is quadratic, because "list.remove(element)" has linear
> complexity for "ArrayList<O> list = new ArrayList<O>()".
> The attached patch makes the remove() be constant complexity by
> removing from an org.apache.commons.collections.bag.HashBag.  The
> attached patch is very similar to the patch of COLLECTIONS-406, so I
> assume the risk of applying this patch is minimal.  Just like in the
> patch for COLLECTIONS-406, this patch uses a HashBag (and not a
> HashSet) to respect cardinality when there are repeated objects.
> Can you please confirm if the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira