You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Thomas Neidhart (JIRA)" <ji...@apache.org> on 2015/11/27 22:12:13 UTC

[jira] [Closed] (COLLECTIONS-545) Undocumented performance issue in the removeAll method in CollectionUtils

     [ https://issues.apache.org/jira/browse/COLLECTIONS-545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Neidhart closed COLLECTIONS-545.
---------------------------------------

> Undocumented performance issue in the removeAll method in CollectionUtils
> -------------------------------------------------------------------------
>
>                 Key: COLLECTIONS-545
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-545
>             Project: Commons Collections
>          Issue Type: Bug
>          Components: Collection
>    Affects Versions: 4.0
>         Environment: Ubuntu 14.04
>            Reporter: Oswaldo Olivo
>            Priority: Trivial
>              Labels: Collections, documentaion, performance
>             Fix For: 4.1
>
>
> This bug is analogous to https://issues.apache.org/jira/browse/COLLECTIONS-544
> The method removeAll in CollectionUtils is inefficient when the second parameter collection has a slow containment method.
> The following is the current implementation with its documentation:
> ============================
>      /**
>      * Removes the elements in <code>remove</code> from <code>collection</code>. That is, this
>      * method returns a collection containing all the elements in <code>c</code>
>      * that are not in <code>remove</code>. The cardinality of an element <code>e</code>
>      * in the returned collection is the same as the cardinality of <code>e</code>
>      * in <code>collection</code> unless <code>remove</code> contains <code>e</code>, in which
>      * case the cardinality is zero. This method is useful if you do not wish to modify
>      * the collection <code>c</code> and thus cannot call <code>collection.removeAll(remove);</code>.
>      *
>      * @param <E>  the type of object the {@link Collection} contains
>      * @param collection  the collection from which items are removed (in the returned collection)
>      * @param remove  the items to be removed from the returned <code>collection</code>
>      * @return a <code>Collection</code> containing all the elements of <code>collection</code> except
>      * any elements that also occur in <code>remove</code>.
>      * @throws NullPointerException if either parameter is null
>      * @since 4.0 (method existed in 3.2 but was completely broken)
>      */
>     public static <E> Collection<E> removeAll(final Collection<E> collection, final Collection<?> remove) {
>         return ListUtils.removeAll(collection, remove);
>     }
> =======================================
> We can notice the inefficiency by looking at the removeAll method in ListUtils.
> The removeAll method from ListUtils is implemented and documented as follows:
> =======================================
>      /**
>      * Removes the elements in <code>remove</code> from <code>collection</code>. That is, this
>      * method returns a list containing all the elements in <code>collection</code>
>      * that are not in <code>remove</code>. The cardinality of an element <code>e</code>
>      * in the returned collection is the same as the cardinality of <code>e</code>
>      * in <code>collection</code> unless <code>remove</code> contains <code>e</code>, in which
>      * case the cardinality is zero. This method is useful if you do not wish to modify
>      * <code>collection</code> and thus cannot call <code>collection.removeAll(remove);</code>.
>      * <p>
>      * This implementation iterates over <code>collection</code>, checking each element in
>      * turn to see if it's contained in <code>remove</code>. If it's not contained, it's added
>      * to the returned list. As a consequence, it is advised to use a collection type for
>      * <code>remove</code> that provides a fast (e.g. O(1)) implementation of
>      * {@link Collection#contains(Object)}.
>      *
>      * @param <E>  the element type
>      * @param collection  the collection from which items are removed (in the returned collection)
>      * @param remove  the items to be removed from the returned <code>collection</code>
>      * @return a <code>List</code> containing all the elements of <code>c</code> except
>      * any elements that also occur in <code>remove</code>.
>      * @throws NullPointerException if either parameter is null
>      * @since 3.2
>      */
>     public static <E> List<E> removeAll(final Collection<E> collection, final Collection<?> remove) {
>         final List<E> list = new ArrayList<E>();
>         for (final E obj : collection) {
>             if (!remove.contains(obj)) {
>                 list.add(obj);
>             }
>         }
>         return list;
>     }
> =======================================
> In the case of ListUtils:removeAll, the inefficiency is properly documented.
> Perhaps the disclaimer about potential inefficiencies depending on the type 
> of the parameter collection in ListUtils:removeAll should also be included in CollectionUtils:removeAll.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)