You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Bruno P. Kinoshita (Jira)" <ji...@apache.org> on 2020/01/12 00:47:00 UTC

[jira] [Comment Edited] (COLLECTIONS-588) Redundant computation in CollectionBag and CollectionSortedBag

    [ https://issues.apache.org/jira/browse/COLLECTIONS-588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013625#comment-17013625 ] 

Bruno P. Kinoshita edited comment on COLLECTIONS-588 at 1/12/20 12:46 AM:
--------------------------------------------------------------------------

The code you provided definitely runs much faster with your patch [~monikadhok] . But I think the javadocs already states that the containsAll method will be faster if the Collection provided is a Set

{quote}{color:#212121}This implementation iterates over the elements of this bag, checking each element in turn to see if it's contained in {color}{{coll}}{color:#212121}. If it's not contained, it's removed from this bag. As a consequence, it is advised to use a collection type for {color}{{coll}}{color:#212121} that provides a fast (e.g. O(1)) implementation of {color}{{[Collection.contains(Object)|eclipse-javadoc:%E2%98%82=commons-collections/src%5C/main%5C/java=/optional=/true=/=/maven.pomderived=/true=/%3Corg.apache.commons.collections4.bag%7BCollectionBag.java%E2%98%83CollectionBag~retainAll~QCollection%5C%3C*%3E;%E2%98%82Collection%E2%98%82contains%E2%98%82Object]}}{color:#212121}.\{quote}{color}

I think it should be fine to let the user provide a collection that it responsible for the .contains. So that if there is any special logic in the implementation, it would still work — even if not as performant as with a Set.

 


was (Author: kinow):
The code you provided definitely runs much faster with your patch [~monikadhok] . But I think the javadocs already states that the containsAll method will be faster if the Collection provided is a Set

>{color:#212121}This implementation iterates over the elements of this bag, checking each element in turn to see if it's contained in {color}{{coll}}{color:#212121}. If it's not contained, it's removed from this bag. As a consequence, it is advised to use a collection type for {color}{{coll}}{color:#212121} that provides a fast (e.g. O(1)) implementation of {color}{{[Collection.contains(Object)|eclipse-javadoc:%E2%98%82=commons-collections/src%5C/main%5C/java=/optional=/true=/=/maven.pomderived=/true=/%3Corg.apache.commons.collections4.bag%7BCollectionBag.java%E2%98%83CollectionBag~retainAll~QCollection%5C%3C*%3E;%E2%98%82Collection%E2%98%82contains%E2%98%82Object]}}{color:#212121}.{color}

I think it should be fine to let the user provide a collection that it responsible for the .contains. So that if there is any special logic in the implementation, it would still work — even if not as performant as with a Set.

 

> Redundant computation in CollectionBag and CollectionSortedBag
> --------------------------------------------------------------
>
>                 Key: COLLECTIONS-588
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-588
>             Project: Commons Collections
>          Issue Type: Bug
>            Reporter: Monika Dhok
>            Priority: Major
>         Attachments: Test1.java, file.patch
>
>
> There appears to be redundant computations in "CollectionBag.retainAll"
> method in the version 4.4.1. I have attached a test and proposed a small 
> patch which ensures that "contains" method is called on hashset of
> input collection. This patch gives 892X speed up on my 
> machine for the provided test.
> Similar patches can be applied for, "CollectionSortedBag.retainAll".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)