You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Ismael Juma (Jira)" <ji...@apache.org> on 2020/03/09 06:46:00 UTC
[jira] [Commented] (KAFKA-9685) Solve Set concatenation perf issue in AclAuthorizer

    [ https://issues.apache.org/jira/browse/KAFKA-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054702#comment-17054702 ] 

Ismael Juma commented on KAFKA-9685:
------------------------------------

Thanks for the report. Are you planning to submit a pull request? cc [~omkreddy] [~lbradstreet]

> Solve Set concatenation perf issue in AclAuthorizer
> ---------------------------------------------------
>
>                 Key: KAFKA-9685
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9685
>             Project: Kafka
>          Issue Type: Improvement
>          Components: security
>    Affects Versions: 1.1.0
>            Reporter: Jiao Zhang
>            Priority: Minor
>
> In version 1.1, [https://github.com/apache/kafka/blob/71b1e19fc60b5e1f9bba33025737ec2b7fb1c2aa/core/src/main/scala/kafka/security/auth/SimpleAclAuthorizer.scala#L110]
>  the logic for checking acls is preparing a merged acl Set with
> {code:java}
> acls = getAcls(new Resource(resource.resourceType, Resource.WildCardResource)) ++ getAcls(resource);{code}
> and then pass it as aclMatch's parameter.
>  We found scala's Set ++ operation is very slow for example in the case that the Set on right hand of ++ has more than 100 entries.
>  And the bad performance of ++ is due to iterating every entry of the Set on right hand of ++, in which the calculation of HashCode seems heavy.
>  The performance of 'authorize' is important as each request delivered to broker goes through the logic, that's the reason we can't leave it as-is although the change for this proposal seems trivial.
> Here is the approach. We propose to solve this issue by introducing a new class 'AclSets' which takes multiple Sets as parameters and do 'find' against them one by one.
> {code:java}
> class AclSets(sets: Set[Acl]*){
>   def find(p: Acl => Boolean): Option[Acl] = sets.flatMap(_.find(p)).headOption       
>   def isEmpty: Boolean = !sets.exists(_.nonEmpty) 
> }
> {code}
> This approach avoid the Set ++ operation like following,
> {code:java}
> val acls = new AclSets(getAcls(new Resource(resource.resourceType, Resource.WildCardResource)), getAcls(resource)){code}
> and thus outperforms a lot compared to old logic.
> The benchmark result(we did the test with kafka version 1.1) shows notable difference under the condition:
>  1. set on left consists of 60 entries
>  2. set of right consists of 30 entries
>  3. search for absent entry (so that all entries are iterated)
> Benchmark Results is as following.
> Mode                                                     Cnt    Score         Error   Units
>  ScalaSetConcatination.Set thrpt          3   281.974  ± 140.029  ops/ms
>  ScalaSetConcatination.AclSets thrpt   3   887.426 ± 40.261    ops/ms
> As the upstream also use the similar ++ operation, [https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/security/authorizer/AclAuthorizer.scala#L360]
>  we think it's necessary to fix this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)