You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Manikumar (JIRA)" <ji...@apache.org> on 2018/06/12 16:25:00 UTC
[jira] [Resolved] (KAFKA-2550) [Kafka][0.8.2.1][Performance]When there are a lot of partition under a Topic, there are serious performance degradation.

     [ https://issues.apache.org/jira/browse/KAFKA-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Manikumar resolved KAFKA-2550.
------------------------------
    Resolution: Auto Closed

{color:#000000}Closing inactive issue. Old clients are deprecated. Please reopen if you think the issue still exists in newer versions.{color}
 

> [Kafka][0.8.2.1][Performance]When there are a lot of partition under a Topic, there are serious performance degradation.
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-2550
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2550
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients, consumer, producer 
>    Affects Versions: 0.8.2.1
>            Reporter: yanwei
>            Assignee: Neha Narkhede
>            Priority: Major
>
> Because of business need to create a large number of partitions,I test the partition number of support.
> But I find When there are a lot of partition under a Topic, there are serious performance degradation.
> Through the analysis, in addition to the hard disk is bottleneck, the client is the bottleneck
> I use JProfile,producer and consumer 1000000 message(msg size:500byte)
> 1、Consumer high level API：(I find i can't upload picture?)
>      ZookeeperConsumerConnector.scala-->rebalance
> -->val assignmentContext = new AssignmentContext(group, consumerIdString, config.excludeInternalTopics, zkClient)
> -->ZkUtils.getPartitionsForTopics(zkClient, myTopicThreadIds.keySet.toSeq)
> -->getPartitionAssignmentForTopics
> -->Json.parseFull(jsonPartitionMap) 
>      1) one topic 400 partion：
>          JProfile:48.6% cpu run time
>      2) ont topic 3000 partion:
>          JProfile:97.8% cpu run time
>   Maybe the file(jsonPartitionMap) is very big lead to parse is very slow.
>   But this function is executed only once, so the problem should not be too big.
> 2、Producer Scala API：
>     BrokerPartitionInfo.scala--->getBrokerPartitionInfo:
>     partitionMetadata.map { m =>
>       m.leader match {
>         case Some(leader) =>
>           //y00163442 delete log print
>           debug("Partition [%s,%d] has leader %d".format(topic, m.partitionId, leader.id))
>           new PartitionAndLeader(topic, m.partitionId, Some(leader.id))
>         case None =>
>           //y00163442 delete log print
>           //debug("Partition [%s,%d] does not have a leader yet".format(topic, m.partitionId))
>           new PartitionAndLeader(topic, m.partitionId, None)
>       }
>     }.sortWith((s, t) => s.partitionId < t.partitionId) 
>          
>       When partitions number>25,the function 'format' cpu run time is 44.8%.
>       Nearly half of the time consumption in the format function.whether the log print open, this format will be executed.Led to the decrease of the TPS for five times(25000--->5000).
>       
> 3、Producer JAVA client（clients module）：
>       function:org.apache.kafka.clients.producer.KafkaProducer.send
>       I find the function 'send' cpu run time  rise with the rising number of partitions ,when partions is 5000,the cpu run time is 60.8.
>       Because Kafka broker side of CPU, memory, disk, the network didn't reach the bottleneck , No matter request.required.acks is set to 0 or 1, the results are similar, I doubt the send there may be some bottlenecks.
>       
> Very unfortunately to upload pictures don't succeed, can't see the results.
> My test results, for a single server, a single hard disk can support 1000 partitions, 7 hard disk can support 3000 partitions.If can solve the bottleneck for the client, then seven hard disk I estimate that can support more partitions.
> Actual production configuration, could be more partitions configuration under more than one TOPIC,Things could be better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)