You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by "Marouane RAJI (JIRA)" <ji...@apache.org> on 2019/08/13 09:32:00 UTC

[jira] [Created] (KAFKA-8796) A broker joining the cluster should be able to replicate without impacting the cluster

Marouane RAJI created KAFKA-8796:
------------------------------------

             Summary: A broker joining the cluster should be able to replicate without impacting the cluster
                 Key: KAFKA-8796
                 URL: https://issues.apache.org/jira/browse/KAFKA-8796
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 1.1.0
            Reporter: Marouane RAJI
         Attachments: image-2019-08-13-10-26-19-282.png, image-2019-08-13-10-28-42-337.png

Hi, 

We run a cluster of 50 brokers, 1.4M msgs/sec at max, on AWS. We were using m4.2xlarge. We are now moving to m5.2xlarge. Everytime we replace a broker from scratch (EBSs are linked to ec2 instance..), the byte sent on the replaced broker increase significantly and that seem to impact the cluster, increasing the produce time and fetch time..

This is our configuration per broker :

 

 
{code:java}
broker.id=11
############################# Socket Server Settings #############################
# The port the socket server listens on
port=9092

advertised.host.name=ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com
# The number of threads handling network requests
num.network.threads=32
# The number of threads doing disk I/O
num.io.threads=16socket server socket.receive.buffer.bytes=1048576 

socket.request.max.bytes=104857600 # The max time a connection can be idle connections.max.idle.ms = 60000 

num.partitions=2 

default.replication.factor=2 

auto.leader.rebalance.enable=true 

delete.topic.enable=true 

compression.type=producer 

log.message.format.version=0.9.0.1


message.max.bytes=8000000 
# The minimum age of a log file to be eligible for deletion log.retention.hours=48 

log.retention.bytes=3000000000 

log.segment.bytes=268435456 

log.retention.check.interval.ms=60000  

log.cleaner.enable=true 

log.cleaner.dedupe.buffer.size=268435456

replica.fetch.max.bytes=8388608 

replica.fetch.wait.max.ms=500 

replica.lag.time.max.ms=10000 

num.replica.fetchers = 3 

# Auto creation of topics on the server 
auto.create.topics.enable=true 

controlled.shutdown.enable=true 

inter.broker.protocol.version=0.10.2 

unclean.leader.election.enabled=True
{code}
 

This is what we notice on replication :

I high increase in byte received on the replaced broker

 

!image-2019-08-13-10-26-19-282.png!

!image-2019-08-13-10-28-42-337.png!

You can't see it the graph above but the increase in produce time stayed high for 20minutes..

We didn't see anything out of the ordinary in the logs.

Please let us know if there is anything wrong in our config or if it is a potential issue that needs fixing with kafka. 

Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)