You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Varun Kumar <va...@hotmail.com> on 2019/06/20 03:50:30 UTC

Partition Reassignment in Cloud

Hi

I have been trying a small experiment with partition reassignment in cloud. where instead of copying data between brokers using network, I moved the disk between the 2 brokers and ran the partition reassignment. This actually increased the speed of partition reassignment significantly. (As it had to catchup/fetch only down time data)


I tried this experiment this in Kafka 2.2.1 and it worked. I validated the data-consistency using "kafka-replica-verification.sh" script.

Few more details of the experiment:

  *   Both the brokers from and to which the partitions are moving had to be shutdown.
  *   All the partitions in the disk are moved at once to new broker.
  *   Had to update broker.id property in meta.properties file for the moved log directory before broker restart .
  *   Had to re-balance Leaders after brokers restart.

Can you please let me know if this approach will work in production ? Is there any scenario where it might truncate/delete all data in moved disk and copy complete partition over network ?

Thanks
Varun



Re: Partition Reassignment in Cloud

Posted by Varun Kumar <va...@hotmail.com>.
The idea is to expand the cluster, i.e increase the number of brokers and repartition the partition to distribute the load. So at the end of process both the broker needs to be up and running.
Would like to know if it is possible to move disk/log.dir from one broker to another and updating metadata(partition assignment) will work in all cases ? I tried this is Kafka 2.2 and it worked, but not sure if there are any special cases where it might fail.
________________________________
From: George Li <sq...@yahoo.com.INVALID>
Sent: Friday, June 21, 2019 2:17 AM
To: dev
Subject: Re: Partition Reassignment in Cloud

The new broker host meta.properties file can have the broker.id set to the original broker_id (with original host shutdown/decommission), the new host has the storage of the original host (either by copying or by change the network storage mount from original to new host).  This way, it saves time running reassignments to change old_broker_id => new_broker_id ?

    On Wednesday, June 19, 2019, 9:19:58 PM PDT, Varun Kumar <va...@hotmail.com> wrote:

 Hi

I have been trying a small experiment with partition reassignment in cloud. where instead of copying data between brokers using network, I moved the disk between the 2 brokers and ran the partition reassignment. This actually increased the speed of partition reassignment significantly. (As it had to catchup/fetch only down time data)


I tried this experiment this in Kafka 2.2.1 and it worked. I validated the data-consistency using "kafka-replica-verification.sh" script.

Few more details of the experiment:

  *  Both the brokers from and to which the partitions are moving had to be shutdown.
  *  All the partitions in the disk are moved at once to new broker.
  *  Had to update broker.id property in meta.properties file for the moved log directory before broker restart .
  *  Had to re-balance Leaders after brokers restart.

Can you please let me know if this approach will work in production ? Is there any scenario where it might truncate/delete all data in moved disk and copy complete partition over network ?

Thanks
Varun



Re: Partition Reassignment in Cloud

Posted by George Li <sq...@yahoo.com.INVALID>.
 The new broker host meta.properties file can have the broker.id set to the original broker_id (with original host shutdown/decommission), the new host has the storage of the original host (either by copying or by change the network storage mount from original to new host).  This way, it saves time running reassignments to change old_broker_id => new_broker_id ? 

    On Wednesday, June 19, 2019, 9:19:58 PM PDT, Varun Kumar <va...@hotmail.com> wrote:  
 
 Hi

I have been trying a small experiment with partition reassignment in cloud. where instead of copying data between brokers using network, I moved the disk between the 2 brokers and ran the partition reassignment. This actually increased the speed of partition reassignment significantly. (As it had to catchup/fetch only down time data)


I tried this experiment this in Kafka 2.2.1 and it worked. I validated the data-consistency using "kafka-replica-verification.sh" script.

Few more details of the experiment:

  *  Both the brokers from and to which the partitions are moving had to be shutdown.
  *  All the partitions in the disk are moved at once to new broker.
  *  Had to update broker.id property in meta.properties file for the moved log directory before broker restart .
  *  Had to re-balance Leaders after brokers restart.

Can you please let me know if this approach will work in production ? Is there any scenario where it might truncate/delete all data in moved disk and copy complete partition over network ?

Thanks
Varun