You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by "Vazhenin, Maksim" <Ma...@dell.com> on 2016/10/21 08:11:26 UTC

rolling upgrade with wiped disk

Hi,

Is this a supported scenario for doing rolling upgrade of zookeeper (v3.4.5) to later version (say v3.4.9):

1. Shutdown server A (v3.4.5)
2. Wipe disk with zookeeper data on server A
3. Start server A with new zk version (v3.4.9)
4. Wait till reconstruction complete on server A (what is the indicator for completion?)
5. Go to server B and repeat 1-4.

Thanks,
Maksim

RE: rolling upgrade with wiped disk

Posted by "Vazhenin, Maksim" <Ma...@dell.com>.
Thanks a lot for the information, Flavio.

-Maksim


-------- Исходное сообщение --------
От: Flavio Junqueira <fp...@apache.org>
Дата: 21.10.2016 16:59 (GMT+03:00)
Кому: user@zookeeper.apache.org
Тема: Re: rolling upgrade with wiped disk


> On 21 Oct 2016, at 14:35, Vazhenin, Maksim <Ma...@dell.com> wrote:
>
> Thanks for reply, Flavio,
>
> "You have to be careful when you do this. You may end up losing quorum on txns if you wipe out the disk of a server prematurely."
> What steps could be performed to not loose quorum? If quorum is lost what steps are to be performed to fix it?
> Is there a chance to loose data stored in zk in this scenario?


If you wipe out the next server before a quorum is formed, then you could end up losing transactions. Making sure that there is a leader and enough followers before the next iteration is very critical.

> On node failure / node replacement it is inevitable to have disk wiped for one server (this should be handled by zk as this will happen eventually on all systems running zk).
> Does upgrading zk version on node with wiped disk adds a situation not supported by zk?
>
> "On your step 4, you can tell that a server is ready once clients are able to connect to it successfully."
> Do you mean client should have only this server in the connection string, and once this client get respond for any 'get' operation it means that it is safe to go ahead?

That's a way of testing, yes. A way of checking this manually is to use the zkCli under bin with the address of the server that you're bringing up. If you want to automate it, then you probably want some java code that waits until a session is sync connected, and the only address in the connect string is the one of the server you're interested in.

Thanks,
-Flavio

>
> Thanks,
> Maksim
>
> -----Original Message-----
> From: Flavio Junqueira [mailto:fpj@apache.org]
> Sent: Friday, October 21, 2016 3:59 PM
> To: user@zookeeper.apache.org
> Subject: Re: rolling upgrade with wiped disk
>
> Hi Maksim,
>
> You have to be careful when you do this. You may end up losing quorum on txns if you wipe out the disk of a server prematurely.
>
> On your step 4, you can tell that a server is ready once clients are able to connect to it successfully.
>
> Thanks,
> -Flavio
>
>> On 21 Oct 2016, at 09:11, Vazhenin, Maksim <Ma...@dell.com> wrote:
>>
>> Hi,
>>
>> Is this a supported scenario for doing rolling upgrade of zookeeper (v3.4.5) to later version (say v3.4.9):
>>
>> 1. Shutdown server A (v3.4.5)
>> 2. Wipe disk with zookeeper data on server A 3. Start server A with
>> new zk version (v3.4.9) 4. Wait till reconstruction complete on server
>> A (what is the indicator for completion?) 5. Go to server B and repeat
>> 1-4.
>>
>> Thanks,
>> Maksim
>


Re: rolling upgrade with wiped disk

Posted by Flavio Junqueira <fp...@apache.org>.
> On 21 Oct 2016, at 14:35, Vazhenin, Maksim <Ma...@dell.com> wrote:
> 
> Thanks for reply, Flavio,
> 
> "You have to be careful when you do this. You may end up losing quorum on txns if you wipe out the disk of a server prematurely."
> What steps could be performed to not loose quorum? If quorum is lost what steps are to be performed to fix it?
> Is there a chance to loose data stored in zk in this scenario?


If you wipe out the next server before a quorum is formed, then you could end up losing transactions. Making sure that there is a leader and enough followers before the next iteration is very critical.

> On node failure / node replacement it is inevitable to have disk wiped for one server (this should be handled by zk as this will happen eventually on all systems running zk).
> Does upgrading zk version on node with wiped disk adds a situation not supported by zk?
> 
> "On your step 4, you can tell that a server is ready once clients are able to connect to it successfully."
> Do you mean client should have only this server in the connection string, and once this client get respond for any 'get' operation it means that it is safe to go ahead?

That's a way of testing, yes. A way of checking this manually is to use the zkCli under bin with the address of the server that you're bringing up. If you want to automate it, then you probably want some java code that waits until a session is sync connected, and the only address in the connect string is the one of the server you're interested in.

Thanks,
-Flavio 

> 
> Thanks,
> Maksim
> 
> -----Original Message-----
> From: Flavio Junqueira [mailto:fpj@apache.org] 
> Sent: Friday, October 21, 2016 3:59 PM
> To: user@zookeeper.apache.org
> Subject: Re: rolling upgrade with wiped disk
> 
> Hi Maksim,
> 
> You have to be careful when you do this. You may end up losing quorum on txns if you wipe out the disk of a server prematurely.
> 
> On your step 4, you can tell that a server is ready once clients are able to connect to it successfully.
> 
> Thanks,
> -Flavio
> 
>> On 21 Oct 2016, at 09:11, Vazhenin, Maksim <Ma...@dell.com> wrote:
>> 
>> Hi,
>> 
>> Is this a supported scenario for doing rolling upgrade of zookeeper (v3.4.5) to later version (say v3.4.9):
>> 
>> 1. Shutdown server A (v3.4.5)
>> 2. Wipe disk with zookeeper data on server A 3. Start server A with 
>> new zk version (v3.4.9) 4. Wait till reconstruction complete on server 
>> A (what is the indicator for completion?) 5. Go to server B and repeat 
>> 1-4.
>> 
>> Thanks,
>> Maksim
> 


RE: rolling upgrade with wiped disk

Posted by "Vazhenin, Maksim" <Ma...@dell.com>.
Thanks for reply, Flavio,

"You have to be careful when you do this. You may end up losing quorum on txns if you wipe out the disk of a server prematurely."
What steps could be performed to not loose quorum? If quorum is lost what steps are to be performed to fix it?
Is there a chance to loose data stored in zk in this scenario?
On node failure / node replacement it is inevitable to have disk wiped for one server (this should be handled by zk as this will happen eventually on all systems running zk).
Does upgrading zk version on node with wiped disk adds a situation not supported by zk?

"On your step 4, you can tell that a server is ready once clients are able to connect to it successfully."
Do you mean client should have only this server in the connection string, and once this client get respond for any 'get' operation it means that it is safe to go ahead?

Thanks,
Maksim

-----Original Message-----
From: Flavio Junqueira [mailto:fpj@apache.org] 
Sent: Friday, October 21, 2016 3:59 PM
To: user@zookeeper.apache.org
Subject: Re: rolling upgrade with wiped disk

Hi Maksim,

You have to be careful when you do this. You may end up losing quorum on txns if you wipe out the disk of a server prematurely.

On your step 4, you can tell that a server is ready once clients are able to connect to it successfully.

Thanks,
-Flavio

> On 21 Oct 2016, at 09:11, Vazhenin, Maksim <Ma...@dell.com> wrote:
> 
> Hi,
> 
> Is this a supported scenario for doing rolling upgrade of zookeeper (v3.4.5) to later version (say v3.4.9):
> 
> 1. Shutdown server A (v3.4.5)
> 2. Wipe disk with zookeeper data on server A 3. Start server A with 
> new zk version (v3.4.9) 4. Wait till reconstruction complete on server 
> A (what is the indicator for completion?) 5. Go to server B and repeat 
> 1-4.
> 
> Thanks,
> Maksim


Re: rolling upgrade with wiped disk

Posted by Flavio Junqueira <fp...@apache.org>.
Hi Maksim,

You have to be careful when you do this. You may end up losing quorum on txns if you wipe out the disk of a server prematurely.

On your step 4, you can tell that a server is ready once clients are able to connect to it successfully.

Thanks,
-Flavio

> On 21 Oct 2016, at 09:11, Vazhenin, Maksim <Ma...@dell.com> wrote:
> 
> Hi,
> 
> Is this a supported scenario for doing rolling upgrade of zookeeper (v3.4.5) to later version (say v3.4.9):
> 
> 1. Shutdown server A (v3.4.5)
> 2. Wipe disk with zookeeper data on server A
> 3. Start server A with new zk version (v3.4.9)
> 4. Wait till reconstruction complete on server A (what is the indicator for completion?)
> 5. Go to server B and repeat 1-4.
> 
> Thanks,
> Maksim