You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ambari.apache.org by Clark Breyman <cl...@breyman.com> on 2015/06/27 02:10:44 UTC

Ambari data corruption/recovery process

I'm wondering if anyone can share pointers/procedures/best practices to
handle the scenarios where:

a) The sql database becomes corrupt. (Bugs, ...)
b) The Ambari service host is lost (e.g. EC2 instance termination, physical
hardware loss, ...)

Re: Ambari data corruption/recovery process

Posted by Jeff Sposetti <je...@hortonworks.com>.
Hi,

(Others... please add/correct if I missed something).

I believe the keys are unrelated to whether agent is bootstrapped with SSH or manual. There will be keys on the agents if the ambari server-agent communication was setup for two-way ssl. This is not set by default in Ambari Server ambari.properties. If enabled, you have this in the ambari.properties file.

security.server.two_way_ssl=true

So if two-way ssl is not enabled, the keys folder is empty on the agent hosts (and there is nothing to delete). If enabled, then yep, you have to clear that folder so when the agent checks-in with the replacement Ambari Server, the keys will get re-created to work with the new Ambari Server.

Cheers,

Jeff

From: Alex Kaplan <ak...@ifwe.co>>
Reply-To: "user@ambari.apache.org<ma...@ambari.apache.org>" <us...@ambari.apache.org>>
Date: Saturday, June 27, 2015 at 3:16 AM
To: "user@ambari.apache.org<ma...@ambari.apache.org>" <us...@ambari.apache.org>>
Subject: Re: Ambari data corruption/recovery process


Is removing that directory necessary for agents that registered without ssh?

On Jun 26, 2015 5:53 PM, "Yusaku Sako" <yu...@hortonworks.com>> wrote:
Yes, if you are talking about corruption, then you would need snapshots to go back to.
Recovery would be simpler if the Ambari Server hostname does not change (IP address changes should not matter).

One more step that I forgot to mention...  you would need to delete /var/lib/ambari-agent/keys/* from each agent before restarting it.

Yusaku

From: Clark Breyman <cl...@breyman.com>>
Reply-To: "user@ambari.apache.org<ma...@ambari.apache.org>" <us...@ambari.apache.org>>
Date: Friday, June 26, 2015 5:22 PM
To: "user@ambari.apache.org<ma...@ambari.apache.org>" <us...@ambari.apache.org>>
Subject: Re: Ambari data corruption/recovery process

Thanks Yusaku for the quick response.

For our production systems, we're planning on using Postgres replication to ensure backups, though that doesn't defend against data corruption. Perhaps snapshots will be required.
Is there any documentation on restoring to a newly provisioned host? Is there any reason to use an DNS A record instead of a CNAME alias to simplify the recovery process?


On Fri, Jun 26, 2015 at 5:14 PM, Yusaku Sako <yu...@hortonworks.com>> wrote:
Ambari DB should be backed up on a regular basis.  This is the most important piece of information.
It is also advisable to also back up /etc/ambari-server/conf/ambari-server.properties.
If you have these two, you can restore Ambari Server back to a running condition on a different host.
If the hostname of the Ambari Server changes, then you would have to update /etc/ambari-agent/conf/ambari-agent.ini to point to the new Ambari Server hostname and restart the agent.

Yusaku

From: Clark Breyman <cl...@breyman.com>>
Reply-To: "user@ambari.apache.org<ma...@ambari.apache.org>" <us...@ambari.apache.org>>
Date: Friday, June 26, 2015 5:10 PM
To: "user@ambari.apache.org<ma...@ambari.apache.org>" <us...@ambari.apache.org>>
Subject: Ambari data corruption/recovery process

I'm wondering if anyone can share pointers/procedures/best practices to handle the scenarios where:

a) The sql database becomes corrupt. (Bugs, ...)
b) The Ambari service host is lost (e.g. EC2 instance termination, physical hardware loss, ...)



Re: Ambari data corruption/recovery process

Posted by Alex Kaplan <ak...@ifwe.co>.
Is removing that directory necessary for agents that registered without
ssh?
On Jun 26, 2015 5:53 PM, "Yusaku Sako" <yu...@hortonworks.com> wrote:

>  Yes, if you are talking about corruption, then you would need snapshots
> to go back to.
> Recovery would be simpler if the Ambari Server hostname does not change
> (IP address changes should not matter).
>
>  One more step that I forgot to mention…  you would need to delete
> /var/lib/ambari-agent/keys/* from each agent before restarting it.
>
>  Yusaku
>
>   From: Clark Breyman <cl...@breyman.com>
> Reply-To: "user@ambari.apache.org" <us...@ambari.apache.org>
> Date: Friday, June 26, 2015 5:22 PM
> To: "user@ambari.apache.org" <us...@ambari.apache.org>
> Subject: Re: Ambari data corruption/recovery process
>
>   Thanks Yusaku for the quick response.
>
>  For our production systems, we're planning on using Postgres replication
> to ensure backups, though that doesn't defend against data corruption.
> Perhaps snapshots will be required.
> Is there any documentation on restoring to a newly provisioned host? Is
> there any reason to use an DNS A record instead of a CNAME alias to
> simplify the recovery process?
>
>
> On Fri, Jun 26, 2015 at 5:14 PM, Yusaku Sako <yu...@hortonworks.com>
> wrote:
>
>>  Ambari DB should be backed up on a regular basis.  This is the most
>> important piece of information.
>> It is also advisable to also back up
>> /etc/ambari-server/conf/ambari-server.properties.
>> If you have these two, you can restore Ambari Server back to a running
>> condition on a different host.
>> If the hostname of the Ambari Server changes, then you would have to
>> update /etc/ambari-agent/conf/ambari-agent.ini to point to the new Ambari
>> Server hostname and restart the agent.
>>
>>  Yusaku
>>
>>   From: Clark Breyman <cl...@breyman.com>
>> Reply-To: "user@ambari.apache.org" <us...@ambari.apache.org>
>> Date: Friday, June 26, 2015 5:10 PM
>> To: "user@ambari.apache.org" <us...@ambari.apache.org>
>> Subject: Ambari data corruption/recovery process
>>
>>   I'm wondering if anyone can share pointers/procedures/best practices
>> to handle the scenarios where:
>>
>>  a) The sql database becomes corrupt. (Bugs, ...)
>> b) The Ambari service host is lost (e.g. EC2 instance termination,
>> physical hardware loss, ...)
>>
>>
>

Re: Ambari data corruption/recovery process

Posted by Yusaku Sako <yu...@hortonworks.com>.
Yes, if you are talking about corruption, then you would need snapshots to go back to.
Recovery would be simpler if the Ambari Server hostname does not change (IP address changes should not matter).

One more step that I forgot to mention...  you would need to delete /var/lib/ambari-agent/keys/* from each agent before restarting it.

Yusaku

From: Clark Breyman <cl...@breyman.com>>
Reply-To: "user@ambari.apache.org<ma...@ambari.apache.org>" <us...@ambari.apache.org>>
Date: Friday, June 26, 2015 5:22 PM
To: "user@ambari.apache.org<ma...@ambari.apache.org>" <us...@ambari.apache.org>>
Subject: Re: Ambari data corruption/recovery process

Thanks Yusaku for the quick response.

For our production systems, we're planning on using Postgres replication to ensure backups, though that doesn't defend against data corruption. Perhaps snapshots will be required.
Is there any documentation on restoring to a newly provisioned host? Is there any reason to use an DNS A record instead of a CNAME alias to simplify the recovery process?


On Fri, Jun 26, 2015 at 5:14 PM, Yusaku Sako <yu...@hortonworks.com>> wrote:
Ambari DB should be backed up on a regular basis.  This is the most important piece of information.
It is also advisable to also back up /etc/ambari-server/conf/ambari-server.properties.
If you have these two, you can restore Ambari Server back to a running condition on a different host.
If the hostname of the Ambari Server changes, then you would have to update /etc/ambari-agent/conf/ambari-agent.ini to point to the new Ambari Server hostname and restart the agent.

Yusaku

From: Clark Breyman <cl...@breyman.com>>
Reply-To: "user@ambari.apache.org<ma...@ambari.apache.org>" <us...@ambari.apache.org>>
Date: Friday, June 26, 2015 5:10 PM
To: "user@ambari.apache.org<ma...@ambari.apache.org>" <us...@ambari.apache.org>>
Subject: Ambari data corruption/recovery process

I'm wondering if anyone can share pointers/procedures/best practices to handle the scenarios where:

a) The sql database becomes corrupt. (Bugs, ...)
b) The Ambari service host is lost (e.g. EC2 instance termination, physical hardware loss, ...)



Re: Ambari data corruption/recovery process

Posted by Clark Breyman <cl...@breyman.com>.
Thanks Yusaku for the quick response.

For our production systems, we're planning on using Postgres replication to
ensure backups, though that doesn't defend against data corruption. Perhaps
snapshots will be required.
Is there any documentation on restoring to a newly provisioned host? Is
there any reason to use an DNS A record instead of a CNAME alias to
simplify the recovery process?


On Fri, Jun 26, 2015 at 5:14 PM, Yusaku Sako <yu...@hortonworks.com> wrote:

>  Ambari DB should be backed up on a regular basis.  This is the most
> important piece of information.
> It is also advisable to also back up
> /etc/ambari-server/conf/ambari-server.properties.
> If you have these two, you can restore Ambari Server back to a running
> condition on a different host.
> If the hostname of the Ambari Server changes, then you would have to
> update /etc/ambari-agent/conf/ambari-agent.ini to point to the new Ambari
> Server hostname and restart the agent.
>
>  Yusaku
>
>   From: Clark Breyman <cl...@breyman.com>
> Reply-To: "user@ambari.apache.org" <us...@ambari.apache.org>
> Date: Friday, June 26, 2015 5:10 PM
> To: "user@ambari.apache.org" <us...@ambari.apache.org>
> Subject: Ambari data corruption/recovery process
>
>   I'm wondering if anyone can share pointers/procedures/best practices to
> handle the scenarios where:
>
>  a) The sql database becomes corrupt. (Bugs, ...)
> b) The Ambari service host is lost (e.g. EC2 instance termination,
> physical hardware loss, ...)
>
>

Re: Ambari data corruption/recovery process

Posted by Yusaku Sako <yu...@hortonworks.com>.
Ambari DB should be backed up on a regular basis.  This is the most important piece of information.
It is also advisable to also back up /etc/ambari-server/conf/ambari-server.properties.
If you have these two, you can restore Ambari Server back to a running condition on a different host.
If the hostname of the Ambari Server changes, then you would have to update /etc/ambari-agent/conf/ambari-agent.ini to point to the new Ambari Server hostname and restart the agent.

Yusaku

From: Clark Breyman <cl...@breyman.com>>
Reply-To: "user@ambari.apache.org<ma...@ambari.apache.org>" <us...@ambari.apache.org>>
Date: Friday, June 26, 2015 5:10 PM
To: "user@ambari.apache.org<ma...@ambari.apache.org>" <us...@ambari.apache.org>>
Subject: Ambari data corruption/recovery process

I'm wondering if anyone can share pointers/procedures/best practices to handle the scenarios where:

a) The sql database becomes corrupt. (Bugs, ...)
b) The Ambari service host is lost (e.g. EC2 instance termination, physical hardware loss, ...)