You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Ishan Bhatt (JIRA)" <ji...@apache.org> on 2018/07/10 04:35:00 UTC

[jira] [Commented] (AMBARI-22848) Blueprint database inconsistency should be caught by Ambari DB consistency checker

    [ https://issues.apache.org/jira/browse/AMBARI-22848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537987#comment-16537987 ] 

Ishan Bhatt commented on AMBARI-22848:
--------------------------------------

Moving it out to 2.7.1

> Blueprint database inconsistency should be caught by Ambari DB consistency checker
> ----------------------------------------------------------------------------------
>
>                 Key: AMBARI-22848
>                 URL: https://issues.apache.org/jira/browse/AMBARI-22848
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.5.0
>            Reporter: Robert Nettleton
>            Assignee: Robert Nettleton
>            Priority: Critical
>             Fix For: 2.7.1
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> We've seen some Blueprint deployments fail after an upgrade to Ambari 2.5.2/2.6) causes older configuration to be reset.  
> 1. User deploys cluster via Blueprints with older (older than Ambari 2.5/2.6) version of Ambari.
> 2. Cluster deployment fails, and either the user doesn't realize the deployment has failed, or works through the manual configuration changes required to get failed services up and running. 
> 3. Things run fine, sometimes for quite a while.
> 4. User upgrades ambari-server to Ambari 2.5 or Ambari 2.6.
> 5. Upon the restart of ambari-server, some services seem to be failing, due to invalid, or old configuration.
> The root cause of this problem is that the Blueprints TopologyManager class will attempt to "replay" any failed requests, which was originally implemented to allow a Blueprints install to continue working even if ambari-server is stopped and restarted.
> Since the original Blueprint deployment failed, the Ambari Server database is in an inconsistent state, which causes the Blueprints ToplogyManager to attempt a replay of various configuration tasks. This ends up causing the TopologyManager to send configuration updates from the Blueprints's configuration sections, why by now may be quite out of date, as the cluster may have changed over time while being adminstered.
> This in turn causes some services to fail, as older configuration may not match the current environment.
>  
> The ambari-server update mechanism should be modified to include integrity checks on the Blueprint-related tables in the database. In particular, if a Blueprint deployment is detected, at the very least the "clusterconfig" table needs to be checked, to ensure that at least one configuration type's version has a
> {code:java}
> version_tag{code}
> of "TOPOLOGY_RESOLVED". If no configuration versions are found to have a tag of "TOPOLOGY_RESOLVED", then the ambari-server upgrade should fail with the appropriate messages, to allow the user to make the manual changes required in order to resolve the problem, usually by applying a workaround.
> Having this check at the ambari-server upgrade time seems like the correct way to move forward, as this will more quickly detect this problem, and will keep users from accidentally moving forward with an upgrade that will corrupt the cluster's configuration with older configuration items.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)