You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@ambari.apache.org by Jonathan Hurley <jh...@hortonworks.com> on 2017/01/19 01:35:28 UTC

Review Request 55698: Restarting Some Components During a Suspended Upgrade Fails Due To Missing Upgrade Parameters

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55698/
-----------------------------------------------------------

Review request for Ambari, Alejandro Fernandez, Nate Cole, and Robert Levas.


Bugs: AMBARI-19617
    https://issues.apache.org/jira/browse/AMBARI-19617


Repository: ambari


Description
-------

While attempting to restart a component that has complicated upgrade logic, the upgrade parameters are not sent to the agents. This can cause some components to fails during a suspended upgrade restart. 

Example:

- Begin express upgrade from {{2.3.6.0-3796}} to {{2.5.3.0-37}}
- {{HIVE_METASTORE}} couldn't start b/c of a missing Kerberos property:
{code}
resource_management.core.exceptions.Fail: Configuration parameter 'hive.server2.authentication.kerberos.principal' was not found in configurations dictionary!
{code}
- Chose to {{Ignore and Proceed}} which means that none of the Metastore SQL files ran. 
- Paused the upgrade (presumably at Finalize) and try to start Metastore. It fails to start because the new HDP 2.5 bits are using a non-upgraded database. That causes the {{-info}} option to fail and makes Ambari think it needs to run {{-initSchema}}. 

RCA: Metastore failed to start during upgrade and the admin chose to skip it. This caused schema upgrade logic not to run. Ambari can examine the {{upgrade_suspended}} property to determine if we need to run upgrade commands while restarting Metastore during an upgrade. 

However, it might be more prudent to simply send along the suspended upgrade properties so that any actions which might need to happen (such as invoking an upgrade script during the restart) can happen when the upgrade is suspended.


Diffs
-----

  ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariActionExecutionHelper.java 4fa942f 
  ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariCustomCommandExecutionHelper.java bdad015 
  ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 5e8c803 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java 2ec43cf 
  ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContext.java 1d51b0d 
  ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContextFactory.java 4b988e8 
  ambari-server/src/test/java/org/apache/ambari/server/state/UpgradeHelperTest.java 0d1a2fa 
  ambari-server/src/test/java/org/apache/ambari/server/state/stack/upgrade/StageWrapperBuilderTest.java f7f8325 

Diff: https://reviews.apache.org/r/55698/diff/


Testing
-------

Tested restarts during a suspended upgrade for Metastore.

UNIT TESTS PENDING...


Thanks,

Jonathan Hurley


Re: Review Request 55698: Restarting Some Components During a Suspended Upgrade Fails Due To Missing Upgrade Parameters

Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55698/#review162334
-----------------------------------------------------------


Ship it!




Ship It!

- Alejandro Fernandez


On Jan. 19, 2017, 7:03 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55698/
> -----------------------------------------------------------
> 
> (Updated Jan. 19, 2017, 7:03 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Nate Cole, and Robert Levas.
> 
> 
> Bugs: AMBARI-19617
>     https://issues.apache.org/jira/browse/AMBARI-19617
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> While attempting to restart a component that has complicated upgrade logic, the upgrade parameters are not sent to the agents. This can cause some components to fails during a suspended upgrade restart. 
> 
> Example:
> 
> - Begin express upgrade from {{2.3.6.0-3796}} to {{2.5.3.0-37}}
> - {{HIVE_METASTORE}} couldn't start b/c of a missing Kerberos property:
> {code}
> resource_management.core.exceptions.Fail: Configuration parameter 'hive.server2.authentication.kerberos.principal' was not found in configurations dictionary!
> {code}
> - Chose to {{Ignore and Proceed}} which means that none of the Metastore SQL files ran. 
> - Paused the upgrade (presumably at Finalize) and try to start Metastore. It fails to start because the new HDP 2.5 bits are using a non-upgraded database. That causes the {{-info}} option to fail and makes Ambari think it needs to run {{-initSchema}}. 
> 
> RCA: Metastore failed to start during upgrade and the admin chose to skip it. This caused schema upgrade logic not to run. Ambari can examine the {{upgrade_suspended}} property to determine if we need to run upgrade commands while restarting Metastore during an upgrade. 
> 
> However, it might be more prudent to simply send along the suspended upgrade properties so that any actions which might need to happen (such as invoking an upgrade script during the restart) can happen when the upgrade is suspended.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariActionExecutionHelper.java ec0f7d0 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariCustomCommandExecutionHelper.java bdad015 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 5e8c803 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java 2ec43cf 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 4e37c92 
>   ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContext.java 1d51b0d 
>   ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContextFactory.java 4b988e8 
>   ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java 7b0b696 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/KerberosHelperTest.java 9693f98 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/ActiveWidgetLayoutResourceProviderTest.java 5cce3fc 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UserAuthorizationResourceProviderTest.java fd96c8e 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UserResourceProviderTest.java cc0f2b6 
>   ambari-server/src/test/java/org/apache/ambari/server/state/ConfigHelperTest.java 526e462 
>   ambari-server/src/test/java/org/apache/ambari/server/state/UpgradeHelperTest.java 0d1a2fa 
>   ambari-server/src/test/java/org/apache/ambari/server/state/stack/upgrade/StageWrapperBuilderTest.java f7f8325 
> 
> Diff: https://reviews.apache.org/r/55698/diff/
> 
> 
> Testing
> -------
> 
> Tested restarts during a suspended upgrade for Metastore.
> 
> Tests run: 4864, Failures: 0, Errors: 0, Skipped: 38
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 26:24 min
> [INFO] Finished at: 2017-01-19T10:53:14-05:00
> [INFO] Final Memory: 57M/678M
> [INFO] ------------------------------------------------------------------------
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Re: Review Request 55698: Restarting Some Components During a Suspended Upgrade Fails Due To Missing Upgrade Parameters

Posted by Nate Cole <nc...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55698/#review162342
-----------------------------------------------------------


Ship it!




Ship It!

- Nate Cole


On Jan. 19, 2017, 2:03 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55698/
> -----------------------------------------------------------
> 
> (Updated Jan. 19, 2017, 2:03 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Nate Cole, and Robert Levas.
> 
> 
> Bugs: AMBARI-19617
>     https://issues.apache.org/jira/browse/AMBARI-19617
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> While attempting to restart a component that has complicated upgrade logic, the upgrade parameters are not sent to the agents. This can cause some components to fails during a suspended upgrade restart. 
> 
> Example:
> 
> - Begin express upgrade from {{2.3.6.0-3796}} to {{2.5.3.0-37}}
> - {{HIVE_METASTORE}} couldn't start b/c of a missing Kerberos property:
> {code}
> resource_management.core.exceptions.Fail: Configuration parameter 'hive.server2.authentication.kerberos.principal' was not found in configurations dictionary!
> {code}
> - Chose to {{Ignore and Proceed}} which means that none of the Metastore SQL files ran. 
> - Paused the upgrade (presumably at Finalize) and try to start Metastore. It fails to start because the new HDP 2.5 bits are using a non-upgraded database. That causes the {{-info}} option to fail and makes Ambari think it needs to run {{-initSchema}}. 
> 
> RCA: Metastore failed to start during upgrade and the admin chose to skip it. This caused schema upgrade logic not to run. Ambari can examine the {{upgrade_suspended}} property to determine if we need to run upgrade commands while restarting Metastore during an upgrade. 
> 
> However, it might be more prudent to simply send along the suspended upgrade properties so that any actions which might need to happen (such as invoking an upgrade script during the restart) can happen when the upgrade is suspended.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariActionExecutionHelper.java ec0f7d0 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariCustomCommandExecutionHelper.java bdad015 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 5e8c803 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java 2ec43cf 
>   ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 4e37c92 
>   ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContext.java 1d51b0d 
>   ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContextFactory.java 4b988e8 
>   ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java 7b0b696 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/KerberosHelperTest.java 9693f98 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/ActiveWidgetLayoutResourceProviderTest.java 5cce3fc 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UserAuthorizationResourceProviderTest.java fd96c8e 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UserResourceProviderTest.java cc0f2b6 
>   ambari-server/src/test/java/org/apache/ambari/server/state/ConfigHelperTest.java 526e462 
>   ambari-server/src/test/java/org/apache/ambari/server/state/UpgradeHelperTest.java 0d1a2fa 
>   ambari-server/src/test/java/org/apache/ambari/server/state/stack/upgrade/StageWrapperBuilderTest.java f7f8325 
> 
> Diff: https://reviews.apache.org/r/55698/diff/
> 
> 
> Testing
> -------
> 
> Tested restarts during a suspended upgrade for Metastore.
> 
> Tests run: 4864, Failures: 0, Errors: 0, Skipped: 38
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 26:24 min
> [INFO] Finished at: 2017-01-19T10:53:14-05:00
> [INFO] Final Memory: 57M/678M
> [INFO] ------------------------------------------------------------------------
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Re: Review Request 55698: Restarting Some Components During a Suspended Upgrade Fails Due To Missing Upgrade Parameters

Posted by Jonathan Hurley <jh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55698/
-----------------------------------------------------------

(Updated Jan. 19, 2017, 2:03 p.m.)


Review request for Ambari, Alejandro Fernandez, Nate Cole, and Robert Levas.


Bugs: AMBARI-19617
    https://issues.apache.org/jira/browse/AMBARI-19617


Repository: ambari


Description
-------

While attempting to restart a component that has complicated upgrade logic, the upgrade parameters are not sent to the agents. This can cause some components to fails during a suspended upgrade restart. 

Example:

- Begin express upgrade from {{2.3.6.0-3796}} to {{2.5.3.0-37}}
- {{HIVE_METASTORE}} couldn't start b/c of a missing Kerberos property:
{code}
resource_management.core.exceptions.Fail: Configuration parameter 'hive.server2.authentication.kerberos.principal' was not found in configurations dictionary!
{code}
- Chose to {{Ignore and Proceed}} which means that none of the Metastore SQL files ran. 
- Paused the upgrade (presumably at Finalize) and try to start Metastore. It fails to start because the new HDP 2.5 bits are using a non-upgraded database. That causes the {{-info}} option to fail and makes Ambari think it needs to run {{-initSchema}}. 

RCA: Metastore failed to start during upgrade and the admin chose to skip it. This caused schema upgrade logic not to run. Ambari can examine the {{upgrade_suspended}} property to determine if we need to run upgrade commands while restarting Metastore during an upgrade. 

However, it might be more prudent to simply send along the suspended upgrade properties so that any actions which might need to happen (such as invoking an upgrade script during the restart) can happen when the upgrade is suspended.


Diffs (updated)
-----

  ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariActionExecutionHelper.java ec0f7d0 
  ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariCustomCommandExecutionHelper.java bdad015 
  ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 5e8c803 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java 2ec43cf 
  ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java 4e37c92 
  ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContext.java 1d51b0d 
  ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContextFactory.java 4b988e8 
  ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java 7b0b696 
  ambari-server/src/test/java/org/apache/ambari/server/controller/KerberosHelperTest.java 9693f98 
  ambari-server/src/test/java/org/apache/ambari/server/controller/internal/ActiveWidgetLayoutResourceProviderTest.java 5cce3fc 
  ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UserAuthorizationResourceProviderTest.java fd96c8e 
  ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UserResourceProviderTest.java cc0f2b6 
  ambari-server/src/test/java/org/apache/ambari/server/state/ConfigHelperTest.java 526e462 
  ambari-server/src/test/java/org/apache/ambari/server/state/UpgradeHelperTest.java 0d1a2fa 
  ambari-server/src/test/java/org/apache/ambari/server/state/stack/upgrade/StageWrapperBuilderTest.java f7f8325 

Diff: https://reviews.apache.org/r/55698/diff/


Testing
-------

Tested restarts during a suspended upgrade for Metastore.

Tests run: 4864, Failures: 0, Errors: 0, Skipped: 38

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 26:24 min
[INFO] Finished at: 2017-01-19T10:53:14-05:00
[INFO] Final Memory: 57M/678M
[INFO] ------------------------------------------------------------------------


Thanks,

Jonathan Hurley


Re: Review Request 55698: Restarting Some Components During a Suspended Upgrade Fails Due To Missing Upgrade Parameters

Posted by Jonathan Hurley <jh...@hortonworks.com>.

> On Jan. 19, 2017, 1:04 p.m., Nate Cole wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java, lines 2456-2460
> > <https://reviews.apache.org/r/55698/diff/2/?file=1609200#file1609200line2456>
> >
> >     This pattern is used at least 3 times - should we just get this from the cluster itself?  A method that goes with cluster.isUpgradeSuspended()?  We also have a mix of upgradeContext.getInitializedCommandParams() and the key for suspended when they should just all be put on together.

Sure - I can refactor this to be reused.


> On Jan. 19, 2017, 1:04 p.m., Nate Cole wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContext.java, lines 283-286
> > <https://reviews.apache.org/r/55698/diff/2/?file=1609202#file1609202line283>
> >
> >     Can you mark this variable with @Experimental(ExperimentalFeature.PATCH_UPGRADES) ?  Will make it easier to find when we merge.

Will Do!


> On Jan. 19, 2017, 1:04 p.m., Nate Cole wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContext.java, lines 641-647
> > <https://reviews.apache.org/r/55698/diff/2/?file=1609202#file1609202line641>
> >
> >     These links don't exist anymore?

Hah! Right! I had moved them when I realized we need a central place. I'll update the doc.


- Jonathan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55698/#review162322
-----------------------------------------------------------


On Jan. 19, 2017, 11:54 a.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55698/
> -----------------------------------------------------------
> 
> (Updated Jan. 19, 2017, 11:54 a.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Nate Cole, and Robert Levas.
> 
> 
> Bugs: AMBARI-19617
>     https://issues.apache.org/jira/browse/AMBARI-19617
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> While attempting to restart a component that has complicated upgrade logic, the upgrade parameters are not sent to the agents. This can cause some components to fails during a suspended upgrade restart. 
> 
> Example:
> 
> - Begin express upgrade from {{2.3.6.0-3796}} to {{2.5.3.0-37}}
> - {{HIVE_METASTORE}} couldn't start b/c of a missing Kerberos property:
> {code}
> resource_management.core.exceptions.Fail: Configuration parameter 'hive.server2.authentication.kerberos.principal' was not found in configurations dictionary!
> {code}
> - Chose to {{Ignore and Proceed}} which means that none of the Metastore SQL files ran. 
> - Paused the upgrade (presumably at Finalize) and try to start Metastore. It fails to start because the new HDP 2.5 bits are using a non-upgraded database. That causes the {{-info}} option to fail and makes Ambari think it needs to run {{-initSchema}}. 
> 
> RCA: Metastore failed to start during upgrade and the admin chose to skip it. This caused schema upgrade logic not to run. Ambari can examine the {{upgrade_suspended}} property to determine if we need to run upgrade commands while restarting Metastore during an upgrade. 
> 
> However, it might be more prudent to simply send along the suspended upgrade properties so that any actions which might need to happen (such as invoking an upgrade script during the restart) can happen when the upgrade is suspended.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariActionExecutionHelper.java ec0f7d0 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariCustomCommandExecutionHelper.java bdad015 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 5e8c803 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java 2ec43cf 
>   ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContext.java 1d51b0d 
>   ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContextFactory.java 4b988e8 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/KerberosHelperTest.java 9693f98 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/ActiveWidgetLayoutResourceProviderTest.java 5cce3fc 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UserAuthorizationResourceProviderTest.java fd96c8e 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UserResourceProviderTest.java cc0f2b6 
>   ambari-server/src/test/java/org/apache/ambari/server/state/ConfigHelperTest.java 526e462 
>   ambari-server/src/test/java/org/apache/ambari/server/state/UpgradeHelperTest.java 0d1a2fa 
>   ambari-server/src/test/java/org/apache/ambari/server/state/stack/upgrade/StageWrapperBuilderTest.java f7f8325 
> 
> Diff: https://reviews.apache.org/r/55698/diff/
> 
> 
> Testing
> -------
> 
> Tested restarts during a suspended upgrade for Metastore.
> 
> Tests run: 4864, Failures: 0, Errors: 0, Skipped: 38
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 26:24 min
> [INFO] Finished at: 2017-01-19T10:53:14-05:00
> [INFO] Final Memory: 57M/678M
> [INFO] ------------------------------------------------------------------------
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Re: Review Request 55698: Restarting Some Components During a Suspended Upgrade Fails Due To Missing Upgrade Parameters

Posted by Nate Cole <nc...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55698/#review162322
-----------------------------------------------------------




ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java (lines 2447 - 2451)
<https://reviews.apache.org/r/55698/#comment233652>

    This pattern is used at least 3 times - should we just get this from the cluster itself?  A method that goes with cluster.isUpgradeSuspended()?  We also have a mix of upgradeContext.getInitializedCommandParams() and the key for suspended when they should just all be put on together.



ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContext.java (lines 272 - 275)
<https://reviews.apache.org/r/55698/#comment233653>

    Can you mark this variable with @Experimental(ExperimentalFeature.PATCH_UPGRADES) ?  Will make it easier to find when we merge.



ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContext.java (lines 622 - 628)
<https://reviews.apache.org/r/55698/#comment233654>

    These links don't exist anymore?


- Nate Cole


On Jan. 19, 2017, 11:54 a.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55698/
> -----------------------------------------------------------
> 
> (Updated Jan. 19, 2017, 11:54 a.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Nate Cole, and Robert Levas.
> 
> 
> Bugs: AMBARI-19617
>     https://issues.apache.org/jira/browse/AMBARI-19617
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> While attempting to restart a component that has complicated upgrade logic, the upgrade parameters are not sent to the agents. This can cause some components to fails during a suspended upgrade restart. 
> 
> Example:
> 
> - Begin express upgrade from {{2.3.6.0-3796}} to {{2.5.3.0-37}}
> - {{HIVE_METASTORE}} couldn't start b/c of a missing Kerberos property:
> {code}
> resource_management.core.exceptions.Fail: Configuration parameter 'hive.server2.authentication.kerberos.principal' was not found in configurations dictionary!
> {code}
> - Chose to {{Ignore and Proceed}} which means that none of the Metastore SQL files ran. 
> - Paused the upgrade (presumably at Finalize) and try to start Metastore. It fails to start because the new HDP 2.5 bits are using a non-upgraded database. That causes the {{-info}} option to fail and makes Ambari think it needs to run {{-initSchema}}. 
> 
> RCA: Metastore failed to start during upgrade and the admin chose to skip it. This caused schema upgrade logic not to run. Ambari can examine the {{upgrade_suspended}} property to determine if we need to run upgrade commands while restarting Metastore during an upgrade. 
> 
> However, it might be more prudent to simply send along the suspended upgrade properties so that any actions which might need to happen (such as invoking an upgrade script during the restart) can happen when the upgrade is suspended.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariActionExecutionHelper.java ec0f7d0 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariCustomCommandExecutionHelper.java bdad015 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 5e8c803 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java 2ec43cf 
>   ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContext.java 1d51b0d 
>   ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContextFactory.java 4b988e8 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/KerberosHelperTest.java 9693f98 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/ActiveWidgetLayoutResourceProviderTest.java 5cce3fc 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UserAuthorizationResourceProviderTest.java fd96c8e 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UserResourceProviderTest.java cc0f2b6 
>   ambari-server/src/test/java/org/apache/ambari/server/state/ConfigHelperTest.java 526e462 
>   ambari-server/src/test/java/org/apache/ambari/server/state/UpgradeHelperTest.java 0d1a2fa 
>   ambari-server/src/test/java/org/apache/ambari/server/state/stack/upgrade/StageWrapperBuilderTest.java f7f8325 
> 
> Diff: https://reviews.apache.org/r/55698/diff/
> 
> 
> Testing
> -------
> 
> Tested restarts during a suspended upgrade for Metastore.
> 
> Tests run: 4864, Failures: 0, Errors: 0, Skipped: 38
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 26:24 min
> [INFO] Finished at: 2017-01-19T10:53:14-05:00
> [INFO] Final Memory: 57M/678M
> [INFO] ------------------------------------------------------------------------
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Re: Review Request 55698: Restarting Some Components During a Suspended Upgrade Fails Due To Missing Upgrade Parameters

Posted by Jonathan Hurley <jh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55698/
-----------------------------------------------------------

(Updated Jan. 19, 2017, 11:54 a.m.)


Review request for Ambari, Alejandro Fernandez, Nate Cole, and Robert Levas.


Bugs: AMBARI-19617
    https://issues.apache.org/jira/browse/AMBARI-19617


Repository: ambari


Description
-------

While attempting to restart a component that has complicated upgrade logic, the upgrade parameters are not sent to the agents. This can cause some components to fails during a suspended upgrade restart. 

Example:

- Begin express upgrade from {{2.3.6.0-3796}} to {{2.5.3.0-37}}
- {{HIVE_METASTORE}} couldn't start b/c of a missing Kerberos property:
{code}
resource_management.core.exceptions.Fail: Configuration parameter 'hive.server2.authentication.kerberos.principal' was not found in configurations dictionary!
{code}
- Chose to {{Ignore and Proceed}} which means that none of the Metastore SQL files ran. 
- Paused the upgrade (presumably at Finalize) and try to start Metastore. It fails to start because the new HDP 2.5 bits are using a non-upgraded database. That causes the {{-info}} option to fail and makes Ambari think it needs to run {{-initSchema}}. 

RCA: Metastore failed to start during upgrade and the admin chose to skip it. This caused schema upgrade logic not to run. Ambari can examine the {{upgrade_suspended}} property to determine if we need to run upgrade commands while restarting Metastore during an upgrade. 

However, it might be more prudent to simply send along the suspended upgrade properties so that any actions which might need to happen (such as invoking an upgrade script during the restart) can happen when the upgrade is suspended.


Diffs (updated)
-----

  ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariActionExecutionHelper.java ec0f7d0 
  ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariCustomCommandExecutionHelper.java bdad015 
  ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 5e8c803 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java 2ec43cf 
  ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContext.java 1d51b0d 
  ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContextFactory.java 4b988e8 
  ambari-server/src/test/java/org/apache/ambari/server/controller/KerberosHelperTest.java 9693f98 
  ambari-server/src/test/java/org/apache/ambari/server/controller/internal/ActiveWidgetLayoutResourceProviderTest.java 5cce3fc 
  ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UserAuthorizationResourceProviderTest.java fd96c8e 
  ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UserResourceProviderTest.java cc0f2b6 
  ambari-server/src/test/java/org/apache/ambari/server/state/ConfigHelperTest.java 526e462 
  ambari-server/src/test/java/org/apache/ambari/server/state/UpgradeHelperTest.java 0d1a2fa 
  ambari-server/src/test/java/org/apache/ambari/server/state/stack/upgrade/StageWrapperBuilderTest.java f7f8325 

Diff: https://reviews.apache.org/r/55698/diff/


Testing (updated)
-------

Tested restarts during a suspended upgrade for Metastore.

Tests run: 4864, Failures: 0, Errors: 0, Skipped: 38

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 26:24 min
[INFO] Finished at: 2017-01-19T10:53:14-05:00
[INFO] Final Memory: 57M/678M
[INFO] ------------------------------------------------------------------------


Thanks,

Jonathan Hurley


Re: Review Request 55698: Restarting Some Components During a Suspended Upgrade Fails Due To Missing Upgrade Parameters

Posted by Jonathan Hurley <jh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55698/#review162222
-----------------------------------------------------------




ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariActionExecutionHelper.java (lines 474 - 485)
<https://reviews.apache.org/r/55698/#comment233507>

    This is the functional change - in the 3 spots where we generate commands, also put in the upgrade command params if there is a suspended upgrade.



ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java (lines 717 - 720)
<https://reviews.apache.org/r/55698/#comment233511>

    Ewwwww ... This logic needed to be used in several places. I moved it into UpgradeContext ... in fact, I made it so that the constructor of UpgradeContext initialized these without the need to invoke the `setSourceAndTargetVersion()` method.



ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java (line 1387)
<https://reviews.apache.org/r/55698/#comment233512>

    Moved to UpgradeContext for better re-use across the product.



ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContext.java (lines 231 - 251)
<https://reviews.apache.org/r/55698/#comment233514>

    This new constructor initialized the fields from an existing UpgradeEntity


- Jonathan Hurley


On Jan. 18, 2017, 8:35 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55698/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2017, 8:35 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Nate Cole, and Robert Levas.
> 
> 
> Bugs: AMBARI-19617
>     https://issues.apache.org/jira/browse/AMBARI-19617
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> While attempting to restart a component that has complicated upgrade logic, the upgrade parameters are not sent to the agents. This can cause some components to fails during a suspended upgrade restart. 
> 
> Example:
> 
> - Begin express upgrade from {{2.3.6.0-3796}} to {{2.5.3.0-37}}
> - {{HIVE_METASTORE}} couldn't start b/c of a missing Kerberos property:
> {code}
> resource_management.core.exceptions.Fail: Configuration parameter 'hive.server2.authentication.kerberos.principal' was not found in configurations dictionary!
> {code}
> - Chose to {{Ignore and Proceed}} which means that none of the Metastore SQL files ran. 
> - Paused the upgrade (presumably at Finalize) and try to start Metastore. It fails to start because the new HDP 2.5 bits are using a non-upgraded database. That causes the {{-info}} option to fail and makes Ambari think it needs to run {{-initSchema}}. 
> 
> RCA: Metastore failed to start during upgrade and the admin chose to skip it. This caused schema upgrade logic not to run. Ambari can examine the {{upgrade_suspended}} property to determine if we need to run upgrade commands while restarting Metastore during an upgrade. 
> 
> However, it might be more prudent to simply send along the suspended upgrade properties so that any actions which might need to happen (such as invoking an upgrade script during the restart) can happen when the upgrade is suspended.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariActionExecutionHelper.java 4fa942f 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariCustomCommandExecutionHelper.java bdad015 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 5e8c803 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java 2ec43cf 
>   ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContext.java 1d51b0d 
>   ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeContextFactory.java 4b988e8 
>   ambari-server/src/test/java/org/apache/ambari/server/state/UpgradeHelperTest.java 0d1a2fa 
>   ambari-server/src/test/java/org/apache/ambari/server/state/stack/upgrade/StageWrapperBuilderTest.java f7f8325 
> 
> Diff: https://reviews.apache.org/r/55698/diff/
> 
> 
> Testing
> -------
> 
> Tested restarts during a suspended upgrade for Metastore.
> 
> UNIT TESTS PENDING...
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>