You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by Alejandro Fernandez <af...@hortonworks.com> on 2015/06/19 03:37:24 UTC

Review Request 35640: Install packages doesn't update actual version with build number if installation timesout on all hosts

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35640/
-----------------------------------------------------------

Review request for Ambari, Dmytro Sen and Nate Cole.


Bugs: AMBARI-12012
    https://issues.apache.org/jira/browse/AMBARI-12012


Repository: ambari


Description
-------

STR:
1. User registers repo version 2.3.0.0 (notice that a build number was not provided), and clicks the Install button
2. On all of the hosts, the yum commands timeout (or does a partial install), this way, "hdp-select versions" will report that 2 versions exist (2.2.0.0-2041 and 2.3.0.0-2800). Because the install did not succeed, the command will not return the actual_version installed (which was 2.3.0.0-2800). Note: I did this by decreasing the timeouts in ambari.properties file to 5 mins, and adding a sleep in install_packages.py after the first package was installed.
3. The ambari server code then changes the state of the 2.3.0.0 version it knows about to INSTALL_FAILED so that the user can retry, but did not update the repo version with the actual build version that includes the build number.
4. User retries and this time it succeeds. However, the delta of "hdp-select versions" outputs "", so no "actual_version" is returned! This is really bad because the build number is needed for ambari to use it whenever it calls "hdp-select set <comp> <version>"
5. The ambari server code will change the state to INSTALLED.
The fix is for install_packages.py to always return the actual_version (even in the case of a failure) so that Ambari server can correct the database entry (even if the command fails/timesout). This will only happen the first time, but subsequent attempts to retry installation will use the right value so an exact match will be found in the database.


Diffs
-----

  ambari-server/src/main/java/org/apache/ambari/server/bootstrap/DistributeRepositoriesStructuredOutput.java f1d6aad 
  ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java 5600ef1 
  ambari-server/src/main/resources/custom_actions/scripts/install_packages.py f8b2308 

Diff: https://reviews.apache.org/r/35640/diff/


Testing
-------

Reproduced the issue on a live cluster and verified that the patch worked even when the agents reported that the packages failed to be installed.

Unit tests in progress


Thanks,

Alejandro Fernandez


Re: Review Request 35640: Install packages doesn't update actual version with build number if installation timesout on all hosts

Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35640/
-----------------------------------------------------------

(Updated June 19, 2015, 5:44 p.m.)


Review request for Ambari, Dmytro Sen and Nate Cole.


Changes
-------

Combined with another dev's request to add more logging to other files.


Bugs: AMBARI-12012
    https://issues.apache.org/jira/browse/AMBARI-12012


Repository: ambari


Description
-------

STR:
1. User registers repo version 2.3.0.0 (notice that a build number was not provided), and clicks the Install button
2. On all of the hosts, the yum commands timeout (or does a partial install), this way, "hdp-select versions" will report that 2 versions exist (2.2.0.0-2041 and 2.3.0.0-2800). Because the install did not succeed, the command will not return the actual_version installed (which was 2.3.0.0-2800). Note: I did this by decreasing the timeouts in ambari.properties file to 5 mins, and adding a sleep in install_packages.py after the first package was installed.
3. The ambari server code then changes the state of the 2.3.0.0 version it knows about to INSTALL_FAILED so that the user can retry, but did not update the repo version with the actual build version that includes the build number.
4. User retries and this time it succeeds. However, the delta of "hdp-select versions" outputs "", so no "actual_version" is returned! This is really bad because the build number is needed for ambari to use it whenever it calls "hdp-select set <comp> <version>"
5. The ambari server code will change the state to INSTALLED.
The fix is for install_packages.py to always return the actual_version (even in the case of a failure) so that Ambari server can correct the database entry (even if the command fails/timesout). This will only happen the first time, but subsequent attempts to retry installation will use the right value so an exact match will be found in the database.


Diffs (updated)
-----

  ambari-server/src/main/java/org/apache/ambari/server/bootstrap/DistributeRepositoriesStructuredOutput.java f1d6aad 
  ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 42f4708 
  ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java 5600ef1 
  ambari-server/src/main/java/org/apache/ambari/server/state/Alert.java be99d96 
  ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 484bf79 
  ambari-server/src/main/resources/custom_actions/scripts/install_packages.py f8b2308 

Diff: https://reviews.apache.org/r/35640/diff/


Testing
-------

Reproduced the issue on a live cluster and verified that the patch worked even when the agents reported that the packages failed to be installed.

Unit tests in progress


Thanks,

Alejandro Fernandez


Re: Review Request 35640: Install packages doesn't update actual version with build number if installation timesout on all hosts

Posted by Nate Cole <nc...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35640/#review88483
-----------------------------------------------------------

Ship it!


Ship It!

- Nate Cole


On June 18, 2015, 9:37 p.m., Alejandro Fernandez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35640/
> -----------------------------------------------------------
> 
> (Updated June 18, 2015, 9:37 p.m.)
> 
> 
> Review request for Ambari, Dmytro Sen and Nate Cole.
> 
> 
> Bugs: AMBARI-12012
>     https://issues.apache.org/jira/browse/AMBARI-12012
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> STR:
> 1. User registers repo version 2.3.0.0 (notice that a build number was not provided), and clicks the Install button
> 2. On all of the hosts, the yum commands timeout (or does a partial install), this way, "hdp-select versions" will report that 2 versions exist (2.2.0.0-2041 and 2.3.0.0-2800). Because the install did not succeed, the command will not return the actual_version installed (which was 2.3.0.0-2800). Note: I did this by decreasing the timeouts in ambari.properties file to 5 mins, and adding a sleep in install_packages.py after the first package was installed.
> 3. The ambari server code then changes the state of the 2.3.0.0 version it knows about to INSTALL_FAILED so that the user can retry, but did not update the repo version with the actual build version that includes the build number.
> 4. User retries and this time it succeeds. However, the delta of "hdp-select versions" outputs "", so no "actual_version" is returned! This is really bad because the build number is needed for ambari to use it whenever it calls "hdp-select set <comp> <version>"
> 5. The ambari server code will change the state to INSTALLED.
> The fix is for install_packages.py to always return the actual_version (even in the case of a failure) so that Ambari server can correct the database entry (even if the command fails/timesout). This will only happen the first time, but subsequent attempts to retry installation will use the right value so an exact match will be found in the database.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/bootstrap/DistributeRepositoriesStructuredOutput.java f1d6aad 
>   ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java 5600ef1 
>   ambari-server/src/main/resources/custom_actions/scripts/install_packages.py f8b2308 
> 
> Diff: https://reviews.apache.org/r/35640/diff/
> 
> 
> Testing
> -------
> 
> Reproduced the issue on a live cluster and verified that the patch worked even when the agents reported that the packages failed to be installed.
> 
> Unit tests in progress
> 
> 
> Thanks,
> 
> Alejandro Fernandez
> 
>


Re: Review Request 35640: Install packages doesn't update actual version with build number if installation timesout on all hosts

Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35640/#review88481
-----------------------------------------------------------



ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java (line 116)
<https://reviews.apache.org/r/35640/#comment141044>

    The indentation changed here. The repositoryVersion will be calculated even when the HostRoleStatus is not COMPLETED



ambari-server/src/main/resources/custom_actions/scripts/install_packages.py (line 139)
<https://reviews.apache.org/r/35640/#comment141039>

    This is always returned now.


- Alejandro Fernandez


On June 19, 2015, 1:37 a.m., Alejandro Fernandez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35640/
> -----------------------------------------------------------
> 
> (Updated June 19, 2015, 1:37 a.m.)
> 
> 
> Review request for Ambari, Dmytro Sen and Nate Cole.
> 
> 
> Bugs: AMBARI-12012
>     https://issues.apache.org/jira/browse/AMBARI-12012
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> STR:
> 1. User registers repo version 2.3.0.0 (notice that a build number was not provided), and clicks the Install button
> 2. On all of the hosts, the yum commands timeout (or does a partial install), this way, "hdp-select versions" will report that 2 versions exist (2.2.0.0-2041 and 2.3.0.0-2800). Because the install did not succeed, the command will not return the actual_version installed (which was 2.3.0.0-2800). Note: I did this by decreasing the timeouts in ambari.properties file to 5 mins, and adding a sleep in install_packages.py after the first package was installed.
> 3. The ambari server code then changes the state of the 2.3.0.0 version it knows about to INSTALL_FAILED so that the user can retry, but did not update the repo version with the actual build version that includes the build number.
> 4. User retries and this time it succeeds. However, the delta of "hdp-select versions" outputs "", so no "actual_version" is returned! This is really bad because the build number is needed for ambari to use it whenever it calls "hdp-select set <comp> <version>"
> 5. The ambari server code will change the state to INSTALLED.
> The fix is for install_packages.py to always return the actual_version (even in the case of a failure) so that Ambari server can correct the database entry (even if the command fails/timesout). This will only happen the first time, but subsequent attempts to retry installation will use the right value so an exact match will be found in the database.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/bootstrap/DistributeRepositoriesStructuredOutput.java f1d6aad 
>   ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java 5600ef1 
>   ambari-server/src/main/resources/custom_actions/scripts/install_packages.py f8b2308 
> 
> Diff: https://reviews.apache.org/r/35640/diff/
> 
> 
> Testing
> -------
> 
> Reproduced the issue on a live cluster and verified that the patch worked even when the agents reported that the packages failed to be installed.
> 
> Unit tests in progress
> 
> 
> Thanks,
> 
> Alejandro Fernandez
> 
>