You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by Alejandro Fernandez <af...@hortonworks.com> on 2015/06/19 03:37:24 UTC
Review Request 35640: Install packages doesn't update actual version
with build number if installation timesout on all hosts
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35640/
-----------------------------------------------------------
Review request for Ambari, Dmytro Sen and Nate Cole.
Bugs: AMBARI-12012
https://issues.apache.org/jira/browse/AMBARI-12012
Repository: ambari
Description
-------
STR:
1. User registers repo version 2.3.0.0 (notice that a build number was not provided), and clicks the Install button
2. On all of the hosts, the yum commands timeout (or does a partial install), this way, "hdp-select versions" will report that 2 versions exist (2.2.0.0-2041 and 2.3.0.0-2800). Because the install did not succeed, the command will not return the actual_version installed (which was 2.3.0.0-2800). Note: I did this by decreasing the timeouts in ambari.properties file to 5 mins, and adding a sleep in install_packages.py after the first package was installed.
3. The ambari server code then changes the state of the 2.3.0.0 version it knows about to INSTALL_FAILED so that the user can retry, but did not update the repo version with the actual build version that includes the build number.
4. User retries and this time it succeeds. However, the delta of "hdp-select versions" outputs "", so no "actual_version" is returned! This is really bad because the build number is needed for ambari to use it whenever it calls "hdp-select set <comp> <version>"
5. The ambari server code will change the state to INSTALLED.
The fix is for install_packages.py to always return the actual_version (even in the case of a failure) so that Ambari server can correct the database entry (even if the command fails/timesout). This will only happen the first time, but subsequent attempts to retry installation will use the right value so an exact match will be found in the database.
Diffs
-----
ambari-server/src/main/java/org/apache/ambari/server/bootstrap/DistributeRepositoriesStructuredOutput.java f1d6aad
ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java 5600ef1
ambari-server/src/main/resources/custom_actions/scripts/install_packages.py f8b2308
Diff: https://reviews.apache.org/r/35640/diff/
Testing
-------
Reproduced the issue on a live cluster and verified that the patch worked even when the agents reported that the packages failed to be installed.
Unit tests in progress
Thanks,
Alejandro Fernandez
Re: Review Request 35640: Install packages doesn't update actual
version with build number if installation timesout on all hosts
Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35640/
-----------------------------------------------------------
(Updated June 19, 2015, 5:44 p.m.)
Review request for Ambari, Dmytro Sen and Nate Cole.
Changes
-------
Combined with another dev's request to add more logging to other files.
Bugs: AMBARI-12012
https://issues.apache.org/jira/browse/AMBARI-12012
Repository: ambari
Description
-------
STR:
1. User registers repo version 2.3.0.0 (notice that a build number was not provided), and clicks the Install button
2. On all of the hosts, the yum commands timeout (or does a partial install), this way, "hdp-select versions" will report that 2 versions exist (2.2.0.0-2041 and 2.3.0.0-2800). Because the install did not succeed, the command will not return the actual_version installed (which was 2.3.0.0-2800). Note: I did this by decreasing the timeouts in ambari.properties file to 5 mins, and adding a sleep in install_packages.py after the first package was installed.
3. The ambari server code then changes the state of the 2.3.0.0 version it knows about to INSTALL_FAILED so that the user can retry, but did not update the repo version with the actual build version that includes the build number.
4. User retries and this time it succeeds. However, the delta of "hdp-select versions" outputs "", so no "actual_version" is returned! This is really bad because the build number is needed for ambari to use it whenever it calls "hdp-select set <comp> <version>"
5. The ambari server code will change the state to INSTALLED.
The fix is for install_packages.py to always return the actual_version (even in the case of a failure) so that Ambari server can correct the database entry (even if the command fails/timesout). This will only happen the first time, but subsequent attempts to retry installation will use the right value so an exact match will be found in the database.
Diffs (updated)
-----
ambari-server/src/main/java/org/apache/ambari/server/bootstrap/DistributeRepositoriesStructuredOutput.java f1d6aad
ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 42f4708
ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java 5600ef1
ambari-server/src/main/java/org/apache/ambari/server/state/Alert.java be99d96
ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 484bf79
ambari-server/src/main/resources/custom_actions/scripts/install_packages.py f8b2308
Diff: https://reviews.apache.org/r/35640/diff/
Testing
-------
Reproduced the issue on a live cluster and verified that the patch worked even when the agents reported that the packages failed to be installed.
Unit tests in progress
Thanks,
Alejandro Fernandez
Re: Review Request 35640: Install packages doesn't update actual
version with build number if installation timesout on all hosts
Posted by Nate Cole <nc...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35640/#review88483
-----------------------------------------------------------
Ship it!
Ship It!
- Nate Cole
On June 18, 2015, 9:37 p.m., Alejandro Fernandez wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35640/
> -----------------------------------------------------------
>
> (Updated June 18, 2015, 9:37 p.m.)
>
>
> Review request for Ambari, Dmytro Sen and Nate Cole.
>
>
> Bugs: AMBARI-12012
> https://issues.apache.org/jira/browse/AMBARI-12012
>
>
> Repository: ambari
>
>
> Description
> -------
>
> STR:
> 1. User registers repo version 2.3.0.0 (notice that a build number was not provided), and clicks the Install button
> 2. On all of the hosts, the yum commands timeout (or does a partial install), this way, "hdp-select versions" will report that 2 versions exist (2.2.0.0-2041 and 2.3.0.0-2800). Because the install did not succeed, the command will not return the actual_version installed (which was 2.3.0.0-2800). Note: I did this by decreasing the timeouts in ambari.properties file to 5 mins, and adding a sleep in install_packages.py after the first package was installed.
> 3. The ambari server code then changes the state of the 2.3.0.0 version it knows about to INSTALL_FAILED so that the user can retry, but did not update the repo version with the actual build version that includes the build number.
> 4. User retries and this time it succeeds. However, the delta of "hdp-select versions" outputs "", so no "actual_version" is returned! This is really bad because the build number is needed for ambari to use it whenever it calls "hdp-select set <comp> <version>"
> 5. The ambari server code will change the state to INSTALLED.
> The fix is for install_packages.py to always return the actual_version (even in the case of a failure) so that Ambari server can correct the database entry (even if the command fails/timesout). This will only happen the first time, but subsequent attempts to retry installation will use the right value so an exact match will be found in the database.
>
>
> Diffs
> -----
>
> ambari-server/src/main/java/org/apache/ambari/server/bootstrap/DistributeRepositoriesStructuredOutput.java f1d6aad
> ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java 5600ef1
> ambari-server/src/main/resources/custom_actions/scripts/install_packages.py f8b2308
>
> Diff: https://reviews.apache.org/r/35640/diff/
>
>
> Testing
> -------
>
> Reproduced the issue on a live cluster and verified that the patch worked even when the agents reported that the packages failed to be installed.
>
> Unit tests in progress
>
>
> Thanks,
>
> Alejandro Fernandez
>
>
Re: Review Request 35640: Install packages doesn't update actual
version with build number if installation timesout on all hosts
Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35640/#review88481
-----------------------------------------------------------
ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java (line 116)
<https://reviews.apache.org/r/35640/#comment141044>
The indentation changed here. The repositoryVersion will be calculated even when the HostRoleStatus is not COMPLETED
ambari-server/src/main/resources/custom_actions/scripts/install_packages.py (line 139)
<https://reviews.apache.org/r/35640/#comment141039>
This is always returned now.
- Alejandro Fernandez
On June 19, 2015, 1:37 a.m., Alejandro Fernandez wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35640/
> -----------------------------------------------------------
>
> (Updated June 19, 2015, 1:37 a.m.)
>
>
> Review request for Ambari, Dmytro Sen and Nate Cole.
>
>
> Bugs: AMBARI-12012
> https://issues.apache.org/jira/browse/AMBARI-12012
>
>
> Repository: ambari
>
>
> Description
> -------
>
> STR:
> 1. User registers repo version 2.3.0.0 (notice that a build number was not provided), and clicks the Install button
> 2. On all of the hosts, the yum commands timeout (or does a partial install), this way, "hdp-select versions" will report that 2 versions exist (2.2.0.0-2041 and 2.3.0.0-2800). Because the install did not succeed, the command will not return the actual_version installed (which was 2.3.0.0-2800). Note: I did this by decreasing the timeouts in ambari.properties file to 5 mins, and adding a sleep in install_packages.py after the first package was installed.
> 3. The ambari server code then changes the state of the 2.3.0.0 version it knows about to INSTALL_FAILED so that the user can retry, but did not update the repo version with the actual build version that includes the build number.
> 4. User retries and this time it succeeds. However, the delta of "hdp-select versions" outputs "", so no "actual_version" is returned! This is really bad because the build number is needed for ambari to use it whenever it calls "hdp-select set <comp> <version>"
> 5. The ambari server code will change the state to INSTALLED.
> The fix is for install_packages.py to always return the actual_version (even in the case of a failure) so that Ambari server can correct the database entry (even if the command fails/timesout). This will only happen the first time, but subsequent attempts to retry installation will use the right value so an exact match will be found in the database.
>
>
> Diffs
> -----
>
> ambari-server/src/main/java/org/apache/ambari/server/bootstrap/DistributeRepositoriesStructuredOutput.java f1d6aad
> ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java 5600ef1
> ambari-server/src/main/resources/custom_actions/scripts/install_packages.py f8b2308
>
> Diff: https://reviews.apache.org/r/35640/diff/
>
>
> Testing
> -------
>
> Reproduced the issue on a live cluster and verified that the patch worked even when the agents reported that the packages failed to be installed.
>
> Unit tests in progress
>
>
> Thanks,
>
> Alejandro Fernandez
>
>