You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@ambari.apache.org by Alejandro Fernandez <af...@hortonworks.com> on 2016/03/09 05:46:19 UTC

Re: Review Request 44492: AMBARI-15330. Bubble up errors during RU/EU

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44492/
-----------------------------------------------------------

(Updated March 9, 2016, 4:46 a.m.)


Review request for Ambari, Dmytro Grinenko, Dmitro Lisnichenko, Jonathan Hurley, Nate Cole, and Sid Wagle.


Summary (updated)
-----------------

AMBARI-15330. Bubble up errors during RU/EU


Bugs: AMBARI-15330
    https://issues.apache.org/jira/browse/AMBARI-15330


Repository: ambari


Description
-------

During RU/EU, need a way to bubble up an error of the current item that failed. This is useful to quickly get a human-readable error that others UIs can quickly retrieve.
It can print a human-readable error, plus stdout and stderr.
This would become part of the upgrade endpoint. e.g,
api/v1/clusters/$name/upgrade_summary/$request_id

```
{

cluster_name: "c1",
request_id: 1,
fail_reason: "Failed calling RESTART ZOOKEEPER/ZOOKEEPER_SERVER on host c6401.ambari.apache.org",

// Notice that the rest are inherited from the failed task if it exists.
attempt_cnt: 1,
command: "CUSTOM_COMMAND",
command_detail: "RESTART ZOOKEEPER/ZOOKEEPER_SERVER",
custom_command_name: "RESTART",
end_time: -1,
error_log: "/var/lib/ambari-agent/data/errors-1234.txt",
exit_code: 1,
host_name: "c6401.ambari.apache.org",
id: 1234,
output_log: "/var/lib/ambari-agent/data/output-1234.txt",
role: "ZOOKEEPER_SERVER",
stage_id: 1,
start_time: 123456789,
status: "HOLDING_FAILED",
stdout: "",
stderr: ""
}
```


Diffs
-----

  ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleStatus.java 52523c7 
  ambari-server/src/main/java/org/apache/ambari/server/api/resources/ResourceInstanceFactoryImpl.java e7bbec4 
  ambari-server/src/main/java/org/apache/ambari/server/api/services/ClusterService.java 4954a96 
  ambari-server/src/main/java/org/apache/ambari/server/api/services/UpgradeSummaryService.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/DefaultProviderModule.java dde934d 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/TaskResourceProvider.java 510d6fb 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummary.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProvider.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/controller/spi/Resource.java fbbc7c8 
  ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java fee3200 
  ambari-server/src/main/java/org/apache/ambari/server/orm/entities/HostRoleCommandEntity.java a12b204 
  ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeHelper.java 8213a78 
  ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProviderTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/44492/diff/


Testing
-------

Verified on RU and EU by introducing a failure.
Waiting for unit test results.


Thanks,

Alejandro Fernandez


Re: Review Request 44492: AMBARI-15330. Bubble up errors during RU/EU

Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44492/#review123021
-----------------------------------------------------------




ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProvider.java (line 87)
<https://reviews.apache.org/r/44492/#comment185155>

    Notice that the task elements are part of a failed_task subelement in the JSON structure.



ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java (line 700)
<https://reviews.apache.org/r/44492/#comment185156>

    Switched these to named queries.



ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProviderTest.java (line 245)
<https://reviews.apache.org/r/44492/#comment185157>

    Unit test that covers 4 cases that build on top of each other.


- Alejandro Fernandez


On March 10, 2016, 10:48 p.m., Alejandro Fernandez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44492/
> -----------------------------------------------------------
> 
> (Updated March 10, 2016, 10:48 p.m.)
> 
> 
> Review request for Ambari, Dmytro Grinenko, Dmitro Lisnichenko, Jonathan Hurley, Nate Cole, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15330
>     https://issues.apache.org/jira/browse/AMBARI-15330
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> During RU/EU, need a way to bubble up an error of the current item that failed. This is useful to quickly get a human-readable error that others UIs can quickly retrieve.
> It can print a human-readable error, plus stdout and stderr.
> This would become part of the upgrade endpoint. e.g,
> api/v1/clusters/$name/upgrade_summary/$request_id
> 
> ```
> {
> 
> cluster_name: "c1",
> request_id: 1,
> fail_reason: "Failed calling RESTART ZOOKEEPER/ZOOKEEPER_SERVER on host c6401.ambari.apache.org",
> 
> // Notice that the rest are inherited from the failed task if it exists.
> attempt_cnt: 1,
> command: "CUSTOM_COMMAND",
> command_detail: "RESTART ZOOKEEPER/ZOOKEEPER_SERVER",
> custom_command_name: "RESTART",
> end_time: -1,
> error_log: "/var/lib/ambari-agent/data/errors-1234.txt",
> exit_code: 1,
> host_name: "c6401.ambari.apache.org",
> id: 1234,
> output_log: "/var/lib/ambari-agent/data/output-1234.txt",
> role: "ZOOKEEPER_SERVER",
> stage_id: 1,
> start_time: 123456789,
> status: "HOLDING_FAILED",
> stdout: "",
> stderr: ""
> }
> ```
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleStatus.java 52523c7 
>   ambari-server/src/main/java/org/apache/ambari/server/api/resources/ResourceInstanceFactoryImpl.java 3526e23 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/ClusterService.java 7200b83 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/UpgradeSummaryService.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/DefaultProviderModule.java d1d3fe6 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/TaskResourceProvider.java 510d6fb 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummary.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProvider.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/spi/Resource.java e79f300 
>   ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java b48ffa8 
>   ambari-server/src/main/java/org/apache/ambari/server/orm/entities/HostRoleCommandEntity.java 1674175 
>   ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeHelper.java 2ac4d25 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProviderTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/44492/diff/
> 
> 
> Testing
> -------
> 
> Verified on RU and EU by introducing a failure.
> Waiting for unit test results.
> 
> 
> Thanks,
> 
> Alejandro Fernandez
> 
>


Re: Review Request 44492: AMBARI-15330. Bubble up errors during RU/EU

Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44492/
-----------------------------------------------------------

(Updated March 11, 2016, 7:16 p.m.)


Review request for Ambari, Dmytro Grinenko, Dmitro Lisnichenko, Jonathan Hurley, Nate Cole, and Sid Wagle.


Bugs: AMBARI-15330
    https://issues.apache.org/jira/browse/AMBARI-15330


Repository: ambari


Description
-------

During RU/EU, need a way to bubble up an error of the current item that failed. This is useful to quickly get a human-readable error that others UIs can quickly retrieve.
It can print a human-readable error, plus stdout and stderr.
This would become part of the upgrade endpoint. e.g,
api/v1/clusters/$name/upgrade_summary/$request_id

```
{
cluster_name: "c1",
request_id: 1,
fail_reason: "Failed calling RESTART ZOOKEEPER/ZOOKEEPER_SERVER on host c6401.ambari.apache.org",

// Notice that the rest are inherited from the failed task if it exists.
failed_task: {
  attempt_cnt: 1,
command: "CUSTOM_COMMAND",
  command_detail: "RESTART ZOOKEEPER/ZOOKEEPER_SERVER",
  custom_command_name: "RESTART",
  end_time: -1,
  error_log: "/var/lib/ambari-agent/data/errors-1234.txt",
  exit_code: 1,
  host_name: "c6401.ambari.apache.org",
  id: 1234,
  output_log: "/var/lib/ambari-agent/data/output-1234.txt",
  role: "ZOOKEEPER_SERVER",
  stage_id: 1,
  start_time: 123456789,
  status: "HOLDING_FAILED",
  stdout: "",
  stderr: ""
  }
}
```


Diffs (updated)
-----

  ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleStatus.java 52523c7 
  ambari-server/src/main/java/org/apache/ambari/server/api/resources/ResourceInstanceFactoryImpl.java 3526e23 
  ambari-server/src/main/java/org/apache/ambari/server/api/services/ClusterService.java 7200b83 
  ambari-server/src/main/java/org/apache/ambari/server/api/services/UpgradeSummaryService.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/DefaultProviderModule.java d1d3fe6 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/TaskResourceProvider.java 510d6fb 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummary.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProvider.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/controller/spi/Resource.java e79f300 
  ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java b48ffa8 
  ambari-server/src/main/java/org/apache/ambari/server/orm/entities/HostRoleCommandEntity.java 1674175 
  ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeHelper.java 2ac4d25 
  ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProviderTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/44492/diff/


Testing
-------

Verified on RU and EU by introducing a failure.
Waiting for unit test results.


Thanks,

Alejandro Fernandez


Re: Review Request 44492: AMBARI-15330. Bubble up errors during RU/EU

Posted by Alejandro Fernandez <af...@hortonworks.com>.

> On March 11, 2016, 1:30 p.m., Jonathan Hurley wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleStatus.java, lines 93-94
> > <https://reviews.apache.org/r/44492/diff/4/?file=1294961#file1294961line93>
> >
> >     This changes existing behavior; might not be what we want. The HOLDING_FAILED status is not technically a failure. 
> >     
> >     Upgrades and other areas may use this enum. If they do, we should not change it. I'd just create a new one to old what you need.
> >     
> >     Imagine upgrades starting to throw failures and aborting every time it hit a HOLDING_FAILED state.

I'll add my own enumset.


> On March 11, 2016, 1:30 p.m., Jonathan Hurley wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummary.java, line 25
> > <https://reviews.apache.org/r/44492/diff/4/?file=1294967#file1294967line25>
> >
> >     This class is still basically assuming a failure. The upgrade summary could be a success, no?

I'm keeping the name generic for now. The first use case is to report an error. If that changes in the future, we can always add more properties.


> On March 11, 2016, 1:30 p.m., Jonathan Hurley wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProvider.java, lines 157-178
> > <https://reviews.apache.org/r/44492/diff/4/?file=1294968#file1294968line157>
> >
> >     This class is still basically assuming a failure. The upgrade summary could be a success, no?
> >     
> >     If we're saying that the upgrade summary is only going to be for a failure, then we should rename it to reflect that.
> >     
> >     Personally, I'm fine leaving it and populating the fields with success information if it finished.

If there's no failed_reason they can assume success. I only want to add the minimal set of properties I know are needed right now.


- Alejandro


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44492/#review123134
-----------------------------------------------------------


On March 10, 2016, 11:12 p.m., Alejandro Fernandez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44492/
> -----------------------------------------------------------
> 
> (Updated March 10, 2016, 11:12 p.m.)
> 
> 
> Review request for Ambari, Dmytro Grinenko, Dmitro Lisnichenko, Jonathan Hurley, Nate Cole, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15330
>     https://issues.apache.org/jira/browse/AMBARI-15330
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> During RU/EU, need a way to bubble up an error of the current item that failed. This is useful to quickly get a human-readable error that others UIs can quickly retrieve.
> It can print a human-readable error, plus stdout and stderr.
> This would become part of the upgrade endpoint. e.g,
> api/v1/clusters/$name/upgrade_summary/$request_id
> 
> ```
> {
> cluster_name: "c1",
> request_id: 1,
> fail_reason: "Failed calling RESTART ZOOKEEPER/ZOOKEEPER_SERVER on host c6401.ambari.apache.org",
> 
> // Notice that the rest are inherited from the failed task if it exists.
> failed_task: {
>   attempt_cnt: 1,
> command: "CUSTOM_COMMAND",
>   command_detail: "RESTART ZOOKEEPER/ZOOKEEPER_SERVER",
>   custom_command_name: "RESTART",
>   end_time: -1,
>   error_log: "/var/lib/ambari-agent/data/errors-1234.txt",
>   exit_code: 1,
>   host_name: "c6401.ambari.apache.org",
>   id: 1234,
>   output_log: "/var/lib/ambari-agent/data/output-1234.txt",
>   role: "ZOOKEEPER_SERVER",
>   stage_id: 1,
>   start_time: 123456789,
>   status: "HOLDING_FAILED",
>   stdout: "",
>   stderr: ""
>   }
> }
> ```
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleStatus.java 52523c7 
>   ambari-server/src/main/java/org/apache/ambari/server/api/resources/ResourceInstanceFactoryImpl.java 3526e23 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/ClusterService.java 7200b83 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/UpgradeSummaryService.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/DefaultProviderModule.java d1d3fe6 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/TaskResourceProvider.java 510d6fb 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummary.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProvider.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/spi/Resource.java e79f300 
>   ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java b48ffa8 
>   ambari-server/src/main/java/org/apache/ambari/server/orm/entities/HostRoleCommandEntity.java 1674175 
>   ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeHelper.java 2ac4d25 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProviderTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/44492/diff/
> 
> 
> Testing
> -------
> 
> Verified on RU and EU by introducing a failure.
> Waiting for unit test results.
> 
> 
> Thanks,
> 
> Alejandro Fernandez
> 
>


Re: Review Request 44492: AMBARI-15330. Bubble up errors during RU/EU

Posted by Jonathan Hurley <jh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44492/#review123134
-----------------------------------------------------------


Fix it, then Ship it!




Some minor issues; mostly around how the upgrade summary doesn't reflect a success.


ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleStatus.java (lines 93 - 94)
<https://reviews.apache.org/r/44492/#comment185303>

    This changes existing behavior; might not be what we want. The HOLDING_FAILED status is not technically a failure. 
    
    Upgrades and other areas may use this enum. If they do, we should not change it. I'd just create a new one to old what you need.
    
    Imagine upgrades starting to throw failures and aborting every time it hit a HOLDING_FAILED state.



ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummary.java (line 25)
<https://reviews.apache.org/r/44492/#comment185304>

    This class is still basically assuming a failure. The upgrade summary could be a success, no?



ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProvider.java (lines 157 - 178)
<https://reviews.apache.org/r/44492/#comment185305>

    This class is still basically assuming a failure. The upgrade summary could be a success, no?
    
    If we're saying that the upgrade summary is only going to be for a failure, then we should rename it to reflect that.
    
    Personally, I'm fine leaving it and populating the fields with success information if it finished.


- Jonathan Hurley


On March 10, 2016, 6:12 p.m., Alejandro Fernandez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44492/
> -----------------------------------------------------------
> 
> (Updated March 10, 2016, 6:12 p.m.)
> 
> 
> Review request for Ambari, Dmytro Grinenko, Dmitro Lisnichenko, Jonathan Hurley, Nate Cole, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15330
>     https://issues.apache.org/jira/browse/AMBARI-15330
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> During RU/EU, need a way to bubble up an error of the current item that failed. This is useful to quickly get a human-readable error that others UIs can quickly retrieve.
> It can print a human-readable error, plus stdout and stderr.
> This would become part of the upgrade endpoint. e.g,
> api/v1/clusters/$name/upgrade_summary/$request_id
> 
> ```
> {
> cluster_name: "c1",
> request_id: 1,
> fail_reason: "Failed calling RESTART ZOOKEEPER/ZOOKEEPER_SERVER on host c6401.ambari.apache.org",
> 
> // Notice that the rest are inherited from the failed task if it exists.
> failed_task: {
>   attempt_cnt: 1,
> command: "CUSTOM_COMMAND",
>   command_detail: "RESTART ZOOKEEPER/ZOOKEEPER_SERVER",
>   custom_command_name: "RESTART",
>   end_time: -1,
>   error_log: "/var/lib/ambari-agent/data/errors-1234.txt",
>   exit_code: 1,
>   host_name: "c6401.ambari.apache.org",
>   id: 1234,
>   output_log: "/var/lib/ambari-agent/data/output-1234.txt",
>   role: "ZOOKEEPER_SERVER",
>   stage_id: 1,
>   start_time: 123456789,
>   status: "HOLDING_FAILED",
>   stdout: "",
>   stderr: ""
>   }
> }
> ```
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleStatus.java 52523c7 
>   ambari-server/src/main/java/org/apache/ambari/server/api/resources/ResourceInstanceFactoryImpl.java 3526e23 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/ClusterService.java 7200b83 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/UpgradeSummaryService.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/DefaultProviderModule.java d1d3fe6 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/TaskResourceProvider.java 510d6fb 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummary.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProvider.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/spi/Resource.java e79f300 
>   ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java b48ffa8 
>   ambari-server/src/main/java/org/apache/ambari/server/orm/entities/HostRoleCommandEntity.java 1674175 
>   ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeHelper.java 2ac4d25 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProviderTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/44492/diff/
> 
> 
> Testing
> -------
> 
> Verified on RU and EU by introducing a failure.
> Waiting for unit test results.
> 
> 
> Thanks,
> 
> Alejandro Fernandez
> 
>


Re: Review Request 44492: AMBARI-15330. Bubble up errors during RU/EU

Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44492/
-----------------------------------------------------------

(Updated March 10, 2016, 11:12 p.m.)


Review request for Ambari, Dmytro Grinenko, Dmitro Lisnichenko, Jonathan Hurley, Nate Cole, and Sid Wagle.


Bugs: AMBARI-15330
    https://issues.apache.org/jira/browse/AMBARI-15330


Repository: ambari


Description (updated)
-------

During RU/EU, need a way to bubble up an error of the current item that failed. This is useful to quickly get a human-readable error that others UIs can quickly retrieve.
It can print a human-readable error, plus stdout and stderr.
This would become part of the upgrade endpoint. e.g,
api/v1/clusters/$name/upgrade_summary/$request_id

```
{
cluster_name: "c1",
request_id: 1,
fail_reason: "Failed calling RESTART ZOOKEEPER/ZOOKEEPER_SERVER on host c6401.ambari.apache.org",

// Notice that the rest are inherited from the failed task if it exists.
failed_task: {
  attempt_cnt: 1,
command: "CUSTOM_COMMAND",
  command_detail: "RESTART ZOOKEEPER/ZOOKEEPER_SERVER",
  custom_command_name: "RESTART",
  end_time: -1,
  error_log: "/var/lib/ambari-agent/data/errors-1234.txt",
  exit_code: 1,
  host_name: "c6401.ambari.apache.org",
  id: 1234,
  output_log: "/var/lib/ambari-agent/data/output-1234.txt",
  role: "ZOOKEEPER_SERVER",
  stage_id: 1,
  start_time: 123456789,
  status: "HOLDING_FAILED",
  stdout: "",
  stderr: ""
  }
}
```


Diffs
-----

  ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleStatus.java 52523c7 
  ambari-server/src/main/java/org/apache/ambari/server/api/resources/ResourceInstanceFactoryImpl.java 3526e23 
  ambari-server/src/main/java/org/apache/ambari/server/api/services/ClusterService.java 7200b83 
  ambari-server/src/main/java/org/apache/ambari/server/api/services/UpgradeSummaryService.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/DefaultProviderModule.java d1d3fe6 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/TaskResourceProvider.java 510d6fb 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummary.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProvider.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/controller/spi/Resource.java e79f300 
  ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java b48ffa8 
  ambari-server/src/main/java/org/apache/ambari/server/orm/entities/HostRoleCommandEntity.java 1674175 
  ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeHelper.java 2ac4d25 
  ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProviderTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/44492/diff/


Testing
-------

Verified on RU and EU by introducing a failure.
Waiting for unit test results.


Thanks,

Alejandro Fernandez


Re: Review Request 44492: AMBARI-15330. Bubble up errors during RU/EU

Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44492/
-----------------------------------------------------------

(Updated March 10, 2016, 10:48 p.m.)


Review request for Ambari, Dmytro Grinenko, Dmitro Lisnichenko, Jonathan Hurley, Nate Cole, and Sid Wagle.


Changes
-------

Addressed all comments.


Bugs: AMBARI-15330
    https://issues.apache.org/jira/browse/AMBARI-15330


Repository: ambari


Description
-------

During RU/EU, need a way to bubble up an error of the current item that failed. This is useful to quickly get a human-readable error that others UIs can quickly retrieve.
It can print a human-readable error, plus stdout and stderr.
This would become part of the upgrade endpoint. e.g,
api/v1/clusters/$name/upgrade_summary/$request_id

```
{

cluster_name: "c1",
request_id: 1,
fail_reason: "Failed calling RESTART ZOOKEEPER/ZOOKEEPER_SERVER on host c6401.ambari.apache.org",

// Notice that the rest are inherited from the failed task if it exists.
attempt_cnt: 1,
command: "CUSTOM_COMMAND",
command_detail: "RESTART ZOOKEEPER/ZOOKEEPER_SERVER",
custom_command_name: "RESTART",
end_time: -1,
error_log: "/var/lib/ambari-agent/data/errors-1234.txt",
exit_code: 1,
host_name: "c6401.ambari.apache.org",
id: 1234,
output_log: "/var/lib/ambari-agent/data/output-1234.txt",
role: "ZOOKEEPER_SERVER",
stage_id: 1,
start_time: 123456789,
status: "HOLDING_FAILED",
stdout: "",
stderr: ""
}
```


Diffs (updated)
-----

  ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleStatus.java 52523c7 
  ambari-server/src/main/java/org/apache/ambari/server/api/resources/ResourceInstanceFactoryImpl.java 3526e23 
  ambari-server/src/main/java/org/apache/ambari/server/api/services/ClusterService.java 7200b83 
  ambari-server/src/main/java/org/apache/ambari/server/api/services/UpgradeSummaryService.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/DefaultProviderModule.java d1d3fe6 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/TaskResourceProvider.java 510d6fb 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummary.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProvider.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/controller/spi/Resource.java e79f300 
  ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java b48ffa8 
  ambari-server/src/main/java/org/apache/ambari/server/orm/entities/HostRoleCommandEntity.java 1674175 
  ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeHelper.java 2ac4d25 
  ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProviderTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/44492/diff/


Testing
-------

Verified on RU and EU by introducing a failure.
Waiting for unit test results.


Thanks,

Alejandro Fernandez


Re: Review Request 44492: AMBARI-15330. Bubble up errors during RU/EU

Posted by Dmitro Lisnichenko <dl...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44492/#review122707
-----------------------------------------------------------


Ship it!




Ship It!

- Dmitro Lisnichenko


On March 9, 2016, 6:46 a.m., Alejandro Fernandez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44492/
> -----------------------------------------------------------
> 
> (Updated March 9, 2016, 6:46 a.m.)
> 
> 
> Review request for Ambari, Dmytro Grinenko, Dmitro Lisnichenko, Jonathan Hurley, Nate Cole, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15330
>     https://issues.apache.org/jira/browse/AMBARI-15330
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> During RU/EU, need a way to bubble up an error of the current item that failed. This is useful to quickly get a human-readable error that others UIs can quickly retrieve.
> It can print a human-readable error, plus stdout and stderr.
> This would become part of the upgrade endpoint. e.g,
> api/v1/clusters/$name/upgrade_summary/$request_id
> 
> ```
> {
> 
> cluster_name: "c1",
> request_id: 1,
> fail_reason: "Failed calling RESTART ZOOKEEPER/ZOOKEEPER_SERVER on host c6401.ambari.apache.org",
> 
> // Notice that the rest are inherited from the failed task if it exists.
> attempt_cnt: 1,
> command: "CUSTOM_COMMAND",
> command_detail: "RESTART ZOOKEEPER/ZOOKEEPER_SERVER",
> custom_command_name: "RESTART",
> end_time: -1,
> error_log: "/var/lib/ambari-agent/data/errors-1234.txt",
> exit_code: 1,
> host_name: "c6401.ambari.apache.org",
> id: 1234,
> output_log: "/var/lib/ambari-agent/data/output-1234.txt",
> role: "ZOOKEEPER_SERVER",
> stage_id: 1,
> start_time: 123456789,
> status: "HOLDING_FAILED",
> stdout: "",
> stderr: ""
> }
> ```
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleStatus.java 52523c7 
>   ambari-server/src/main/java/org/apache/ambari/server/api/resources/ResourceInstanceFactoryImpl.java e7bbec4 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/ClusterService.java 4954a96 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/UpgradeSummaryService.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/DefaultProviderModule.java dde934d 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/TaskResourceProvider.java 510d6fb 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummary.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProvider.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/spi/Resource.java fbbc7c8 
>   ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java fee3200 
>   ambari-server/src/main/java/org/apache/ambari/server/orm/entities/HostRoleCommandEntity.java a12b204 
>   ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeHelper.java 8213a78 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProviderTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/44492/diff/
> 
> 
> Testing
> -------
> 
> Verified on RU and EU by introducing a failure.
> Waiting for unit test results.
> 
> 
> Thanks,
> 
> Alejandro Fernandez
> 
>