You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by Jonathan Hurley <jh...@hortonworks.com> on 2015/03/03 19:05:30 UTC

Review Request 31686: Alerts: YARN YM HA Alerts Are UNKNOWN Due to HA Redirects

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31686/
-----------------------------------------------------------

Review request for Ambari, Nate Cole and Tom Beerbower.


Bugs: AMBARI-9894
    https://issues.apache.org/jira/browse/AMBARI-9894


Repository: ambari


Description
-------

After configuring a 3-node cluster with YARN and adding ResourceManager HA, the some RM alerts are UNKNOWN.

The reason this fails is because YARN forwards requests for the standby RM to the active one. In this scenario, the alert gets back an HTTP 200 response that looks like:

```
This is standby RM. Redirecting to the current active RM: http://c6403.ambari.apache.org:8088/
```

Unfortunately, this is a refresh header redirect which is not able to be handled by the metric alert. The reason that the alerts work is that after the VMs restarted, the original RM became active again.

There are a few issues here:
YARN doesn't do HA in the same way that other services like HDFS do. As a result, there's no config property that could let the alert know what to do or which hosts to contact.

YARN actually forwards after an HTTP 200 to the active node, which doesn't jive with how alerts works.

The fix adds the new `high-availability` structure to the YARN URIs and adds a new handler to urllib2 in order to follow a header-based redirect.


Diffs
-----

  ambari-agent/src/main/python/ambari_agent/alerts/base_alert.py 3ae3c6d 
  ambari-agent/src/main/python/ambari_agent/alerts/metric_alert.py d378254 
  ambari-agent/src/test/python/ambari_agent/TestAlerts.py e9e106d 
  ambari-common/src/main/python/ambari_commons/urllib_handlers.py PRE-CREATION 
  ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/alerts.json efef2d0 
  ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanagers_summary.py 6895889 
  ambari-server/src/main/resources/stacks/BIGTOP/0.8/services/YARN/alerts.json 935fd1c 

Diff: https://reviews.apache.org/r/31686/diff/


Testing
-------

Verified that the YARN HA alerts work properly.

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 12.530 s
[INFO] Finished at: 2015-03-03T10:43:54-05:00
[INFO] Final Memory: 8M/81M


Thanks,

Jonathan Hurley


Re: Review Request 31686: Alerts: YARN YM HA Alerts Are UNKNOWN Due to HA Redirects

Posted by Nate Cole <nc...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31686/#review75177
-----------------------------------------------------------

Ship it!


Ship It!

- Nate Cole


On March 4, 2015, 8:26 a.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31686/
> -----------------------------------------------------------
> 
> (Updated March 4, 2015, 8:26 a.m.)
> 
> 
> Review request for Ambari, Nate Cole and Tom Beerbower.
> 
> 
> Bugs: AMBARI-9894
>     https://issues.apache.org/jira/browse/AMBARI-9894
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> After configuring a 3-node cluster with YARN and adding ResourceManager HA, the some RM alerts are UNKNOWN.
> 
> The reason this fails is because YARN forwards requests for the standby RM to the active one. In this scenario, the alert gets back an HTTP 200 response that looks like:
> 
> ```
> This is standby RM. Redirecting to the current active RM: http://c6403.ambari.apache.org:8088/
> ```
> 
> Unfortunately, this is a refresh header redirect which is not able to be handled by the metric alert. The reason that the alerts work is that after the VMs restarted, the original RM became active again.
> 
> There are a few issues here:
> YARN doesn't do HA in the same way that other services like HDFS do. As a result, there's no config property that could let the alert know what to do or which hosts to contact.
> 
> YARN actually forwards after an HTTP 200 to the active node, which doesn't jive with how alerts works.
> 
> The fix adds the new `high-availability` structure to the YARN URIs and adds a new handler to urllib2 in order to follow a header-based redirect.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/alerts/base_alert.py 3ae3c6d 
>   ambari-agent/src/main/python/ambari_agent/alerts/metric_alert.py d378254 
>   ambari-agent/src/test/python/ambari_agent/TestAlerts.py e9e106d 
>   ambari-common/src/main/python/ambari_commons/urllib_handlers.py PRE-CREATION 
>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/alerts.json efef2d0 
>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanagers_summary.py 6895889 
>   ambari-server/src/main/resources/stacks/BIGTOP/0.8/services/YARN/alerts.json 935fd1c 
> 
> Diff: https://reviews.apache.org/r/31686/diff/
> 
> 
> Testing
> -------
> 
> Verified that the YARN HA alerts work properly.
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 12.530 s
> [INFO] Finished at: 2015-03-03T10:43:54-05:00
> [INFO] Final Memory: 8M/81M
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Re: Review Request 31686: Alerts: YARN YM HA Alerts Are UNKNOWN Due to HA Redirects

Posted by Jonathan Hurley <jh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31686/
-----------------------------------------------------------

(Updated March 4, 2015, 8:26 a.m.)


Review request for Ambari, Nate Cole and Tom Beerbower.


Bugs: AMBARI-9894
    https://issues.apache.org/jira/browse/AMBARI-9894


Repository: ambari


Description
-------

After configuring a 3-node cluster with YARN and adding ResourceManager HA, the some RM alerts are UNKNOWN.

The reason this fails is because YARN forwards requests for the standby RM to the active one. In this scenario, the alert gets back an HTTP 200 response that looks like:

```
This is standby RM. Redirecting to the current active RM: http://c6403.ambari.apache.org:8088/
```

Unfortunately, this is a refresh header redirect which is not able to be handled by the metric alert. The reason that the alerts work is that after the VMs restarted, the original RM became active again.

There are a few issues here:
YARN doesn't do HA in the same way that other services like HDFS do. As a result, there's no config property that could let the alert know what to do or which hosts to contact.

YARN actually forwards after an HTTP 200 to the active node, which doesn't jive with how alerts works.

The fix adds the new `high-availability` structure to the YARN URIs and adds a new handler to urllib2 in order to follow a header-based redirect.


Diffs (updated)
-----

  ambari-agent/src/main/python/ambari_agent/alerts/base_alert.py 3ae3c6d 
  ambari-agent/src/main/python/ambari_agent/alerts/metric_alert.py d378254 
  ambari-agent/src/test/python/ambari_agent/TestAlerts.py e9e106d 
  ambari-common/src/main/python/ambari_commons/urllib_handlers.py PRE-CREATION 
  ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/alerts.json efef2d0 
  ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanagers_summary.py 6895889 
  ambari-server/src/main/resources/stacks/BIGTOP/0.8/services/YARN/alerts.json 935fd1c 

Diff: https://reviews.apache.org/r/31686/diff/


Testing
-------

Verified that the YARN HA alerts work properly.

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 12.530 s
[INFO] Finished at: 2015-03-03T10:43:54-05:00
[INFO] Final Memory: 8M/81M


Thanks,

Jonathan Hurley


Re: Review Request 31686: Alerts: YARN YM HA Alerts Are UNKNOWN Due to HA Redirects

Posted by Nate Cole <nc...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31686/#review75173
-----------------------------------------------------------



ambari-common/src/main/python/ambari_commons/urllib_handlers.py
<https://reviews.apache.org/r/31686/#comment122147>

    common code isn't just for [Alert] - it could be for anything



ambari-common/src/main/python/ambari_commons/urllib_handlers.py
<https://reviews.apache.org/r/31686/#comment122148>

    Not just alerts



ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/alerts.json
<https://reviews.apache.org/r/31686/#comment122149>

    Spacing


- Nate Cole


On March 3, 2015, 1:05 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31686/
> -----------------------------------------------------------
> 
> (Updated March 3, 2015, 1:05 p.m.)
> 
> 
> Review request for Ambari, Nate Cole and Tom Beerbower.
> 
> 
> Bugs: AMBARI-9894
>     https://issues.apache.org/jira/browse/AMBARI-9894
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> After configuring a 3-node cluster with YARN and adding ResourceManager HA, the some RM alerts are UNKNOWN.
> 
> The reason this fails is because YARN forwards requests for the standby RM to the active one. In this scenario, the alert gets back an HTTP 200 response that looks like:
> 
> ```
> This is standby RM. Redirecting to the current active RM: http://c6403.ambari.apache.org:8088/
> ```
> 
> Unfortunately, this is a refresh header redirect which is not able to be handled by the metric alert. The reason that the alerts work is that after the VMs restarted, the original RM became active again.
> 
> There are a few issues here:
> YARN doesn't do HA in the same way that other services like HDFS do. As a result, there's no config property that could let the alert know what to do or which hosts to contact.
> 
> YARN actually forwards after an HTTP 200 to the active node, which doesn't jive with how alerts works.
> 
> The fix adds the new `high-availability` structure to the YARN URIs and adds a new handler to urllib2 in order to follow a header-based redirect.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/alerts/base_alert.py 3ae3c6d 
>   ambari-agent/src/main/python/ambari_agent/alerts/metric_alert.py d378254 
>   ambari-agent/src/test/python/ambari_agent/TestAlerts.py e9e106d 
>   ambari-common/src/main/python/ambari_commons/urllib_handlers.py PRE-CREATION 
>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/alerts.json efef2d0 
>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanagers_summary.py 6895889 
>   ambari-server/src/main/resources/stacks/BIGTOP/0.8/services/YARN/alerts.json 935fd1c 
> 
> Diff: https://reviews.apache.org/r/31686/diff/
> 
> 
> Testing
> -------
> 
> Verified that the YARN HA alerts work properly.
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 12.530 s
> [INFO] Finished at: 2015-03-03T10:43:54-05:00
> [INFO] Final Memory: 8M/81M
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Re: Review Request 31686: Alerts: YARN YM HA Alerts Are UNKNOWN Due to HA Redirects

Posted by Tom Beerbower <tb...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31686/#review75166
-----------------------------------------------------------

Ship it!


Ship It!

- Tom Beerbower


On March 3, 2015, 6:05 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31686/
> -----------------------------------------------------------
> 
> (Updated March 3, 2015, 6:05 p.m.)
> 
> 
> Review request for Ambari, Nate Cole and Tom Beerbower.
> 
> 
> Bugs: AMBARI-9894
>     https://issues.apache.org/jira/browse/AMBARI-9894
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> After configuring a 3-node cluster with YARN and adding ResourceManager HA, the some RM alerts are UNKNOWN.
> 
> The reason this fails is because YARN forwards requests for the standby RM to the active one. In this scenario, the alert gets back an HTTP 200 response that looks like:
> 
> ```
> This is standby RM. Redirecting to the current active RM: http://c6403.ambari.apache.org:8088/
> ```
> 
> Unfortunately, this is a refresh header redirect which is not able to be handled by the metric alert. The reason that the alerts work is that after the VMs restarted, the original RM became active again.
> 
> There are a few issues here:
> YARN doesn't do HA in the same way that other services like HDFS do. As a result, there's no config property that could let the alert know what to do or which hosts to contact.
> 
> YARN actually forwards after an HTTP 200 to the active node, which doesn't jive with how alerts works.
> 
> The fix adds the new `high-availability` structure to the YARN URIs and adds a new handler to urllib2 in order to follow a header-based redirect.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/alerts/base_alert.py 3ae3c6d 
>   ambari-agent/src/main/python/ambari_agent/alerts/metric_alert.py d378254 
>   ambari-agent/src/test/python/ambari_agent/TestAlerts.py e9e106d 
>   ambari-common/src/main/python/ambari_commons/urllib_handlers.py PRE-CREATION 
>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/alerts.json efef2d0 
>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanagers_summary.py 6895889 
>   ambari-server/src/main/resources/stacks/BIGTOP/0.8/services/YARN/alerts.json 935fd1c 
> 
> Diff: https://reviews.apache.org/r/31686/diff/
> 
> 
> Testing
> -------
> 
> Verified that the YARN HA alerts work properly.
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 12.530 s
> [INFO] Finished at: 2015-03-03T10:43:54-05:00
> [INFO] Final Memory: 8M/81M
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>