You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Jaimin D Jetly (JIRA)" <ji...@apache.org> on 2015/12/12 01:08:46 UTC

[jira] [Commented] (AMBARI-12995) Ambari alerts reports "UNKNOWN" error for secondary YARN RM and NM in a kerberoized YARN HA deployment

    [ https://issues.apache.org/jira/browse/AMBARI-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053831#comment-15053831 ] 

Jaimin D Jetly commented on AMBARI-12995:
-----------------------------------------

The issue seems to have been largely fixed in YARN and should be available in a stack that is supported by future release of Ambari.
So moving this out of 2.2.0 for now

cc [~jonathan.hurley] [~arobertson]

> Ambari alerts reports "UNKNOWN" error for secondary YARN RM and NM in a kerberoized YARN HA deployment
> ------------------------------------------------------------------------------------------------------
>
>                 Key: AMBARI-12995
>                 URL: https://issues.apache.org/jira/browse/AMBARI-12995
>             Project: Ambari
>          Issue Type: Bug
>          Components: alerts
>    Affects Versions: 2.1.1
>         Environment: Requires YARN HA with Kerberos
>            Reporter: Andrew Robertson
>             Fix For: 2.3.0
>
>
> What is observed:
> On my currently active YARN NodeManager and ResourceManager, Ambari
> alerts are fine.
> On the secondary YARN NodeManager and ResourceManager, Ambari reports
> "Status: Unknown" / "HTTP 200 response (metrics unavailable)".  This
> is for the alerts:
>  - NodeManager Health Summary
>  - ResourceManager CPU Utilization
>  - ResourceManager RPC Latency
> The Ambari web interface does not make this error obvious, as it says
> "0 alerts" in the top bar. But you can see the alerts with "unknown"
> status when you go to the ambari alerts page, or if you query the
> alerts API.
> What is expected:
> Ambari alerts does not generate any alarms on a secondary YARN HA node as long as the node is responsive.
> ---
> A network dump of the ambari poll against the secondary RM looks like:
> Request:
> """
> GET /jmx?qry=Hadoop:service=ResourceManager,name=RMNMInfo HTTP/1.1
> ...
> """
> Response:
> """
> HTTP/1.1 200 OK
> ...
> Refresh: 3; url=http://{my-primary-rm}:8088/jmx
> Content-Length: 106
> Server: Jetty(6.1.26.hwx)
> This is standby RM. Redirecting to the current active RM:
> http://{my-primary-rm}:8088/jmx
> """
> --
> I'm also filing a JIRA against YARN (per request from jhurley) and will post that info here.
> --
> Comment from Jonathan Hurley jhurley@hortonworks.com:
> This is caused by how YARN does HA mode. With two YARN RMs, the standby RM returns a 200 response with a JavaScript redirect instead of an 3xx redirection. When not using Kerberos, Ambari should be able to parse the headers and follow the JS-based redirect. However, on a Kerberized cluster, we use curl which cannot do this. Therefore, requests against the secondary RM will return an UNKNOWN response since it did get a 200. I think a few things can be improved here:
> 1) There should be a ticket filed for YARN to have their HA mode use a proper redirect
> 2) Ambari might not want to produce an UNKNOWN response here since it gives a false feeling that something went wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)