You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Tom Beerbower (JIRA)" <ji...@apache.org> on 2012/11/30 18:09:59 UTC
[jira] [Created] (AMBARI-1044) API is not returning Ganglia metrics
for one of the hosts in the cluster
Tom Beerbower created AMBARI-1044:
-------------------------------------
Summary: API is not returning Ganglia metrics for one of the hosts in the cluster
Key: AMBARI-1044
URL: https://issues.apache.org/jira/browse/AMBARI-1044
Project: Ambari
Issue Type: Sub-task
Reporter: Tom Beerbower
Assignee: Tom Beerbower
A cluster was deployed with 4 hosts, with Ambari Server running on a different host.
Host graphs are showing for 3 of the hosts.
For one of the hosts, API is not returning any temporal data.
Ganglia is showing host-level metrics.
UI: http://ec2-54-242-174-25.compute-1.amazonaws.com:8080/#/main/hosts/ip-10-224-42-108.ec2.internal/summary
Ganglia UI: http://ec2-174-129-70-110.compute-1.amazonaws.com/ganglia/mobile_helper.php?show_host_metrics=1&h=ip-10-224-42-108.ec2.internal&c=HDPNameNode&r=hour&cs=&ce=
API response:
{
"href" : "http://ec2-54-242-174-25.compute-1.amazonaws.com:8080/api/v1/clusters/C2/hosts/ip-10-224-42-108.ec2.internal?fields=metrics/cpu/cpu_user1354227417,1354231017,15,metrics/cpu/cpu_wio1354227417,1354231017,15,metrics/cpu/cpu_nice1354227417,1354231017,15,metrics/cpu/cpu_aidle1354227417,1354231017,15,metrics/cpu/cpu_system1354227417,1354231017,15,metrics/cpu/cpu_idle1354227417,1354231017,15",
"Hosts" :
{ "cluster_name" : "C2", "host_name" : "ip-10-224-42-108.ec2.internal" }
}
We need to understand the root cause.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AMBARI-1044) API is not returning Ganglia metrics
for one of the hosts in the cluster
Posted by "Tom Beerbower (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AMBARI-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tom Beerbower updated AMBARI-1044:
----------------------------------
Attachment: AMBARI-1044.patch
> API is not returning Ganglia metrics for one of the hosts in the cluster
> ------------------------------------------------------------------------
>
> Key: AMBARI-1044
> URL: https://issues.apache.org/jira/browse/AMBARI-1044
> Project: Ambari
> Issue Type: Sub-task
> Reporter: Tom Beerbower
> Assignee: Tom Beerbower
> Attachments: AMBARI-1044.patch
>
>
> A cluster was deployed with 4 hosts, with Ambari Server running on a different host.
> Host graphs are showing for 3 of the hosts.
> For one of the hosts, API is not returning any temporal data.
> Ganglia is showing host-level metrics.
> UI: http://ec2-54-242-174-25.compute-1.amazonaws.com:8080/#/main/hosts/ip-10-224-42-108.ec2.internal/summary
> Ganglia UI: http://ec2-174-129-70-110.compute-1.amazonaws.com/ganglia/mobile_helper.php?show_host_metrics=1&h=ip-10-224-42-108.ec2.internal&c=HDPNameNode&r=hour&cs=&ce=
> API response:
> {
> "href" : "http://ec2-54-242-174-25.compute-1.amazonaws.com:8080/api/v1/clusters/C2/hosts/ip-10-224-42-108.ec2.internal?fields=metrics/cpu/cpu_user1354227417,1354231017,15,metrics/cpu/cpu_wio1354227417,1354231017,15,metrics/cpu/cpu_nice1354227417,1354231017,15,metrics/cpu/cpu_aidle1354227417,1354231017,15,metrics/cpu/cpu_system1354227417,1354231017,15,metrics/cpu/cpu_idle1354227417,1354231017,15",
> "Hosts" :
> { "cluster_name" : "C2", "host_name" : "ip-10-224-42-108.ec2.internal" }
> }
> We need to understand the root cause.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AMBARI-1044) API is not returning Ganglia
metrics for one of the hosts in the cluster
Posted by "Tom Beerbower (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AMBARI-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507459#comment-13507459 ]
Tom Beerbower commented on AMBARI-1044:
---------------------------------------
I don't see any related exceptions in the server log which means that either its not attempting to get the metrics for this host or they are just not being set on the host resource.
I think that I see what is happening. One of the arguments that can be specified for the rrd query is the Ganglia cluster (HDPHBaseMaster, HDPJobTracker, HDPNameNode or HDPSlaves). The question is, for a host level query which Ganglia cluster should we specify?
Its hard to say since a host isn't necessarily with any of the services related to those clusters... or maybe more than one. It turns out it doesn't really matter. In this case I can see the system level rrd files that we use for host level metrics for ip-10-224-42-108.ec2.internal under any of the Ganglia cluster folders. For example ...
{code}
[root@ip-10-40-91-121 rrds]# ls ./HDPHBaseMaster/ip-10-224-42-108.ec2.internal
boottime.rrd bytes_out.rrd cpu_idle.rrd cpu_num.rrd cpu_system.rrd cpu_wio.rrd disk_total.rrd load_five.rrd mem_buffers.rrd mem_free.rrd mem_total.rrd pkts_in.rrd proc_run.rrd swap_free.rrd
bytes_in.rrd cpu_aidle.rrd cpu_nice.rrd cpu_speed.rrd cpu_user.rrd disk_free.rrd load_fifteen.rrd load_one.rrd mem_cached.rrd mem_shared.rrd part_max_used.rrd pkts_out.rrd proc_total.rrd swap_total.rrd
...
[root@ip-10-40-91-121 rrds]# ls HDPNameNode/ip-10-224-42-108.ec2.internal
boottime.rrd bytes_out.rrd cpu_idle.rrd cpu_num.rrd cpu_system.rrd cpu_wio.rrd disk_total.rrd load_five.rrd mem_buffers.rrd mem_free.rrd mem_total.rrd pkts_in.rrd proc_run.rrd swap_free.rrd
bytes_in.rrd cpu_aidle.rrd cpu_nice.rrd cpu_speed.rrd cpu_user.rrd disk_free.rrd load_fifteen.rrd load_one.rrd mem_cached.rrd mem_shared.rrd part_max_used.rrd pkts_out.rrd proc_total.rrd swap_total.rrd
{code}
The approach that I've been using is to look through the host components for the host that we are interested in and try to map one of its component names back to a Ganglia cluster. In this case it looks like the host with the missing metrics is not associated with any component that would map back given the mapping method that I am using.
Given what I am currently seeing with the system level metrics, I think that it would be safe to simply use HDPSlaves as the Ganglia cluster for host level queries.
> API is not returning Ganglia metrics for one of the hosts in the cluster
> ------------------------------------------------------------------------
>
> Key: AMBARI-1044
> URL: https://issues.apache.org/jira/browse/AMBARI-1044
> Project: Ambari
> Issue Type: Sub-task
> Reporter: Tom Beerbower
> Assignee: Tom Beerbower
>
> A cluster was deployed with 4 hosts, with Ambari Server running on a different host.
> Host graphs are showing for 3 of the hosts.
> For one of the hosts, API is not returning any temporal data.
> Ganglia is showing host-level metrics.
> UI: http://ec2-54-242-174-25.compute-1.amazonaws.com:8080/#/main/hosts/ip-10-224-42-108.ec2.internal/summary
> Ganglia UI: http://ec2-174-129-70-110.compute-1.amazonaws.com/ganglia/mobile_helper.php?show_host_metrics=1&h=ip-10-224-42-108.ec2.internal&c=HDPNameNode&r=hour&cs=&ce=
> API response:
> {
> "href" : "http://ec2-54-242-174-25.compute-1.amazonaws.com:8080/api/v1/clusters/C2/hosts/ip-10-224-42-108.ec2.internal?fields=metrics/cpu/cpu_user1354227417,1354231017,15,metrics/cpu/cpu_wio1354227417,1354231017,15,metrics/cpu/cpu_nice1354227417,1354231017,15,metrics/cpu/cpu_aidle1354227417,1354231017,15,metrics/cpu/cpu_system1354227417,1354231017,15,metrics/cpu/cpu_idle1354227417,1354231017,15",
> "Hosts" :
> { "cluster_name" : "C2", "host_name" : "ip-10-224-42-108.ec2.internal" }
> }
> We need to understand the root cause.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AMBARI-1044) API is not returning Ganglia metrics
for one of the hosts in the cluster
Posted by "Tom Beerbower (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AMBARI-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tom Beerbower updated AMBARI-1044:
----------------------------------
Attachment: AMBARI-1044-2.patch
> API is not returning Ganglia metrics for one of the hosts in the cluster
> ------------------------------------------------------------------------
>
> Key: AMBARI-1044
> URL: https://issues.apache.org/jira/browse/AMBARI-1044
> Project: Ambari
> Issue Type: Sub-task
> Reporter: Tom Beerbower
> Assignee: Tom Beerbower
> Attachments: AMBARI-1044-2.patch, AMBARI-1044.patch
>
>
> A cluster was deployed with 4 hosts, with Ambari Server running on a different host.
> Host graphs are showing for 3 of the hosts.
> For one of the hosts, API is not returning any temporal data.
> Ganglia is showing host-level metrics.
> UI: http://ec2-54-242-174-25.compute-1.amazonaws.com:8080/#/main/hosts/ip-10-224-42-108.ec2.internal/summary
> Ganglia UI: http://ec2-174-129-70-110.compute-1.amazonaws.com/ganglia/mobile_helper.php?show_host_metrics=1&h=ip-10-224-42-108.ec2.internal&c=HDPNameNode&r=hour&cs=&ce=
> API response:
> {
> "href" : "http://ec2-54-242-174-25.compute-1.amazonaws.com:8080/api/v1/clusters/C2/hosts/ip-10-224-42-108.ec2.internal?fields=metrics/cpu/cpu_user1354227417,1354231017,15,metrics/cpu/cpu_wio1354227417,1354231017,15,metrics/cpu/cpu_nice1354227417,1354231017,15,metrics/cpu/cpu_aidle1354227417,1354231017,15,metrics/cpu/cpu_system1354227417,1354231017,15,metrics/cpu/cpu_idle1354227417,1354231017,15",
> "Hosts" :
> { "cluster_name" : "C2", "host_name" : "ip-10-224-42-108.ec2.internal" }
> }
> We need to understand the root cause.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira