You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Vinod Kumar Vavilapalli (JIRA)" <ji...@apache.org> on 2012/07/04 23:51:25 UTC

[jira] [Updated] (AMBARI-517) Dashboard shows HDFS is down though it's still running

     [ https://issues.apache.org/jira/browse/AMBARI-517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod Kumar Vavilapalli updated AMBARI-517:
-------------------------------------------

    Fix Version/s:     (was: ambari-186)
                   0.9.0
    
> Dashboard shows HDFS is down though it's still running
> ------------------------------------------------------
>
>                 Key: AMBARI-517
>                 URL: https://issues.apache.org/jira/browse/AMBARI-517
>             Project: Ambari
>          Issue Type: Bug
>    Affects Versions: ambari-186
>            Reporter: vitthal (Suhas) Gogate
>            Assignee: vitthal (Suhas) Gogate
>             Fix For: 0.9.0
>
>         Attachments: AMBARI-517.patch
>
>
> This defect occasionally occurs when the jmx over http call to respective service master times out due to load on the service master or network problem. e.g. error message in the httpd/error_log file is,
> error_log:Mon Jun 04 12:56:07 2012 error client 24.7.53.89 hdp_mon_jmx_helpers.inc:522hdp_mon_jmx_get_jmx_data: Error when accessing jmx info: , url=http://ec2-72-44-58-186.compute-1.amazonaws.com:60010/jmx, errno=28, error=Operation timed out after 1 seconds with 0 bytes received, referer: http://ec2-72-44-58-186.compute-1.amazonaws.com/hdp/dashboard/ui/home.html
> Solution for this is to increase the timeout (currently 1 sec) to 2 or 3 seconds. Although drawback of increasing timeout is, when one of the services (HDFS, MR, HBASE) is down, backend call will always timeout for that service and so will take that much time to load the page. Although this is not a common scenario to have one of the installed services down for long time, I would recommend to increase the timeout to 3 secs to lower the chances of happening this problem on the slower nodes and network. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira