You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2009/03/13 00:40:50 UTC

[jira] Commented: (HADOOP-5469) Exposing Hadoop metrics via HTTP

    [ https://issues.apache.org/jira/browse/HADOOP-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681549#action_12681549 ] 

Hadoop QA commented on HADOOP-5469:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12402000/HADOOP-5469.patch
  against trunk revision 752984.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 1 new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    -1 release audit.  The applied patch generated 647 release audit warnings (more than the trunk's current 645 warnings).

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/78/testReport/
Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/78/artifact/trunk/current/releaseAuditDiffWarnings.txt
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/78/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/78/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/78/console

This message is automatically generated.

> Exposing Hadoop metrics via HTTP
> --------------------------------
>
>                 Key: HADOOP-5469
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5469
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: metrics
>            Reporter: Philip Zeyliger
>         Attachments: HADOOP-5469.patch
>
>
> I'd like to be able to query Hadoop's metrics via HTTP, e.g., by going to "/metrics" on any Hadoop daemon that has an HttpServer.  My motivation is pretty simple--if you're running on a lot of machines, tracking down the relevant metrics files is pretty time-consuming; this would be a useful debugging utility.  I'd also like the output to be parseable, so I could write a quick web app to query the metrics dynamically.
> This is similar in spirit, but different, from just using JMX.  (See also HADOOP-4756.)  JMX requires a client, and, more annoyingly, JMX requires setting up authentication.  If you just disable authentication, someone can do Bad Things, and if you enable it, you have to worry about yet another password. It's also more complete--JMX require separate instrumentation, so, for example, the JobTracker's metrics aren't exposed via JMX.
> To start the discussion going, I've attached a patch.  I had to add a method to ContextFactory to get all the active MetrixContexts, implement a do-little MetricsContext that simply inherits from AbstractMetricsContext, add a method to MetricsContext to get all the records, expose copy methods for the maps in OutputRecord, and implemented an easy servlet.  I ended up removing some
> common code from all MetricsContexts, for setting the period; I'm open to taking that out if it muddies the patch significantly.
> I'd love to hear your suggestions.  There's a bug in the JSON representation, and there's some gross type-handling.
> The patch is missing tests.  I wanted to post to gather feedback before I got too far, but tests are forthcoming.
> Here's a sample output for a job tracker, while it was running a "pi" job:
> {noformat}
> jvm
>   metrics
>     {hostName=doorstop.local, processName=JobTracker, sessionId=}
>       gcCount=22
>       gcTimeMillis=68
>       logError=0
>       logFatal=0
>       logInfo=52
>       logWarn=0
>       memHeapCommittedM=7.4375
>       memHeapUsedM=4.2150116
>       memNonHeapCommittedM=23.1875
>       memNonHeapUsedM=18.438614
>       threadsBlocked=0
>       threadsNew=0
>       threadsRunnable=7
>       threadsTerminated=0
>       threadsTimedWaiting=8
>       threadsWaiting=15
> mapred
>   job
>     {counter=Map input records, group=Map-Reduce Framework, hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr, sessionId=, user=philip}
>       value=2.0
>     {counter=Map output records, group=Map-Reduce Framework, hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr, sessionId=, user=philip}
>       value=4.0
>     {counter=Data-local map tasks, group=Job Counters , hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr, sessionId=, user=philip}
>       value=4.0
>     {counter=Map input bytes, group=Map-Reduce Framework, hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr, sessionId=, user=philip}
>       value=48.0
>     {counter=FILE_BYTES_WRITTEN, group=FileSystemCounters, hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr, sessionId=, user=philip}
>       value=148.0
>     {counter=Combine output records, group=Map-Reduce Framework, hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr, sessionId=, user=philip}
>       value=0.0
>     {counter=Launched map tasks, group=Job Counters , hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr, sessionId=, user=philip}
>       value=4.0
>     {counter=HDFS_BYTES_READ, group=FileSystemCounters, hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr, sessionId=, user=philip}
>       value=236.0
>     {counter=Map output bytes, group=Map-Reduce Framework, hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr, sessionId=, user=philip}
>       value=64.0
>     {counter=Launched reduce tasks, group=Job Counters , hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr, sessionId=, user=philip}
>       value=1.0
>     {counter=Spilled Records, group=Map-Reduce Framework, hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr, sessionId=, user=philip}
>       value=4.0
>     {counter=Combine input records, group=Map-Reduce Framework, hostName=doorstop.local, jobId=job_200903101702_0001, jobName=test-mini-mr, sessionId=, user=philip}
>       value=0.0
>   jobtracker
>     {hostName=doorstop.local, sessionId=}
>       jobs_completed=0
>       jobs_submitted=1
>       maps_completed=2
>       maps_launched=5
>       reduces_completed=0
>       reduces_launched=1
> rpc
>   metrics
>     {hostName=doorstop.local, port=50030}
>       NumOpenConnections=2
>       RpcProcessingTime_avg_time=0
>       RpcProcessingTime_num_ops=84
>       RpcQueueTime_avg_time=1
>       RpcQueueTime_num_ops=84
>       callQueueLen=0
>       getBuildVersion_avg_time=0
>       getBuildVersion_num_ops=1
>       getJobProfile_avg_time=0
>       getJobProfile_num_ops=17
>       getJobStatus_avg_time=0
>       getJobStatus_num_ops=32
>       getNewJobId_avg_time=0
>       getNewJobId_num_ops=1
>       getProtocolVersion_avg_time=0
>       getProtocolVersion_num_ops=2
>       getSystemDir_avg_time=0
>       getSystemDir_num_ops=2
>       getTaskCompletionEvents_avg_time=0
>       getTaskCompletionEvents_num_ops=19
>       heartbeat_avg_time=5
>       heartbeat_num_ops=9
>       submitJob_avg_time=0
>       submitJob_num_ops=1
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.