You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Rajan Dev <ku...@gmail.com> on 2010/02/03 13:36:36 UTC

Reading Counters

We have a hadoop job running and have used custom counters to track  few
counters ( like no of successfully processed documents matching certain
conditions)


Since we need to get this counters even while the Hadoop job is running , we
wrote another Java program to read these counters

*
Counter reader*  program will do the following :


1)      List all the running jobs.

2)   Get the running job using Job name

2)     Get all the counter for individual running jobs

3)      Set this counters in variables.
        We could successfully read these counters  , but since we need to
show these counters to custom UI , how can we show these counters?

        we looked into various options to read these counters to show in UI
as following :

      1. Dump these counters to database , however this may be overhead
      2. Write web service   and UI will invoke the functions from these
service to show in UI ( However since we need to run "Counter reader
program "  with Hadoop command it might not be feasible to write web service
?   )

      so the question is can we achive to read the counters using simple
Java APIs ? Does anyone have idea how does the default jobtracker JSP works
? we wanted to built something similar to this

thanks
Rajan Dev

Re: Reading Counters

Posted by Ted Xu <te...@gmail.com>.
Rajan,

It is possible for clients to read counters using Java API. You can create
an RPC proxy to retrieve them ( by calling  RPC.getProxy() ). In fact the
hadoop command use that way. However, jobs that retired by JobTracker (
those been put into the history page as presented in web UI ) cannot
retrieve counters like that. If a job is retired, you can read counters in
the job history file.

The web UI do not get JobTracker information by client. When the http server
start, it calls jsp setAttribute() to make sure the JSP page have full
access to the *real* JobTracker instance.

2010/2/3 Rajan Dev <ku...@gmail.com>

> We have a hadoop job running and have used custom counters to track  few
> counters ( like no of successfully processed documents matching certain
> conditions)
>
>
> Since we need to get this counters even while the Hadoop job is running ,
> we wrote another Java program to read these counters
>
> *
> Counter reader*  program will do the following :
>
>
> 1)      List all the running jobs.
>
> 2)   Get the running job using Job name
>
> 2)     Get all the counter for individual running jobs
>
> 3)      Set this counters in variables.
>         We could successfully read these counters  , but since we need to
> show these counters to custom UI , how can we show these counters?
>
>         we looked into various options to read these counters to show in UI
> as following :
>
>       1. Dump these counters to database , however this may be overhead
>       2. Write web service   and UI will invoke the functions from these
> service to show in UI ( However since we need to run "Counter reader
> program "  with Hadoop command it might not be feasible to write web service
> ?   )
>
>       so the question is can we achive to read the counters using simple
> Java APIs ? Does anyone have idea how does the default jobtracker JSP works
> ? we wanted to built something similar to this
>
> thanks
> Rajan Dev
>

Best regards,

Ted Xu