You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@pig.apache.org by Kris Coward <kr...@melon.org> on 2011/05/12 03:06:59 UTC

Resource use accounting for pig(/hadoop).

Is there any tool (or even just a good starting point in the logs) for
measuring the amount of cluster used by a pig job (i.e. something that
can be checked after a job was run on a cluster that was already running
other jobs, and provide an indication how much time the job would've
taken if the cluster had otherwise been idle)?

Thanks,
Kris

-- 
Kris Coward					http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3

Re: Resource use accounting for pig(/hadoop).

Posted by Dmitriy Ryaboy <dv...@gmail.com>.

Pig provides mean/min/max time per mapper and reducer per job, so you
can multiply that out I suppose. Gets a bit tricky with parallel jobs.

On Wed, May 11, 2011 at 6:06 PM, Kris Coward <kr...@melon.org> wrote:
>
> Is there any tool (or even just a good starting point in the logs) for
> measuring the amount of cluster used by a pig job (i.e. something that
> can be checked after a job was run on a cluster that was already running
> other jobs, and provide an indication how much time the job would've
> taken if the cluster had otherwise been idle)?
>
> Thanks,
> Kris
>
> --
> Kris Coward                                     http://unripe.melon.org/
> GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
>