You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Bryan Duxbury <br...@rapleaf.com> on 2009/02/12 22:10:00 UTC
Measuring IO time in map/reduce jobs?
Hey all,
Does anyone have any experience trying to measure IO time spent in
their map/reduce jobs? I know how to profile a sample of map and
reduce tasks, but that appears to exclude IO time. Just subtracting
the total cpu time from the total run time of a task seems like too
coarse an approach.
-Bryan
Re: Measuring IO time in map/reduce jobs?
Posted by jdd dhok <jd...@gmail.com>.
Hi,
Linux kernel provides delay accounting information through a netlink
socket to user space. You can read more about it here:
http://www.mjmwired.net/kernel/Documentation/accounting/taskstats.txt.
I think there's a python tool called iotop that uses this feature.
Hope this helps.
Regards,
Jaideep
On Fri, Feb 13, 2009 at 2:40 AM, Bryan Duxbury <br...@rapleaf.com> wrote:
> Hey all,
>
> Does anyone have any experience trying to measure IO time spent in their
> map/reduce jobs? I know how to profile a sample of map and reduce tasks, but
> that appears to exclude IO time. Just subtracting the total cpu time from
> the total run time of a task seems like too coarse an approach.
>
> -Bryan
>
--
- JDD