You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by A Df <ab...@yahoo.com> on 2011/08/13 13:57:57 UTC

What is best way to perform job performance analysis on hadoop?

Dear All:

The job tracker and the web interface for the namenode and jobtracker displays various information about the jobs for both pseudo mode and cluser mode (which I am yet to setup, see my other post please). However, which variables would you suggest be used to compare the performance and scalibility of pseudo vs cluster mode? I am sure time, total map and reduce jobs, dfs % used, heap size. Is there any other variable that can help determine how well each mode ran their jobs? I want to create graphs to compare both modes for my project. Thank you.

Cheers,
A Df