You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by WANG Shicai <Ev...@yahoo.cn> on 2010/06/02 09:54:10 UTC

Need Suggestion: Tuning MR performance by changing parameters in Hadoop project and JVM

Hi,

This message is a little long. I beg your patient.

Our team would like to tune MR performance by changing parameters in Hadoop project and JVM according to the MR Job status and result.

First, classify MR jobs into several kinds. Then monitor cpu, memory, etc. in a MR job, structing the data from the monitor and input it into HBase. The crucial step is to build a model or models to analysis the data. Finally, acquire the proposal for tuning MR jobs, such as increase the memory for the job or reduce it, etc.

However, I am a developer in HBase subproject and not so acquainted with MR jobs. I need some suggestion about the following aspects:
* Is this plan feasible or not? why?
* Is there any one or team doing the above before?
* Which processes in a MR job we ought to monitor more carefully?
* Which parameters in that processes we ought to care?
* What can we refer for the model building?
* Also, any other suggestion about our plan will be welcome.
Thank you a lot!!!

Evan,
2010-06-02

__________________________________________________
�Ͽ�ע���Ż����������������?
http://cn.mail.yahoo.com


Re: Need Suggestion: Tuning MR performance by changing parameters in Hadoop project and JVM

Posted by Amogh Vasekar <am...@yahoo-inc.com>.
Hi,
You might want to check https://issues.apache.org/jira/browse/HADOOP-4179
And  http://hadoop.apache.org/common/docs/current/vaidya.html

Amogh


On 6/2/10 1:24 PM, "WANG Shicai" <Ev...@yahoo.cn> wrote:

Hi,

This message is a little long. I beg your patient.

Our team would like to tune MR performance by changing parameters in Hadoop project and JVM according to the MR Job status and result.

First, classify MR jobs into several kinds. Then monitor cpu, memory, etc. in a MR job, structing the data from the monitor and input it into HBase. The crucial step is to build a model or models to analysis the data. Finally, acquire the proposal for tuning MR jobs, such as increase the memory for the job or reduce it, etc.

However, I am a developer in HBase subproject and not so acquainted with MR jobs. I need some suggestion about the following aspects:
* Is this plan feasible or not? why?
* Is there any one or team doing the above before?
* Which processes in a MR job we ought to monitor more carefully?
* Which parameters in that processes we ought to care?
* What can we refer for the model building?
* Also, any other suggestion about our plan will be welcome.
Thank you a lot!!!

Evan,
2010-06-02

__________________________________________________
?????????????????????????????
http://cn.mail.yahoo.com