You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by zsongbo <zs...@gmail.com> on 2009/05/11 14:49:41 UTC

How to do load control of MapReduce

Hi all,
Now, if we have a large dataset to process by MapReduce. The MapReduce will
take machine resources as many as possible.

So when one such a big MapReduce job are running, the cluster would become
very busy and almost cannot do anything else.

For example, we have a HDFS+MapReduc+HBase cluster.
There are a large dataset in HDFS to be processed by MapReduce periodically,
the workload is CPU and I/O heavy. And the cluster also provide other
service for query (query HBase and read files in HDFS). So, when the job is
running, the query latency will become very long.

Since the MapReduce job is not time sensitive, I want to control the load of
MapReduce. Do you have some advices ?

Thanks in advance.
Schubert

Re: How to do load control of MapReduce

Posted by zsongbo <zs...@gmail.com>.

We find the disk I/O is the major bottleneck.
Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               1.00     0.00 85.21  0.00 20926.32     0.00   245.58
 31.59  364.49  11.77 100.28
sdb               5.76  4752.88 53.13 131.08 10145.36 39206.02   267.91
168.34  857.96   5.44 100.28
dm-0              0.00     0.00  5.26  7.52    78.20    60.15    10.82
5.60  461.24  78.31 100.10
dm-1              0.00     0.00 146.12 4875.94 32617.54 39007.52    14.26
 5498.79 1021.17   0.20 100.28
dm-2              0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00


On Wed, May 13, 2009 at 12:01 AM, Steve Loughran <st...@apache.org> wrote:

> Stefan Will wrote:
>
>> Yes, I think the JVM uses way more memory than just its heap. Now some of
>> it
>> might be just reserved memory, but not actually used (not sure how to tell
>> the difference). There are also things like thread stacks, jit compiler
>> cache, direct nio byte buffers etc. that take up process space outside of
>> the Java heap. But none of that should imho add up to Gigabytes...
>>
>
> good article on this
> http://www.ibm.com/developerworks/linux/library/j-nativememory-linux/
>
>

Re: How to do load control of MapReduce

Posted by Steve Loughran <st...@apache.org>.

Stefan Will wrote:
> Yes, I think the JVM uses way more memory than just its heap. Now some of it
> might be just reserved memory, but not actually used (not sure how to tell
> the difference). There are also things like thread stacks, jit compiler
> cache, direct nio byte buffers etc. that take up process space outside of
> the Java heap. But none of that should imho add up to Gigabytes...

good article on this
http://www.ibm.com/developerworks/linux/library/j-nativememory-linux/

Re: How to do load control of MapReduce

Posted by Stefan Will <st...@gmx.net>.

Yes, I think the JVM uses way more memory than just its heap. Now some of it
might be just reserved memory, but not actually used (not sure how to tell
the difference). There are also things like thread stacks, jit compiler
cache, direct nio byte buffers etc. that take up process space outside of
the Java heap. But none of that should imho add up to Gigabytes...

-- Stefan 


> From: zsongbo <zs...@gmail.com>
> Reply-To: <co...@hadoop.apache.org>
> Date: Tue, 12 May 2009 20:06:37 +0800
> To: <co...@hadoop.apache.org>
> Subject: Re: How to do load control of MapReduce
> 
> Yes, I also found that the TaskTracker should not use so much memory.
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 
> 
> 32480 schubert  35  10 1411m 172m 9212 S    0  2.2   8:54.78 java
> 
> The previous 1GB is the default value, I have just change the heap of TT to
> 384MB one hours ago.
> 
> I also found DataNode also need not too much memory.
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 
> 32399 schubert  25   0 1638m 372m 9208 S    2  4.7  32:46.28 java
> 
> 
> In fact, I define the -Xmx512m in child opt for MapReduce tasks. But I found
> the child task use more memory than the definition:
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 
> 
> 10577 schubert  30  10  942m 572m 9092 S   46  7.2  51:02.21 java
> 
> 10507 schubert  29  10  878m 570m 9092 S   48  7.1  50:49.52 java
> 
> Schubert
> 
> On Tue, May 12, 2009 at 6:53 PM, Steve Loughran <st...@apache.org> wrote:
> 
>> zsongbo wrote:
>> 
>>> Hi Stefan,
>>> Yes, the 'nice' cannot resolve this problem.
>>> 
>>> Now, in my cluster, there are 8GB of RAM. My java heap configuration is:
>>> 
>>> HDFS DataNode : 1GB
>>> HBase-RegionServer: 1.5GB
>>> MR-TaskTracker: 1GB
>>> MR-child: 512MB   (max child task is 6, 4 map task + 2 reduce task)
>>> 
>>> But the memory usage is still tight.
>>> 
>> 
>> does TT need to be so big if you are running all your work in external VMs?
>>

Re: How to do load control of MapReduce

Posted by zsongbo <zs...@gmail.com>.

Yes, I also found that the TaskTracker should not use so much memory.
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

32480 schubert  35  10 1411m 172m 9212 S    0  2.2   8:54.78 java

The previous 1GB is the default value, I have just change the heap of TT to
384MB one hours ago.

I also found DataNode also need not too much memory.
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

32399 schubert  25   0 1638m 372m 9208 S    2  4.7  32:46.28 java

In fact, I define the -Xmx512m in child opt for MapReduce tasks. But I found
the child task use more memory than the definition:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

10577 schubert  30  10  942m 572m 9092 S   46  7.2  51:02.21 java

10507 schubert  29  10  878m 570m 9092 S   48  7.1  50:49.52 java

Schubert

On Tue, May 12, 2009 at 6:53 PM, Steve Loughran <st...@apache.org> wrote:

> zsongbo wrote:
>
>> Hi Stefan,
>> Yes, the 'nice' cannot resolve this problem.
>>
>> Now, in my cluster, there are 8GB of RAM. My java heap configuration is:
>>
>> HDFS DataNode : 1GB
>> HBase-RegionServer: 1.5GB
>> MR-TaskTracker: 1GB
>> MR-child: 512MB   (max child task is 6, 4 map task + 2 reduce task)
>>
>> But the memory usage is still tight.
>>
>
> does TT need to be so big if you are running all your work in external VMs?
>

Re: How to do load control of MapReduce

Posted by Steve Loughran <st...@apache.org>.

zsongbo wrote:
> Hi Stefan,
> Yes, the 'nice' cannot resolve this problem.
> 
> Now, in my cluster, there are 8GB of RAM. My java heap configuration is:
> 
> HDFS DataNode : 1GB
> HBase-RegionServer: 1.5GB
> MR-TaskTracker: 1GB
> MR-child: 512MB   (max child task is 6, 4 map task + 2 reduce task)
> 
> But the memory usage is still tight.

does TT need to be so big if you are running all your work in external VMs?

Re: How to do load control of MapReduce

Posted by zsongbo <zs...@gmail.com>.

Hi Stefan,
Yes, the 'nice' cannot resolve this problem.

Now, in my cluster, there are 8GB of RAM. My java heap configuration is:

HDFS DataNode : 1GB
HBase-RegionServer: 1.5GB
MR-TaskTracker: 1GB
MR-child: 512MB   (max child task is 6, 4 map task + 2 reduce task)

But the memory usage is still tight.

Schubert

On Tue, May 12, 2009 at 11:39 AM, Stefan Will <st...@gmx.net> wrote:

> I'm having similar performance issues and have been running my Hadoop
> processes using a nice level of 10 for a while, and haven't noticed any
> improvement.
>
> In my case, I believe what's happening is that the peak combined RAM usage
> of all the Hadoop task processes and the service processes exceeeds the
> ammount of RAM on my machines. This in turn causes part of the server
> processes to get paged out to disk while the nightly Hadoop batch processes
> are running. Since the swap space is typically on the same physical disks
> as
> the DFS and MapReduce working directory, I'm heavily IO bound and real time
> queries pretty much slow down to a crawl.
>
> I think the key is to make absolutely sure that all of your processes fit
> in
> your available RAM at all times. I'm actually having a hard time achieving
> this since the virtual memory usage of the JVM is usually way higher than
> the maximum heap size (see my other thread).
>
> -- Stefan
>
>
> > From: zsongbo <zs...@gmail.com>
> > Reply-To: <co...@hadoop.apache.org>
> > Date: Tue, 12 May 2009 10:58:49 +0800
> > To: <co...@hadoop.apache.org>
> > Subject: Re: How to do load control of MapReduce
> >
> > Thanks Billy,I am trying 'nice', and will report the result later.
> >
> > On Tue, May 12, 2009 at 3:42 AM, Billy Pearson
> > <sa...@pearsonwholesale.com>wrote:
> >
> >> Might try setting the tasktrackers linux nice level to say 5 or 10
> >> leavening dfs and hbase setting to 0
> >>
> >> Billy
> >> "zsongbo" <zs...@gmail.com> wrote in message
> >> news:fa03480d0905110549j7f09be13qd434ca41c9f84d1d@mail.gmail.com...
> >>
> >>  Hi all,
> >>> Now, if we have a large dataset to process by MapReduce. The MapReduce
> >>> will
> >>> take machine resources as many as possible.
> >>>
> >>> So when one such a big MapReduce job are running, the cluster would
> become
> >>> very busy and almost cannot do anything else.
> >>>
> >>> For example, we have a HDFS+MapReduc+HBase cluster.
> >>> There are a large dataset in HDFS to be processed by MapReduce
> >>> periodically,
> >>> the workload is CPU and I/O heavy. And the cluster also provide other
> >>> service for query (query HBase and read files in HDFS). So, when the
> job
> >>> is
> >>> running, the query latency will become very long.
> >>>
> >>> Since the MapReduce job is not time sensitive, I want to control the
> load
> >>> of
> >>> MapReduce. Do you have some advices ?
> >>>
> >>> Thanks in advance.
> >>> Schubert
> >>>
> >>>
> >>
> >>
>
>
>

Re: How to do load control of MapReduce

Posted by Stefan Will <st...@gmx.net>.

I'm having similar performance issues and have been running my Hadoop
processes using a nice level of 10 for a while, and haven't noticed any
improvement.

In my case, I believe what's happening is that the peak combined RAM usage
of all the Hadoop task processes and the service processes exceeeds the
ammount of RAM on my machines. This in turn causes part of the server
processes to get paged out to disk while the nightly Hadoop batch processes
are running. Since the swap space is typically on the same physical disks as
the DFS and MapReduce working directory, I'm heavily IO bound and real time
queries pretty much slow down to a crawl.

I think the key is to make absolutely sure that all of your processes fit in
your available RAM at all times. I'm actually having a hard time achieving
this since the virtual memory usage of the JVM is usually way higher than
the maximum heap size (see my other thread).

-- Stefan

> From: zsongbo <zs...@gmail.com>
> Reply-To: <co...@hadoop.apache.org>
> Date: Tue, 12 May 2009 10:58:49 +0800
> To: <co...@hadoop.apache.org>
> Subject: Re: How to do load control of MapReduce
> 
> Thanks Billy,I am trying 'nice', and will report the result later.
> 
> On Tue, May 12, 2009 at 3:42 AM, Billy Pearson
> <sa...@pearsonwholesale.com>wrote:
> 
>> Might try setting the tasktrackers linux nice level to say 5 or 10
>> leavening dfs and hbase setting to 0
>> 
>> Billy
>> "zsongbo" <zs...@gmail.com> wrote in message
>> news:fa03480d0905110549j7f09be13qd434ca41c9f84d1d@mail.gmail.com...
>> 
>>  Hi all,
>>> Now, if we have a large dataset to process by MapReduce. The MapReduce
>>> will
>>> take machine resources as many as possible.
>>> 
>>> So when one such a big MapReduce job are running, the cluster would become
>>> very busy and almost cannot do anything else.
>>> 
>>> For example, we have a HDFS+MapReduc+HBase cluster.
>>> There are a large dataset in HDFS to be processed by MapReduce
>>> periodically,
>>> the workload is CPU and I/O heavy. And the cluster also provide other
>>> service for query (query HBase and read files in HDFS). So, when the job
>>> is
>>> running, the query latency will become very long.
>>> 
>>> Since the MapReduce job is not time sensitive, I want to control the load
>>> of
>>> MapReduce. Do you have some advices ?
>>> 
>>> Thanks in advance.
>>> Schubert
>>> 
>>> 
>> 
>>

Re: How to do load control of MapReduce

Posted by zsongbo <zs...@gmail.com>.

Thanks Billy,I am trying 'nice', and will report the result later.

On Tue, May 12, 2009 at 3:42 AM, Billy Pearson
<sa...@pearsonwholesale.com>wrote:

> Might try setting the tasktrackers linux nice level to say 5 or 10
> leavening dfs and hbase setting to 0
>
> Billy
> "zsongbo" <zs...@gmail.com> wrote in message
> news:fa03480d0905110549j7f09be13qd434ca41c9f84d1d@mail.gmail.com...
>
>  Hi all,
>> Now, if we have a large dataset to process by MapReduce. The MapReduce
>> will
>> take machine resources as many as possible.
>>
>> So when one such a big MapReduce job are running, the cluster would become
>> very busy and almost cannot do anything else.
>>
>> For example, we have a HDFS+MapReduc+HBase cluster.
>> There are a large dataset in HDFS to be processed by MapReduce
>> periodically,
>> the workload is CPU and I/O heavy. And the cluster also provide other
>> service for query (query HBase and read files in HDFS). So, when the job
>> is
>> running, the query latency will become very long.
>>
>> Since the MapReduce job is not time sensitive, I want to control the load
>> of
>> MapReduce. Do you have some advices ?
>>
>> Thanks in advance.
>> Schubert
>>
>>
>
>

Re: How to do load control of MapReduce

Posted by Billy Pearson <sa...@pearsonwholesale.com>.

Might try setting the tasktrackers linux nice level to say 5 or 10 leavening 
dfs and hbase setting to 0

Billy
"zsongbo" <zs...@gmail.com> wrote in message 
news:fa03480d0905110549j7f09be13qd434ca41c9f84d1d@mail.gmail.com...
> Hi all,
> Now, if we have a large dataset to process by MapReduce. The MapReduce 
> will
> take machine resources as many as possible.
>
> So when one such a big MapReduce job are running, the cluster would become
> very busy and almost cannot do anything else.
>
> For example, we have a HDFS+MapReduc+HBase cluster.
> There are a large dataset in HDFS to be processed by MapReduce 
> periodically,
> the workload is CPU and I/O heavy. And the cluster also provide other
> service for query (query HBase and read files in HDFS). So, when the job 
> is
> running, the query latency will become very long.
>
> Since the MapReduce job is not time sensitive, I want to control the load 
> of
> MapReduce. Do you have some advices ?
>
> Thanks in advance.
> Schubert
>