You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by exception <ex...@taomee.com> on 2011/03/11 08:48:27 UTC

memory leak in reduce side

Hi all,

I encounter a wired problem while running a job on a cluster of 8 machines(1 master,7 slaves). I am under hadoop 0.21.

What I am trying to do is storing the values from the map side into a list and doing some computation. This is the code:

public void reduce(MoleCommandKey key, Iterable<LongWritable> values, Context context)
            throws IOException, InterruptedException
{
         mList.clear();
    mList = null;
         mList = new ArrayList<TimeFreq>();
         for(LongWritable val : values)
{
         long time = val.get();
    for(TimeFreq tf : mList)
    {
                   //do something with tf
                   tf.doSomething();
    }

    mList.add(new TimeFreq(time,1));
}

......
}

public static class TimeFreq
{
         Public void doSomething()
{        ......   }

         ......
}

This job has 7 reducers. Two of them can finish successfully, but the others stop at 66%-70%. These reducers already finish copying and sorting. Looks like they are blocked for some reason.

I check the memory usage of the blocked java process and find there may be memory leak in my code.
This is the memory usage printed by jmap.
  4:        149810        4793920  org.taomee.job.MoleCMDFreq$TimeFreq

The number of allocated TimeFreq continue increasing. Looks like gc cannot free these objects. But there is no out of memory exception thrown. I already clear the list. Why there are still memory leak?

Any ideas / help is much appreciated!


Thanks
Exception


Re: memory leak in reduce side

Posted by Harsh J <qw...@gmail.com>.
Hello,

On Fri, Mar 11, 2011 at 1:18 PM, exception <ex...@taomee.com> wrote:
> What I am trying to do is storing the values from the map side into a list
> and doing some computation.

Before you attempt this, know how many values you can possibly receive
for a grouped key in your reducer. Storing a few values is alright,
but storing all that comes in each reducer call is not sane in most
cases - you will easily run out of memory in these cases.

>     mList.add(new TimeFreq(time,1));

Ensure you clone the 'time' object. The Reducer of Hadoop reuses the
key and value objects, and you may run into some weird issues where
the only thing you have left is the last value of the last key that
came in.

> This job has 7 reducers. Two of them can finish successfully, but the others
> stop at 66%-70%. These reducers already finish copying and sorting. Looks
> like they are blocked for some reason.

If two always pass, it is probably because their key-[value] sizes
were within limits somehow (the result of the partitioner and the data
used as key).

> I check the memory usage of the blocked java process and find there may be
> memory leak in my code.

You are clearing your list at every reduce call, which is correct and
should avoid leaks. But you need to see just how many values get
entered into the list in _each_ reduce call.


-- 
Harsh J
www.harshj.com