You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Zheng Lv <lv...@gmail.com> on 2010/04/01 09:34:07 UTC
reduce takes too long time

Hello Everyone,
    One of our job's has 4 reduce tasks, but we find that one of them runs
normally, and others takes too long time.
    Following is the normal task's log:
    2010-04-01 15:01:48,596 INFO org.apache.hadoop.mapred.Merger: Merging 1
sorted segments
2010-04-01 15:01:48,601 INFO org.apache.hadoop.mapred.Merger: Down to the
last merge-pass, with 1 segments left of total size: 9907055 bytes
2010-04-01 15:01:48,605 WARN org.apache.hadoop.mapred.JobConf: The variable
mapred.task.maxvmem is no longer used. Instead use mapred.job.map.memory.mb
and mapred.job.reduce.memory.mb
2010-04-01 15:01:48,622 WARN org.apache.hadoop.mapred.JobConf: The variable
mapred.task.maxvmem is no longer used. Instead use mapred.job.map.memory.mb
and mapred.job.reduce.memory.mb
2010-04-01 15:01:48,672 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new compressor
2010-04-01 15:02:03,744 INFO org.apache.hadoop.mapred.TaskRunner:
Task:attempt_201003301656_0139_r_000001_0 is done. And is in the process of
commiting
2010-04-01 15:02:05,756 INFO org.apache.hadoop.mapred.TaskRunner: Task
attempt_201003301656_0139_r_000001_0 is allowed to commit now
2010-04-01 15:02:05,762 INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of
task 'attempt_201003301656_0139_r_000001_0' to
/user/root/nginxlog/sessionjob/output/20100401140001-20100401150001
2010-04-01 15:02:05,765 INFO org.apache.hadoop.mapred.TaskRunner: Task
'attempt_201003301656_0139_r_000001_0' done.

    And following is one of others:
    2010-04-01 15:01:49,549 INFO org.apache.hadoop.mapred.Merger: Merging 1
sorted segments
2010-04-01 15:01:49,554 INFO org.apache.hadoop.mapred.Merger: Down to the
last merge-pass, with 1 segments left of total size: 9793700 bytes
2010-04-01 15:01:49,563 WARN org.apache.hadoop.mapred.JobConf: The variable
mapred.task.maxvmem is no longer used. Instead use mapred.job.map.memory.mb
and mapred.job.reduce.memory.mb
2010-04-01 15:01:49,582 WARN org.apache.hadoop.mapred.JobConf: The variable
mapred.task.maxvmem is no longer used. Instead use mapred.job.map.memory.mb
and mapred.job.reduce.memory.mb
2010-04-01 15:04:49,690 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new compressor
2010-04-01 15:05:07,103 INFO org.apache.hadoop.mapred.TaskRunner:
Task:attempt_201003301656_0139_r_000000_0 is done. And is in the process of
commiting
2010-04-01 15:05:09,114 INFO org.apache.hadoop.mapred.TaskRunner: Task
attempt_201003301656_0139_r_000000_0 is allowed to commit now
2010-04-01 15:05:09,120 INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of
task 'attempt_201003301656_0139_r_000000_0' to
/user/root/nginxlog/sessionjob/output/20100401140001-20100401150001
2010-04-01 15:05:09,123 INFO org.apache.hadoop.mapred.TaskRunner: Task
'attempt_201003301656_0139_r_000000_0' done.

   It looks like sth is waiting before "2010-04-01 15:05:07,103 INFO
org.apache.hadoop.mapred.TaskRunner:
Task:attempt_201003301656_0139_r_000000_0 is done. And is in the process of
commiting".Any suggestion?
   Regards,
        LvZheng