You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by felix gao <gr...@gmail.com> on 2011/03/02 19:19:32 UTC

How to speed up my slaves

Hello experts,

I am recently testing a set of logs that I converted to avro format in
hadoop.  I am notice really really slow performance when compare to raw
logs.  The map logs showing below seems to indicate setting up JVM took the
longest time. I am wondering if there is anything I can tweak in my

2011-03-02 08:41:21,887 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=MAP, sessionId=
2011-03-02 08:43:53,684 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 100
2011-03-02 08:43:53,737 INFO org.apache.hadoop.mapred.MapTask: data
buffer = 79691776/99614720
2011-03-02 08:43:53,737 INFO org.apache.hadoop.mapred.MapTask: record
buffer = 262144/327680
2011-03-02 08:44:05,057 INFO org.apache.hadoop.mapred.MapTask:
Spilling map output: record full = true
2011-03-02 08:44:05,057 INFO org.apache.hadoop.mapred.MapTask:
bufstart = 0; bufend = 11272192; bufvoid = 99614720
2011-03-02 08:44:05,057 INFO org.apache.hadoop.mapred.MapTask: kvstart
= 0; kvend = 262144; length = 327680
2011-03-02 08:44:05,703 INFO org.apache.hadoop.util.NativeCodeLoader:
Loaded the native-hadoop library
2011-03-02 08:44:05,704 INFO
org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded &
initialized native-zlib library
2011-03-02 08:44:05,706 INFO org.apache.hadoop.io.compress.CodecPool:
Got brand-new compressor
2011-03-02 08:44:08,103 INFO org.apache.hadoop.mapred.MapTask: Finished spill 0
2011-03-02 08:44:16,281 INFO org.apache.hadoop.mapred.MapTask:
Spilling map output: record full = true
2011-03-02 08:44:16,281 INFO org.apache.hadoop.mapred.MapTask:
bufstart = 11272192; bufend = 22544341; bufvoid = 99614720
2011-03-02 08:44:16,281 INFO org.apache.hadoop.mapred.MapTask: kvstart
= 262144; kvend = 196607; length = 327680
2011-03-02 08:44:18,921 INFO org.apache.hadoop.mapred.MapTask: Finished spill 1
2011-03-02 08:44:27,029 INFO org.apache.hadoop.mapred.MapTask:
Spilling map output: record full = true
2011-03-02 08:44:27,029 INFO org.apache.hadoop.mapred.MapTask:
bufstart = 22544341; bufend = 33816490; bufvoid = 99614720
2011-03-02 08:44:27,030 INFO org.apache.hadoop.mapred.MapTask: kvstart
= 196607; kvend = 131070; length = 327680
2011-03-02 08:44:29,454 INFO org.apache.hadoop.mapred.MapTask: Finished spill 2
2011-03-02 08:44:41,450 INFO org.apache.hadoop.mapred.MapTask:
Spilling map output: record full = true
2011-03-02 08:44:41,450 INFO org.apache.hadoop.mapred.MapTask:
bufstart = 33816490; bufend = 45088639; bufvoid = 99614720
2011-03-02 08:44:41,450 INFO org.apache.hadoop.mapred.MapTask: kvstart
= 131070; kvend = 65533; length = 327680
2011-03-02 08:44:43,740 INFO org.apache.hadoop.mapred.MapTask: Finished spill 3
2011-03-02 08:44:51,010 INFO org.apache.hadoop.mapred.MapTask:
Spilling map output: record full = true
2011-03-02 08:44:51,010 INFO org.apache.hadoop.mapred.MapTask:
bufstart = 45088639; bufend = 56360831; bufvoid = 99614720
2011-03-02 08:44:51,010 INFO org.apache.hadoop.mapred.MapTask: kvstart
= 65533; kvend = 327677; length = 327680
2011-03-02 08:44:53,066 INFO org.apache.hadoop.mapred.MapTask: Finished spill 4
2011-03-02 08:44:53,419 INFO org.apache.hadoop.mapred.MapTask:
Starting flush of map output
2011-03-02 08:44:54,079 INFO org.apache.hadoop.mapred.MapTask: Finished spill 5
2011-03-02 08:44:54,085 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:54,091 INFO org.apache.hadoop.io.compress.CodecPool:
Got brand-new decompressor
2011-03-02 08:44:54,093 INFO org.apache.hadoop.io.compress.CodecPool:
Got brand-new decompressor
2011-03-02 08:44:54,093 INFO org.apache.hadoop.io.compress.CodecPool:
Got brand-new decompressor
2011-03-02 08:44:54,094 INFO org.apache.hadoop.io.compress.CodecPool:
Got brand-new decompressor
2011-03-02 08:44:54,095 INFO org.apache.hadoop.io.compress.CodecPool:
Got brand-new decompressor
2011-03-02 08:44:54,096 INFO org.apache.hadoop.io.compress.CodecPool:
Got brand-new decompressor
2011-03-02 08:44:54,096 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 476686 bytes
2011-03-02 08:44:54,519 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:54,523 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 477643 bytes
2011-03-02 08:44:54,882 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:54,886 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 474730 bytes
2011-03-02 08:44:55,241 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:55,245 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 478581 bytes
2011-03-02 08:44:55,647 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:55,651 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 479353 bytes
2011-03-02 08:44:56,011 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:56,015 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 474061 bytes
2011-03-02 08:44:56,367 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:56,371 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 476289 bytes
2011-03-02 08:44:56,725 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:56,729 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 477047 bytes
2011-03-02 08:44:57,082 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:57,086 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 474144 bytes
2011-03-02 08:44:57,437 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:57,441 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 479785 bytes
2011-03-02 08:44:57,819 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:57,823 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 478787 bytes
2011-03-02 08:44:58,177 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:58,182 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 474947 bytes
2011-03-02 08:44:58,531 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:58,535 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 476642 bytes
2011-03-02 08:44:58,890 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:58,894 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 474380 bytes
2011-03-02 08:44:59,248 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:59,252 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 474939 bytes
2011-03-02 08:44:59,610 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:59,614 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 476284 bytes
2011-03-02 08:44:59,964 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:44:59,968 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 477268 bytes
2011-03-02 08:45:00,321 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:45:00,325 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 477738 bytes
2011-03-02 08:45:00,680 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:45:00,684 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 477485 bytes
2011-03-02 08:45:01,038 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:45:01,042 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 477476 bytes
2011-03-02 08:45:01,395 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:45:01,399 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 475864 bytes
2011-03-02 08:45:01,749 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:45:01,753 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 476185 bytes
2011-03-02 08:45:02,105 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:45:02,109 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 477883 bytes
2011-03-02 08:45:02,463 INFO org.apache.hadoop.mapred.Merger: Merging
6 sorted segments
2011-03-02 08:45:02,467 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 6 segments left of total size: 478753 bytes
2011-03-02 08:45:02,830 INFO org.apache.hadoop.mapred.TaskRunner:
Task:attempt_201102071809_6255_m_000008_0 is done. And is in the
process of commiting
2011-03-02 08:45:02,833 INFO org.apache.hadoop.mapred.TaskRunner: Task
'attempt_201102071809_6255_m_000008_0' done.



When I ran the job with raw logs I see

2011-03-02 10:40:56,113 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=MAP, sessionId=
2011-03-02 10:41:01,514 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 100
2011-03-02 10:41:01,579 INFO org.apache.hadoop.mapred.MapTask: data
buffer = 79691776/99614720
2011-03-02 10:41:01,579 INFO org.apache.hadoop.mapred.MapTask: record
buffer = 262144/327680
2011-03-02 10:42:01,446 INFO org.apache.hadoop.mapred.MapTask:
Spilling map output: record full = true
2011-03-02 10:42:01,446 INFO org.apache.hadoop.mapred.MapTask:
bufstart = 0; bufend = 11272192; bufvoid = 99614720


notice the io.sort.mb in avro format takes more than 2 minutes while
the raw version is almost instant. So what is going on there?

Re: How to speed up my slaves

Posted by felix gao <gr...@gmail.com>.
Just tried to launch the same job from a secondary machine. The speed was
really fast.  Must be some kind of environment problem or configuration
problem on the primary launching machine.   Anyone has an idea on what could
cause it to take that long for every map task that is loading the avro
formatted log?

On Wed, Mar 2, 2011 at 10:19 AM, felix gao <gr...@gmail.com> wrote:

> Hello experts,
>
> I am recently testing a set of logs that I converted to avro format in
> hadoop.  I am notice really really slow performance when compare to raw
> logs.  The map logs showing below seems to indicate setting up JVM took the
> longest time. I am wondering if there is anything I can tweak in my
>
> 2011-03-02 08:41:21,887 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
> 2011-03-02 08:43:53,684 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 100
> 2011-03-02 08:43:53,737 INFO org.apache.hadoop.mapred.MapTask: data buffer = 79691776/99614720
> 2011-03-02 08:43:53,737 INFO org.apache.hadoop.mapred.MapTask: record buffer = 262144/327680
> 2011-03-02 08:44:05,057 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: record full = true
> 2011-03-02 08:44:05,057 INFO org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 11272192; bufvoid = 99614720
> 2011-03-02 08:44:05,057 INFO org.apache.hadoop.mapred.MapTask: kvstart = 0; kvend = 262144; length = 327680
> 2011-03-02 08:44:05,703 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> 2011-03-02 08:44:05,704 INFO org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
> 2011-03-02 08:44:05,706 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor
> 2011-03-02 08:44:08,103 INFO org.apache.hadoop.mapred.MapTask: Finished spill 0
> 2011-03-02 08:44:16,281 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: record full = true
> 2011-03-02 08:44:16,281 INFO org.apache.hadoop.mapred.MapTask: bufstart = 11272192; bufend = 22544341; bufvoid = 99614720
> 2011-03-02 08:44:16,281 INFO org.apache.hadoop.mapred.MapTask: kvstart = 262144; kvend = 196607; length = 327680
> 2011-03-02 08:44:18,921 INFO org.apache.hadoop.mapred.MapTask: Finished spill 1
> 2011-03-02 08:44:27,029 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: record full = true
> 2011-03-02 08:44:27,029 INFO org.apache.hadoop.mapred.MapTask: bufstart = 22544341; bufend = 33816490; bufvoid = 99614720
> 2011-03-02 08:44:27,030 INFO org.apache.hadoop.mapred.MapTask: kvstart = 196607; kvend = 131070; length = 327680
> 2011-03-02 08:44:29,454 INFO org.apache.hadoop.mapred.MapTask: Finished spill 2
> 2011-03-02 08:44:41,450 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: record full = true
> 2011-03-02 08:44:41,450 INFO org.apache.hadoop.mapred.MapTask: bufstart = 33816490; bufend = 45088639; bufvoid = 99614720
> 2011-03-02 08:44:41,450 INFO org.apache.hadoop.mapred.MapTask: kvstart = 131070; kvend = 65533; length = 327680
> 2011-03-02 08:44:43,740 INFO org.apache.hadoop.mapred.MapTask: Finished spill 3
> 2011-03-02 08:44:51,010 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: record full = true
> 2011-03-02 08:44:51,010 INFO org.apache.hadoop.mapred.MapTask: bufstart = 45088639; bufend = 56360831; bufvoid = 99614720
> 2011-03-02 08:44:51,010 INFO org.apache.hadoop.mapred.MapTask: kvstart = 65533; kvend = 327677; length = 327680
> 2011-03-02 08:44:53,066 INFO org.apache.hadoop.mapred.MapTask: Finished spill 4
> 2011-03-02 08:44:53,419 INFO org.apache.hadoop.mapred.MapTask: Starting flush of map output
> 2011-03-02 08:44:54,079 INFO org.apache.hadoop.mapred.MapTask: Finished spill 5
> 2011-03-02 08:44:54,085 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:54,091 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
> 2011-03-02 08:44:54,093 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
> 2011-03-02 08:44:54,093 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
> 2011-03-02 08:44:54,094 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
> 2011-03-02 08:44:54,095 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
> 2011-03-02 08:44:54,096 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
> 2011-03-02 08:44:54,096 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 476686 bytes
> 2011-03-02 08:44:54,519 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:54,523 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 477643 bytes
> 2011-03-02 08:44:54,882 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:54,886 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 474730 bytes
> 2011-03-02 08:44:55,241 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:55,245 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 478581 bytes
> 2011-03-02 08:44:55,647 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:55,651 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 479353 bytes
> 2011-03-02 08:44:56,011 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:56,015 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 474061 bytes
> 2011-03-02 08:44:56,367 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:56,371 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 476289 bytes
> 2011-03-02 08:44:56,725 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:56,729 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 477047 bytes
> 2011-03-02 08:44:57,082 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:57,086 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 474144 bytes
> 2011-03-02 08:44:57,437 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:57,441 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 479785 bytes
> 2011-03-02 08:44:57,819 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:57,823 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 478787 bytes
> 2011-03-02 08:44:58,177 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:58,182 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 474947 bytes
> 2011-03-02 08:44:58,531 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:58,535 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 476642 bytes
> 2011-03-02 08:44:58,890 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:58,894 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 474380 bytes
> 2011-03-02 08:44:59,248 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:59,252 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 474939 bytes
> 2011-03-02 08:44:59,610 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:59,614 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 476284 bytes
> 2011-03-02 08:44:59,964 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:44:59,968 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 477268 bytes
> 2011-03-02 08:45:00,321 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:45:00,325 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 477738 bytes
> 2011-03-02 08:45:00,680 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:45:00,684 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 477485 bytes
> 2011-03-02 08:45:01,038 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:45:01,042 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 477476 bytes
> 2011-03-02 08:45:01,395 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:45:01,399 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 475864 bytes
> 2011-03-02 08:45:01,749 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:45:01,753 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 476185 bytes
> 2011-03-02 08:45:02,105 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:45:02,109 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 477883 bytes
> 2011-03-02 08:45:02,463 INFO org.apache.hadoop.mapred.Merger: Merging 6 sorted segments
> 2011-03-02 08:45:02,467 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 6 segments left of total size: 478753 bytes
> 2011-03-02 08:45:02,830 INFO org.apache.hadoop.mapred.TaskRunner: Task:attempt_201102071809_6255_m_000008_0 is done. And is in the process of commiting
> 2011-03-02 08:45:02,833 INFO org.apache.hadoop.mapred.TaskRunner: Task 'attempt_201102071809_6255_m_000008_0' done.
>
>
>
> When I ran the job with raw logs I see
>
> 2011-03-02 10:40:56,113 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
> 2011-03-02 10:41:01,514 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 100
> 2011-03-02 10:41:01,579 INFO org.apache.hadoop.mapred.MapTask: data buffer = 79691776/99614720
> 2011-03-02 10:41:01,579 INFO org.apache.hadoop.mapred.MapTask: record buffer = 262144/327680
> 2011-03-02 10:42:01,446 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: record full = true
> 2011-03-02 10:42:01,446 INFO org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 11272192; bufvoid = 99614720
>
>
> notice the io.sort.mb in avro format takes more than 2 minutes while the raw version is almost instant. So what is going on there?
>
>