You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Rajat Jain (JIRA)" <ji...@apache.org> on 2015/04/18 08:58:59 UTC

[jira] [Updated] (MAPREDUCE-6321) Map tasks take a lot of time to start up

     [ https://issues.apache.org/jira/browse/MAPREDUCE-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rajat Jain updated MAPREDUCE-6321:
----------------------------------
    Component/s: mrv2

> Map tasks take a lot of time to start up
> ----------------------------------------
>
>                 Key: MAPREDUCE-6321
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6321
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>    Affects Versions: 2.6.0
>            Reporter: Rajat Jain
>            Priority: Critical
>              Labels: performance
>
> I have noticed repeatedly that the map tasks take a lot of time to startup on YARN clusters. This is not the scheduling part, this is after the actual container is launched containing the Map task. Take for example, the sample log from a mapper of a Pi job that I launched. The command I used to launch the Pi job was:
> {code}
> hadoop jar /usr/lib/hadoop/share/hadoop/mapreduce/hadoop*mapreduce*examples*jar pi 10 100
> {code}
> This is the sample job from one of the mappers which took 14 seconds to complete. If you notice from the logs, most of the time taken by this job is during the start up. I notice that the most mappers take anywhere between 7 to 15 seconds during start up and have seen this behavior consistent across mapreduce jobs. This really affects the performance of short running mappers.
> I run a hadoop2 / yarn cluster on a 4-5 node m1.xlarge cluster, and the mapper memory is always specified as 2048m and so on.
> Log:
> {code}
> 2015-04-18 06:48:34,081 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
> 2015-04-18 06:48:34,637 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
> 2015-04-18 06:48:34,637 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
> 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
> 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1429338752209_0059, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@5d48e5d6)
> 2015-04-18 06:48:35,391 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
> 2015-04-18 06:48:36,656 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /media/ephemeral3/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral1/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral0/yarn/local/usercache/rjain/appcache/application_1429338752209_0059
> 2015-04-18 06:48:36,657 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.reserved.local.dir for child: 
> 2015-04-18 06:48:36,706 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> 2015-04-18 06:48:37,387 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> 2015-04-18 06:48:39,388 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
> 2015-04-18 06:48:39,448 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
> 2015-04-18 06:48:41,060 INFO [main] org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to org.apache.hadoop.mapred.Task$TaskReporter@601211d0 comment setting up progress from Task
> 2015-04-18 06:48:41,098 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
> 2015-04-18 06:48:41,585 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: hdfs://ec2-54-211-109-245.compute-1.amazonaws.com:9000/user/rjain/QuasiMonteCarlo_1429339685772_504558444/in/part4:0+118
> 2015-04-18 06:48:43,926 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR) 0 kvi 234881020(939524080)
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: mapreduce.task.io.sort.mb: 896
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: soft limit at 657666880
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid = 939524096
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 234881020; length = 58720256
> 2015-04-18 06:48:43,946 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling map output
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 18; bufvoid = 939524096
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 234881020(939524080); kvend = 234881016(939524064); length = 5/58720256
> 2015-04-18 06:48:44,065 INFO [main] org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext$DirSelector: Returning directory: /media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/attempt_1429338752209_0059_m_000004_0_spill_0.out
> 2015-04-18 06:48:44,089 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.snappy]
> 2015-04-18 06:48:44,100 INFO [main] org.apache.hadoop.mapred.MapTask: Finished spill 0
> 2015-04-18 06:48:44,111 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1429338752209_0059_m_000004_0 is done. And is in the process of committing
> 2015-04-18 06:48:44,164 INFO [main] org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to null comment clearing progress now that task is done
> 2015-04-18 06:48:44,191 INFO [main] org.apache.hadoop.mapred.Task: Counters for attempt_1429338752209_0059_m_000004_0
> 2015-04-18 06:48:44,203 INFO [main] org.apache.hadoop.mapred.Task: File System Counters
> 2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of bytes read = 0
> 2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of bytes written = 116166
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of read operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of large read operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of write operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of bytes read = 297
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of bytes written = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of read operations = 4
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of large read operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of write operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce Framework
> 2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map input records = 1
> 2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map output records = 2
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output bytes = 18
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output materialized bytes = 34
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Input split bytes = 179
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Combine input records = 0
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Spilled Records = 2
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Failed Shuffles = 0
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Merged Map outputs = 0
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: GC time elapsed (ms) = 158
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: CPU time spent (ms) = 2080
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Physical memory (bytes) snapshot = 1104715776
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Virtual memory (bytes) snapshot = 2900013056
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Total committed heap usage (bytes) = 1251475456
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File Input Format Counters 
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Bytes Read = 118
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File System Counters
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of bytes read = 0
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of bytes written = 116166
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of read operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of large read operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of write operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of bytes read = 297
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of bytes written = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of read operations = 4
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of large read operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of write operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce Framework
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map input records = 1
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output records = 2
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output bytes = 18
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output materialized bytes = 34
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Input split bytes = 179
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Combine input records = 0
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Spilled Records = 2
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Failed Shuffles = 0
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Merged Map outputs = 0
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: GC time elapsed (ms) = 158
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: CPU time spent (ms) = 2080
> 2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Physical memory (bytes) snapshot = 1104715776
> 2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Virtual memory (bytes) snapshot = 2900013056
> 2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Total committed heap usage (bytes) = 1251475456
> 2015-04-18 06:48:44,214 INFO [main] org.apache.hadoop.mapred.Task: File path: /media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/output/attempt_1429338752209_0059_m_000004_0/file.out
> 2015-04-18 06:48:44,223 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1429338752209_0059_m_000004_0' done.
> 2015-04-18 06:48:44,225 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system...
> 2015-04-18 06:48:44,231 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped.
> 2015-04-18 06:48:44,233 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)