You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by "Phan, Truong Q" <Tr...@team.telstra.com> on 2014/04/10 07:25:43 UTC

Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

Hi

My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce Streaming job.
I have no issue in running the MapReduce Streaming job which has an input data file of around 400Mb CSV file.
However, it is failed when I try to run the job which has 11 input data files of size 400Mb each.
The job failed with the following error.

I appreciate for any hints or suggestions to fix this issue.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1395628276810_0062_m_000149_0 - exited : java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
MAPREDUCE SCRIPT:
$ cat devices-hdfs-mr-PyIterGen-v3.sh
#!/bin/sh
export HADOOP_CMD=/usr/bin/hadoop
export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce
export HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.2.0-mr1-cdh5.0.0-beta-1.jar

# Clean up the previous runs
sudo -u hdfs hadoop fs -rm -f -R /data/db/bdms1p/device

sudo -u hdfs hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-*streaming*.jar \
-D mapreduce.job.reduces=160 \
-files ./device-mapper-v1.py,./device-combiner-v1.py,./device-reducer-v1.py,./nem-dms-stats-parameter.txt,./map-debug.py \
-mapper ./device-mapper-v1.py \
-combiner ./device-combiner-v1.py \
-reducer ./device-reducer-v1.py \
-mapdebug ./map-debug.py \
-input /data/db/bdms1p/input/*.csv \
-output /data/db/bdms1p/device

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
OUTPUT ON THE CONSOLE:
$ ./devices-hdfs-mr-PyIterGen-v3.sh
14/04/10 10:26:27 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 86400000 minutes, Emptier interval = 0 minutes.
Moved: 'hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/data/db/bdms1p/device' to trash at: hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/user/hdfs/.Trash/Current
packageJobJar: [] [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-1.jar] /tmp/streamjob781154149428893352.jar tmpDir=null
14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager at bpdevdmsdbs01/172.18.127.245:8032
14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager at bpdevdmsdbs01/172.18.127.245:8032
14/04/10 10:26:32 INFO mapred.FileInputFormat: Total input paths to process : 106
14/04/10 10:26:32 INFO mapreduce.JobSubmitter: number of splits:317
14/04/10 10:26:32 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files.filesizes is deprecated. Instead, use mapreduce.job.cache.files.filesizes
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.task.debug.script is deprecated. Instead, use mapreduce.map.debug.script
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
14/04/10 10:26:32 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/04/10 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1395628276810_0062
14/04/10 10:26:33 INFO impl.YarnClientImpl: Submitted application application_1395628276810_0062 to ResourceManager at bpdevdmsdbs01/172.18.127.245:8032
14/04/10 10:26:33 INFO mapreduce.Job: The url to track the job: http://bpdevdmsdbs01:8088/proxy/application_1395628276810_0062/
14/04/10 10:26:33 INFO mapreduce.Job: Running job: job_1395628276810_0062
14/04/10 10:26:42 INFO mapreduce.Job: Job job_1395628276810_0062 running in uber mode : false
14/04/10 10:26:42 INFO mapreduce.Job:  map 0% reduce 0%
14/04/10 10:26:51 INFO mapreduce.Job:  map 1% reduce 0%
14/04/10 10:26:52 INFO mapreduce.Job:  map 2% reduce 0%
14/04/10 10:26:53 INFO mapreduce.Job:  map 3% reduce 0%
14/04/10 10:26:55 INFO mapreduce.Job:  map 4% reduce 0%
14/04/10 10:26:58 INFO mapreduce.Job:  map 5% reduce 0%
14/04/10 10:26:59 INFO mapreduce.Job:  map 6% reduce 0%
14/04/10 10:27:01 INFO mapreduce.Job:  map 7% reduce 0%
14/04/10 10:27:02 INFO mapreduce.Job:  map 8% reduce 0%
14/04/10 10:27:04 INFO mapreduce.Job:  map 9% reduce 0%
14/04/10 10:27:06 INFO mapreduce.Job:  map 10% reduce 0%
14/04/10 10:27:08 INFO mapreduce.Job:  map 11% reduce 0%
14/04/10 10:27:10 INFO mapreduce.Job:  map 12% reduce 0%
14/04/10 10:27:12 INFO mapreduce.Job:  map 13% reduce 0%
14/04/10 10:27:13 INFO mapreduce.Job:  map 14% reduce 0%
14/04/10 10:27:15 INFO mapreduce.Job:  map 15% reduce 0%
14/04/10 10:27:18 INFO mapreduce.Job:  map 16% reduce 0%
14/04/10 10:27:19 INFO mapreduce.Job:  map 17% reduce 0%
14/04/10 10:27:20 INFO mapreduce.Job:  map 18% reduce 0%
14/04/10 10:27:23 INFO mapreduce.Job:  map 19% reduce 0%
14/04/10 10:27:25 INFO mapreduce.Job:  map 20% reduce 0%
14/04/10 10:27:27 INFO mapreduce.Job:  map 21% reduce 0%
14/04/10 10:27:28 INFO mapreduce.Job:  map 22% reduce 0%
14/04/10 10:27:30 INFO mapreduce.Job:  map 23% reduce 0%
14/04/10 10:27:32 INFO mapreduce.Job:  map 24% reduce 0%
14/04/10 10:27:34 INFO mapreduce.Job:  map 25% reduce 0%
14/04/10 10:27:35 INFO mapreduce.Job:  map 26% reduce 0%
14/04/10 10:27:38 INFO mapreduce.Job:  map 27% reduce 0%
14/04/10 10:27:40 INFO mapreduce.Job:  map 28% reduce 0%
14/04/10 10:27:41 INFO mapreduce.Job:  map 29% reduce 0%
14/04/10 10:27:43 INFO mapreduce.Job:  map 30% reduce 0%
14/04/10 10:27:45 INFO mapreduce.Job:  map 31% reduce 0%
14/04/10 10:27:47 INFO mapreduce.Job:  map 32% reduce 0%
14/04/10 10:27:48 INFO mapreduce.Job:  map 33% reduce 0%
14/04/10 10:27:51 INFO mapreduce.Job:  map 34% reduce 0%
14/04/10 10:27:53 INFO mapreduce.Job:  map 35% reduce 0%
14/04/10 10:27:54 INFO mapreduce.Job:  map 36% reduce 0%
14/04/10 10:27:55 INFO mapreduce.Job:  map 37% reduce 0%
14/04/10 10:27:59 INFO mapreduce.Job:  map 38% reduce 0%
14/04/10 10:28:00 INFO mapreduce.Job:  map 39% reduce 0%
14/04/10 10:28:02 INFO mapreduce.Job:  map 40% reduce 0%
14/04/10 10:28:04 INFO mapreduce.Job:  map 41% reduce 0%
14/04/10 10:28:06 INFO mapreduce.Job:  map 42% reduce 0%
14/04/10 10:28:07 INFO mapreduce.Job:  map 43% reduce 0%
14/04/10 10:28:09 INFO mapreduce.Job:  map 44% reduce 0%
14/04/10 10:28:10 INFO mapreduce.Job: Task Id : attempt_1395628276810_0062_m_000149_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

14/04/10 10:28:11 INFO mapreduce.Job:  map 45% reduce 0%
14/04/10 10:28:13 INFO mapreduce.Job:  map 46% reduce 0%
14/04/10 10:28:14 INFO mapreduce.Job:  map 47% reduce 0%
14/04/10 10:28:14 INFO mapreduce.Job: Task Id : attempt_1395628276810_0062_m_000149_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

14/04/10 10:28:17 INFO mapreduce.Job:  map 48% reduce 0%
14/04/10 10:28:19 INFO mapreduce.Job:  map 49% reduce 0%
14/04/10 10:28:19 INFO mapreduce.Job: Task Id : attempt_1395628276810_0062_m_000149_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)

14/04/10 10:28:21 INFO mapreduce.Job:  map 50% reduce 0%
14/04/10 10:28:23 INFO mapreduce.Job:  map 51% reduce 0%
14/04/10 10:28:24 INFO mapreduce.Job:  map 100% reduce 100%
14/04/10 10:28:24 INFO mapreduce.Job: Job job_1395628276810_0062 failed with state FAILED due to: Task failed task_1395628276810_0062_m_000149
Job failed as tasks failed. failedMaps:1 failedReduces:0

14/04/10 10:28:24 INFO mapreduce.Job: Counters: 33
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=15667286
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=21753912258
                HDFS: Number of bytes written=0
                HDFS: Number of read operations=486
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=0
        Job Counters
                Failed map tasks=4
                Killed map tasks=10
                Launched map tasks=176
                Other local map tasks=3
                Data-local map tasks=173
                Total time spent by all maps in occupied slots (ms)=1035708
                Total time spent by all reduces in occupied slots (ms)=0
        Map-Reduce Framework
                Map input records=164217466
                Map output records=0
                Map output bytes=0
                Map output materialized bytes=414720
                Input split bytes=23490
                Combine input records=0
                Combine output records=0
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=4750
                CPU time spent (ms)=321980
                Physical memory (bytes) snapshot=91335024640
                Virtual memory (bytes) snapshot=229819834368
                Total committed heap usage (bytes)=128240713728
        File Input Format Counters
                Bytes Read=21753888768
14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful!
Streaming Command Failed!


Thanks and Regards,
Truong Phan

Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

Posted by Silvina Caíno Lores <si...@gmail.com>.
Hi!

I've faced the same issue a couple of times and I found nothing in the logs
that lead me to the source of the error. However, I've found out that smart
container and block configuration can prevent these issues

First of all, check RM logs to find any problematic container since the
same task is failing all the time (maybe that split is violating container
resource limits, that should be reflected in such log). For instance, in my
particular case, I was running a memory-intensive map and some records
needed more memory than other in large test cases, hence I observed the
behaviour you describe because containers were getting killed.

I usually find application log files under userlogs, just go to the
directory of the container that triggers the error, as pointed by the RM
logs.

Hope it helps.

Regards,
Silvina



On 11 April 2014 09:15, Phan, Truong Q <Tr...@team.telstra.com> wrote:

> I could not find the "attempt_1395628276810_0062_m_000149_0 attemp*" in
> the HDFS "/tmp" directory.
> Where can I find these log files.
>
> Thanks and Regards,
> Truong Phan
>
>
> P    + 61 2 8576 5771
> M   + 61 4 1463 7424
> E    troung.phan@team.telstra.com
> W  www.telstra.com
>
>
> -----Original Message-----
> From: Harsh J [mailto:harsh@cloudera.com]
> Sent: Thursday, 10 April 2014 4:32 PM
> To: <us...@hadoop.apache.org>
> Subject: Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed
> to run on a larger jobs
>
> It appears to me that whatever chunk of the input CSV files your map task
> 000149 gets, the program is unable to process it and throws an error and
> exits.
>
> Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log to
> see if there's any stdout/stderr printed that may help. The syslog in the
> attempt's task log will also carry a "Processing split ..."
> message that may help you know which file and what offset+length under
> that file was being processed.
>
> On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q <
> Troung.Phan@team.telstra.com> wrote:
> > Hi
> >
> >
> >
> > My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce
> > Streaming job.
> >
> > I have no issue in running the MapReduce Streaming job which has an
> > input data file of around 400Mb CSV file.
> >
> > However, it is failed when I try to run the job which has 11 input
> > data files of size 400Mb each.
> >
> > The job failed with the following error.
> >
> >
> >
> > I appreciate for any hints or suggestions to fix this issue.
> >
> >
> >
> >
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> > 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179]
> > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
> > attempt_1395628276810_0062_m_000149_0 - exited :
> java.lang.RuntimeException:
> > PipeMapRed.waitOutputThreads(): subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179]
> > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report
> > from
> > attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException:
> > PipeMapRed.waitOutputThreads(): subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler]
> > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
> > Diagnostics report from attempt_1395628276810_0062_m_000149_0: Error:
> > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> > failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> >
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> > MAPREDUCE SCRIPT:
> >
> > $ cat devices-hdfs-mr-PyIterGen-v3.sh
> >
> > #!/bin/sh
> >
> > export HADOOP_CMD=/usr/bin/hadoop
> >
> > export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce
> >
> > export
> > HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hado
> > op-streaming-2.2.0-mr1-cdh5.0.0-beta-1.jar
> >
> >
> >
> > # Clean up the previous runs
> >
> > sudo -u hdfs hadoop fs -rm -f -R /data/db/bdms1p/device
> >
> >
> >
> > sudo -u hdfs hadoop jar
> > $HADOOP_HOME/contrib/streaming/hadoop-*streaming*.jar \
> >
> > -D mapreduce.job.reduces=160 \
> >
> > -files
> > ./device-mapper-v1.py,./device-combiner-v1.py,./device-reducer-v1.py,.
> > /nem-dms-stats-parameter.txt,./map-debug.py
> > \
> >
> > -mapper ./device-mapper-v1.py \
> >
> > -combiner ./device-combiner-v1.py \
> >
> > -reducer ./device-reducer-v1.py \
> >
> > -mapdebug ./map-debug.py \
> >
> > -input /data/db/bdms1p/input/*.csv \
> >
> > -output /data/db/bdms1p/device
> >
> >
> >
> >
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> > OUTPUT ON THE CONSOLE:
> >
> > $ ./devices-hdfs-mr-PyIterGen-v3.sh
> >
> > 14/04/10 10:26:27 INFO fs.TrashPolicyDefault: Namenode trash
> configuration:
> > Deletion interval = 86400000 minutes, Emptier interval = 0 minutes.
> >
> > Moved:
> > 'hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/data/db/bdms1p/device'
> > to trash at:
> > hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/user/hdfs/.Trash/Curren
> > t
> >
> > packageJobJar: []
> > [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-1.jar]
> > /tmp/streamjob781154149428893352.jar tmpDir=null
> >
> > 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> > at
> > bpdevdmsdbs01/172.18.127.245:8032
> >
> > 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> > at
> > bpdevdmsdbs01/172.18.127.245:8032
> >
> > 14/04/10 10:26:32 INFO mapred.FileInputFormat: Total input paths to
> > process
> > : 106
> >
> > 14/04/10 10:26:32 INFO mapreduce.JobSubmitter: number of splits:317
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: user.name is
> deprecated.
> > Instead, use mapreduce.job.user.name
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.jar is
> deprecated.
> > Instead, use mapreduce.job.jar
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.cache.files.filesizes is deprecated. Instead, use
> > mapreduce.job.cache.files.filesizes
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files
> > is deprecated. Instead, use mapreduce.job.cache.files
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.reduce.tasks
> > is deprecated. Instead, use mapreduce.job.reduces
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.output.value.class is deprecated. Instead, use
> > mapreduce.job.output.value.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.mapoutput.value.class is deprecated. Instead, use
> > mapreduce.map.output.value.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.used.genericoptionsparser is deprecated. Instead, use
> > mapreduce.client.genericoptionsparser.used
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.job.name is
> > deprecated. Instead, use mapreduce.job.name
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.input.dir is
> > deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.dir is
> > deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.map.task.debug.script is deprecated. Instead, use
> > mapreduce.map.debug.script
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.tasks is
> > deprecated. Instead, use mapreduce.job.maps
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.cache.files.timestamps is deprecated. Instead, use
> > mapreduce.job.cache.files.timestamps
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.output.key.class is deprecated. Instead, use
> > mapreduce.job.output.key.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.mapoutput.key.class is deprecated. Instead, use
> > mapreduce.map.output.key.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.working.dir
> > is deprecated. Instead, use mapreduce.job.working.dir
> >
> > 14/04/10 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> > job_1395628276810_0062
> >
> > 14/04/10 10:26:33 INFO impl.YarnClientImpl: Submitted application
> > application_1395628276810_0062 to ResourceManager at
> > bpdevdmsdbs01/172.18.127.245:8032
> >
> > 14/04/10 10:26:33 INFO mapreduce.Job: The url to track the job:
> > http://bpdevdmsdbs01:8088/proxy/application_1395628276810_0062/
> >
> > 14/04/10 10:26:33 INFO mapreduce.Job: Running job:
> > job_1395628276810_0062
> >
> > 14/04/10 10:26:42 INFO mapreduce.Job: Job job_1395628276810_0062
> > running in uber mode : false
> >
> > 14/04/10 10:26:42 INFO mapreduce.Job:  map 0% reduce 0%
> >
> > 14/04/10 10:26:51 INFO mapreduce.Job:  map 1% reduce 0%
> >
> > 14/04/10 10:26:52 INFO mapreduce.Job:  map 2% reduce 0%
> >
> > 14/04/10 10:26:53 INFO mapreduce.Job:  map 3% reduce 0%
> >
> > 14/04/10 10:26:55 INFO mapreduce.Job:  map 4% reduce 0%
> >
> > 14/04/10 10:26:58 INFO mapreduce.Job:  map 5% reduce 0%
> >
> > 14/04/10 10:26:59 INFO mapreduce.Job:  map 6% reduce 0%
> >
> > 14/04/10 10:27:01 INFO mapreduce.Job:  map 7% reduce 0%
> >
> > 14/04/10 10:27:02 INFO mapreduce.Job:  map 8% reduce 0%
> >
> > 14/04/10 10:27:04 INFO mapreduce.Job:  map 9% reduce 0%
> >
> > 14/04/10 10:27:06 INFO mapreduce.Job:  map 10% reduce 0%
> >
> > 14/04/10 10:27:08 INFO mapreduce.Job:  map 11% reduce 0%
> >
> > 14/04/10 10:27:10 INFO mapreduce.Job:  map 12% reduce 0%
> >
> > 14/04/10 10:27:12 INFO mapreduce.Job:  map 13% reduce 0%
> >
> > 14/04/10 10:27:13 INFO mapreduce.Job:  map 14% reduce 0%
> >
> > 14/04/10 10:27:15 INFO mapreduce.Job:  map 15% reduce 0%
> >
> > 14/04/10 10:27:18 INFO mapreduce.Job:  map 16% reduce 0%
> >
> > 14/04/10 10:27:19 INFO mapreduce.Job:  map 17% reduce 0%
> >
> > 14/04/10 10:27:20 INFO mapreduce.Job:  map 18% reduce 0%
> >
> > 14/04/10 10:27:23 INFO mapreduce.Job:  map 19% reduce 0%
> >
> > 14/04/10 10:27:25 INFO mapreduce.Job:  map 20% reduce 0%
> >
> > 14/04/10 10:27:27 INFO mapreduce.Job:  map 21% reduce 0%
> >
> > 14/04/10 10:27:28 INFO mapreduce.Job:  map 22% reduce 0%
> >
> > 14/04/10 10:27:30 INFO mapreduce.Job:  map 23% reduce 0%
> >
> > 14/04/10 10:27:32 INFO mapreduce.Job:  map 24% reduce 0%
> >
> > 14/04/10 10:27:34 INFO mapreduce.Job:  map 25% reduce 0%
> >
> > 14/04/10 10:27:35 INFO mapreduce.Job:  map 26% reduce 0%
> >
> > 14/04/10 10:27:38 INFO mapreduce.Job:  map 27% reduce 0%
> >
> > 14/04/10 10:27:40 INFO mapreduce.Job:  map 28% reduce 0%
> >
> > 14/04/10 10:27:41 INFO mapreduce.Job:  map 29% reduce 0%
> >
> > 14/04/10 10:27:43 INFO mapreduce.Job:  map 30% reduce 0%
> >
> > 14/04/10 10:27:45 INFO mapreduce.Job:  map 31% reduce 0%
> >
> > 14/04/10 10:27:47 INFO mapreduce.Job:  map 32% reduce 0%
> >
> > 14/04/10 10:27:48 INFO mapreduce.Job:  map 33% reduce 0%
> >
> > 14/04/10 10:27:51 INFO mapreduce.Job:  map 34% reduce 0%
> >
> > 14/04/10 10:27:53 INFO mapreduce.Job:  map 35% reduce 0%
> >
> > 14/04/10 10:27:54 INFO mapreduce.Job:  map 36% reduce 0%
> >
> > 14/04/10 10:27:55 INFO mapreduce.Job:  map 37% reduce 0%
> >
> > 14/04/10 10:27:59 INFO mapreduce.Job:  map 38% reduce 0%
> >
> > 14/04/10 10:28:00 INFO mapreduce.Job:  map 39% reduce 0%
> >
> > 14/04/10 10:28:02 INFO mapreduce.Job:  map 40% reduce 0%
> >
> > 14/04/10 10:28:04 INFO mapreduce.Job:  map 41% reduce 0%
> >
> > 14/04/10 10:28:06 INFO mapreduce.Job:  map 42% reduce 0%
> >
> > 14/04/10 10:28:07 INFO mapreduce.Job:  map 43% reduce 0%
> >
> > 14/04/10 10:28:09 INFO mapreduce.Job:  map 44% reduce 0%
> >
> > 14/04/10 10:28:10 INFO mapreduce.Job: Task Id :
> > attempt_1395628276810_0062_m_000149_0, Status : FAILED
> >
> > Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 14/04/10 10:28:11 INFO mapreduce.Job:  map 45% reduce 0%
> >
> > 14/04/10 10:28:13 INFO mapreduce.Job:  map 46% reduce 0%
> >
> > 14/04/10 10:28:14 INFO mapreduce.Job:  map 47% reduce 0%
> >
> > 14/04/10 10:28:14 INFO mapreduce.Job: Task Id :
> > attempt_1395628276810_0062_m_000149_1, Status : FAILED
> >
> > Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 14/04/10 10:28:17 INFO mapreduce.Job:  map 48% reduce 0%
> >
> > 14/04/10 10:28:19 INFO mapreduce.Job:  map 49% reduce 0%
> >
> > 14/04/10 10:28:19 INFO mapreduce.Job: Task Id :
> > attempt_1395628276810_0062_m_000149_2, Status : FAILED
> >
> > Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 14/04/10 10:28:21 INFO mapreduce.Job:  map 50% reduce 0%
> >
> > 14/04/10 10:28:23 INFO mapreduce.Job:  map 51% reduce 0%
> >
> > 14/04/10 10:28:24 INFO mapreduce.Job:  map 100% reduce 100%
> >
> > 14/04/10 10:28:24 INFO mapreduce.Job: Job job_1395628276810_0062
> > failed with state FAILED due to: Task failed
> > task_1395628276810_0062_m_000149
> >
> > Job failed as tasks failed. failedMaps:1 failedReduces:0
> >
> >
> >
> > 14/04/10 10:28:24 INFO mapreduce.Job: Counters: 33
> >
> >         File System Counters
> >
> >                 FILE: Number of bytes read=0
> >
> >                 FILE: Number of bytes written=15667286
> >
> >                 FILE: Number of read operations=0
> >
> >                 FILE: Number of large read operations=0
> >
> >                 FILE: Number of write operations=0
> >
> >                 HDFS: Number of bytes read=21753912258
> >
> >                 HDFS: Number of bytes written=0
> >
> >                 HDFS: Number of read operations=486
> >
> >                 HDFS: Number of large read operations=0
> >
> >                 HDFS: Number of write operations=0
> >
> >         Job Counters
> >
> >                 Failed map tasks=4
> >
> >                 Killed map tasks=10
> >
> >                 Launched map tasks=176
> >
> >                 Other local map tasks=3
> >
> >                 Data-local map tasks=173
> >
> >                 Total time spent by all maps in occupied slots
> > (ms)=1035708
> >
> >                 Total time spent by all reduces in occupied slots
> > (ms)=0
> >
> >         Map-Reduce Framework
> >
> >                 Map input records=164217466
> >
> >                 Map output records=0
> >
> >                 Map output bytes=0
> >
> >                 Map output materialized bytes=414720
> >
> >                 Input split bytes=23490
> >
> >                 Combine input records=0
> >
> >                 Combine output records=0
> >
> >                 Spilled Records=0
> >
> >                 Failed Shuffles=0
> >
> >                 Merged Map outputs=0
> >
> >                 GC time elapsed (ms)=4750
> >
> >                 CPU time spent (ms)=321980
> >
> >                 Physical memory (bytes) snapshot=91335024640
> >
> >                 Virtual memory (bytes) snapshot=229819834368
> >
> >                 Total committed heap usage (bytes)=128240713728
> >
> >         File Input Format Counters
> >
> >                 Bytes Read=21753888768
> >
> > 14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful!
> >
> > Streaming Command Failed!
> >
> >
> >
> >
> >
> > Thanks and Regards,
> >
> > Truong Phan
>
>
>
> --
> Harsh J
>

Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

Posted by Silvina Caíno Lores <si...@gmail.com>.
Hi!

I've faced the same issue a couple of times and I found nothing in the logs
that lead me to the source of the error. However, I've found out that smart
container and block configuration can prevent these issues

First of all, check RM logs to find any problematic container since the
same task is failing all the time (maybe that split is violating container
resource limits, that should be reflected in such log). For instance, in my
particular case, I was running a memory-intensive map and some records
needed more memory than other in large test cases, hence I observed the
behaviour you describe because containers were getting killed.

I usually find application log files under userlogs, just go to the
directory of the container that triggers the error, as pointed by the RM
logs.

Hope it helps.

Regards,
Silvina



On 11 April 2014 09:15, Phan, Truong Q <Tr...@team.telstra.com> wrote:

> I could not find the "attempt_1395628276810_0062_m_000149_0 attemp*" in
> the HDFS "/tmp" directory.
> Where can I find these log files.
>
> Thanks and Regards,
> Truong Phan
>
>
> P    + 61 2 8576 5771
> M   + 61 4 1463 7424
> E    troung.phan@team.telstra.com
> W  www.telstra.com
>
>
> -----Original Message-----
> From: Harsh J [mailto:harsh@cloudera.com]
> Sent: Thursday, 10 April 2014 4:32 PM
> To: <us...@hadoop.apache.org>
> Subject: Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed
> to run on a larger jobs
>
> It appears to me that whatever chunk of the input CSV files your map task
> 000149 gets, the program is unable to process it and throws an error and
> exits.
>
> Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log to
> see if there's any stdout/stderr printed that may help. The syslog in the
> attempt's task log will also carry a "Processing split ..."
> message that may help you know which file and what offset+length under
> that file was being processed.
>
> On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q <
> Troung.Phan@team.telstra.com> wrote:
> > Hi
> >
> >
> >
> > My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce
> > Streaming job.
> >
> > I have no issue in running the MapReduce Streaming job which has an
> > input data file of around 400Mb CSV file.
> >
> > However, it is failed when I try to run the job which has 11 input
> > data files of size 400Mb each.
> >
> > The job failed with the following error.
> >
> >
> >
> > I appreciate for any hints or suggestions to fix this issue.
> >
> >
> >
> >
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> > 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179]
> > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
> > attempt_1395628276810_0062_m_000149_0 - exited :
> java.lang.RuntimeException:
> > PipeMapRed.waitOutputThreads(): subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179]
> > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report
> > from
> > attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException:
> > PipeMapRed.waitOutputThreads(): subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler]
> > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
> > Diagnostics report from attempt_1395628276810_0062_m_000149_0: Error:
> > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> > failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> >
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> > MAPREDUCE SCRIPT:
> >
> > $ cat devices-hdfs-mr-PyIterGen-v3.sh
> >
> > #!/bin/sh
> >
> > export HADOOP_CMD=/usr/bin/hadoop
> >
> > export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce
> >
> > export
> > HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hado
> > op-streaming-2.2.0-mr1-cdh5.0.0-beta-1.jar
> >
> >
> >
> > # Clean up the previous runs
> >
> > sudo -u hdfs hadoop fs -rm -f -R /data/db/bdms1p/device
> >
> >
> >
> > sudo -u hdfs hadoop jar
> > $HADOOP_HOME/contrib/streaming/hadoop-*streaming*.jar \
> >
> > -D mapreduce.job.reduces=160 \
> >
> > -files
> > ./device-mapper-v1.py,./device-combiner-v1.py,./device-reducer-v1.py,.
> > /nem-dms-stats-parameter.txt,./map-debug.py
> > \
> >
> > -mapper ./device-mapper-v1.py \
> >
> > -combiner ./device-combiner-v1.py \
> >
> > -reducer ./device-reducer-v1.py \
> >
> > -mapdebug ./map-debug.py \
> >
> > -input /data/db/bdms1p/input/*.csv \
> >
> > -output /data/db/bdms1p/device
> >
> >
> >
> >
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> > OUTPUT ON THE CONSOLE:
> >
> > $ ./devices-hdfs-mr-PyIterGen-v3.sh
> >
> > 14/04/10 10:26:27 INFO fs.TrashPolicyDefault: Namenode trash
> configuration:
> > Deletion interval = 86400000 minutes, Emptier interval = 0 minutes.
> >
> > Moved:
> > 'hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/data/db/bdms1p/device'
> > to trash at:
> > hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/user/hdfs/.Trash/Curren
> > t
> >
> > packageJobJar: []
> > [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-1.jar]
> > /tmp/streamjob781154149428893352.jar tmpDir=null
> >
> > 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> > at
> > bpdevdmsdbs01/172.18.127.245:8032
> >
> > 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> > at
> > bpdevdmsdbs01/172.18.127.245:8032
> >
> > 14/04/10 10:26:32 INFO mapred.FileInputFormat: Total input paths to
> > process
> > : 106
> >
> > 14/04/10 10:26:32 INFO mapreduce.JobSubmitter: number of splits:317
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: user.name is
> deprecated.
> > Instead, use mapreduce.job.user.name
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.jar is
> deprecated.
> > Instead, use mapreduce.job.jar
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.cache.files.filesizes is deprecated. Instead, use
> > mapreduce.job.cache.files.filesizes
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files
> > is deprecated. Instead, use mapreduce.job.cache.files
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.reduce.tasks
> > is deprecated. Instead, use mapreduce.job.reduces
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.output.value.class is deprecated. Instead, use
> > mapreduce.job.output.value.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.mapoutput.value.class is deprecated. Instead, use
> > mapreduce.map.output.value.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.used.genericoptionsparser is deprecated. Instead, use
> > mapreduce.client.genericoptionsparser.used
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.job.name is
> > deprecated. Instead, use mapreduce.job.name
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.input.dir is
> > deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.dir is
> > deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.map.task.debug.script is deprecated. Instead, use
> > mapreduce.map.debug.script
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.tasks is
> > deprecated. Instead, use mapreduce.job.maps
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.cache.files.timestamps is deprecated. Instead, use
> > mapreduce.job.cache.files.timestamps
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.output.key.class is deprecated. Instead, use
> > mapreduce.job.output.key.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.mapoutput.key.class is deprecated. Instead, use
> > mapreduce.map.output.key.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.working.dir
> > is deprecated. Instead, use mapreduce.job.working.dir
> >
> > 14/04/10 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> > job_1395628276810_0062
> >
> > 14/04/10 10:26:33 INFO impl.YarnClientImpl: Submitted application
> > application_1395628276810_0062 to ResourceManager at
> > bpdevdmsdbs01/172.18.127.245:8032
> >
> > 14/04/10 10:26:33 INFO mapreduce.Job: The url to track the job:
> > http://bpdevdmsdbs01:8088/proxy/application_1395628276810_0062/
> >
> > 14/04/10 10:26:33 INFO mapreduce.Job: Running job:
> > job_1395628276810_0062
> >
> > 14/04/10 10:26:42 INFO mapreduce.Job: Job job_1395628276810_0062
> > running in uber mode : false
> >
> > 14/04/10 10:26:42 INFO mapreduce.Job:  map 0% reduce 0%
> >
> > 14/04/10 10:26:51 INFO mapreduce.Job:  map 1% reduce 0%
> >
> > 14/04/10 10:26:52 INFO mapreduce.Job:  map 2% reduce 0%
> >
> > 14/04/10 10:26:53 INFO mapreduce.Job:  map 3% reduce 0%
> >
> > 14/04/10 10:26:55 INFO mapreduce.Job:  map 4% reduce 0%
> >
> > 14/04/10 10:26:58 INFO mapreduce.Job:  map 5% reduce 0%
> >
> > 14/04/10 10:26:59 INFO mapreduce.Job:  map 6% reduce 0%
> >
> > 14/04/10 10:27:01 INFO mapreduce.Job:  map 7% reduce 0%
> >
> > 14/04/10 10:27:02 INFO mapreduce.Job:  map 8% reduce 0%
> >
> > 14/04/10 10:27:04 INFO mapreduce.Job:  map 9% reduce 0%
> >
> > 14/04/10 10:27:06 INFO mapreduce.Job:  map 10% reduce 0%
> >
> > 14/04/10 10:27:08 INFO mapreduce.Job:  map 11% reduce 0%
> >
> > 14/04/10 10:27:10 INFO mapreduce.Job:  map 12% reduce 0%
> >
> > 14/04/10 10:27:12 INFO mapreduce.Job:  map 13% reduce 0%
> >
> > 14/04/10 10:27:13 INFO mapreduce.Job:  map 14% reduce 0%
> >
> > 14/04/10 10:27:15 INFO mapreduce.Job:  map 15% reduce 0%
> >
> > 14/04/10 10:27:18 INFO mapreduce.Job:  map 16% reduce 0%
> >
> > 14/04/10 10:27:19 INFO mapreduce.Job:  map 17% reduce 0%
> >
> > 14/04/10 10:27:20 INFO mapreduce.Job:  map 18% reduce 0%
> >
> > 14/04/10 10:27:23 INFO mapreduce.Job:  map 19% reduce 0%
> >
> > 14/04/10 10:27:25 INFO mapreduce.Job:  map 20% reduce 0%
> >
> > 14/04/10 10:27:27 INFO mapreduce.Job:  map 21% reduce 0%
> >
> > 14/04/10 10:27:28 INFO mapreduce.Job:  map 22% reduce 0%
> >
> > 14/04/10 10:27:30 INFO mapreduce.Job:  map 23% reduce 0%
> >
> > 14/04/10 10:27:32 INFO mapreduce.Job:  map 24% reduce 0%
> >
> > 14/04/10 10:27:34 INFO mapreduce.Job:  map 25% reduce 0%
> >
> > 14/04/10 10:27:35 INFO mapreduce.Job:  map 26% reduce 0%
> >
> > 14/04/10 10:27:38 INFO mapreduce.Job:  map 27% reduce 0%
> >
> > 14/04/10 10:27:40 INFO mapreduce.Job:  map 28% reduce 0%
> >
> > 14/04/10 10:27:41 INFO mapreduce.Job:  map 29% reduce 0%
> >
> > 14/04/10 10:27:43 INFO mapreduce.Job:  map 30% reduce 0%
> >
> > 14/04/10 10:27:45 INFO mapreduce.Job:  map 31% reduce 0%
> >
> > 14/04/10 10:27:47 INFO mapreduce.Job:  map 32% reduce 0%
> >
> > 14/04/10 10:27:48 INFO mapreduce.Job:  map 33% reduce 0%
> >
> > 14/04/10 10:27:51 INFO mapreduce.Job:  map 34% reduce 0%
> >
> > 14/04/10 10:27:53 INFO mapreduce.Job:  map 35% reduce 0%
> >
> > 14/04/10 10:27:54 INFO mapreduce.Job:  map 36% reduce 0%
> >
> > 14/04/10 10:27:55 INFO mapreduce.Job:  map 37% reduce 0%
> >
> > 14/04/10 10:27:59 INFO mapreduce.Job:  map 38% reduce 0%
> >
> > 14/04/10 10:28:00 INFO mapreduce.Job:  map 39% reduce 0%
> >
> > 14/04/10 10:28:02 INFO mapreduce.Job:  map 40% reduce 0%
> >
> > 14/04/10 10:28:04 INFO mapreduce.Job:  map 41% reduce 0%
> >
> > 14/04/10 10:28:06 INFO mapreduce.Job:  map 42% reduce 0%
> >
> > 14/04/10 10:28:07 INFO mapreduce.Job:  map 43% reduce 0%
> >
> > 14/04/10 10:28:09 INFO mapreduce.Job:  map 44% reduce 0%
> >
> > 14/04/10 10:28:10 INFO mapreduce.Job: Task Id :
> > attempt_1395628276810_0062_m_000149_0, Status : FAILED
> >
> > Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 14/04/10 10:28:11 INFO mapreduce.Job:  map 45% reduce 0%
> >
> > 14/04/10 10:28:13 INFO mapreduce.Job:  map 46% reduce 0%
> >
> > 14/04/10 10:28:14 INFO mapreduce.Job:  map 47% reduce 0%
> >
> > 14/04/10 10:28:14 INFO mapreduce.Job: Task Id :
> > attempt_1395628276810_0062_m_000149_1, Status : FAILED
> >
> > Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 14/04/10 10:28:17 INFO mapreduce.Job:  map 48% reduce 0%
> >
> > 14/04/10 10:28:19 INFO mapreduce.Job:  map 49% reduce 0%
> >
> > 14/04/10 10:28:19 INFO mapreduce.Job: Task Id :
> > attempt_1395628276810_0062_m_000149_2, Status : FAILED
> >
> > Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 14/04/10 10:28:21 INFO mapreduce.Job:  map 50% reduce 0%
> >
> > 14/04/10 10:28:23 INFO mapreduce.Job:  map 51% reduce 0%
> >
> > 14/04/10 10:28:24 INFO mapreduce.Job:  map 100% reduce 100%
> >
> > 14/04/10 10:28:24 INFO mapreduce.Job: Job job_1395628276810_0062
> > failed with state FAILED due to: Task failed
> > task_1395628276810_0062_m_000149
> >
> > Job failed as tasks failed. failedMaps:1 failedReduces:0
> >
> >
> >
> > 14/04/10 10:28:24 INFO mapreduce.Job: Counters: 33
> >
> >         File System Counters
> >
> >                 FILE: Number of bytes read=0
> >
> >                 FILE: Number of bytes written=15667286
> >
> >                 FILE: Number of read operations=0
> >
> >                 FILE: Number of large read operations=0
> >
> >                 FILE: Number of write operations=0
> >
> >                 HDFS: Number of bytes read=21753912258
> >
> >                 HDFS: Number of bytes written=0
> >
> >                 HDFS: Number of read operations=486
> >
> >                 HDFS: Number of large read operations=0
> >
> >                 HDFS: Number of write operations=0
> >
> >         Job Counters
> >
> >                 Failed map tasks=4
> >
> >                 Killed map tasks=10
> >
> >                 Launched map tasks=176
> >
> >                 Other local map tasks=3
> >
> >                 Data-local map tasks=173
> >
> >                 Total time spent by all maps in occupied slots
> > (ms)=1035708
> >
> >                 Total time spent by all reduces in occupied slots
> > (ms)=0
> >
> >         Map-Reduce Framework
> >
> >                 Map input records=164217466
> >
> >                 Map output records=0
> >
> >                 Map output bytes=0
> >
> >                 Map output materialized bytes=414720
> >
> >                 Input split bytes=23490
> >
> >                 Combine input records=0
> >
> >                 Combine output records=0
> >
> >                 Spilled Records=0
> >
> >                 Failed Shuffles=0
> >
> >                 Merged Map outputs=0
> >
> >                 GC time elapsed (ms)=4750
> >
> >                 CPU time spent (ms)=321980
> >
> >                 Physical memory (bytes) snapshot=91335024640
> >
> >                 Virtual memory (bytes) snapshot=229819834368
> >
> >                 Total committed heap usage (bytes)=128240713728
> >
> >         File Input Format Counters
> >
> >                 Bytes Read=21753888768
> >
> > 14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful!
> >
> > Streaming Command Failed!
> >
> >
> >
> >
> >
> > Thanks and Regards,
> >
> > Truong Phan
>
>
>
> --
> Harsh J
>

Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

Posted by Silvina Caíno Lores <si...@gmail.com>.
Hi!

I've faced the same issue a couple of times and I found nothing in the logs
that lead me to the source of the error. However, I've found out that smart
container and block configuration can prevent these issues

First of all, check RM logs to find any problematic container since the
same task is failing all the time (maybe that split is violating container
resource limits, that should be reflected in such log). For instance, in my
particular case, I was running a memory-intensive map and some records
needed more memory than other in large test cases, hence I observed the
behaviour you describe because containers were getting killed.

I usually find application log files under userlogs, just go to the
directory of the container that triggers the error, as pointed by the RM
logs.

Hope it helps.

Regards,
Silvina



On 11 April 2014 09:15, Phan, Truong Q <Tr...@team.telstra.com> wrote:

> I could not find the "attempt_1395628276810_0062_m_000149_0 attemp*" in
> the HDFS "/tmp" directory.
> Where can I find these log files.
>
> Thanks and Regards,
> Truong Phan
>
>
> P    + 61 2 8576 5771
> M   + 61 4 1463 7424
> E    troung.phan@team.telstra.com
> W  www.telstra.com
>
>
> -----Original Message-----
> From: Harsh J [mailto:harsh@cloudera.com]
> Sent: Thursday, 10 April 2014 4:32 PM
> To: <us...@hadoop.apache.org>
> Subject: Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed
> to run on a larger jobs
>
> It appears to me that whatever chunk of the input CSV files your map task
> 000149 gets, the program is unable to process it and throws an error and
> exits.
>
> Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log to
> see if there's any stdout/stderr printed that may help. The syslog in the
> attempt's task log will also carry a "Processing split ..."
> message that may help you know which file and what offset+length under
> that file was being processed.
>
> On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q <
> Troung.Phan@team.telstra.com> wrote:
> > Hi
> >
> >
> >
> > My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce
> > Streaming job.
> >
> > I have no issue in running the MapReduce Streaming job which has an
> > input data file of around 400Mb CSV file.
> >
> > However, it is failed when I try to run the job which has 11 input
> > data files of size 400Mb each.
> >
> > The job failed with the following error.
> >
> >
> >
> > I appreciate for any hints or suggestions to fix this issue.
> >
> >
> >
> >
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> > 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179]
> > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
> > attempt_1395628276810_0062_m_000149_0 - exited :
> java.lang.RuntimeException:
> > PipeMapRed.waitOutputThreads(): subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179]
> > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report
> > from
> > attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException:
> > PipeMapRed.waitOutputThreads(): subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler]
> > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
> > Diagnostics report from attempt_1395628276810_0062_m_000149_0: Error:
> > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> > failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> >
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> > MAPREDUCE SCRIPT:
> >
> > $ cat devices-hdfs-mr-PyIterGen-v3.sh
> >
> > #!/bin/sh
> >
> > export HADOOP_CMD=/usr/bin/hadoop
> >
> > export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce
> >
> > export
> > HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hado
> > op-streaming-2.2.0-mr1-cdh5.0.0-beta-1.jar
> >
> >
> >
> > # Clean up the previous runs
> >
> > sudo -u hdfs hadoop fs -rm -f -R /data/db/bdms1p/device
> >
> >
> >
> > sudo -u hdfs hadoop jar
> > $HADOOP_HOME/contrib/streaming/hadoop-*streaming*.jar \
> >
> > -D mapreduce.job.reduces=160 \
> >
> > -files
> > ./device-mapper-v1.py,./device-combiner-v1.py,./device-reducer-v1.py,.
> > /nem-dms-stats-parameter.txt,./map-debug.py
> > \
> >
> > -mapper ./device-mapper-v1.py \
> >
> > -combiner ./device-combiner-v1.py \
> >
> > -reducer ./device-reducer-v1.py \
> >
> > -mapdebug ./map-debug.py \
> >
> > -input /data/db/bdms1p/input/*.csv \
> >
> > -output /data/db/bdms1p/device
> >
> >
> >
> >
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> > OUTPUT ON THE CONSOLE:
> >
> > $ ./devices-hdfs-mr-PyIterGen-v3.sh
> >
> > 14/04/10 10:26:27 INFO fs.TrashPolicyDefault: Namenode trash
> configuration:
> > Deletion interval = 86400000 minutes, Emptier interval = 0 minutes.
> >
> > Moved:
> > 'hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/data/db/bdms1p/device'
> > to trash at:
> > hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/user/hdfs/.Trash/Curren
> > t
> >
> > packageJobJar: []
> > [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-1.jar]
> > /tmp/streamjob781154149428893352.jar tmpDir=null
> >
> > 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> > at
> > bpdevdmsdbs01/172.18.127.245:8032
> >
> > 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> > at
> > bpdevdmsdbs01/172.18.127.245:8032
> >
> > 14/04/10 10:26:32 INFO mapred.FileInputFormat: Total input paths to
> > process
> > : 106
> >
> > 14/04/10 10:26:32 INFO mapreduce.JobSubmitter: number of splits:317
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: user.name is
> deprecated.
> > Instead, use mapreduce.job.user.name
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.jar is
> deprecated.
> > Instead, use mapreduce.job.jar
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.cache.files.filesizes is deprecated. Instead, use
> > mapreduce.job.cache.files.filesizes
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files
> > is deprecated. Instead, use mapreduce.job.cache.files
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.reduce.tasks
> > is deprecated. Instead, use mapreduce.job.reduces
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.output.value.class is deprecated. Instead, use
> > mapreduce.job.output.value.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.mapoutput.value.class is deprecated. Instead, use
> > mapreduce.map.output.value.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.used.genericoptionsparser is deprecated. Instead, use
> > mapreduce.client.genericoptionsparser.used
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.job.name is
> > deprecated. Instead, use mapreduce.job.name
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.input.dir is
> > deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.dir is
> > deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.map.task.debug.script is deprecated. Instead, use
> > mapreduce.map.debug.script
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.tasks is
> > deprecated. Instead, use mapreduce.job.maps
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.cache.files.timestamps is deprecated. Instead, use
> > mapreduce.job.cache.files.timestamps
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.output.key.class is deprecated. Instead, use
> > mapreduce.job.output.key.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.mapoutput.key.class is deprecated. Instead, use
> > mapreduce.map.output.key.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.working.dir
> > is deprecated. Instead, use mapreduce.job.working.dir
> >
> > 14/04/10 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> > job_1395628276810_0062
> >
> > 14/04/10 10:26:33 INFO impl.YarnClientImpl: Submitted application
> > application_1395628276810_0062 to ResourceManager at
> > bpdevdmsdbs01/172.18.127.245:8032
> >
> > 14/04/10 10:26:33 INFO mapreduce.Job: The url to track the job:
> > http://bpdevdmsdbs01:8088/proxy/application_1395628276810_0062/
> >
> > 14/04/10 10:26:33 INFO mapreduce.Job: Running job:
> > job_1395628276810_0062
> >
> > 14/04/10 10:26:42 INFO mapreduce.Job: Job job_1395628276810_0062
> > running in uber mode : false
> >
> > 14/04/10 10:26:42 INFO mapreduce.Job:  map 0% reduce 0%
> >
> > 14/04/10 10:26:51 INFO mapreduce.Job:  map 1% reduce 0%
> >
> > 14/04/10 10:26:52 INFO mapreduce.Job:  map 2% reduce 0%
> >
> > 14/04/10 10:26:53 INFO mapreduce.Job:  map 3% reduce 0%
> >
> > 14/04/10 10:26:55 INFO mapreduce.Job:  map 4% reduce 0%
> >
> > 14/04/10 10:26:58 INFO mapreduce.Job:  map 5% reduce 0%
> >
> > 14/04/10 10:26:59 INFO mapreduce.Job:  map 6% reduce 0%
> >
> > 14/04/10 10:27:01 INFO mapreduce.Job:  map 7% reduce 0%
> >
> > 14/04/10 10:27:02 INFO mapreduce.Job:  map 8% reduce 0%
> >
> > 14/04/10 10:27:04 INFO mapreduce.Job:  map 9% reduce 0%
> >
> > 14/04/10 10:27:06 INFO mapreduce.Job:  map 10% reduce 0%
> >
> > 14/04/10 10:27:08 INFO mapreduce.Job:  map 11% reduce 0%
> >
> > 14/04/10 10:27:10 INFO mapreduce.Job:  map 12% reduce 0%
> >
> > 14/04/10 10:27:12 INFO mapreduce.Job:  map 13% reduce 0%
> >
> > 14/04/10 10:27:13 INFO mapreduce.Job:  map 14% reduce 0%
> >
> > 14/04/10 10:27:15 INFO mapreduce.Job:  map 15% reduce 0%
> >
> > 14/04/10 10:27:18 INFO mapreduce.Job:  map 16% reduce 0%
> >
> > 14/04/10 10:27:19 INFO mapreduce.Job:  map 17% reduce 0%
> >
> > 14/04/10 10:27:20 INFO mapreduce.Job:  map 18% reduce 0%
> >
> > 14/04/10 10:27:23 INFO mapreduce.Job:  map 19% reduce 0%
> >
> > 14/04/10 10:27:25 INFO mapreduce.Job:  map 20% reduce 0%
> >
> > 14/04/10 10:27:27 INFO mapreduce.Job:  map 21% reduce 0%
> >
> > 14/04/10 10:27:28 INFO mapreduce.Job:  map 22% reduce 0%
> >
> > 14/04/10 10:27:30 INFO mapreduce.Job:  map 23% reduce 0%
> >
> > 14/04/10 10:27:32 INFO mapreduce.Job:  map 24% reduce 0%
> >
> > 14/04/10 10:27:34 INFO mapreduce.Job:  map 25% reduce 0%
> >
> > 14/04/10 10:27:35 INFO mapreduce.Job:  map 26% reduce 0%
> >
> > 14/04/10 10:27:38 INFO mapreduce.Job:  map 27% reduce 0%
> >
> > 14/04/10 10:27:40 INFO mapreduce.Job:  map 28% reduce 0%
> >
> > 14/04/10 10:27:41 INFO mapreduce.Job:  map 29% reduce 0%
> >
> > 14/04/10 10:27:43 INFO mapreduce.Job:  map 30% reduce 0%
> >
> > 14/04/10 10:27:45 INFO mapreduce.Job:  map 31% reduce 0%
> >
> > 14/04/10 10:27:47 INFO mapreduce.Job:  map 32% reduce 0%
> >
> > 14/04/10 10:27:48 INFO mapreduce.Job:  map 33% reduce 0%
> >
> > 14/04/10 10:27:51 INFO mapreduce.Job:  map 34% reduce 0%
> >
> > 14/04/10 10:27:53 INFO mapreduce.Job:  map 35% reduce 0%
> >
> > 14/04/10 10:27:54 INFO mapreduce.Job:  map 36% reduce 0%
> >
> > 14/04/10 10:27:55 INFO mapreduce.Job:  map 37% reduce 0%
> >
> > 14/04/10 10:27:59 INFO mapreduce.Job:  map 38% reduce 0%
> >
> > 14/04/10 10:28:00 INFO mapreduce.Job:  map 39% reduce 0%
> >
> > 14/04/10 10:28:02 INFO mapreduce.Job:  map 40% reduce 0%
> >
> > 14/04/10 10:28:04 INFO mapreduce.Job:  map 41% reduce 0%
> >
> > 14/04/10 10:28:06 INFO mapreduce.Job:  map 42% reduce 0%
> >
> > 14/04/10 10:28:07 INFO mapreduce.Job:  map 43% reduce 0%
> >
> > 14/04/10 10:28:09 INFO mapreduce.Job:  map 44% reduce 0%
> >
> > 14/04/10 10:28:10 INFO mapreduce.Job: Task Id :
> > attempt_1395628276810_0062_m_000149_0, Status : FAILED
> >
> > Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 14/04/10 10:28:11 INFO mapreduce.Job:  map 45% reduce 0%
> >
> > 14/04/10 10:28:13 INFO mapreduce.Job:  map 46% reduce 0%
> >
> > 14/04/10 10:28:14 INFO mapreduce.Job:  map 47% reduce 0%
> >
> > 14/04/10 10:28:14 INFO mapreduce.Job: Task Id :
> > attempt_1395628276810_0062_m_000149_1, Status : FAILED
> >
> > Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 14/04/10 10:28:17 INFO mapreduce.Job:  map 48% reduce 0%
> >
> > 14/04/10 10:28:19 INFO mapreduce.Job:  map 49% reduce 0%
> >
> > 14/04/10 10:28:19 INFO mapreduce.Job: Task Id :
> > attempt_1395628276810_0062_m_000149_2, Status : FAILED
> >
> > Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 14/04/10 10:28:21 INFO mapreduce.Job:  map 50% reduce 0%
> >
> > 14/04/10 10:28:23 INFO mapreduce.Job:  map 51% reduce 0%
> >
> > 14/04/10 10:28:24 INFO mapreduce.Job:  map 100% reduce 100%
> >
> > 14/04/10 10:28:24 INFO mapreduce.Job: Job job_1395628276810_0062
> > failed with state FAILED due to: Task failed
> > task_1395628276810_0062_m_000149
> >
> > Job failed as tasks failed. failedMaps:1 failedReduces:0
> >
> >
> >
> > 14/04/10 10:28:24 INFO mapreduce.Job: Counters: 33
> >
> >         File System Counters
> >
> >                 FILE: Number of bytes read=0
> >
> >                 FILE: Number of bytes written=15667286
> >
> >                 FILE: Number of read operations=0
> >
> >                 FILE: Number of large read operations=0
> >
> >                 FILE: Number of write operations=0
> >
> >                 HDFS: Number of bytes read=21753912258
> >
> >                 HDFS: Number of bytes written=0
> >
> >                 HDFS: Number of read operations=486
> >
> >                 HDFS: Number of large read operations=0
> >
> >                 HDFS: Number of write operations=0
> >
> >         Job Counters
> >
> >                 Failed map tasks=4
> >
> >                 Killed map tasks=10
> >
> >                 Launched map tasks=176
> >
> >                 Other local map tasks=3
> >
> >                 Data-local map tasks=173
> >
> >                 Total time spent by all maps in occupied slots
> > (ms)=1035708
> >
> >                 Total time spent by all reduces in occupied slots
> > (ms)=0
> >
> >         Map-Reduce Framework
> >
> >                 Map input records=164217466
> >
> >                 Map output records=0
> >
> >                 Map output bytes=0
> >
> >                 Map output materialized bytes=414720
> >
> >                 Input split bytes=23490
> >
> >                 Combine input records=0
> >
> >                 Combine output records=0
> >
> >                 Spilled Records=0
> >
> >                 Failed Shuffles=0
> >
> >                 Merged Map outputs=0
> >
> >                 GC time elapsed (ms)=4750
> >
> >                 CPU time spent (ms)=321980
> >
> >                 Physical memory (bytes) snapshot=91335024640
> >
> >                 Virtual memory (bytes) snapshot=229819834368
> >
> >                 Total committed heap usage (bytes)=128240713728
> >
> >         File Input Format Counters
> >
> >                 Bytes Read=21753888768
> >
> > 14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful!
> >
> > Streaming Command Failed!
> >
> >
> >
> >
> >
> > Thanks and Regards,
> >
> > Truong Phan
>
>
>
> --
> Harsh J
>

Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

Posted by Silvina Caíno Lores <si...@gmail.com>.
Hi!

I've faced the same issue a couple of times and I found nothing in the logs
that lead me to the source of the error. However, I've found out that smart
container and block configuration can prevent these issues

First of all, check RM logs to find any problematic container since the
same task is failing all the time (maybe that split is violating container
resource limits, that should be reflected in such log). For instance, in my
particular case, I was running a memory-intensive map and some records
needed more memory than other in large test cases, hence I observed the
behaviour you describe because containers were getting killed.

I usually find application log files under userlogs, just go to the
directory of the container that triggers the error, as pointed by the RM
logs.

Hope it helps.

Regards,
Silvina



On 11 April 2014 09:15, Phan, Truong Q <Tr...@team.telstra.com> wrote:

> I could not find the "attempt_1395628276810_0062_m_000149_0 attemp*" in
> the HDFS "/tmp" directory.
> Where can I find these log files.
>
> Thanks and Regards,
> Truong Phan
>
>
> P    + 61 2 8576 5771
> M   + 61 4 1463 7424
> E    troung.phan@team.telstra.com
> W  www.telstra.com
>
>
> -----Original Message-----
> From: Harsh J [mailto:harsh@cloudera.com]
> Sent: Thursday, 10 April 2014 4:32 PM
> To: <us...@hadoop.apache.org>
> Subject: Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed
> to run on a larger jobs
>
> It appears to me that whatever chunk of the input CSV files your map task
> 000149 gets, the program is unable to process it and throws an error and
> exits.
>
> Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log to
> see if there's any stdout/stderr printed that may help. The syslog in the
> attempt's task log will also carry a "Processing split ..."
> message that may help you know which file and what offset+length under
> that file was being processed.
>
> On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q <
> Troung.Phan@team.telstra.com> wrote:
> > Hi
> >
> >
> >
> > My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce
> > Streaming job.
> >
> > I have no issue in running the MapReduce Streaming job which has an
> > input data file of around 400Mb CSV file.
> >
> > However, it is failed when I try to run the job which has 11 input
> > data files of size 400Mb each.
> >
> > The job failed with the following error.
> >
> >
> >
> > I appreciate for any hints or suggestions to fix this issue.
> >
> >
> >
> >
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> > 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179]
> > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
> > attempt_1395628276810_0062_m_000149_0 - exited :
> java.lang.RuntimeException:
> > PipeMapRed.waitOutputThreads(): subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179]
> > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report
> > from
> > attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException:
> > PipeMapRed.waitOutputThreads(): subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler]
> > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
> > Diagnostics report from attempt_1395628276810_0062_m_000149_0: Error:
> > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> > failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> >
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> > MAPREDUCE SCRIPT:
> >
> > $ cat devices-hdfs-mr-PyIterGen-v3.sh
> >
> > #!/bin/sh
> >
> > export HADOOP_CMD=/usr/bin/hadoop
> >
> > export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce
> >
> > export
> > HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hado
> > op-streaming-2.2.0-mr1-cdh5.0.0-beta-1.jar
> >
> >
> >
> > # Clean up the previous runs
> >
> > sudo -u hdfs hadoop fs -rm -f -R /data/db/bdms1p/device
> >
> >
> >
> > sudo -u hdfs hadoop jar
> > $HADOOP_HOME/contrib/streaming/hadoop-*streaming*.jar \
> >
> > -D mapreduce.job.reduces=160 \
> >
> > -files
> > ./device-mapper-v1.py,./device-combiner-v1.py,./device-reducer-v1.py,.
> > /nem-dms-stats-parameter.txt,./map-debug.py
> > \
> >
> > -mapper ./device-mapper-v1.py \
> >
> > -combiner ./device-combiner-v1.py \
> >
> > -reducer ./device-reducer-v1.py \
> >
> > -mapdebug ./map-debug.py \
> >
> > -input /data/db/bdms1p/input/*.csv \
> >
> > -output /data/db/bdms1p/device
> >
> >
> >
> >
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> > OUTPUT ON THE CONSOLE:
> >
> > $ ./devices-hdfs-mr-PyIterGen-v3.sh
> >
> > 14/04/10 10:26:27 INFO fs.TrashPolicyDefault: Namenode trash
> configuration:
> > Deletion interval = 86400000 minutes, Emptier interval = 0 minutes.
> >
> > Moved:
> > 'hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/data/db/bdms1p/device'
> > to trash at:
> > hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/user/hdfs/.Trash/Curren
> > t
> >
> > packageJobJar: []
> > [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-1.jar]
> > /tmp/streamjob781154149428893352.jar tmpDir=null
> >
> > 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> > at
> > bpdevdmsdbs01/172.18.127.245:8032
> >
> > 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> > at
> > bpdevdmsdbs01/172.18.127.245:8032
> >
> > 14/04/10 10:26:32 INFO mapred.FileInputFormat: Total input paths to
> > process
> > : 106
> >
> > 14/04/10 10:26:32 INFO mapreduce.JobSubmitter: number of splits:317
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: user.name is
> deprecated.
> > Instead, use mapreduce.job.user.name
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.jar is
> deprecated.
> > Instead, use mapreduce.job.jar
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.cache.files.filesizes is deprecated. Instead, use
> > mapreduce.job.cache.files.filesizes
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files
> > is deprecated. Instead, use mapreduce.job.cache.files
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.reduce.tasks
> > is deprecated. Instead, use mapreduce.job.reduces
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.output.value.class is deprecated. Instead, use
> > mapreduce.job.output.value.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.mapoutput.value.class is deprecated. Instead, use
> > mapreduce.map.output.value.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.used.genericoptionsparser is deprecated. Instead, use
> > mapreduce.client.genericoptionsparser.used
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.job.name is
> > deprecated. Instead, use mapreduce.job.name
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.input.dir is
> > deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.dir is
> > deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.map.task.debug.script is deprecated. Instead, use
> > mapreduce.map.debug.script
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.tasks is
> > deprecated. Instead, use mapreduce.job.maps
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.cache.files.timestamps is deprecated. Instead, use
> > mapreduce.job.cache.files.timestamps
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.output.key.class is deprecated. Instead, use
> > mapreduce.job.output.key.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation:
> > mapred.mapoutput.key.class is deprecated. Instead, use
> > mapreduce.map.output.key.class
> >
> > 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.working.dir
> > is deprecated. Instead, use mapreduce.job.working.dir
> >
> > 14/04/10 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> > job_1395628276810_0062
> >
> > 14/04/10 10:26:33 INFO impl.YarnClientImpl: Submitted application
> > application_1395628276810_0062 to ResourceManager at
> > bpdevdmsdbs01/172.18.127.245:8032
> >
> > 14/04/10 10:26:33 INFO mapreduce.Job: The url to track the job:
> > http://bpdevdmsdbs01:8088/proxy/application_1395628276810_0062/
> >
> > 14/04/10 10:26:33 INFO mapreduce.Job: Running job:
> > job_1395628276810_0062
> >
> > 14/04/10 10:26:42 INFO mapreduce.Job: Job job_1395628276810_0062
> > running in uber mode : false
> >
> > 14/04/10 10:26:42 INFO mapreduce.Job:  map 0% reduce 0%
> >
> > 14/04/10 10:26:51 INFO mapreduce.Job:  map 1% reduce 0%
> >
> > 14/04/10 10:26:52 INFO mapreduce.Job:  map 2% reduce 0%
> >
> > 14/04/10 10:26:53 INFO mapreduce.Job:  map 3% reduce 0%
> >
> > 14/04/10 10:26:55 INFO mapreduce.Job:  map 4% reduce 0%
> >
> > 14/04/10 10:26:58 INFO mapreduce.Job:  map 5% reduce 0%
> >
> > 14/04/10 10:26:59 INFO mapreduce.Job:  map 6% reduce 0%
> >
> > 14/04/10 10:27:01 INFO mapreduce.Job:  map 7% reduce 0%
> >
> > 14/04/10 10:27:02 INFO mapreduce.Job:  map 8% reduce 0%
> >
> > 14/04/10 10:27:04 INFO mapreduce.Job:  map 9% reduce 0%
> >
> > 14/04/10 10:27:06 INFO mapreduce.Job:  map 10% reduce 0%
> >
> > 14/04/10 10:27:08 INFO mapreduce.Job:  map 11% reduce 0%
> >
> > 14/04/10 10:27:10 INFO mapreduce.Job:  map 12% reduce 0%
> >
> > 14/04/10 10:27:12 INFO mapreduce.Job:  map 13% reduce 0%
> >
> > 14/04/10 10:27:13 INFO mapreduce.Job:  map 14% reduce 0%
> >
> > 14/04/10 10:27:15 INFO mapreduce.Job:  map 15% reduce 0%
> >
> > 14/04/10 10:27:18 INFO mapreduce.Job:  map 16% reduce 0%
> >
> > 14/04/10 10:27:19 INFO mapreduce.Job:  map 17% reduce 0%
> >
> > 14/04/10 10:27:20 INFO mapreduce.Job:  map 18% reduce 0%
> >
> > 14/04/10 10:27:23 INFO mapreduce.Job:  map 19% reduce 0%
> >
> > 14/04/10 10:27:25 INFO mapreduce.Job:  map 20% reduce 0%
> >
> > 14/04/10 10:27:27 INFO mapreduce.Job:  map 21% reduce 0%
> >
> > 14/04/10 10:27:28 INFO mapreduce.Job:  map 22% reduce 0%
> >
> > 14/04/10 10:27:30 INFO mapreduce.Job:  map 23% reduce 0%
> >
> > 14/04/10 10:27:32 INFO mapreduce.Job:  map 24% reduce 0%
> >
> > 14/04/10 10:27:34 INFO mapreduce.Job:  map 25% reduce 0%
> >
> > 14/04/10 10:27:35 INFO mapreduce.Job:  map 26% reduce 0%
> >
> > 14/04/10 10:27:38 INFO mapreduce.Job:  map 27% reduce 0%
> >
> > 14/04/10 10:27:40 INFO mapreduce.Job:  map 28% reduce 0%
> >
> > 14/04/10 10:27:41 INFO mapreduce.Job:  map 29% reduce 0%
> >
> > 14/04/10 10:27:43 INFO mapreduce.Job:  map 30% reduce 0%
> >
> > 14/04/10 10:27:45 INFO mapreduce.Job:  map 31% reduce 0%
> >
> > 14/04/10 10:27:47 INFO mapreduce.Job:  map 32% reduce 0%
> >
> > 14/04/10 10:27:48 INFO mapreduce.Job:  map 33% reduce 0%
> >
> > 14/04/10 10:27:51 INFO mapreduce.Job:  map 34% reduce 0%
> >
> > 14/04/10 10:27:53 INFO mapreduce.Job:  map 35% reduce 0%
> >
> > 14/04/10 10:27:54 INFO mapreduce.Job:  map 36% reduce 0%
> >
> > 14/04/10 10:27:55 INFO mapreduce.Job:  map 37% reduce 0%
> >
> > 14/04/10 10:27:59 INFO mapreduce.Job:  map 38% reduce 0%
> >
> > 14/04/10 10:28:00 INFO mapreduce.Job:  map 39% reduce 0%
> >
> > 14/04/10 10:28:02 INFO mapreduce.Job:  map 40% reduce 0%
> >
> > 14/04/10 10:28:04 INFO mapreduce.Job:  map 41% reduce 0%
> >
> > 14/04/10 10:28:06 INFO mapreduce.Job:  map 42% reduce 0%
> >
> > 14/04/10 10:28:07 INFO mapreduce.Job:  map 43% reduce 0%
> >
> > 14/04/10 10:28:09 INFO mapreduce.Job:  map 44% reduce 0%
> >
> > 14/04/10 10:28:10 INFO mapreduce.Job: Task Id :
> > attempt_1395628276810_0062_m_000149_0, Status : FAILED
> >
> > Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 14/04/10 10:28:11 INFO mapreduce.Job:  map 45% reduce 0%
> >
> > 14/04/10 10:28:13 INFO mapreduce.Job:  map 46% reduce 0%
> >
> > 14/04/10 10:28:14 INFO mapreduce.Job:  map 47% reduce 0%
> >
> > 14/04/10 10:28:14 INFO mapreduce.Job: Task Id :
> > attempt_1395628276810_0062_m_000149_1, Status : FAILED
> >
> > Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 14/04/10 10:28:17 INFO mapreduce.Job:  map 48% reduce 0%
> >
> > 14/04/10 10:28:19 INFO mapreduce.Job:  map 49% reduce 0%
> >
> > 14/04/10 10:28:19 INFO mapreduce.Job: Task Id :
> > attempt_1395628276810_0062_m_000149_2, Status : FAILED
> >
> > Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > subprocess failed with code 1
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> > va:320)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> > 533)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
> >
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> >
> >         at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
> >
> >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> >
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >
> >         at
> > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> >
> >         at java.security.AccessController.doPrivileged(Native Method)
> >
> >         at javax.security.auth.Subject.doAs(Subject.java:415)
> >
> >         at
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> > ion.java:1491)
> >
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> >
> >
> >
> > 14/04/10 10:28:21 INFO mapreduce.Job:  map 50% reduce 0%
> >
> > 14/04/10 10:28:23 INFO mapreduce.Job:  map 51% reduce 0%
> >
> > 14/04/10 10:28:24 INFO mapreduce.Job:  map 100% reduce 100%
> >
> > 14/04/10 10:28:24 INFO mapreduce.Job: Job job_1395628276810_0062
> > failed with state FAILED due to: Task failed
> > task_1395628276810_0062_m_000149
> >
> > Job failed as tasks failed. failedMaps:1 failedReduces:0
> >
> >
> >
> > 14/04/10 10:28:24 INFO mapreduce.Job: Counters: 33
> >
> >         File System Counters
> >
> >                 FILE: Number of bytes read=0
> >
> >                 FILE: Number of bytes written=15667286
> >
> >                 FILE: Number of read operations=0
> >
> >                 FILE: Number of large read operations=0
> >
> >                 FILE: Number of write operations=0
> >
> >                 HDFS: Number of bytes read=21753912258
> >
> >                 HDFS: Number of bytes written=0
> >
> >                 HDFS: Number of read operations=486
> >
> >                 HDFS: Number of large read operations=0
> >
> >                 HDFS: Number of write operations=0
> >
> >         Job Counters
> >
> >                 Failed map tasks=4
> >
> >                 Killed map tasks=10
> >
> >                 Launched map tasks=176
> >
> >                 Other local map tasks=3
> >
> >                 Data-local map tasks=173
> >
> >                 Total time spent by all maps in occupied slots
> > (ms)=1035708
> >
> >                 Total time spent by all reduces in occupied slots
> > (ms)=0
> >
> >         Map-Reduce Framework
> >
> >                 Map input records=164217466
> >
> >                 Map output records=0
> >
> >                 Map output bytes=0
> >
> >                 Map output materialized bytes=414720
> >
> >                 Input split bytes=23490
> >
> >                 Combine input records=0
> >
> >                 Combine output records=0
> >
> >                 Spilled Records=0
> >
> >                 Failed Shuffles=0
> >
> >                 Merged Map outputs=0
> >
> >                 GC time elapsed (ms)=4750
> >
> >                 CPU time spent (ms)=321980
> >
> >                 Physical memory (bytes) snapshot=91335024640
> >
> >                 Virtual memory (bytes) snapshot=229819834368
> >
> >                 Total committed heap usage (bytes)=128240713728
> >
> >         File Input Format Counters
> >
> >                 Bytes Read=21753888768
> >
> > 14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful!
> >
> > Streaming Command Failed!
> >
> >
> >
> >
> >
> > Thanks and Regards,
> >
> > Truong Phan
>
>
>
> --
> Harsh J
>

RE: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

Posted by "Phan, Truong Q" <Tr...@team.telstra.com>.
I could not find the "attempt_1395628276810_0062_m_000149_0 attemp*" in the HDFS "/tmp" directory.
Where can I find these log files.

Thanks and Regards,
Truong Phan


P    + 61 2 8576 5771
M   + 61 4 1463 7424
E    troung.phan@team.telstra.com
W  www.telstra.com


-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com]
Sent: Thursday, 10 April 2014 4:32 PM
To: <us...@hadoop.apache.org>
Subject: Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

It appears to me that whatever chunk of the input CSV files your map task 000149 gets, the program is unable to process it and throws an error and exits.

Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log to see if there's any stdout/stderr printed that may help. The syslog in the attempt's task log will also carry a "Processing split ..."
message that may help you know which file and what offset+length under that file was being processed.

On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q <Tr...@team.telstra.com> wrote:
> Hi
>
>
>
> My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce
> Streaming job.
>
> I have no issue in running the MapReduce Streaming job which has an
> input data file of around 400Mb CSV file.
>
> However, it is failed when I try to run the job which has 11 input
> data files of size 400Mb each.
>
> The job failed with the following error.
>
>
>
> I appreciate for any hints or suggestions to fix this issue.
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
> attempt_1395628276810_0062_m_000149_0 - exited : java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report
> from
> attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
> Diagnostics report from attempt_1395628276810_0062_m_000149_0: Error:
> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> MAPREDUCE SCRIPT:
>
> $ cat devices-hdfs-mr-PyIterGen-v3.sh
>
> #!/bin/sh
>
> export HADOOP_CMD=/usr/bin/hadoop
>
> export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce
>
> export
> HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hado
> op-streaming-2.2.0-mr1-cdh5.0.0-beta-1.jar
>
>
>
> # Clean up the previous runs
>
> sudo -u hdfs hadoop fs -rm -f -R /data/db/bdms1p/device
>
>
>
> sudo -u hdfs hadoop jar
> $HADOOP_HOME/contrib/streaming/hadoop-*streaming*.jar \
>
> -D mapreduce.job.reduces=160 \
>
> -files
> ./device-mapper-v1.py,./device-combiner-v1.py,./device-reducer-v1.py,.
> /nem-dms-stats-parameter.txt,./map-debug.py
> \
>
> -mapper ./device-mapper-v1.py \
>
> -combiner ./device-combiner-v1.py \
>
> -reducer ./device-reducer-v1.py \
>
> -mapdebug ./map-debug.py \
>
> -input /data/db/bdms1p/input/*.csv \
>
> -output /data/db/bdms1p/device
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> OUTPUT ON THE CONSOLE:
>
> $ ./devices-hdfs-mr-PyIterGen-v3.sh
>
> 14/04/10 10:26:27 INFO fs.TrashPolicyDefault: Namenode trash configuration:
> Deletion interval = 86400000 minutes, Emptier interval = 0 minutes.
>
> Moved:
> 'hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/data/db/bdms1p/device'
> to trash at:
> hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/user/hdfs/.Trash/Curren
> t
>
> packageJobJar: []
> [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-1.jar]
> /tmp/streamjob781154149428893352.jar tmpDir=null
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:32 INFO mapred.FileInputFormat: Total input paths to
> process
> : 106
>
> 14/04/10 10:26:32 INFO mapreduce.JobSubmitter: number of splits:317
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: user.name is deprecated.
> Instead, use mapreduce.job.user.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.jar is deprecated.
> Instead, use mapreduce.job.jar
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.filesizes is deprecated. Instead, use
> mapreduce.job.cache.files.filesizes
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files
> is deprecated. Instead, use mapreduce.job.cache.files
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.reduce.tasks
> is deprecated. Instead, use mapreduce.job.reduces
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.output.value.class is deprecated. Instead, use
> mapreduce.job.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.mapoutput.value.class is deprecated. Instead, use
> mapreduce.map.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.used.genericoptionsparser is deprecated. Instead, use
> mapreduce.client.genericoptionsparser.used
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.job.name is
> deprecated. Instead, use mapreduce.job.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.input.dir is
> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.map.task.debug.script is deprecated. Instead, use
> mapreduce.map.debug.script
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.timestamps is deprecated. Instead, use
> mapreduce.job.cache.files.timestamps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.output.key.class is deprecated. Instead, use
> mapreduce.job.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.mapoutput.key.class is deprecated. Instead, use
> mapreduce.map.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.working.dir
> is deprecated. Instead, use mapreduce.job.working.dir
>
> 14/04/10 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1395628276810_0062
>
> 14/04/10 10:26:33 INFO impl.YarnClientImpl: Submitted application
> application_1395628276810_0062 to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:33 INFO mapreduce.Job: The url to track the job:
> http://bpdevdmsdbs01:8088/proxy/application_1395628276810_0062/
>
> 14/04/10 10:26:33 INFO mapreduce.Job: Running job:
> job_1395628276810_0062
>
> 14/04/10 10:26:42 INFO mapreduce.Job: Job job_1395628276810_0062
> running in uber mode : false
>
> 14/04/10 10:26:42 INFO mapreduce.Job:  map 0% reduce 0%
>
> 14/04/10 10:26:51 INFO mapreduce.Job:  map 1% reduce 0%
>
> 14/04/10 10:26:52 INFO mapreduce.Job:  map 2% reduce 0%
>
> 14/04/10 10:26:53 INFO mapreduce.Job:  map 3% reduce 0%
>
> 14/04/10 10:26:55 INFO mapreduce.Job:  map 4% reduce 0%
>
> 14/04/10 10:26:58 INFO mapreduce.Job:  map 5% reduce 0%
>
> 14/04/10 10:26:59 INFO mapreduce.Job:  map 6% reduce 0%
>
> 14/04/10 10:27:01 INFO mapreduce.Job:  map 7% reduce 0%
>
> 14/04/10 10:27:02 INFO mapreduce.Job:  map 8% reduce 0%
>
> 14/04/10 10:27:04 INFO mapreduce.Job:  map 9% reduce 0%
>
> 14/04/10 10:27:06 INFO mapreduce.Job:  map 10% reduce 0%
>
> 14/04/10 10:27:08 INFO mapreduce.Job:  map 11% reduce 0%
>
> 14/04/10 10:27:10 INFO mapreduce.Job:  map 12% reduce 0%
>
> 14/04/10 10:27:12 INFO mapreduce.Job:  map 13% reduce 0%
>
> 14/04/10 10:27:13 INFO mapreduce.Job:  map 14% reduce 0%
>
> 14/04/10 10:27:15 INFO mapreduce.Job:  map 15% reduce 0%
>
> 14/04/10 10:27:18 INFO mapreduce.Job:  map 16% reduce 0%
>
> 14/04/10 10:27:19 INFO mapreduce.Job:  map 17% reduce 0%
>
> 14/04/10 10:27:20 INFO mapreduce.Job:  map 18% reduce 0%
>
> 14/04/10 10:27:23 INFO mapreduce.Job:  map 19% reduce 0%
>
> 14/04/10 10:27:25 INFO mapreduce.Job:  map 20% reduce 0%
>
> 14/04/10 10:27:27 INFO mapreduce.Job:  map 21% reduce 0%
>
> 14/04/10 10:27:28 INFO mapreduce.Job:  map 22% reduce 0%
>
> 14/04/10 10:27:30 INFO mapreduce.Job:  map 23% reduce 0%
>
> 14/04/10 10:27:32 INFO mapreduce.Job:  map 24% reduce 0%
>
> 14/04/10 10:27:34 INFO mapreduce.Job:  map 25% reduce 0%
>
> 14/04/10 10:27:35 INFO mapreduce.Job:  map 26% reduce 0%
>
> 14/04/10 10:27:38 INFO mapreduce.Job:  map 27% reduce 0%
>
> 14/04/10 10:27:40 INFO mapreduce.Job:  map 28% reduce 0%
>
> 14/04/10 10:27:41 INFO mapreduce.Job:  map 29% reduce 0%
>
> 14/04/10 10:27:43 INFO mapreduce.Job:  map 30% reduce 0%
>
> 14/04/10 10:27:45 INFO mapreduce.Job:  map 31% reduce 0%
>
> 14/04/10 10:27:47 INFO mapreduce.Job:  map 32% reduce 0%
>
> 14/04/10 10:27:48 INFO mapreduce.Job:  map 33% reduce 0%
>
> 14/04/10 10:27:51 INFO mapreduce.Job:  map 34% reduce 0%
>
> 14/04/10 10:27:53 INFO mapreduce.Job:  map 35% reduce 0%
>
> 14/04/10 10:27:54 INFO mapreduce.Job:  map 36% reduce 0%
>
> 14/04/10 10:27:55 INFO mapreduce.Job:  map 37% reduce 0%
>
> 14/04/10 10:27:59 INFO mapreduce.Job:  map 38% reduce 0%
>
> 14/04/10 10:28:00 INFO mapreduce.Job:  map 39% reduce 0%
>
> 14/04/10 10:28:02 INFO mapreduce.Job:  map 40% reduce 0%
>
> 14/04/10 10:28:04 INFO mapreduce.Job:  map 41% reduce 0%
>
> 14/04/10 10:28:06 INFO mapreduce.Job:  map 42% reduce 0%
>
> 14/04/10 10:28:07 INFO mapreduce.Job:  map 43% reduce 0%
>
> 14/04/10 10:28:09 INFO mapreduce.Job:  map 44% reduce 0%
>
> 14/04/10 10:28:10 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_0, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:11 INFO mapreduce.Job:  map 45% reduce 0%
>
> 14/04/10 10:28:13 INFO mapreduce.Job:  map 46% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job:  map 47% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_1, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:17 INFO mapreduce.Job:  map 48% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job:  map 49% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_2, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:21 INFO mapreduce.Job:  map 50% reduce 0%
>
> 14/04/10 10:28:23 INFO mapreduce.Job:  map 51% reduce 0%
>
> 14/04/10 10:28:24 INFO mapreduce.Job:  map 100% reduce 100%
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Job job_1395628276810_0062
> failed with state FAILED due to: Task failed
> task_1395628276810_0062_m_000149
>
> Job failed as tasks failed. failedMaps:1 failedReduces:0
>
>
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Counters: 33
>
>         File System Counters
>
>                 FILE: Number of bytes read=0
>
>                 FILE: Number of bytes written=15667286
>
>                 FILE: Number of read operations=0
>
>                 FILE: Number of large read operations=0
>
>                 FILE: Number of write operations=0
>
>                 HDFS: Number of bytes read=21753912258
>
>                 HDFS: Number of bytes written=0
>
>                 HDFS: Number of read operations=486
>
>                 HDFS: Number of large read operations=0
>
>                 HDFS: Number of write operations=0
>
>         Job Counters
>
>                 Failed map tasks=4
>
>                 Killed map tasks=10
>
>                 Launched map tasks=176
>
>                 Other local map tasks=3
>
>                 Data-local map tasks=173
>
>                 Total time spent by all maps in occupied slots
> (ms)=1035708
>
>                 Total time spent by all reduces in occupied slots
> (ms)=0
>
>         Map-Reduce Framework
>
>                 Map input records=164217466
>
>                 Map output records=0
>
>                 Map output bytes=0
>
>                 Map output materialized bytes=414720
>
>                 Input split bytes=23490
>
>                 Combine input records=0
>
>                 Combine output records=0
>
>                 Spilled Records=0
>
>                 Failed Shuffles=0
>
>                 Merged Map outputs=0
>
>                 GC time elapsed (ms)=4750
>
>                 CPU time spent (ms)=321980
>
>                 Physical memory (bytes) snapshot=91335024640
>
>                 Virtual memory (bytes) snapshot=229819834368
>
>                 Total committed heap usage (bytes)=128240713728
>
>         File Input Format Counters
>
>                 Bytes Read=21753888768
>
> 14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful!
>
> Streaming Command Failed!
>
>
>
>
>
> Thanks and Regards,
>
> Truong Phan



--
Harsh J

RE: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

Posted by "Phan, Truong Q" <Tr...@team.telstra.com>.
I could not find the "attempt_1395628276810_0062_m_000149_0 attemp*" in the HDFS "/tmp" directory.
Where can I find these log files.

Thanks and Regards,
Truong Phan


P    + 61 2 8576 5771
M   + 61 4 1463 7424
E    troung.phan@team.telstra.com
W  www.telstra.com


-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com]
Sent: Thursday, 10 April 2014 4:32 PM
To: <us...@hadoop.apache.org>
Subject: Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

It appears to me that whatever chunk of the input CSV files your map task 000149 gets, the program is unable to process it and throws an error and exits.

Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log to see if there's any stdout/stderr printed that may help. The syslog in the attempt's task log will also carry a "Processing split ..."
message that may help you know which file and what offset+length under that file was being processed.

On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q <Tr...@team.telstra.com> wrote:
> Hi
>
>
>
> My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce
> Streaming job.
>
> I have no issue in running the MapReduce Streaming job which has an
> input data file of around 400Mb CSV file.
>
> However, it is failed when I try to run the job which has 11 input
> data files of size 400Mb each.
>
> The job failed with the following error.
>
>
>
> I appreciate for any hints or suggestions to fix this issue.
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
> attempt_1395628276810_0062_m_000149_0 - exited : java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report
> from
> attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
> Diagnostics report from attempt_1395628276810_0062_m_000149_0: Error:
> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> MAPREDUCE SCRIPT:
>
> $ cat devices-hdfs-mr-PyIterGen-v3.sh
>
> #!/bin/sh
>
> export HADOOP_CMD=/usr/bin/hadoop
>
> export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce
>
> export
> HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hado
> op-streaming-2.2.0-mr1-cdh5.0.0-beta-1.jar
>
>
>
> # Clean up the previous runs
>
> sudo -u hdfs hadoop fs -rm -f -R /data/db/bdms1p/device
>
>
>
> sudo -u hdfs hadoop jar
> $HADOOP_HOME/contrib/streaming/hadoop-*streaming*.jar \
>
> -D mapreduce.job.reduces=160 \
>
> -files
> ./device-mapper-v1.py,./device-combiner-v1.py,./device-reducer-v1.py,.
> /nem-dms-stats-parameter.txt,./map-debug.py
> \
>
> -mapper ./device-mapper-v1.py \
>
> -combiner ./device-combiner-v1.py \
>
> -reducer ./device-reducer-v1.py \
>
> -mapdebug ./map-debug.py \
>
> -input /data/db/bdms1p/input/*.csv \
>
> -output /data/db/bdms1p/device
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> OUTPUT ON THE CONSOLE:
>
> $ ./devices-hdfs-mr-PyIterGen-v3.sh
>
> 14/04/10 10:26:27 INFO fs.TrashPolicyDefault: Namenode trash configuration:
> Deletion interval = 86400000 minutes, Emptier interval = 0 minutes.
>
> Moved:
> 'hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/data/db/bdms1p/device'
> to trash at:
> hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/user/hdfs/.Trash/Curren
> t
>
> packageJobJar: []
> [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-1.jar]
> /tmp/streamjob781154149428893352.jar tmpDir=null
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:32 INFO mapred.FileInputFormat: Total input paths to
> process
> : 106
>
> 14/04/10 10:26:32 INFO mapreduce.JobSubmitter: number of splits:317
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: user.name is deprecated.
> Instead, use mapreduce.job.user.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.jar is deprecated.
> Instead, use mapreduce.job.jar
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.filesizes is deprecated. Instead, use
> mapreduce.job.cache.files.filesizes
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files
> is deprecated. Instead, use mapreduce.job.cache.files
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.reduce.tasks
> is deprecated. Instead, use mapreduce.job.reduces
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.output.value.class is deprecated. Instead, use
> mapreduce.job.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.mapoutput.value.class is deprecated. Instead, use
> mapreduce.map.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.used.genericoptionsparser is deprecated. Instead, use
> mapreduce.client.genericoptionsparser.used
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.job.name is
> deprecated. Instead, use mapreduce.job.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.input.dir is
> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.map.task.debug.script is deprecated. Instead, use
> mapreduce.map.debug.script
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.timestamps is deprecated. Instead, use
> mapreduce.job.cache.files.timestamps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.output.key.class is deprecated. Instead, use
> mapreduce.job.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.mapoutput.key.class is deprecated. Instead, use
> mapreduce.map.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.working.dir
> is deprecated. Instead, use mapreduce.job.working.dir
>
> 14/04/10 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1395628276810_0062
>
> 14/04/10 10:26:33 INFO impl.YarnClientImpl: Submitted application
> application_1395628276810_0062 to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:33 INFO mapreduce.Job: The url to track the job:
> http://bpdevdmsdbs01:8088/proxy/application_1395628276810_0062/
>
> 14/04/10 10:26:33 INFO mapreduce.Job: Running job:
> job_1395628276810_0062
>
> 14/04/10 10:26:42 INFO mapreduce.Job: Job job_1395628276810_0062
> running in uber mode : false
>
> 14/04/10 10:26:42 INFO mapreduce.Job:  map 0% reduce 0%
>
> 14/04/10 10:26:51 INFO mapreduce.Job:  map 1% reduce 0%
>
> 14/04/10 10:26:52 INFO mapreduce.Job:  map 2% reduce 0%
>
> 14/04/10 10:26:53 INFO mapreduce.Job:  map 3% reduce 0%
>
> 14/04/10 10:26:55 INFO mapreduce.Job:  map 4% reduce 0%
>
> 14/04/10 10:26:58 INFO mapreduce.Job:  map 5% reduce 0%
>
> 14/04/10 10:26:59 INFO mapreduce.Job:  map 6% reduce 0%
>
> 14/04/10 10:27:01 INFO mapreduce.Job:  map 7% reduce 0%
>
> 14/04/10 10:27:02 INFO mapreduce.Job:  map 8% reduce 0%
>
> 14/04/10 10:27:04 INFO mapreduce.Job:  map 9% reduce 0%
>
> 14/04/10 10:27:06 INFO mapreduce.Job:  map 10% reduce 0%
>
> 14/04/10 10:27:08 INFO mapreduce.Job:  map 11% reduce 0%
>
> 14/04/10 10:27:10 INFO mapreduce.Job:  map 12% reduce 0%
>
> 14/04/10 10:27:12 INFO mapreduce.Job:  map 13% reduce 0%
>
> 14/04/10 10:27:13 INFO mapreduce.Job:  map 14% reduce 0%
>
> 14/04/10 10:27:15 INFO mapreduce.Job:  map 15% reduce 0%
>
> 14/04/10 10:27:18 INFO mapreduce.Job:  map 16% reduce 0%
>
> 14/04/10 10:27:19 INFO mapreduce.Job:  map 17% reduce 0%
>
> 14/04/10 10:27:20 INFO mapreduce.Job:  map 18% reduce 0%
>
> 14/04/10 10:27:23 INFO mapreduce.Job:  map 19% reduce 0%
>
> 14/04/10 10:27:25 INFO mapreduce.Job:  map 20% reduce 0%
>
> 14/04/10 10:27:27 INFO mapreduce.Job:  map 21% reduce 0%
>
> 14/04/10 10:27:28 INFO mapreduce.Job:  map 22% reduce 0%
>
> 14/04/10 10:27:30 INFO mapreduce.Job:  map 23% reduce 0%
>
> 14/04/10 10:27:32 INFO mapreduce.Job:  map 24% reduce 0%
>
> 14/04/10 10:27:34 INFO mapreduce.Job:  map 25% reduce 0%
>
> 14/04/10 10:27:35 INFO mapreduce.Job:  map 26% reduce 0%
>
> 14/04/10 10:27:38 INFO mapreduce.Job:  map 27% reduce 0%
>
> 14/04/10 10:27:40 INFO mapreduce.Job:  map 28% reduce 0%
>
> 14/04/10 10:27:41 INFO mapreduce.Job:  map 29% reduce 0%
>
> 14/04/10 10:27:43 INFO mapreduce.Job:  map 30% reduce 0%
>
> 14/04/10 10:27:45 INFO mapreduce.Job:  map 31% reduce 0%
>
> 14/04/10 10:27:47 INFO mapreduce.Job:  map 32% reduce 0%
>
> 14/04/10 10:27:48 INFO mapreduce.Job:  map 33% reduce 0%
>
> 14/04/10 10:27:51 INFO mapreduce.Job:  map 34% reduce 0%
>
> 14/04/10 10:27:53 INFO mapreduce.Job:  map 35% reduce 0%
>
> 14/04/10 10:27:54 INFO mapreduce.Job:  map 36% reduce 0%
>
> 14/04/10 10:27:55 INFO mapreduce.Job:  map 37% reduce 0%
>
> 14/04/10 10:27:59 INFO mapreduce.Job:  map 38% reduce 0%
>
> 14/04/10 10:28:00 INFO mapreduce.Job:  map 39% reduce 0%
>
> 14/04/10 10:28:02 INFO mapreduce.Job:  map 40% reduce 0%
>
> 14/04/10 10:28:04 INFO mapreduce.Job:  map 41% reduce 0%
>
> 14/04/10 10:28:06 INFO mapreduce.Job:  map 42% reduce 0%
>
> 14/04/10 10:28:07 INFO mapreduce.Job:  map 43% reduce 0%
>
> 14/04/10 10:28:09 INFO mapreduce.Job:  map 44% reduce 0%
>
> 14/04/10 10:28:10 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_0, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:11 INFO mapreduce.Job:  map 45% reduce 0%
>
> 14/04/10 10:28:13 INFO mapreduce.Job:  map 46% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job:  map 47% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_1, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:17 INFO mapreduce.Job:  map 48% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job:  map 49% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_2, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:21 INFO mapreduce.Job:  map 50% reduce 0%
>
> 14/04/10 10:28:23 INFO mapreduce.Job:  map 51% reduce 0%
>
> 14/04/10 10:28:24 INFO mapreduce.Job:  map 100% reduce 100%
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Job job_1395628276810_0062
> failed with state FAILED due to: Task failed
> task_1395628276810_0062_m_000149
>
> Job failed as tasks failed. failedMaps:1 failedReduces:0
>
>
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Counters: 33
>
>         File System Counters
>
>                 FILE: Number of bytes read=0
>
>                 FILE: Number of bytes written=15667286
>
>                 FILE: Number of read operations=0
>
>                 FILE: Number of large read operations=0
>
>                 FILE: Number of write operations=0
>
>                 HDFS: Number of bytes read=21753912258
>
>                 HDFS: Number of bytes written=0
>
>                 HDFS: Number of read operations=486
>
>                 HDFS: Number of large read operations=0
>
>                 HDFS: Number of write operations=0
>
>         Job Counters
>
>                 Failed map tasks=4
>
>                 Killed map tasks=10
>
>                 Launched map tasks=176
>
>                 Other local map tasks=3
>
>                 Data-local map tasks=173
>
>                 Total time spent by all maps in occupied slots
> (ms)=1035708
>
>                 Total time spent by all reduces in occupied slots
> (ms)=0
>
>         Map-Reduce Framework
>
>                 Map input records=164217466
>
>                 Map output records=0
>
>                 Map output bytes=0
>
>                 Map output materialized bytes=414720
>
>                 Input split bytes=23490
>
>                 Combine input records=0
>
>                 Combine output records=0
>
>                 Spilled Records=0
>
>                 Failed Shuffles=0
>
>                 Merged Map outputs=0
>
>                 GC time elapsed (ms)=4750
>
>                 CPU time spent (ms)=321980
>
>                 Physical memory (bytes) snapshot=91335024640
>
>                 Virtual memory (bytes) snapshot=229819834368
>
>                 Total committed heap usage (bytes)=128240713728
>
>         File Input Format Counters
>
>                 Bytes Read=21753888768
>
> 14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful!
>
> Streaming Command Failed!
>
>
>
>
>
> Thanks and Regards,
>
> Truong Phan



--
Harsh J

RE: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

Posted by "Phan, Truong Q" <Tr...@team.telstra.com>.
I could not find the "attempt_1395628276810_0062_m_000149_0 attemp*" in the HDFS "/tmp" directory.
Where can I find these log files.

Thanks and Regards,
Truong Phan


P    + 61 2 8576 5771
M   + 61 4 1463 7424
E    troung.phan@team.telstra.com
W  www.telstra.com


-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com]
Sent: Thursday, 10 April 2014 4:32 PM
To: <us...@hadoop.apache.org>
Subject: Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

It appears to me that whatever chunk of the input CSV files your map task 000149 gets, the program is unable to process it and throws an error and exits.

Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log to see if there's any stdout/stderr printed that may help. The syslog in the attempt's task log will also carry a "Processing split ..."
message that may help you know which file and what offset+length under that file was being processed.

On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q <Tr...@team.telstra.com> wrote:
> Hi
>
>
>
> My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce
> Streaming job.
>
> I have no issue in running the MapReduce Streaming job which has an
> input data file of around 400Mb CSV file.
>
> However, it is failed when I try to run the job which has 11 input
> data files of size 400Mb each.
>
> The job failed with the following error.
>
>
>
> I appreciate for any hints or suggestions to fix this issue.
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
> attempt_1395628276810_0062_m_000149_0 - exited : java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report
> from
> attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
> Diagnostics report from attempt_1395628276810_0062_m_000149_0: Error:
> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> MAPREDUCE SCRIPT:
>
> $ cat devices-hdfs-mr-PyIterGen-v3.sh
>
> #!/bin/sh
>
> export HADOOP_CMD=/usr/bin/hadoop
>
> export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce
>
> export
> HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hado
> op-streaming-2.2.0-mr1-cdh5.0.0-beta-1.jar
>
>
>
> # Clean up the previous runs
>
> sudo -u hdfs hadoop fs -rm -f -R /data/db/bdms1p/device
>
>
>
> sudo -u hdfs hadoop jar
> $HADOOP_HOME/contrib/streaming/hadoop-*streaming*.jar \
>
> -D mapreduce.job.reduces=160 \
>
> -files
> ./device-mapper-v1.py,./device-combiner-v1.py,./device-reducer-v1.py,.
> /nem-dms-stats-parameter.txt,./map-debug.py
> \
>
> -mapper ./device-mapper-v1.py \
>
> -combiner ./device-combiner-v1.py \
>
> -reducer ./device-reducer-v1.py \
>
> -mapdebug ./map-debug.py \
>
> -input /data/db/bdms1p/input/*.csv \
>
> -output /data/db/bdms1p/device
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> OUTPUT ON THE CONSOLE:
>
> $ ./devices-hdfs-mr-PyIterGen-v3.sh
>
> 14/04/10 10:26:27 INFO fs.TrashPolicyDefault: Namenode trash configuration:
> Deletion interval = 86400000 minutes, Emptier interval = 0 minutes.
>
> Moved:
> 'hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/data/db/bdms1p/device'
> to trash at:
> hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/user/hdfs/.Trash/Curren
> t
>
> packageJobJar: []
> [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-1.jar]
> /tmp/streamjob781154149428893352.jar tmpDir=null
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:32 INFO mapred.FileInputFormat: Total input paths to
> process
> : 106
>
> 14/04/10 10:26:32 INFO mapreduce.JobSubmitter: number of splits:317
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: user.name is deprecated.
> Instead, use mapreduce.job.user.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.jar is deprecated.
> Instead, use mapreduce.job.jar
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.filesizes is deprecated. Instead, use
> mapreduce.job.cache.files.filesizes
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files
> is deprecated. Instead, use mapreduce.job.cache.files
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.reduce.tasks
> is deprecated. Instead, use mapreduce.job.reduces
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.output.value.class is deprecated. Instead, use
> mapreduce.job.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.mapoutput.value.class is deprecated. Instead, use
> mapreduce.map.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.used.genericoptionsparser is deprecated. Instead, use
> mapreduce.client.genericoptionsparser.used
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.job.name is
> deprecated. Instead, use mapreduce.job.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.input.dir is
> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.map.task.debug.script is deprecated. Instead, use
> mapreduce.map.debug.script
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.timestamps is deprecated. Instead, use
> mapreduce.job.cache.files.timestamps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.output.key.class is deprecated. Instead, use
> mapreduce.job.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.mapoutput.key.class is deprecated. Instead, use
> mapreduce.map.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.working.dir
> is deprecated. Instead, use mapreduce.job.working.dir
>
> 14/04/10 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1395628276810_0062
>
> 14/04/10 10:26:33 INFO impl.YarnClientImpl: Submitted application
> application_1395628276810_0062 to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:33 INFO mapreduce.Job: The url to track the job:
> http://bpdevdmsdbs01:8088/proxy/application_1395628276810_0062/
>
> 14/04/10 10:26:33 INFO mapreduce.Job: Running job:
> job_1395628276810_0062
>
> 14/04/10 10:26:42 INFO mapreduce.Job: Job job_1395628276810_0062
> running in uber mode : false
>
> 14/04/10 10:26:42 INFO mapreduce.Job:  map 0% reduce 0%
>
> 14/04/10 10:26:51 INFO mapreduce.Job:  map 1% reduce 0%
>
> 14/04/10 10:26:52 INFO mapreduce.Job:  map 2% reduce 0%
>
> 14/04/10 10:26:53 INFO mapreduce.Job:  map 3% reduce 0%
>
> 14/04/10 10:26:55 INFO mapreduce.Job:  map 4% reduce 0%
>
> 14/04/10 10:26:58 INFO mapreduce.Job:  map 5% reduce 0%
>
> 14/04/10 10:26:59 INFO mapreduce.Job:  map 6% reduce 0%
>
> 14/04/10 10:27:01 INFO mapreduce.Job:  map 7% reduce 0%
>
> 14/04/10 10:27:02 INFO mapreduce.Job:  map 8% reduce 0%
>
> 14/04/10 10:27:04 INFO mapreduce.Job:  map 9% reduce 0%
>
> 14/04/10 10:27:06 INFO mapreduce.Job:  map 10% reduce 0%
>
> 14/04/10 10:27:08 INFO mapreduce.Job:  map 11% reduce 0%
>
> 14/04/10 10:27:10 INFO mapreduce.Job:  map 12% reduce 0%
>
> 14/04/10 10:27:12 INFO mapreduce.Job:  map 13% reduce 0%
>
> 14/04/10 10:27:13 INFO mapreduce.Job:  map 14% reduce 0%
>
> 14/04/10 10:27:15 INFO mapreduce.Job:  map 15% reduce 0%
>
> 14/04/10 10:27:18 INFO mapreduce.Job:  map 16% reduce 0%
>
> 14/04/10 10:27:19 INFO mapreduce.Job:  map 17% reduce 0%
>
> 14/04/10 10:27:20 INFO mapreduce.Job:  map 18% reduce 0%
>
> 14/04/10 10:27:23 INFO mapreduce.Job:  map 19% reduce 0%
>
> 14/04/10 10:27:25 INFO mapreduce.Job:  map 20% reduce 0%
>
> 14/04/10 10:27:27 INFO mapreduce.Job:  map 21% reduce 0%
>
> 14/04/10 10:27:28 INFO mapreduce.Job:  map 22% reduce 0%
>
> 14/04/10 10:27:30 INFO mapreduce.Job:  map 23% reduce 0%
>
> 14/04/10 10:27:32 INFO mapreduce.Job:  map 24% reduce 0%
>
> 14/04/10 10:27:34 INFO mapreduce.Job:  map 25% reduce 0%
>
> 14/04/10 10:27:35 INFO mapreduce.Job:  map 26% reduce 0%
>
> 14/04/10 10:27:38 INFO mapreduce.Job:  map 27% reduce 0%
>
> 14/04/10 10:27:40 INFO mapreduce.Job:  map 28% reduce 0%
>
> 14/04/10 10:27:41 INFO mapreduce.Job:  map 29% reduce 0%
>
> 14/04/10 10:27:43 INFO mapreduce.Job:  map 30% reduce 0%
>
> 14/04/10 10:27:45 INFO mapreduce.Job:  map 31% reduce 0%
>
> 14/04/10 10:27:47 INFO mapreduce.Job:  map 32% reduce 0%
>
> 14/04/10 10:27:48 INFO mapreduce.Job:  map 33% reduce 0%
>
> 14/04/10 10:27:51 INFO mapreduce.Job:  map 34% reduce 0%
>
> 14/04/10 10:27:53 INFO mapreduce.Job:  map 35% reduce 0%
>
> 14/04/10 10:27:54 INFO mapreduce.Job:  map 36% reduce 0%
>
> 14/04/10 10:27:55 INFO mapreduce.Job:  map 37% reduce 0%
>
> 14/04/10 10:27:59 INFO mapreduce.Job:  map 38% reduce 0%
>
> 14/04/10 10:28:00 INFO mapreduce.Job:  map 39% reduce 0%
>
> 14/04/10 10:28:02 INFO mapreduce.Job:  map 40% reduce 0%
>
> 14/04/10 10:28:04 INFO mapreduce.Job:  map 41% reduce 0%
>
> 14/04/10 10:28:06 INFO mapreduce.Job:  map 42% reduce 0%
>
> 14/04/10 10:28:07 INFO mapreduce.Job:  map 43% reduce 0%
>
> 14/04/10 10:28:09 INFO mapreduce.Job:  map 44% reduce 0%
>
> 14/04/10 10:28:10 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_0, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:11 INFO mapreduce.Job:  map 45% reduce 0%
>
> 14/04/10 10:28:13 INFO mapreduce.Job:  map 46% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job:  map 47% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_1, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:17 INFO mapreduce.Job:  map 48% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job:  map 49% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_2, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:21 INFO mapreduce.Job:  map 50% reduce 0%
>
> 14/04/10 10:28:23 INFO mapreduce.Job:  map 51% reduce 0%
>
> 14/04/10 10:28:24 INFO mapreduce.Job:  map 100% reduce 100%
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Job job_1395628276810_0062
> failed with state FAILED due to: Task failed
> task_1395628276810_0062_m_000149
>
> Job failed as tasks failed. failedMaps:1 failedReduces:0
>
>
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Counters: 33
>
>         File System Counters
>
>                 FILE: Number of bytes read=0
>
>                 FILE: Number of bytes written=15667286
>
>                 FILE: Number of read operations=0
>
>                 FILE: Number of large read operations=0
>
>                 FILE: Number of write operations=0
>
>                 HDFS: Number of bytes read=21753912258
>
>                 HDFS: Number of bytes written=0
>
>                 HDFS: Number of read operations=486
>
>                 HDFS: Number of large read operations=0
>
>                 HDFS: Number of write operations=0
>
>         Job Counters
>
>                 Failed map tasks=4
>
>                 Killed map tasks=10
>
>                 Launched map tasks=176
>
>                 Other local map tasks=3
>
>                 Data-local map tasks=173
>
>                 Total time spent by all maps in occupied slots
> (ms)=1035708
>
>                 Total time spent by all reduces in occupied slots
> (ms)=0
>
>         Map-Reduce Framework
>
>                 Map input records=164217466
>
>                 Map output records=0
>
>                 Map output bytes=0
>
>                 Map output materialized bytes=414720
>
>                 Input split bytes=23490
>
>                 Combine input records=0
>
>                 Combine output records=0
>
>                 Spilled Records=0
>
>                 Failed Shuffles=0
>
>                 Merged Map outputs=0
>
>                 GC time elapsed (ms)=4750
>
>                 CPU time spent (ms)=321980
>
>                 Physical memory (bytes) snapshot=91335024640
>
>                 Virtual memory (bytes) snapshot=229819834368
>
>                 Total committed heap usage (bytes)=128240713728
>
>         File Input Format Counters
>
>                 Bytes Read=21753888768
>
> 14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful!
>
> Streaming Command Failed!
>
>
>
>
>
> Thanks and Regards,
>
> Truong Phan



--
Harsh J

RE: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

Posted by "Phan, Truong Q" <Tr...@team.telstra.com>.
I could not find the "attempt_1395628276810_0062_m_000149_0 attemp*" in the HDFS "/tmp" directory.
Where can I find these log files.

Thanks and Regards,
Truong Phan


P    + 61 2 8576 5771
M   + 61 4 1463 7424
E    troung.phan@team.telstra.com
W  www.telstra.com


-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com]
Sent: Thursday, 10 April 2014 4:32 PM
To: <us...@hadoop.apache.org>
Subject: Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

It appears to me that whatever chunk of the input CSV files your map task 000149 gets, the program is unable to process it and throws an error and exits.

Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log to see if there's any stdout/stderr printed that may help. The syslog in the attempt's task log will also carry a "Processing split ..."
message that may help you know which file and what offset+length under that file was being processed.

On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q <Tr...@team.telstra.com> wrote:
> Hi
>
>
>
> My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce
> Streaming job.
>
> I have no issue in running the MapReduce Streaming job which has an
> input data file of around 400Mb CSV file.
>
> However, it is failed when I try to run the job which has 11 input
> data files of size 400Mb each.
>
> The job failed with the following error.
>
>
>
> I appreciate for any hints or suggestions to fix this issue.
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
> attempt_1395628276810_0062_m_000149_0 - exited : java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report
> from
> attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
> Diagnostics report from attempt_1395628276810_0062_m_000149_0: Error:
> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> MAPREDUCE SCRIPT:
>
> $ cat devices-hdfs-mr-PyIterGen-v3.sh
>
> #!/bin/sh
>
> export HADOOP_CMD=/usr/bin/hadoop
>
> export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce
>
> export
> HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hado
> op-streaming-2.2.0-mr1-cdh5.0.0-beta-1.jar
>
>
>
> # Clean up the previous runs
>
> sudo -u hdfs hadoop fs -rm -f -R /data/db/bdms1p/device
>
>
>
> sudo -u hdfs hadoop jar
> $HADOOP_HOME/contrib/streaming/hadoop-*streaming*.jar \
>
> -D mapreduce.job.reduces=160 \
>
> -files
> ./device-mapper-v1.py,./device-combiner-v1.py,./device-reducer-v1.py,.
> /nem-dms-stats-parameter.txt,./map-debug.py
> \
>
> -mapper ./device-mapper-v1.py \
>
> -combiner ./device-combiner-v1.py \
>
> -reducer ./device-reducer-v1.py \
>
> -mapdebug ./map-debug.py \
>
> -input /data/db/bdms1p/input/*.csv \
>
> -output /data/db/bdms1p/device
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> OUTPUT ON THE CONSOLE:
>
> $ ./devices-hdfs-mr-PyIterGen-v3.sh
>
> 14/04/10 10:26:27 INFO fs.TrashPolicyDefault: Namenode trash configuration:
> Deletion interval = 86400000 minutes, Emptier interval = 0 minutes.
>
> Moved:
> 'hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/data/db/bdms1p/device'
> to trash at:
> hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/user/hdfs/.Trash/Curren
> t
>
> packageJobJar: []
> [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-1.jar]
> /tmp/streamjob781154149428893352.jar tmpDir=null
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager
> at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:32 INFO mapred.FileInputFormat: Total input paths to
> process
> : 106
>
> 14/04/10 10:26:32 INFO mapreduce.JobSubmitter: number of splits:317
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: user.name is deprecated.
> Instead, use mapreduce.job.user.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.jar is deprecated.
> Instead, use mapreduce.job.jar
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.filesizes is deprecated. Instead, use
> mapreduce.job.cache.files.filesizes
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files
> is deprecated. Instead, use mapreduce.job.cache.files
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.reduce.tasks
> is deprecated. Instead, use mapreduce.job.reduces
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.output.value.class is deprecated. Instead, use
> mapreduce.job.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.mapoutput.value.class is deprecated. Instead, use
> mapreduce.map.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.used.genericoptionsparser is deprecated. Instead, use
> mapreduce.client.genericoptionsparser.used
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.job.name is
> deprecated. Instead, use mapreduce.job.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.input.dir is
> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.map.task.debug.script is deprecated. Instead, use
> mapreduce.map.debug.script
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.timestamps is deprecated. Instead, use
> mapreduce.job.cache.files.timestamps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.output.key.class is deprecated. Instead, use
> mapreduce.job.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.mapoutput.key.class is deprecated. Instead, use
> mapreduce.map.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.working.dir
> is deprecated. Instead, use mapreduce.job.working.dir
>
> 14/04/10 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1395628276810_0062
>
> 14/04/10 10:26:33 INFO impl.YarnClientImpl: Submitted application
> application_1395628276810_0062 to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:33 INFO mapreduce.Job: The url to track the job:
> http://bpdevdmsdbs01:8088/proxy/application_1395628276810_0062/
>
> 14/04/10 10:26:33 INFO mapreduce.Job: Running job:
> job_1395628276810_0062
>
> 14/04/10 10:26:42 INFO mapreduce.Job: Job job_1395628276810_0062
> running in uber mode : false
>
> 14/04/10 10:26:42 INFO mapreduce.Job:  map 0% reduce 0%
>
> 14/04/10 10:26:51 INFO mapreduce.Job:  map 1% reduce 0%
>
> 14/04/10 10:26:52 INFO mapreduce.Job:  map 2% reduce 0%
>
> 14/04/10 10:26:53 INFO mapreduce.Job:  map 3% reduce 0%
>
> 14/04/10 10:26:55 INFO mapreduce.Job:  map 4% reduce 0%
>
> 14/04/10 10:26:58 INFO mapreduce.Job:  map 5% reduce 0%
>
> 14/04/10 10:26:59 INFO mapreduce.Job:  map 6% reduce 0%
>
> 14/04/10 10:27:01 INFO mapreduce.Job:  map 7% reduce 0%
>
> 14/04/10 10:27:02 INFO mapreduce.Job:  map 8% reduce 0%
>
> 14/04/10 10:27:04 INFO mapreduce.Job:  map 9% reduce 0%
>
> 14/04/10 10:27:06 INFO mapreduce.Job:  map 10% reduce 0%
>
> 14/04/10 10:27:08 INFO mapreduce.Job:  map 11% reduce 0%
>
> 14/04/10 10:27:10 INFO mapreduce.Job:  map 12% reduce 0%
>
> 14/04/10 10:27:12 INFO mapreduce.Job:  map 13% reduce 0%
>
> 14/04/10 10:27:13 INFO mapreduce.Job:  map 14% reduce 0%
>
> 14/04/10 10:27:15 INFO mapreduce.Job:  map 15% reduce 0%
>
> 14/04/10 10:27:18 INFO mapreduce.Job:  map 16% reduce 0%
>
> 14/04/10 10:27:19 INFO mapreduce.Job:  map 17% reduce 0%
>
> 14/04/10 10:27:20 INFO mapreduce.Job:  map 18% reduce 0%
>
> 14/04/10 10:27:23 INFO mapreduce.Job:  map 19% reduce 0%
>
> 14/04/10 10:27:25 INFO mapreduce.Job:  map 20% reduce 0%
>
> 14/04/10 10:27:27 INFO mapreduce.Job:  map 21% reduce 0%
>
> 14/04/10 10:27:28 INFO mapreduce.Job:  map 22% reduce 0%
>
> 14/04/10 10:27:30 INFO mapreduce.Job:  map 23% reduce 0%
>
> 14/04/10 10:27:32 INFO mapreduce.Job:  map 24% reduce 0%
>
> 14/04/10 10:27:34 INFO mapreduce.Job:  map 25% reduce 0%
>
> 14/04/10 10:27:35 INFO mapreduce.Job:  map 26% reduce 0%
>
> 14/04/10 10:27:38 INFO mapreduce.Job:  map 27% reduce 0%
>
> 14/04/10 10:27:40 INFO mapreduce.Job:  map 28% reduce 0%
>
> 14/04/10 10:27:41 INFO mapreduce.Job:  map 29% reduce 0%
>
> 14/04/10 10:27:43 INFO mapreduce.Job:  map 30% reduce 0%
>
> 14/04/10 10:27:45 INFO mapreduce.Job:  map 31% reduce 0%
>
> 14/04/10 10:27:47 INFO mapreduce.Job:  map 32% reduce 0%
>
> 14/04/10 10:27:48 INFO mapreduce.Job:  map 33% reduce 0%
>
> 14/04/10 10:27:51 INFO mapreduce.Job:  map 34% reduce 0%
>
> 14/04/10 10:27:53 INFO mapreduce.Job:  map 35% reduce 0%
>
> 14/04/10 10:27:54 INFO mapreduce.Job:  map 36% reduce 0%
>
> 14/04/10 10:27:55 INFO mapreduce.Job:  map 37% reduce 0%
>
> 14/04/10 10:27:59 INFO mapreduce.Job:  map 38% reduce 0%
>
> 14/04/10 10:28:00 INFO mapreduce.Job:  map 39% reduce 0%
>
> 14/04/10 10:28:02 INFO mapreduce.Job:  map 40% reduce 0%
>
> 14/04/10 10:28:04 INFO mapreduce.Job:  map 41% reduce 0%
>
> 14/04/10 10:28:06 INFO mapreduce.Job:  map 42% reduce 0%
>
> 14/04/10 10:28:07 INFO mapreduce.Job:  map 43% reduce 0%
>
> 14/04/10 10:28:09 INFO mapreduce.Job:  map 44% reduce 0%
>
> 14/04/10 10:28:10 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_0, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:11 INFO mapreduce.Job:  map 45% reduce 0%
>
> 14/04/10 10:28:13 INFO mapreduce.Job:  map 46% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job:  map 47% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_1, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:17 INFO mapreduce.Job:  map 48% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job:  map 49% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_2, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja
> va:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:
> 533)
>
>         at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:21 INFO mapreduce.Job:  map 50% reduce 0%
>
> 14/04/10 10:28:23 INFO mapreduce.Job:  map 51% reduce 0%
>
> 14/04/10 10:28:24 INFO mapreduce.Job:  map 100% reduce 100%
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Job job_1395628276810_0062
> failed with state FAILED due to: Task failed
> task_1395628276810_0062_m_000149
>
> Job failed as tasks failed. failedMaps:1 failedReduces:0
>
>
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Counters: 33
>
>         File System Counters
>
>                 FILE: Number of bytes read=0
>
>                 FILE: Number of bytes written=15667286
>
>                 FILE: Number of read operations=0
>
>                 FILE: Number of large read operations=0
>
>                 FILE: Number of write operations=0
>
>                 HDFS: Number of bytes read=21753912258
>
>                 HDFS: Number of bytes written=0
>
>                 HDFS: Number of read operations=486
>
>                 HDFS: Number of large read operations=0
>
>                 HDFS: Number of write operations=0
>
>         Job Counters
>
>                 Failed map tasks=4
>
>                 Killed map tasks=10
>
>                 Launched map tasks=176
>
>                 Other local map tasks=3
>
>                 Data-local map tasks=173
>
>                 Total time spent by all maps in occupied slots
> (ms)=1035708
>
>                 Total time spent by all reduces in occupied slots
> (ms)=0
>
>         Map-Reduce Framework
>
>                 Map input records=164217466
>
>                 Map output records=0
>
>                 Map output bytes=0
>
>                 Map output materialized bytes=414720
>
>                 Input split bytes=23490
>
>                 Combine input records=0
>
>                 Combine output records=0
>
>                 Spilled Records=0
>
>                 Failed Shuffles=0
>
>                 Merged Map outputs=0
>
>                 GC time elapsed (ms)=4750
>
>                 CPU time spent (ms)=321980
>
>                 Physical memory (bytes) snapshot=91335024640
>
>                 Virtual memory (bytes) snapshot=229819834368
>
>                 Total committed heap usage (bytes)=128240713728
>
>         File Input Format Counters
>
>                 Bytes Read=21753888768
>
> 14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful!
>
> Streaming Command Failed!
>
>
>
>
>
> Thanks and Regards,
>
> Truong Phan



--
Harsh J

Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

Posted by Harsh J <ha...@cloudera.com>.
It appears to me that whatever chunk of the input CSV files your map
task 000149 gets, the program is unable to process it and throws an
error and exits.

Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log
to see if there's any stdout/stderr printed that may help. The syslog
in the attempt's task log will also carry a "Processing split ..."
message that may help you know which file and what offset+length under
that file was being processed.

On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q
<Tr...@team.telstra.com> wrote:
> Hi
>
>
>
> My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce
> Streaming job.
>
> I have no issue in running the MapReduce Streaming job which has an input
> data file of around 400Mb CSV file.
>
> However, it is failed when I try to run the job which has 11 input data
> files of size 400Mb each.
>
> The job failed with the following error.
>
>
>
> I appreciate for any hints or suggestions to fix this issue.
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
> attempt_1395628276810_0062_m_000149_0 - exited : java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from
> attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics
> report from attempt_1395628276810_0062_m_000149_0: Error:
> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> MAPREDUCE SCRIPT:
>
> $ cat devices-hdfs-mr-PyIterGen-v3.sh
>
> #!/bin/sh
>
> export HADOOP_CMD=/usr/bin/hadoop
>
> export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce
>
> export
> HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.2.0-mr1-cdh5.0.0-beta-1.jar
>
>
>
> # Clean up the previous runs
>
> sudo -u hdfs hadoop fs -rm -f -R /data/db/bdms1p/device
>
>
>
> sudo -u hdfs hadoop jar
> $HADOOP_HOME/contrib/streaming/hadoop-*streaming*.jar \
>
> -D mapreduce.job.reduces=160 \
>
> -files
> ./device-mapper-v1.py,./device-combiner-v1.py,./device-reducer-v1.py,./nem-dms-stats-parameter.txt,./map-debug.py
> \
>
> -mapper ./device-mapper-v1.py \
>
> -combiner ./device-combiner-v1.py \
>
> -reducer ./device-reducer-v1.py \
>
> -mapdebug ./map-debug.py \
>
> -input /data/db/bdms1p/input/*.csv \
>
> -output /data/db/bdms1p/device
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> OUTPUT ON THE CONSOLE:
>
> $ ./devices-hdfs-mr-PyIterGen-v3.sh
>
> 14/04/10 10:26:27 INFO fs.TrashPolicyDefault: Namenode trash configuration:
> Deletion interval = 86400000 minutes, Emptier interval = 0 minutes.
>
> Moved:
> 'hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/data/db/bdms1p/device' to
> trash at:
> hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/user/hdfs/.Trash/Current
>
> packageJobJar: []
> [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-1.jar]
> /tmp/streamjob781154149428893352.jar tmpDir=null
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:32 INFO mapred.FileInputFormat: Total input paths to process
> : 106
>
> 14/04/10 10:26:32 INFO mapreduce.JobSubmitter: number of splits:317
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: user.name is deprecated.
> Instead, use mapreduce.job.user.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.jar is deprecated.
> Instead, use mapreduce.job.jar
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.filesizes is deprecated. Instead, use
> mapreduce.job.cache.files.filesizes
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files is
> deprecated. Instead, use mapreduce.job.cache.files
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.reduce.tasks is
> deprecated. Instead, use mapreduce.job.reduces
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.value.class
> is deprecated. Instead, use mapreduce.job.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.mapoutput.value.class is deprecated. Instead, use
> mapreduce.map.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.used.genericoptionsparser is deprecated. Instead, use
> mapreduce.client.genericoptionsparser.used
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.job.name is
> deprecated. Instead, use mapreduce.job.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.input.dir is
> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.map.task.debug.script is deprecated. Instead, use
> mapreduce.map.debug.script
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.timestamps is deprecated. Instead, use
> mapreduce.job.cache.files.timestamps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.key.class is
> deprecated. Instead, use mapreduce.job.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.mapoutput.key.class
> is deprecated. Instead, use mapreduce.map.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.working.dir is
> deprecated. Instead, use mapreduce.job.working.dir
>
> 14/04/10 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1395628276810_0062
>
> 14/04/10 10:26:33 INFO impl.YarnClientImpl: Submitted application
> application_1395628276810_0062 to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:33 INFO mapreduce.Job: The url to track the job:
> http://bpdevdmsdbs01:8088/proxy/application_1395628276810_0062/
>
> 14/04/10 10:26:33 INFO mapreduce.Job: Running job: job_1395628276810_0062
>
> 14/04/10 10:26:42 INFO mapreduce.Job: Job job_1395628276810_0062 running in
> uber mode : false
>
> 14/04/10 10:26:42 INFO mapreduce.Job:  map 0% reduce 0%
>
> 14/04/10 10:26:51 INFO mapreduce.Job:  map 1% reduce 0%
>
> 14/04/10 10:26:52 INFO mapreduce.Job:  map 2% reduce 0%
>
> 14/04/10 10:26:53 INFO mapreduce.Job:  map 3% reduce 0%
>
> 14/04/10 10:26:55 INFO mapreduce.Job:  map 4% reduce 0%
>
> 14/04/10 10:26:58 INFO mapreduce.Job:  map 5% reduce 0%
>
> 14/04/10 10:26:59 INFO mapreduce.Job:  map 6% reduce 0%
>
> 14/04/10 10:27:01 INFO mapreduce.Job:  map 7% reduce 0%
>
> 14/04/10 10:27:02 INFO mapreduce.Job:  map 8% reduce 0%
>
> 14/04/10 10:27:04 INFO mapreduce.Job:  map 9% reduce 0%
>
> 14/04/10 10:27:06 INFO mapreduce.Job:  map 10% reduce 0%
>
> 14/04/10 10:27:08 INFO mapreduce.Job:  map 11% reduce 0%
>
> 14/04/10 10:27:10 INFO mapreduce.Job:  map 12% reduce 0%
>
> 14/04/10 10:27:12 INFO mapreduce.Job:  map 13% reduce 0%
>
> 14/04/10 10:27:13 INFO mapreduce.Job:  map 14% reduce 0%
>
> 14/04/10 10:27:15 INFO mapreduce.Job:  map 15% reduce 0%
>
> 14/04/10 10:27:18 INFO mapreduce.Job:  map 16% reduce 0%
>
> 14/04/10 10:27:19 INFO mapreduce.Job:  map 17% reduce 0%
>
> 14/04/10 10:27:20 INFO mapreduce.Job:  map 18% reduce 0%
>
> 14/04/10 10:27:23 INFO mapreduce.Job:  map 19% reduce 0%
>
> 14/04/10 10:27:25 INFO mapreduce.Job:  map 20% reduce 0%
>
> 14/04/10 10:27:27 INFO mapreduce.Job:  map 21% reduce 0%
>
> 14/04/10 10:27:28 INFO mapreduce.Job:  map 22% reduce 0%
>
> 14/04/10 10:27:30 INFO mapreduce.Job:  map 23% reduce 0%
>
> 14/04/10 10:27:32 INFO mapreduce.Job:  map 24% reduce 0%
>
> 14/04/10 10:27:34 INFO mapreduce.Job:  map 25% reduce 0%
>
> 14/04/10 10:27:35 INFO mapreduce.Job:  map 26% reduce 0%
>
> 14/04/10 10:27:38 INFO mapreduce.Job:  map 27% reduce 0%
>
> 14/04/10 10:27:40 INFO mapreduce.Job:  map 28% reduce 0%
>
> 14/04/10 10:27:41 INFO mapreduce.Job:  map 29% reduce 0%
>
> 14/04/10 10:27:43 INFO mapreduce.Job:  map 30% reduce 0%
>
> 14/04/10 10:27:45 INFO mapreduce.Job:  map 31% reduce 0%
>
> 14/04/10 10:27:47 INFO mapreduce.Job:  map 32% reduce 0%
>
> 14/04/10 10:27:48 INFO mapreduce.Job:  map 33% reduce 0%
>
> 14/04/10 10:27:51 INFO mapreduce.Job:  map 34% reduce 0%
>
> 14/04/10 10:27:53 INFO mapreduce.Job:  map 35% reduce 0%
>
> 14/04/10 10:27:54 INFO mapreduce.Job:  map 36% reduce 0%
>
> 14/04/10 10:27:55 INFO mapreduce.Job:  map 37% reduce 0%
>
> 14/04/10 10:27:59 INFO mapreduce.Job:  map 38% reduce 0%
>
> 14/04/10 10:28:00 INFO mapreduce.Job:  map 39% reduce 0%
>
> 14/04/10 10:28:02 INFO mapreduce.Job:  map 40% reduce 0%
>
> 14/04/10 10:28:04 INFO mapreduce.Job:  map 41% reduce 0%
>
> 14/04/10 10:28:06 INFO mapreduce.Job:  map 42% reduce 0%
>
> 14/04/10 10:28:07 INFO mapreduce.Job:  map 43% reduce 0%
>
> 14/04/10 10:28:09 INFO mapreduce.Job:  map 44% reduce 0%
>
> 14/04/10 10:28:10 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_0, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:11 INFO mapreduce.Job:  map 45% reduce 0%
>
> 14/04/10 10:28:13 INFO mapreduce.Job:  map 46% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job:  map 47% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_1, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:17 INFO mapreduce.Job:  map 48% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job:  map 49% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_2, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:21 INFO mapreduce.Job:  map 50% reduce 0%
>
> 14/04/10 10:28:23 INFO mapreduce.Job:  map 51% reduce 0%
>
> 14/04/10 10:28:24 INFO mapreduce.Job:  map 100% reduce 100%
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Job job_1395628276810_0062 failed with
> state FAILED due to: Task failed task_1395628276810_0062_m_000149
>
> Job failed as tasks failed. failedMaps:1 failedReduces:0
>
>
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Counters: 33
>
>         File System Counters
>
>                 FILE: Number of bytes read=0
>
>                 FILE: Number of bytes written=15667286
>
>                 FILE: Number of read operations=0
>
>                 FILE: Number of large read operations=0
>
>                 FILE: Number of write operations=0
>
>                 HDFS: Number of bytes read=21753912258
>
>                 HDFS: Number of bytes written=0
>
>                 HDFS: Number of read operations=486
>
>                 HDFS: Number of large read operations=0
>
>                 HDFS: Number of write operations=0
>
>         Job Counters
>
>                 Failed map tasks=4
>
>                 Killed map tasks=10
>
>                 Launched map tasks=176
>
>                 Other local map tasks=3
>
>                 Data-local map tasks=173
>
>                 Total time spent by all maps in occupied slots (ms)=1035708
>
>                 Total time spent by all reduces in occupied slots (ms)=0
>
>         Map-Reduce Framework
>
>                 Map input records=164217466
>
>                 Map output records=0
>
>                 Map output bytes=0
>
>                 Map output materialized bytes=414720
>
>                 Input split bytes=23490
>
>                 Combine input records=0
>
>                 Combine output records=0
>
>                 Spilled Records=0
>
>                 Failed Shuffles=0
>
>                 Merged Map outputs=0
>
>                 GC time elapsed (ms)=4750
>
>                 CPU time spent (ms)=321980
>
>                 Physical memory (bytes) snapshot=91335024640
>
>                 Virtual memory (bytes) snapshot=229819834368
>
>                 Total committed heap usage (bytes)=128240713728
>
>         File Input Format Counters
>
>                 Bytes Read=21753888768
>
> 14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful!
>
> Streaming Command Failed!
>
>
>
>
>
> Thanks and Regards,
>
> Truong Phan



-- 
Harsh J

Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

Posted by Harsh J <ha...@cloudera.com>.
It appears to me that whatever chunk of the input CSV files your map
task 000149 gets, the program is unable to process it and throws an
error and exits.

Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log
to see if there's any stdout/stderr printed that may help. The syslog
in the attempt's task log will also carry a "Processing split ..."
message that may help you know which file and what offset+length under
that file was being processed.

On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q
<Tr...@team.telstra.com> wrote:
> Hi
>
>
>
> My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce
> Streaming job.
>
> I have no issue in running the MapReduce Streaming job which has an input
> data file of around 400Mb CSV file.
>
> However, it is failed when I try to run the job which has 11 input data
> files of size 400Mb each.
>
> The job failed with the following error.
>
>
>
> I appreciate for any hints or suggestions to fix this issue.
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
> attempt_1395628276810_0062_m_000149_0 - exited : java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from
> attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics
> report from attempt_1395628276810_0062_m_000149_0: Error:
> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> MAPREDUCE SCRIPT:
>
> $ cat devices-hdfs-mr-PyIterGen-v3.sh
>
> #!/bin/sh
>
> export HADOOP_CMD=/usr/bin/hadoop
>
> export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce
>
> export
> HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.2.0-mr1-cdh5.0.0-beta-1.jar
>
>
>
> # Clean up the previous runs
>
> sudo -u hdfs hadoop fs -rm -f -R /data/db/bdms1p/device
>
>
>
> sudo -u hdfs hadoop jar
> $HADOOP_HOME/contrib/streaming/hadoop-*streaming*.jar \
>
> -D mapreduce.job.reduces=160 \
>
> -files
> ./device-mapper-v1.py,./device-combiner-v1.py,./device-reducer-v1.py,./nem-dms-stats-parameter.txt,./map-debug.py
> \
>
> -mapper ./device-mapper-v1.py \
>
> -combiner ./device-combiner-v1.py \
>
> -reducer ./device-reducer-v1.py \
>
> -mapdebug ./map-debug.py \
>
> -input /data/db/bdms1p/input/*.csv \
>
> -output /data/db/bdms1p/device
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> OUTPUT ON THE CONSOLE:
>
> $ ./devices-hdfs-mr-PyIterGen-v3.sh
>
> 14/04/10 10:26:27 INFO fs.TrashPolicyDefault: Namenode trash configuration:
> Deletion interval = 86400000 minutes, Emptier interval = 0 minutes.
>
> Moved:
> 'hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/data/db/bdms1p/device' to
> trash at:
> hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/user/hdfs/.Trash/Current
>
> packageJobJar: []
> [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-1.jar]
> /tmp/streamjob781154149428893352.jar tmpDir=null
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:32 INFO mapred.FileInputFormat: Total input paths to process
> : 106
>
> 14/04/10 10:26:32 INFO mapreduce.JobSubmitter: number of splits:317
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: user.name is deprecated.
> Instead, use mapreduce.job.user.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.jar is deprecated.
> Instead, use mapreduce.job.jar
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.filesizes is deprecated. Instead, use
> mapreduce.job.cache.files.filesizes
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files is
> deprecated. Instead, use mapreduce.job.cache.files
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.reduce.tasks is
> deprecated. Instead, use mapreduce.job.reduces
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.value.class
> is deprecated. Instead, use mapreduce.job.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.mapoutput.value.class is deprecated. Instead, use
> mapreduce.map.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.used.genericoptionsparser is deprecated. Instead, use
> mapreduce.client.genericoptionsparser.used
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.job.name is
> deprecated. Instead, use mapreduce.job.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.input.dir is
> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.map.task.debug.script is deprecated. Instead, use
> mapreduce.map.debug.script
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.timestamps is deprecated. Instead, use
> mapreduce.job.cache.files.timestamps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.key.class is
> deprecated. Instead, use mapreduce.job.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.mapoutput.key.class
> is deprecated. Instead, use mapreduce.map.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.working.dir is
> deprecated. Instead, use mapreduce.job.working.dir
>
> 14/04/10 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1395628276810_0062
>
> 14/04/10 10:26:33 INFO impl.YarnClientImpl: Submitted application
> application_1395628276810_0062 to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:33 INFO mapreduce.Job: The url to track the job:
> http://bpdevdmsdbs01:8088/proxy/application_1395628276810_0062/
>
> 14/04/10 10:26:33 INFO mapreduce.Job: Running job: job_1395628276810_0062
>
> 14/04/10 10:26:42 INFO mapreduce.Job: Job job_1395628276810_0062 running in
> uber mode : false
>
> 14/04/10 10:26:42 INFO mapreduce.Job:  map 0% reduce 0%
>
> 14/04/10 10:26:51 INFO mapreduce.Job:  map 1% reduce 0%
>
> 14/04/10 10:26:52 INFO mapreduce.Job:  map 2% reduce 0%
>
> 14/04/10 10:26:53 INFO mapreduce.Job:  map 3% reduce 0%
>
> 14/04/10 10:26:55 INFO mapreduce.Job:  map 4% reduce 0%
>
> 14/04/10 10:26:58 INFO mapreduce.Job:  map 5% reduce 0%
>
> 14/04/10 10:26:59 INFO mapreduce.Job:  map 6% reduce 0%
>
> 14/04/10 10:27:01 INFO mapreduce.Job:  map 7% reduce 0%
>
> 14/04/10 10:27:02 INFO mapreduce.Job:  map 8% reduce 0%
>
> 14/04/10 10:27:04 INFO mapreduce.Job:  map 9% reduce 0%
>
> 14/04/10 10:27:06 INFO mapreduce.Job:  map 10% reduce 0%
>
> 14/04/10 10:27:08 INFO mapreduce.Job:  map 11% reduce 0%
>
> 14/04/10 10:27:10 INFO mapreduce.Job:  map 12% reduce 0%
>
> 14/04/10 10:27:12 INFO mapreduce.Job:  map 13% reduce 0%
>
> 14/04/10 10:27:13 INFO mapreduce.Job:  map 14% reduce 0%
>
> 14/04/10 10:27:15 INFO mapreduce.Job:  map 15% reduce 0%
>
> 14/04/10 10:27:18 INFO mapreduce.Job:  map 16% reduce 0%
>
> 14/04/10 10:27:19 INFO mapreduce.Job:  map 17% reduce 0%
>
> 14/04/10 10:27:20 INFO mapreduce.Job:  map 18% reduce 0%
>
> 14/04/10 10:27:23 INFO mapreduce.Job:  map 19% reduce 0%
>
> 14/04/10 10:27:25 INFO mapreduce.Job:  map 20% reduce 0%
>
> 14/04/10 10:27:27 INFO mapreduce.Job:  map 21% reduce 0%
>
> 14/04/10 10:27:28 INFO mapreduce.Job:  map 22% reduce 0%
>
> 14/04/10 10:27:30 INFO mapreduce.Job:  map 23% reduce 0%
>
> 14/04/10 10:27:32 INFO mapreduce.Job:  map 24% reduce 0%
>
> 14/04/10 10:27:34 INFO mapreduce.Job:  map 25% reduce 0%
>
> 14/04/10 10:27:35 INFO mapreduce.Job:  map 26% reduce 0%
>
> 14/04/10 10:27:38 INFO mapreduce.Job:  map 27% reduce 0%
>
> 14/04/10 10:27:40 INFO mapreduce.Job:  map 28% reduce 0%
>
> 14/04/10 10:27:41 INFO mapreduce.Job:  map 29% reduce 0%
>
> 14/04/10 10:27:43 INFO mapreduce.Job:  map 30% reduce 0%
>
> 14/04/10 10:27:45 INFO mapreduce.Job:  map 31% reduce 0%
>
> 14/04/10 10:27:47 INFO mapreduce.Job:  map 32% reduce 0%
>
> 14/04/10 10:27:48 INFO mapreduce.Job:  map 33% reduce 0%
>
> 14/04/10 10:27:51 INFO mapreduce.Job:  map 34% reduce 0%
>
> 14/04/10 10:27:53 INFO mapreduce.Job:  map 35% reduce 0%
>
> 14/04/10 10:27:54 INFO mapreduce.Job:  map 36% reduce 0%
>
> 14/04/10 10:27:55 INFO mapreduce.Job:  map 37% reduce 0%
>
> 14/04/10 10:27:59 INFO mapreduce.Job:  map 38% reduce 0%
>
> 14/04/10 10:28:00 INFO mapreduce.Job:  map 39% reduce 0%
>
> 14/04/10 10:28:02 INFO mapreduce.Job:  map 40% reduce 0%
>
> 14/04/10 10:28:04 INFO mapreduce.Job:  map 41% reduce 0%
>
> 14/04/10 10:28:06 INFO mapreduce.Job:  map 42% reduce 0%
>
> 14/04/10 10:28:07 INFO mapreduce.Job:  map 43% reduce 0%
>
> 14/04/10 10:28:09 INFO mapreduce.Job:  map 44% reduce 0%
>
> 14/04/10 10:28:10 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_0, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:11 INFO mapreduce.Job:  map 45% reduce 0%
>
> 14/04/10 10:28:13 INFO mapreduce.Job:  map 46% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job:  map 47% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_1, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:17 INFO mapreduce.Job:  map 48% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job:  map 49% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_2, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:21 INFO mapreduce.Job:  map 50% reduce 0%
>
> 14/04/10 10:28:23 INFO mapreduce.Job:  map 51% reduce 0%
>
> 14/04/10 10:28:24 INFO mapreduce.Job:  map 100% reduce 100%
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Job job_1395628276810_0062 failed with
> state FAILED due to: Task failed task_1395628276810_0062_m_000149
>
> Job failed as tasks failed. failedMaps:1 failedReduces:0
>
>
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Counters: 33
>
>         File System Counters
>
>                 FILE: Number of bytes read=0
>
>                 FILE: Number of bytes written=15667286
>
>                 FILE: Number of read operations=0
>
>                 FILE: Number of large read operations=0
>
>                 FILE: Number of write operations=0
>
>                 HDFS: Number of bytes read=21753912258
>
>                 HDFS: Number of bytes written=0
>
>                 HDFS: Number of read operations=486
>
>                 HDFS: Number of large read operations=0
>
>                 HDFS: Number of write operations=0
>
>         Job Counters
>
>                 Failed map tasks=4
>
>                 Killed map tasks=10
>
>                 Launched map tasks=176
>
>                 Other local map tasks=3
>
>                 Data-local map tasks=173
>
>                 Total time spent by all maps in occupied slots (ms)=1035708
>
>                 Total time spent by all reduces in occupied slots (ms)=0
>
>         Map-Reduce Framework
>
>                 Map input records=164217466
>
>                 Map output records=0
>
>                 Map output bytes=0
>
>                 Map output materialized bytes=414720
>
>                 Input split bytes=23490
>
>                 Combine input records=0
>
>                 Combine output records=0
>
>                 Spilled Records=0
>
>                 Failed Shuffles=0
>
>                 Merged Map outputs=0
>
>                 GC time elapsed (ms)=4750
>
>                 CPU time spent (ms)=321980
>
>                 Physical memory (bytes) snapshot=91335024640
>
>                 Virtual memory (bytes) snapshot=229819834368
>
>                 Total committed heap usage (bytes)=128240713728
>
>         File Input Format Counters
>
>                 Bytes Read=21753888768
>
> 14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful!
>
> Streaming Command Failed!
>
>
>
>
>
> Thanks and Regards,
>
> Truong Phan



-- 
Harsh J

Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

Posted by Harsh J <ha...@cloudera.com>.
It appears to me that whatever chunk of the input CSV files your map
task 000149 gets, the program is unable to process it and throws an
error and exits.

Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log
to see if there's any stdout/stderr printed that may help. The syslog
in the attempt's task log will also carry a "Processing split ..."
message that may help you know which file and what offset+length under
that file was being processed.

On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q
<Tr...@team.telstra.com> wrote:
> Hi
>
>
>
> My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce
> Streaming job.
>
> I have no issue in running the MapReduce Streaming job which has an input
> data file of around 400Mb CSV file.
>
> However, it is failed when I try to run the job which has 11 input data
> files of size 400Mb each.
>
> The job failed with the following error.
>
>
>
> I appreciate for any hints or suggestions to fix this issue.
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
> attempt_1395628276810_0062_m_000149_0 - exited : java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from
> attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics
> report from attempt_1395628276810_0062_m_000149_0: Error:
> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> MAPREDUCE SCRIPT:
>
> $ cat devices-hdfs-mr-PyIterGen-v3.sh
>
> #!/bin/sh
>
> export HADOOP_CMD=/usr/bin/hadoop
>
> export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce
>
> export
> HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.2.0-mr1-cdh5.0.0-beta-1.jar
>
>
>
> # Clean up the previous runs
>
> sudo -u hdfs hadoop fs -rm -f -R /data/db/bdms1p/device
>
>
>
> sudo -u hdfs hadoop jar
> $HADOOP_HOME/contrib/streaming/hadoop-*streaming*.jar \
>
> -D mapreduce.job.reduces=160 \
>
> -files
> ./device-mapper-v1.py,./device-combiner-v1.py,./device-reducer-v1.py,./nem-dms-stats-parameter.txt,./map-debug.py
> \
>
> -mapper ./device-mapper-v1.py \
>
> -combiner ./device-combiner-v1.py \
>
> -reducer ./device-reducer-v1.py \
>
> -mapdebug ./map-debug.py \
>
> -input /data/db/bdms1p/input/*.csv \
>
> -output /data/db/bdms1p/device
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> OUTPUT ON THE CONSOLE:
>
> $ ./devices-hdfs-mr-PyIterGen-v3.sh
>
> 14/04/10 10:26:27 INFO fs.TrashPolicyDefault: Namenode trash configuration:
> Deletion interval = 86400000 minutes, Emptier interval = 0 minutes.
>
> Moved:
> 'hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/data/db/bdms1p/device' to
> trash at:
> hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/user/hdfs/.Trash/Current
>
> packageJobJar: []
> [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-1.jar]
> /tmp/streamjob781154149428893352.jar tmpDir=null
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:32 INFO mapred.FileInputFormat: Total input paths to process
> : 106
>
> 14/04/10 10:26:32 INFO mapreduce.JobSubmitter: number of splits:317
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: user.name is deprecated.
> Instead, use mapreduce.job.user.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.jar is deprecated.
> Instead, use mapreduce.job.jar
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.filesizes is deprecated. Instead, use
> mapreduce.job.cache.files.filesizes
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files is
> deprecated. Instead, use mapreduce.job.cache.files
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.reduce.tasks is
> deprecated. Instead, use mapreduce.job.reduces
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.value.class
> is deprecated. Instead, use mapreduce.job.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.mapoutput.value.class is deprecated. Instead, use
> mapreduce.map.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.used.genericoptionsparser is deprecated. Instead, use
> mapreduce.client.genericoptionsparser.used
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.job.name is
> deprecated. Instead, use mapreduce.job.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.input.dir is
> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.map.task.debug.script is deprecated. Instead, use
> mapreduce.map.debug.script
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.timestamps is deprecated. Instead, use
> mapreduce.job.cache.files.timestamps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.key.class is
> deprecated. Instead, use mapreduce.job.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.mapoutput.key.class
> is deprecated. Instead, use mapreduce.map.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.working.dir is
> deprecated. Instead, use mapreduce.job.working.dir
>
> 14/04/10 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1395628276810_0062
>
> 14/04/10 10:26:33 INFO impl.YarnClientImpl: Submitted application
> application_1395628276810_0062 to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:33 INFO mapreduce.Job: The url to track the job:
> http://bpdevdmsdbs01:8088/proxy/application_1395628276810_0062/
>
> 14/04/10 10:26:33 INFO mapreduce.Job: Running job: job_1395628276810_0062
>
> 14/04/10 10:26:42 INFO mapreduce.Job: Job job_1395628276810_0062 running in
> uber mode : false
>
> 14/04/10 10:26:42 INFO mapreduce.Job:  map 0% reduce 0%
>
> 14/04/10 10:26:51 INFO mapreduce.Job:  map 1% reduce 0%
>
> 14/04/10 10:26:52 INFO mapreduce.Job:  map 2% reduce 0%
>
> 14/04/10 10:26:53 INFO mapreduce.Job:  map 3% reduce 0%
>
> 14/04/10 10:26:55 INFO mapreduce.Job:  map 4% reduce 0%
>
> 14/04/10 10:26:58 INFO mapreduce.Job:  map 5% reduce 0%
>
> 14/04/10 10:26:59 INFO mapreduce.Job:  map 6% reduce 0%
>
> 14/04/10 10:27:01 INFO mapreduce.Job:  map 7% reduce 0%
>
> 14/04/10 10:27:02 INFO mapreduce.Job:  map 8% reduce 0%
>
> 14/04/10 10:27:04 INFO mapreduce.Job:  map 9% reduce 0%
>
> 14/04/10 10:27:06 INFO mapreduce.Job:  map 10% reduce 0%
>
> 14/04/10 10:27:08 INFO mapreduce.Job:  map 11% reduce 0%
>
> 14/04/10 10:27:10 INFO mapreduce.Job:  map 12% reduce 0%
>
> 14/04/10 10:27:12 INFO mapreduce.Job:  map 13% reduce 0%
>
> 14/04/10 10:27:13 INFO mapreduce.Job:  map 14% reduce 0%
>
> 14/04/10 10:27:15 INFO mapreduce.Job:  map 15% reduce 0%
>
> 14/04/10 10:27:18 INFO mapreduce.Job:  map 16% reduce 0%
>
> 14/04/10 10:27:19 INFO mapreduce.Job:  map 17% reduce 0%
>
> 14/04/10 10:27:20 INFO mapreduce.Job:  map 18% reduce 0%
>
> 14/04/10 10:27:23 INFO mapreduce.Job:  map 19% reduce 0%
>
> 14/04/10 10:27:25 INFO mapreduce.Job:  map 20% reduce 0%
>
> 14/04/10 10:27:27 INFO mapreduce.Job:  map 21% reduce 0%
>
> 14/04/10 10:27:28 INFO mapreduce.Job:  map 22% reduce 0%
>
> 14/04/10 10:27:30 INFO mapreduce.Job:  map 23% reduce 0%
>
> 14/04/10 10:27:32 INFO mapreduce.Job:  map 24% reduce 0%
>
> 14/04/10 10:27:34 INFO mapreduce.Job:  map 25% reduce 0%
>
> 14/04/10 10:27:35 INFO mapreduce.Job:  map 26% reduce 0%
>
> 14/04/10 10:27:38 INFO mapreduce.Job:  map 27% reduce 0%
>
> 14/04/10 10:27:40 INFO mapreduce.Job:  map 28% reduce 0%
>
> 14/04/10 10:27:41 INFO mapreduce.Job:  map 29% reduce 0%
>
> 14/04/10 10:27:43 INFO mapreduce.Job:  map 30% reduce 0%
>
> 14/04/10 10:27:45 INFO mapreduce.Job:  map 31% reduce 0%
>
> 14/04/10 10:27:47 INFO mapreduce.Job:  map 32% reduce 0%
>
> 14/04/10 10:27:48 INFO mapreduce.Job:  map 33% reduce 0%
>
> 14/04/10 10:27:51 INFO mapreduce.Job:  map 34% reduce 0%
>
> 14/04/10 10:27:53 INFO mapreduce.Job:  map 35% reduce 0%
>
> 14/04/10 10:27:54 INFO mapreduce.Job:  map 36% reduce 0%
>
> 14/04/10 10:27:55 INFO mapreduce.Job:  map 37% reduce 0%
>
> 14/04/10 10:27:59 INFO mapreduce.Job:  map 38% reduce 0%
>
> 14/04/10 10:28:00 INFO mapreduce.Job:  map 39% reduce 0%
>
> 14/04/10 10:28:02 INFO mapreduce.Job:  map 40% reduce 0%
>
> 14/04/10 10:28:04 INFO mapreduce.Job:  map 41% reduce 0%
>
> 14/04/10 10:28:06 INFO mapreduce.Job:  map 42% reduce 0%
>
> 14/04/10 10:28:07 INFO mapreduce.Job:  map 43% reduce 0%
>
> 14/04/10 10:28:09 INFO mapreduce.Job:  map 44% reduce 0%
>
> 14/04/10 10:28:10 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_0, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:11 INFO mapreduce.Job:  map 45% reduce 0%
>
> 14/04/10 10:28:13 INFO mapreduce.Job:  map 46% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job:  map 47% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_1, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:17 INFO mapreduce.Job:  map 48% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job:  map 49% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_2, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:21 INFO mapreduce.Job:  map 50% reduce 0%
>
> 14/04/10 10:28:23 INFO mapreduce.Job:  map 51% reduce 0%
>
> 14/04/10 10:28:24 INFO mapreduce.Job:  map 100% reduce 100%
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Job job_1395628276810_0062 failed with
> state FAILED due to: Task failed task_1395628276810_0062_m_000149
>
> Job failed as tasks failed. failedMaps:1 failedReduces:0
>
>
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Counters: 33
>
>         File System Counters
>
>                 FILE: Number of bytes read=0
>
>                 FILE: Number of bytes written=15667286
>
>                 FILE: Number of read operations=0
>
>                 FILE: Number of large read operations=0
>
>                 FILE: Number of write operations=0
>
>                 HDFS: Number of bytes read=21753912258
>
>                 HDFS: Number of bytes written=0
>
>                 HDFS: Number of read operations=486
>
>                 HDFS: Number of large read operations=0
>
>                 HDFS: Number of write operations=0
>
>         Job Counters
>
>                 Failed map tasks=4
>
>                 Killed map tasks=10
>
>                 Launched map tasks=176
>
>                 Other local map tasks=3
>
>                 Data-local map tasks=173
>
>                 Total time spent by all maps in occupied slots (ms)=1035708
>
>                 Total time spent by all reduces in occupied slots (ms)=0
>
>         Map-Reduce Framework
>
>                 Map input records=164217466
>
>                 Map output records=0
>
>                 Map output bytes=0
>
>                 Map output materialized bytes=414720
>
>                 Input split bytes=23490
>
>                 Combine input records=0
>
>                 Combine output records=0
>
>                 Spilled Records=0
>
>                 Failed Shuffles=0
>
>                 Merged Map outputs=0
>
>                 GC time elapsed (ms)=4750
>
>                 CPU time spent (ms)=321980
>
>                 Physical memory (bytes) snapshot=91335024640
>
>                 Virtual memory (bytes) snapshot=229819834368
>
>                 Total committed heap usage (bytes)=128240713728
>
>         File Input Format Counters
>
>                 Bytes Read=21753888768
>
> 14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful!
>
> Streaming Command Failed!
>
>
>
>
>
> Thanks and Regards,
>
> Truong Phan



-- 
Harsh J

Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

Posted by Harsh J <ha...@cloudera.com>.
It appears to me that whatever chunk of the input CSV files your map
task 000149 gets, the program is unable to process it and throws an
error and exits.

Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log
to see if there's any stdout/stderr printed that may help. The syslog
in the attempt's task log will also carry a "Processing split ..."
message that may help you know which file and what offset+length under
that file was being processed.

On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q
<Tr...@team.telstra.com> wrote:
> Hi
>
>
>
> My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce
> Streaming job.
>
> I have no issue in running the MapReduce Streaming job which has an input
> data file of around 400Mb CSV file.
>
> However, it is failed when I try to run the job which has 11 input data
> files of size 400Mb each.
>
> The job failed with the following error.
>
>
>
> I appreciate for any hints or suggestions to fix this issue.
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
> attempt_1395628276810_0062_m_000149_0 - exited : java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from
> attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads(): subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics
> report from attempt_1395628276810_0062_m_000149_0: Error:
> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> MAPREDUCE SCRIPT:
>
> $ cat devices-hdfs-mr-PyIterGen-v3.sh
>
> #!/bin/sh
>
> export HADOOP_CMD=/usr/bin/hadoop
>
> export HADOOP_HOME=/usr/lib/hadoop-0.20-mapreduce
>
> export
> HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.2.0-mr1-cdh5.0.0-beta-1.jar
>
>
>
> # Clean up the previous runs
>
> sudo -u hdfs hadoop fs -rm -f -R /data/db/bdms1p/device
>
>
>
> sudo -u hdfs hadoop jar
> $HADOOP_HOME/contrib/streaming/hadoop-*streaming*.jar \
>
> -D mapreduce.job.reduces=160 \
>
> -files
> ./device-mapper-v1.py,./device-combiner-v1.py,./device-reducer-v1.py,./nem-dms-stats-parameter.txt,./map-debug.py
> \
>
> -mapper ./device-mapper-v1.py \
>
> -combiner ./device-combiner-v1.py \
>
> -reducer ./device-reducer-v1.py \
>
> -mapdebug ./map-debug.py \
>
> -input /data/db/bdms1p/input/*.csv \
>
> -output /data/db/bdms1p/device
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> OUTPUT ON THE CONSOLE:
>
> $ ./devices-hdfs-mr-PyIterGen-v3.sh
>
> 14/04/10 10:26:27 INFO fs.TrashPolicyDefault: Namenode trash configuration:
> Deletion interval = 86400000 minutes, Emptier interval = 0 minutes.
>
> Moved:
> 'hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/data/db/bdms1p/device' to
> trash at:
> hdfs://nsda3dmsrpt02.internal.bigpond.com:8020/user/hdfs/.Trash/Current
>
> packageJobJar: []
> [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0-cdh5.0.0-beta-1.jar]
> /tmp/streamjob781154149428893352.jar tmpDir=null
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:30 INFO client.RMProxy: Connecting to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:32 INFO mapred.FileInputFormat: Total input paths to process
> : 106
>
> 14/04/10 10:26:32 INFO mapreduce.JobSubmitter: number of splits:317
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: user.name is deprecated.
> Instead, use mapreduce.job.user.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.jar is deprecated.
> Instead, use mapreduce.job.jar
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.filesizes is deprecated. Instead, use
> mapreduce.job.cache.files.filesizes
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.cache.files is
> deprecated. Instead, use mapreduce.job.cache.files
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.reduce.tasks is
> deprecated. Instead, use mapreduce.job.reduces
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.value.class
> is deprecated. Instead, use mapreduce.job.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.mapoutput.value.class is deprecated. Instead, use
> mapreduce.map.output.value.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.used.genericoptionsparser is deprecated. Instead, use
> mapreduce.client.genericoptionsparser.used
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.job.name is
> deprecated. Instead, use mapreduce.job.name
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.input.dir is
> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.map.task.debug.script is deprecated. Instead, use
> mapreduce.map.debug.script
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation:
> mapred.cache.files.timestamps is deprecated. Instead, use
> mapreduce.job.cache.files.timestamps
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.output.key.class is
> deprecated. Instead, use mapreduce.job.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.mapoutput.key.class
> is deprecated. Instead, use mapreduce.map.output.key.class
>
> 14/04/10 10:26:32 INFO Configuration.deprecation: mapred.working.dir is
> deprecated. Instead, use mapreduce.job.working.dir
>
> 14/04/10 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1395628276810_0062
>
> 14/04/10 10:26:33 INFO impl.YarnClientImpl: Submitted application
> application_1395628276810_0062 to ResourceManager at
> bpdevdmsdbs01/172.18.127.245:8032
>
> 14/04/10 10:26:33 INFO mapreduce.Job: The url to track the job:
> http://bpdevdmsdbs01:8088/proxy/application_1395628276810_0062/
>
> 14/04/10 10:26:33 INFO mapreduce.Job: Running job: job_1395628276810_0062
>
> 14/04/10 10:26:42 INFO mapreduce.Job: Job job_1395628276810_0062 running in
> uber mode : false
>
> 14/04/10 10:26:42 INFO mapreduce.Job:  map 0% reduce 0%
>
> 14/04/10 10:26:51 INFO mapreduce.Job:  map 1% reduce 0%
>
> 14/04/10 10:26:52 INFO mapreduce.Job:  map 2% reduce 0%
>
> 14/04/10 10:26:53 INFO mapreduce.Job:  map 3% reduce 0%
>
> 14/04/10 10:26:55 INFO mapreduce.Job:  map 4% reduce 0%
>
> 14/04/10 10:26:58 INFO mapreduce.Job:  map 5% reduce 0%
>
> 14/04/10 10:26:59 INFO mapreduce.Job:  map 6% reduce 0%
>
> 14/04/10 10:27:01 INFO mapreduce.Job:  map 7% reduce 0%
>
> 14/04/10 10:27:02 INFO mapreduce.Job:  map 8% reduce 0%
>
> 14/04/10 10:27:04 INFO mapreduce.Job:  map 9% reduce 0%
>
> 14/04/10 10:27:06 INFO mapreduce.Job:  map 10% reduce 0%
>
> 14/04/10 10:27:08 INFO mapreduce.Job:  map 11% reduce 0%
>
> 14/04/10 10:27:10 INFO mapreduce.Job:  map 12% reduce 0%
>
> 14/04/10 10:27:12 INFO mapreduce.Job:  map 13% reduce 0%
>
> 14/04/10 10:27:13 INFO mapreduce.Job:  map 14% reduce 0%
>
> 14/04/10 10:27:15 INFO mapreduce.Job:  map 15% reduce 0%
>
> 14/04/10 10:27:18 INFO mapreduce.Job:  map 16% reduce 0%
>
> 14/04/10 10:27:19 INFO mapreduce.Job:  map 17% reduce 0%
>
> 14/04/10 10:27:20 INFO mapreduce.Job:  map 18% reduce 0%
>
> 14/04/10 10:27:23 INFO mapreduce.Job:  map 19% reduce 0%
>
> 14/04/10 10:27:25 INFO mapreduce.Job:  map 20% reduce 0%
>
> 14/04/10 10:27:27 INFO mapreduce.Job:  map 21% reduce 0%
>
> 14/04/10 10:27:28 INFO mapreduce.Job:  map 22% reduce 0%
>
> 14/04/10 10:27:30 INFO mapreduce.Job:  map 23% reduce 0%
>
> 14/04/10 10:27:32 INFO mapreduce.Job:  map 24% reduce 0%
>
> 14/04/10 10:27:34 INFO mapreduce.Job:  map 25% reduce 0%
>
> 14/04/10 10:27:35 INFO mapreduce.Job:  map 26% reduce 0%
>
> 14/04/10 10:27:38 INFO mapreduce.Job:  map 27% reduce 0%
>
> 14/04/10 10:27:40 INFO mapreduce.Job:  map 28% reduce 0%
>
> 14/04/10 10:27:41 INFO mapreduce.Job:  map 29% reduce 0%
>
> 14/04/10 10:27:43 INFO mapreduce.Job:  map 30% reduce 0%
>
> 14/04/10 10:27:45 INFO mapreduce.Job:  map 31% reduce 0%
>
> 14/04/10 10:27:47 INFO mapreduce.Job:  map 32% reduce 0%
>
> 14/04/10 10:27:48 INFO mapreduce.Job:  map 33% reduce 0%
>
> 14/04/10 10:27:51 INFO mapreduce.Job:  map 34% reduce 0%
>
> 14/04/10 10:27:53 INFO mapreduce.Job:  map 35% reduce 0%
>
> 14/04/10 10:27:54 INFO mapreduce.Job:  map 36% reduce 0%
>
> 14/04/10 10:27:55 INFO mapreduce.Job:  map 37% reduce 0%
>
> 14/04/10 10:27:59 INFO mapreduce.Job:  map 38% reduce 0%
>
> 14/04/10 10:28:00 INFO mapreduce.Job:  map 39% reduce 0%
>
> 14/04/10 10:28:02 INFO mapreduce.Job:  map 40% reduce 0%
>
> 14/04/10 10:28:04 INFO mapreduce.Job:  map 41% reduce 0%
>
> 14/04/10 10:28:06 INFO mapreduce.Job:  map 42% reduce 0%
>
> 14/04/10 10:28:07 INFO mapreduce.Job:  map 43% reduce 0%
>
> 14/04/10 10:28:09 INFO mapreduce.Job:  map 44% reduce 0%
>
> 14/04/10 10:28:10 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_0, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:11 INFO mapreduce.Job:  map 45% reduce 0%
>
> 14/04/10 10:28:13 INFO mapreduce.Job:  map 46% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job:  map 47% reduce 0%
>
> 14/04/10 10:28:14 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_1, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:17 INFO mapreduce.Job:  map 48% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job:  map 49% reduce 0%
>
> 14/04/10 10:28:19 INFO mapreduce.Job: Task Id :
> attempt_1395628276810_0062_m_000149_2, Status : FAILED
>
> Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess failed with code 1
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
>
>         at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
>
>         at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>
>         at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>
>
>
> 14/04/10 10:28:21 INFO mapreduce.Job:  map 50% reduce 0%
>
> 14/04/10 10:28:23 INFO mapreduce.Job:  map 51% reduce 0%
>
> 14/04/10 10:28:24 INFO mapreduce.Job:  map 100% reduce 100%
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Job job_1395628276810_0062 failed with
> state FAILED due to: Task failed task_1395628276810_0062_m_000149
>
> Job failed as tasks failed. failedMaps:1 failedReduces:0
>
>
>
> 14/04/10 10:28:24 INFO mapreduce.Job: Counters: 33
>
>         File System Counters
>
>                 FILE: Number of bytes read=0
>
>                 FILE: Number of bytes written=15667286
>
>                 FILE: Number of read operations=0
>
>                 FILE: Number of large read operations=0
>
>                 FILE: Number of write operations=0
>
>                 HDFS: Number of bytes read=21753912258
>
>                 HDFS: Number of bytes written=0
>
>                 HDFS: Number of read operations=486
>
>                 HDFS: Number of large read operations=0
>
>                 HDFS: Number of write operations=0
>
>         Job Counters
>
>                 Failed map tasks=4
>
>                 Killed map tasks=10
>
>                 Launched map tasks=176
>
>                 Other local map tasks=3
>
>                 Data-local map tasks=173
>
>                 Total time spent by all maps in occupied slots (ms)=1035708
>
>                 Total time spent by all reduces in occupied slots (ms)=0
>
>         Map-Reduce Framework
>
>                 Map input records=164217466
>
>                 Map output records=0
>
>                 Map output bytes=0
>
>                 Map output materialized bytes=414720
>
>                 Input split bytes=23490
>
>                 Combine input records=0
>
>                 Combine output records=0
>
>                 Spilled Records=0
>
>                 Failed Shuffles=0
>
>                 Merged Map outputs=0
>
>                 GC time elapsed (ms)=4750
>
>                 CPU time spent (ms)=321980
>
>                 Physical memory (bytes) snapshot=91335024640
>
>                 Virtual memory (bytes) snapshot=229819834368
>
>                 Total committed heap usage (bytes)=128240713728
>
>         File Input Format Counters
>
>                 Bytes Read=21753888768
>
> 14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful!
>
> Streaming Command Failed!
>
>
>
>
>
> Thanks and Regards,
>
> Truong Phan



-- 
Harsh J