You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2009/06/19 08:01:10 UTC

[jira] Created: (HADOOP-6089) Streaming task stuck in MapTask$DirectMapOutputCollector.close

Streaming task stuck in MapTask$DirectMapOutputCollector.close
--------------------------------------------------------------

                 Key: HADOOP-6089
                 URL: https://issues.apache.org/jira/browse/HADOOP-6089
             Project: Hadoop Core
          Issue Type: Bug
          Components: dfs, mapred
            Reporter: Amareshwari Sriramadasu


Observed a streaming task stuck in MapTask$DirectMapOutputCollector.close

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6089) Streaming task stuck in MapTask$DirectMapOutputCollector.close

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721668#action_12721668 ] 

Amareshwari Sriramadasu commented on HADOOP-6089:
-------------------------------------------------

Task logs show :
Syslog:
2009-06-11 03:36:25,815 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed exec [/usr/local/bin/perl, ./hod-farm, node, -user=bolong, -dry=0, -exec_outdir=ascii/hk_ascii_usukin_data, -dfs_data=/projects/srelevance/modeling/data/hk/7.0/ascii/ASCII.TEST.UMLR.quj:ascii/ASCII.TEST.UMLR.quj,/projects/srelevance/modeling/data/hk/7.0/ascii/NON_ASCII.TEST.UMLR.quj:ascii/NON_ASCII.TEST.UMLR.quj,/projects/srelevance/modeling/data/hk/7.0.b/TEST.UMLR.quj:7.0.b/TEST.UMLR.quj,rslt:rslt,/projects/srelevance/modeling/data/um/4.0/train/tmp/usukin_hk_ascii.train.quj:tmp/usukin_hk_ascii.train.quj,/projects/srelevance/modeling/data/um/4.0/train/tmp/usukin_hk_ascii.train.fv:tmp/usukin_hk_ascii.train.fv,/projects/srelevance/modeling/data/hk/7.0/ascii/ASCII.TEST.UMLR.fv:ascii/ASCII.TEST.UMLR.fv,/projects/srelevance/modeling/data/hk/7.0/ascii/NON_ASCII.TEST.UMLR.fv:ascii/NON_ASCII.TEST.UMLR.fv,/projects/srelevance/modeling/data/hk/7.0.b/TEST.UMLR.fv:7.0.b/TEST.UMLR.fv,/projects/srelevance/modeling/data/um/4.0/train/tmp/hk_ascii_usukin.weight0:tmp/hk_ascii_usukin.weight0,/projects/srelevance/modeling/data/um/4.0/train/tmp/hk_ascii_usukin.weight1:tmp/hk_ascii_usukin.weight1,/projects/srelevance/modeling/data/um/4.0/train/tmp/hk_ascii_usukin.weight2:tmp/hk_ascii_usukin.weight2,/projects/srelevance/modeling/data/um/4.0/train/tmp/hk_ascii_usukin.weight3:tmp/hk_ascii_usukin.weight3,/projects/srelevance/modeling/data/um/4.0/train/tmp/hk_ascii_usukin.weight4:tmp/hk_ascii_usukin.weight4,/projects/srelevance/modeling/data/um/4.0/train/tmp/hk_ascii_usukin.weight5:tmp/hk_ascii_usukin.weight5,/projects/srelevance/modeling/data/um/4.0/train/tmp/hk_ascii_usukin.weight6:tmp/hk_ascii_usukin.weight6,/projects/srelevance/modeling/data/um/4.0/train/tmp/hk_ascii_usukin.weight7:tmp/hk_ascii_usukin.weight7,/projects/srelevance/modeling/data/um/4.0/train/tmp/hk_ascii_usukin.weight8:tmp/hk_ascii_usukin.weight8,/projects/srelevance/modeling/data/um/4.0/train/tmp/hk_ascii_usukin.weight9:tmp/hk_ascii_usukin.weight9,/projects/srelevance/modeling/data/um/4.0/train/tmp/hk_ascii_usukin.weight10:tmp/hk_ascii_usukin.weight10, -suicide]
2009-06-11 03:36:25,843 INFO org.apache.hadoop.streaming.PipeMapRed: R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s]
2009-06-12 03:41:08,775 INFO org.apache.hadoop.streaming.PipeMapRed: Records R/W=1/1
2009-06-12 03:41:14,606 INFO org.apache.hadoop.streaming.PipeMapRed: MROutputThread done
2009-06-12 03:41:14,606 INFO org.apache.hadoop.streaming.PipeMapRed: MRErrorThread done
2009-06-12 03:41:14,657 INFO org.apache.hadoop.streaming.PipeMapRed: mapRedFinished

Stderr :
/grid/0/gs/hadoop/current/bin/../bin/hadoop dfs -put hk_ascii_usukin314 /user/bolong/ascii/hk_ascii_usukin_data/hk_ascii_usukin314
86689 === done copy
A thread exited while 2 threads were running, <> line 1.
Exception in thread "IPC Client (47) connection to namenode.com/72.30.117.6:8020 from bolong" java.lang.IllegalMonitorStateException
	at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:183)
	at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
	at java.io.FilterInputStream.read(FilterInputStream.java:116)
	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:276)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
	at java.io.DataInputStream.readInt(DataInputStream.java:370)
	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
log4j:ERROR cleanUpRegex == null || !cleanUpRegex.contains("$fileName")
log4j:ERROR cleanUpRegex == null || !cleanUpRegex.contains("$fileName")
log4j:ERROR cleanUpRegex == null || !cleanUpRegex.contains("$fileName")
log4j:ERROR cleanUpRegex == null || !cleanUpRegex.contains("$fileName")
log4j:ERROR cleanUpRegex == null || !cleanUpRegex.contains("$fileName")
log4j:ERROR cleanUpRegex == null || !cleanUpRegex.contains("$fileName")
log4j:ERROR cleanUpRegex == null || !cleanUpRegex.contains("$fileName")

Attached complete stackTrace for the task .
StackTrace contains :
"Thread-11" daemon prio=10 tid=0xf559c000 nid=0x1a89 in Object.wait() [0xf51d7000..0xf51d81b0]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:485)
	at org.apache.hadoop.ipc.Client.call(Client.java:725)
	- locked <0xcfcf4200> (a org.apache.hadoop.ipc.Client$Call)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
	at $Proxy2.addBlock(Unknown Source)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
	at $Proxy2.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2873)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2755)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)
	- locked <0xcfac2738> (a java.util.LinkedList)

and 
"main" prio=10 tid=0x0805a800 nid=0x19f3 in Object.wait() [0xf7e70000..0xf7e71288]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:485)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal(DFSClient.java:3073)
	- locked <0xcfac2738> (a java.util.LinkedList)
	- locked <0xcfac4518> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3164)
	- locked <0xcfac4518> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3113)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
	at org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:106)
	- locked <0xcfac46c0> (a org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter)
	at org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.close(MapTask.java:565)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:361)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)

Both Main thread and Thread-11 locked 	- locked <0xcfac2738> (a java.util.LinkedList) and WAITING (on object monitor). 

> Streaming task stuck in MapTask$DirectMapOutputCollector.close
> --------------------------------------------------------------
>
>                 Key: HADOOP-6089
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6089
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs, mapred
>            Reporter: Amareshwari Sriramadasu
>
> Observed a streaming task stuck in MapTask$DirectMapOutputCollector.close

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6089) Streaming task stuck in MapTask$DirectMapOutputCollector.close

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-6089:
--------------------------------------------

    Attachment: thread_dump.txt

Complete thread dump for the task

> Streaming task stuck in MapTask$DirectMapOutputCollector.close
> --------------------------------------------------------------
>
>                 Key: HADOOP-6089
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6089
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs, mapred
>            Reporter: Amareshwari Sriramadasu
>         Attachments: thread_dump.txt
>
>
> Observed a streaming task stuck in MapTask$DirectMapOutputCollector.close

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.