You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "Arpit Gupta (JIRA)" <ji...@apache.org> on 2013/05/01 00:24:16 UTC

[jira] [Created] (MAPREDUCE-5198) Race condition in cleanup during task tracker renint with LinuxTaskController

Arpit Gupta created MAPREDUCE-5198:
--------------------------------------

             Summary: Race condition in cleanup during task tracker renint with LinuxTaskController
                 Key: MAPREDUCE-5198
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5198
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: tasktracker
    Affects Versions: 1.2.0
            Reporter: Arpit Gupta


This was noticed when job tracker would be restarted while jobs were running and would ask the task tracker to reinitialize. 

Tasktracker would fail with an error like

{code}
013-04-27 20:19:09,627 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /grid/0/hdp/mapred/local,/grid/1/hdp/mapred/local,/grid/2/hdp/mapred/local,/grid/3/hdp/mapred/local,/grid/4/hdp/mapred/local,/grid/5/hdp/mapred/local
2013-04-27 20:19:09,628 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 42075 caught: java.nio.channels.ClosedChannelException
	at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
	at org.apache.hadoop.ipc.Server.channelWrite(Server.java:1717)
	at org.apache.hadoop.ipc.Server.access$2000(Server.java:98)
	at org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:744)
	at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:808)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1433)

2013-04-27 20:19:09,628 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 42075: exiting
2013-04-27 20:19:10,414 ERROR org.apache.hadoop.mapred.TaskTracker: Got fatal exception while reinitializing TaskTracker: org.apache.hadoop.util.Shell$ExitCodeException: 
	at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
	at org.apache.hadoop.util.Shell.run(Shell.java:182)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
	at org.apache.hadoop.mapred.LinuxTaskController.deleteAsUser(LinuxTaskController.java:281)
	at org.apache.hadoop.mapred.TaskTracker.deleteUserDirectories(TaskTracker.java:779)
	at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:816)
	at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2704)
	at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3934)
{code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira