You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "tim.wu (Created) (JIRA)" <ji...@apache.org> on 2012/04/13 08:19:13 UTC

[jira] [Created] (HADOOP-8274) Under cygwin, hadoop throws exception in pseudo or cluster model

Under cygwin, hadoop throws exception in pseudo or cluster model
----------------------------------------------------------------

                 Key: HADOOP-8274
                 URL: https://issues.apache.org/jira/browse/HADOOP-8274
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 0.22.0, 1.0.1, 1.0.0, 0.20.205.0
         Environment: windows7+cygwin 1.7.11-1+jdk1.6.0_31+hadoop 1.0.0
            Reporter: tim.wu


The standalone model is ok. But, in pseudo or cluster model, it example always throw errors, even I just run wordcount example.

The HDFS works fine, but tasktracker can not create threads(jvm) for new
job.  It is empty under /logs/userlogs/job-xxxx/attempt-xxxx/.

The reason looks like that in windows, Java can not recognize a symlink of folder as a folder. 

The detail description is as following,

----------------------------------------------------

First, The error log of tasktracker is like:


======================
12/03/28 14:35:13 INFO mapred.JvmManager: In JvmRunner constructed JVM ID:
jvm_201203280212_0005_m_-1386636958
12/03/28 14:35:13 INFO mapred.JvmManager: JVM Runner
jvm_201203280212_0005_m_-1386636958 spawned.
12/03/28 14:35:17 INFO mapred.JvmManager: JVM Not killed
jvm_201203280212_0005_m_-1386636958 but just removed
12/03/28 14:35:17 INFO mapred.JvmManager: JVM :
jvm_201203280212_0005_m_-1386636958 exited with exit code -1. Number of
tasks it ran: 0
12/03/28 14:35:17 WARN mapred.TaskRunner:
attempt_201203280212_0005_m_000002_0 : Child Error
java.io.IOException: Task process exit with nonzero status of -1.
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
12/03/28 14:35:21 INFO mapred.TaskTracker: addFreeSlot : current free slots
: 2
12/03/28 14:35:24 INFO mapred.TaskTracker: LaunchTaskAction (registerTask):
attempt_201203280212_0005_m_000002_1 task's state:UNASSIGNED
12/03/28 14:35:24 INFO mapred.TaskTracker: Trying to launch :
attempt_201203280212_0005_m_000002_1 which needs 1 slots
12/03/28 14:35:24 INFO mapred.TaskTracker: In TaskLauncher, current free
slots : 2 and trying to launch attempt_201203280212_0005_m_000002_1 which
needs 1 slots
12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stdout log for
task: attempt_201203280212_0005_m_000002_0
java.io.FileNotFoundException:
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index
(The system cannot find the path specified)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:120)
        at
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
        at
org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
        at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
        at
org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
        at
org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
        at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stderr log for
task: attempt_201203280212_0005_m_000002_0
java.io.FileNotFoundException:
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index
(The system cannot find the path specified)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:120)
        at
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
        at
org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
        at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
        at
org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
        at
org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
        at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

=======================================

I've tried to remote debug tasktracker. In


   1. org.apache.hadoop.mapredTaskLog.createTaskAttemptLogDir(TaskAttemptID,
   boolean, String[]) line: 97:
   2. public static void createTaskAttemptLogDir(TaskAttemptID taskID,
   3.       boolean isCleanup, String[] localDirs) throws IOException{
   4.     String cleanupSuffix = isCleanup ? ".cleanup" : "";
   5.     String strAttemptLogDir = getTaskAttemptLogDir(taskID,
   6.         cleanupSuffix, localDirs);
   7.     File attemptLogDir = new File(strAttemptLogDir);
   8.     if (!attemptLogDir.mkdirs()) {
   9.       throw new IOException("Creation of " + attemptLogDir + " failed
   .");
   10.     }
   11.     String strLinkAttemptLogDir =
   12.         getJobDir(taskID.getJobID()).getAbsolutePath() +
   File.separatorChar +
   13.         taskID.toString() + cleanupSuffix;
   14.     if (FileUtil.symLink(strAttemptLogDir, strLinkAttemptLogDir) !=
   0) {
   15.       throw new IOException("Creation of symlink from " +
   16.                             strLinkAttemptLogDir + " to " +
   yestrAttemptLogDir +
   17.                             " failed.");
   18.     }
   19.     //Set permissions for target attempt log dir
   20.     FsPermission userOnly = new FsPermission((short) 0777);
   //FsPermission userOnly = new FsPermission((short) 0700);
   21.     FileUtil.setPermission(attemptLogDir, userOnly);
   22.   }

and  symlink() function

   1. public static int symLink(String target, String linkname) throws
   IOException{
   2.     String cmd = "ln -s " + target + " " + linkname;
   3.     Process p = Runtime.getRuntime().exec(cmd, null);
   4.     int returnVal = -1;
   5.     try{
   6.       returnVal = p.waitFor();
   7.     } catch(InterruptedException e){
   8.       //do nothing as of yet
   9.     }
   10.     if (returnVal != 0) {
   11.       LOG.warn("Command '" + cmd + "' failed " + returnVal +
   12.                " with: " + copyStderr(p));
   13.     }
   14.     return returnVal;
   15.   }


we know hadoop will create a log folder in ${hadoop.tmp.dir},  and then
invoke "ln -s " to create its symlink under
/logs/userlog/job-xxx/attermp-xxxx.

In my case,

   1. strLinkAttemptLogDir =
   D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
   2.
   strAttemptLogDir=/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1


After a subtrack is created by tasktracker, it runs error in the following
function:


   1. in org.apache.hadoop.mapred.java
   , DefaultTaskController.launchTask(String, String, String, List<String>,
   List<String>, File, String, String) line: 107
   2.       �............
   3.       //mkdir the loglocation
   4.       String logLocation = TaskLog.getAttemptDir(jobId,
   attemptId).toString();
   5.       if (!localFs.mkdirs(new Path(logLocation))) {
   6.         throw new IOException("Mkdirs failed to create "
   7.                    + logLocation);
   8.       }
   9.      �...........


mkdir() return false, because logLocation is a symlink file. In my case, it
is ogLocation=D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1.
 If I open it from explorer in windows, it is just a file,  but not a
folder or shortcut. And its content is like,

 <symlink>/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1

Because the mkdir() is

   1. public boolean mkdirs(Path f) throws IOException {
   2.     Path parent = f.getParent();
   3.     File p2f = pathToFile(f);
   4.     return (parent == null || mkdirs(parent)) &&
   5.       (p2f.mkdir() || p2f.isDirectory());
   6.   }


So, p2f.isDirectory returns false.  And p2f.isFile will return true. So,
for java, it is a file. Hence, IOException(�Mkdirs failed to create
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000001_1�)
will be throws in child threads, and return -1.  Then, we will get the
above exception in main thread.

Is it any way to close this symlink? Or any other way I can follow?

BTW, in core-site.xml, I set  hadoop.tmp.dir = /tmp/hadoop-${user.name},
 and my $User.name is timwu. So, it should create a tmp folder
/tmp/hadoop-timwu  under cygwin's.   However, in deed , it create  a folder
of d:/tmp/hadoop-timwu. If in cygwin, it is /cygdriver/d/tmp/hadoop-timwu.
 Is it correct?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira