You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "peter xie (JIRA)" <ji...@apache.org> on 2014/04/24 12:29:14 UTC

[jira] [Commented] (HADOOP-10538) NumberFormatException happened when hadoop 1.2.1 running on Cygwin

    [ https://issues.apache.org/jira/browse/HADOOP-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13979553#comment-13979553 ] 

peter xie commented on HADOOP-10538:
------------------------------------

After reading the core code of hadoop 1.2.1,  and the root cause is clear now.  
the details is :
During startup , the tasktracker will open an child process  to by executing the script  "taskjvm.sh",  and this script will run the main method in   org.apache.hadoop.mapred.Child.  In this main method,  there has below code :   
   -----------------------------------------
    String pid = "";
    if (!Shell.WINDOWS) {
      pid = System.getenv().get("JVM_PID");
    }
    JvmContext context = new JvmContext(jvmId, pid);
    .......................
    final JvmContext jvmContext = context;
    try {
      while (true) {
        ..........................;
        JvmTask myTask = umbilical.getTask(context);
  ----------------------------------------------
Obviously , that the pid saved to JvmContext is empty under windows platform,  an this pid will send to tacktracker via RPC interface "getTask" .
But in the kill method of JvmManager of parent process , only the null value checked for pidStr . So when parse the pidStr to int , the NumberFormatException happed likes it showed in log.
     ----------------------------------------------------------
      synchronized void kill() throws IOException, InterruptedException {
        if (!killed) {
          TaskController controller = tracker.getTaskController();
          // Check inital context before issuing a kill to prevent situations
          // where kill is issued before task is launched.
          String pidStr = jvmIdToPid.get(jvmId);
          if (null != pidStr) {
            String user = env.conf.getUser();
            int pid = Integer.parseInt(pidStr);
            try {
              // start a thread that will kill the process dead
              if (sleeptimeBeforeSigkill > 0) {
                new DelayedProcessKiller(user, pid, sleeptimeBeforeSigkill, 
                                         Signal.KILL).start();
                controller.signalTask(user, pid, Signal.TERM);
              } else {
                controller.signalTask(user, pid, Signal.KILL);
              }
            } catch (IOException e) {
              LOG.error("Catch Exception caused by lack of user information to prevent inconsistent state: ", e);
            }
          } else {
            LOG.info(String.format("JVM Not killed %s but just removed", jvmId
                .toString()));
          }
          killed = true;
        }
      }

-------------------------------------------------
The solution i used in my local is : the null value check instead of "!StringUtils.isBlank(pidStr)" .  
 But i have another question is why the pid is hard code to empty in window paltform ?  




> NumberFormatException happened  when hadoop 1.2.1 running on Cygwin
> -------------------------------------------------------------------
>
>                 Key: HADOOP-10538
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10538
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 1.2.1
>         Environment: OS: windows 7 / Cygwin
>            Reporter: peter xie
>
> The TaskTracker always failed to startup when it running on Cygwin. And the error info logged in xxx-tasktracker-xxxx.log is :
> 2014-04-21 22:13:51,439 DEBUG org.apache.hadoop.mapred.TaskRunner: putting jobToken file name into environment D:/hadoop/mapred/local/taskTracker/pxie/jobcache/job_201404212205_0001/jobToken
> 2014-04-21 22:13:51,439 INFO org.apache.hadoop.mapred.JvmManager: Killing JVM: jvm_201404212205_0001_m_1895177159
> 2014-04-21 22:13:51,439 WARN org.apache.hadoop.mapred.TaskRunner: attempt_201404212205_0001_m_000000_0 : Child Error
> java.lang.NumberFormatException: For input string: ""
> 	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> 	at java.lang.Integer.parseInt(Integer.java:504)
> 	at java.lang.Integer.parseInt(Integer.java:527)
> 	at org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.kill(JvmManager.java:552)
> 	at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.killJvmRunner(JvmManager.java:314)
> 	at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:378)
> 	at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:189)
> 	at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:122)
> 	at org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
> 	at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)
> 2014-04-21 22:13:51,511 DEBUG org.apache.hadoop.ipc.Server: IPC Server listener on 59983: disconnecting client 127.0.0.1:60154. Number of active connections: 1
> 2014-04-21 22:13:51,531 WARN org.apache.hadoop.fs.FileUtil: Failed to set permissions of path: 



--
This message was sent by Atlassian JIRA
(v6.2#6252)