You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zhenxiao Luo (JIRA)" <ji...@apache.org> on 2012/07/26 03:36:34 UTC
[jira] [Commented] (HIVE-3303) Fix error code inconsistency bug in mapreduce_stack_trace.q and mapreduce_stack_trace_turnoff.q when running hive on hadoop23

    [ https://issues.apache.org/jira/browse/HIVE-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422816#comment-13422816 ] 

Zhenxiao Luo commented on HIVE-3303:
------------------------------------

The problem is, hadoop23 is getting Task-diagnostics differently from hadoop20.

In hadoop20, Task-diagnostics is retrieved via jobSubmitClient in JobClient.java:

 public String[] getTaskDiagnostics(TaskAttemptID id) throws IOException {
      return jobSubmitClient.getTaskDiagnostics(id);
    }

And in JobTracker.java, all the related logs are put into diagnostic info:


public synchronized String[] getTaskDiagnostics(TaskAttemptID taskId)
    throws IOException {

    JobID jobId = taskId.getJobID();
    TaskID tipId = taskId.getTaskID();
    JobInProgress job = jobs.get(jobId);
    if (job == null) {
      throw new IllegalArgumentException("Job " + jobId + " not found.");
    }
    TaskInProgress tip = job.getTaskInProgress(tipId);
    if (tip == null) {
      throw new IllegalArgumentException("TIP " + tipId + " not found.");
    }
    List<String> taskDiagnosticInfo = tip.getDiagnosticInfo(taskId);
    return ((taskDiagnosticInfo == null) ? null
            : taskDiagnosticInfo.toArray(new String[0]));
  }

Here is the diagnostic info in hadoop20:

java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"key":"238","value":"val_238"}
    [junit]     at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
    [junit]     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    [junit]     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
    [junit]     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    [junit]     at org.apache.hadoop.mapred.Child.main(Child.java:170)
    [junit] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"key":"238","value":"val_238"}
    [junit]     at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:548)
    [junit]     at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
    [junit]     ... 4 more
    [junit] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20000]: Unable to initialize custom script.
    [junit]     at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:346)
    [junit]     at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
    [junit]     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
    [junit]     at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
    [junit]     at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
    [junit]     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
    [junit]     at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
    [junit]     at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
    [junit]     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
    [junit]     at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:529)
    [junit]     ... 5 more
    [junit] Caused by: java.io.IOException: Cannot run program "script_does_not_exist": java.io.IOException: error=2, No such file or directory
    [junit]     at java.lang.ProcessBuilder.start(ProcessBuilder.java:475)
    [junit]     at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:305)
    [junit]     ... 14 more
    [junit] Caused by: java.io.IOException: java.io.IOException: error=2, No such file or directory
    [junit]     at java.lang.UNIXProcess.<init>(UNIXProcess.java:164)
    [junit]     at java.lang.ProcessImpl.start(ProcessImpl.java:81)
    [junit]     at java.lang.ProcessBuilder.start(ProcessBuilder.java:468)
    [junit]     ... 15 more

The error code [20000] appears in the diagnostic info, and could be retrieved by Hive.

While, in hadoop23, in Job.java, a different execution path is:


public String[] getTaskDiagnostics(final TaskAttemptID taskid)
      throws IOException, InterruptedException {
    ensureState(JobState.RUNNING);
    return ugi.doAs(new PrivilegedExceptionAction<String[]>() {
      @Override
      public String[] run() throws IOException, InterruptedException {
        return cluster.getClient().getTaskDiagnostics(taskid);
      }
    });
  }

Here is the diagnostic info in hadoop23:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"key":"238","value":"val_238"}
    [junit]     at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
    [junit]     at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
    [junit]     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    [junit]     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393)
    [junit]     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    [junit]     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393)
    [junit]     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
    [junit]     at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
    [junit]     at java.security.AccessController.doPrivileged(Native Method)
    [junit]     at javax.security.auth.Subject.doAs(Subject.java:416)
    [junit]     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
    [junit]     at org.apache.hadoop.mapred.Child.main(Child.java:264)
    [junit] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"key":"238","value":"val_238"}
    [junit]     at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:548)
    [junit]     at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
    [junit]     ... 8 more
    [junit] C
    [junit]     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
    [junit]     at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
    [junit]     at java.security.AccessController.doPrivileged(Native Method)
    [junit]     at javax.security.auth.Subject.doAs(Subject.java:416)
    [junit]     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
    [junit]     at org.apache.hadoop.mapred.Child.main(Child.java:264)
    [junit] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"key":"238","value":"val_238"}
    [junit]     at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:548)
    [junit]     at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
    [junit]     ... 8 more

Except for the Hive Runtime exception saying error while processing row, no more info is get in the diagnostic info.

Since hive is using Hadoop's diagnostic info to extract Error Code, in hadoop 23, error code is only set to be 2(as no other error code could be extracted).

JobDebugger.java, getTasksInfo() function:

if (t.getTaskStatus() != TaskCompletionEvent.Status.SUCCEEDED) {
            if (ti.getErrorCode() == 0) {
              String[] diags = rj.getTaskDiagnostics(t.getTaskAttemptId());
              ti.setErrorCode(extractErrorCode(diags));
              ti.setDiagnosticMesgs(diags);
            }


I think the possible solution is to have separate expected outputs for hadoop20 and hadoop23. Since mapreduce_stack_trace.q and mapreduce_stack_trace_turnoff.q are the testcases for Negative MiniMRCluster, there is no such utility there.

any suggestions are appreciated.

                
> Fix error code inconsistency bug in mapreduce_stack_trace.q and mapreduce_stack_trace_turnoff.q when running hive on hadoop23
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-3303
>                 URL: https://issues.apache.org/jira/browse/HIVE-3303
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Zhenxiao Luo
>            Assignee: Zhenxiao Luo
>
> when running hive on hadoop23, mapreduce_stack_trace.q and mapreduce_stack_trace_turnoff.q are having inconsistent error code diffs:
> [junit] diff -a /home/cloudera/Code/hive/build/ql/test/logs/clientnegative/mapreduce_stack_trace.q.out /home/cloudera/Code/hive/ql/src/test/results/clientnegative/mapreduce_stack_trace.q.out
> [junit] < FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
> [junit] > FAILED: Execution Error, return code 20000 from org.apache.hadoop.hive.ql.exec.MapRedTask. Unable to initialize custom script.
> [junit] diff -a /home/cloudera/Code/hive/build/ql/test/logs/clientnegative/mapreduce_stack_trace_turnoff.q.out /home/cloudera/Code/hive/ql/src/test/results/clientnegative/mapreduce_stack_trace_turnoff.q.out
> [junit] 5c5
> [junit] < FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
> [junit] —
> [junit] > FAILED: Execution Error, return code 20000 from org.apache.hadoop.hive.ql.exec.MapRedTask. Unable to initialize custom script
> The error code 20000(which indicates unable to initialize custom script) could not be retrieved. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira