You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amareshwari Sri Ramadasu (JIRA)" <ji...@apache.org> on 2007/10/01 11:54:51 UTC

[jira] Commented: (HADOOP-1857) Ability to run a script when a task fails to capture stack traces

    [ https://issues.apache.org/jira/browse/HADOOP-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12531456 ] 

Amareshwari Sri Ramadasu commented on HADOOP-1857:
--------------------------------------------------

Usage Documentation :

A facility is provided, via user-provided scripts, for doing post-processing on task logs, task's stdout, stderr, syslog and core files. There is a default script which processes core dumps under gdb and prints stack trace. The last five lines from stdout and stderr of debug script are printed on the diagnostics. These outputs are displayed on job UI on demand.

How to submit debug command:
A quick way to set debug command is to set the properties "mapred.map.task.debug.command" and "mapred.reduce.task.debug.command" for debugging map task and reduce task respectively. These properties can also be set by APIs conf.setMapDebugCommand(String cmd) and conf.setReduceDebugCommand(String cmd). The debug command can consist of @stdout@, @stderr@, @syslog@ and @core@ to access task's stdout, stderr, syslog and core files respectively. In case of streaming, debug command can be submitted with command-line options -mapdebug, -reducedebug for debugging mapper and redcuer respectively.
For example, the debug command can be 'myScript @stderr@'. This command has executable myScript. And myScript processes failed task's stderr.
The debug command can be a gdb command where user can submit a command file to execute using -x option. Then debug command can look like 'gdb <program-name> -c @core@ -x <gdb-cmd-fle> '. This command processes core file of the failed task <program-name> and executes commands in <gdb-cmd-file>. Please make sure gdb command file has 'quit' in its last line.

How to submit debug script:
To submit the debug script file, first put the file in dfs.
The executable can be added by setting the property "mapred.cache.executables" with value <path>#<executable-name>. For more than one executable, they can be added as comma seperated executable paths. Executable property can also be set by APIs DistributedCache.addCacheExecutable(URI,conf) and DistributedCache.setCacheExecutables(URI[],conf) where URI is of the form "hdfs://host:port/<path>#<executable-name>". For Streaming, the executable can be added through -cacheExecutable URI.
For gdb, the gdb command file need not be executable. But, the command file needs to be in dfs. It can be added to cache by setting the property "mapred.cache.files" with the value <path>#<cmd-file> or through the API DistributedCache.addCacheFile(URI,conf). Please make sure the property "mapred.create.symlink" is set to "yes" 


All this documentation is incorporated in Java doc also.


> Ability to run a script when a task fails to capture stack traces
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1857
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1857
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Amareshwari Sri Ramadasu
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.15.0
>
>         Attachments: patch-1857.txt, patch1857.txt
>
>
> This basically is for providing a better user interface for debugging failed
> jobs. Today we see stack traces for failed tasks on the job ui if the job
> happened to be a Java MR job. For non-Java jobs like Streaming, Pipes, the
> diagnostic info on the job UI is not helpful enough to debug what might have
> gone wrong. They are usually framework traces and not app traces.
> We want to be able to provide a facility, via user-provided scripts, for doing
> post-processing on task logs, input, output, etc. There should be some default
> scripts like running core dumps under gdb for locating illegal instructions,
> the last few lines from stderr, etc.  These outputs could be sent to the
> tasktracker and in turn to the jobtracker which would then display it on the
> job UI on demand.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.