You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Hitesh Shah (JIRA)" <ji...@apache.org> on 2016/03/02 06:45:18 UTC

[jira] [Commented] (TEZ-3154) Debuggability : Add an option to take threaddump from a specific vertex/task

    [ https://issues.apache.org/jira/browse/TEZ-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15175078#comment-15175078 ] 

Hitesh Shah commented on TEZ-3154:
----------------------------------

bq. Creating this ticket to explore the possibility of adding thread-dump on periodic basis for specific tasks.

This seems like a debugging only feature which needs to be pre-setup. What would be more useful is to trigger a thread dump when certain conditions are hit: 
   - if there is a task timeout - this would trigger a thread dump before killing the container
   - if the user defines a policy that if certain counter thresholds are hit ( GC counters, etc ) 
   - command-line tool to trigger thread dumps on an demand basis if the user wants to see what is going on within a task 

Also, please take a look at the support yarn is introducing or has introduced for sending a signal to a container - this would be useful to trigger a jstack dump if needed. 



> Debuggability : Add an option to take threaddump from a specific vertex/task
> ----------------------------------------------------------------------------
>
>                 Key: TEZ-3154
>                 URL: https://issues.apache.org/jira/browse/TEZ-3154
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>
> tez.task-specific.launch.cmd-opts and tez.task-specific.launch.cmd-opts.list (e.g "Map 1[10]", 10th task in map 1) options are available to add certain parameters to task specific command line options. It has been useful for launching profilers on specific tasks.
> There are scenarios in which taking threaddumps on periodic basis on specific tasks could be helpful. E.g
> - In certain clusters it could be difficult to add profilers. 
> - There could be scenarios where the tasks are slow due apps using Tez (but the counters might indicate no issues in Tez).  (e.g Parsing using SimpleDateFormat for every record could be time consuming)
> - In certain clusters, access might not be there to take threaddumps of tasks from NM. YARN's threadstack  (in RM UI) is mainly for NM and doesn't work on task level.
> Creating this ticket to explore the possibility of adding thread-dump on periodic basis for specific tasks.
> High level e.g: "--hiveconf tez.task-specific.launch.cmd-opts=" -DthreadDumpInterval=5 " --hiveconf tez.task-specific.launch.cmd-opts.list="Map 1[10,15]" - This should print thread-dumps in tasks 10, 15 in Map-1 every 5 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)