You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Rajesh Balamohan (JIRA)" <ji...@apache.org> on 2014/10/31 03:47:33 UTC

[jira] [Resolved] (TEZ-1698) Cut down on ResourceCalculatorProcessTree overheads in Tez

     [ https://issues.apache.org/jira/browse/TEZ-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rajesh Balamohan resolved TEZ-1698.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: 0.5.2
     Hadoop Flags: Reviewed

Thanks [~gopalv].  Committed to master and branch-0.5

> Cut down on ResourceCalculatorProcessTree overheads in Tez
> ----------------------------------------------------------
>
>                 Key: TEZ-1698
>                 URL: https://issues.apache.org/jira/browse/TEZ-1698
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.5.2
>            Reporter: Gopal V
>            Assignee: Rajesh Balamohan
>             Fix For: 0.5.2
>
>         Attachments: ProcfsBasedProcessTree.png, ProcfsFiles.png, TEZ-1698.1.patch, TEZ-1698.2.patch, TEZ-1698.3.patch, TEZ-1698.4.patch
>
>
> ResourceCalculatorProcessTree scraps all of /proc/ for PIDs which are part of the current task's process group.
> This is mostly wasted in Tez, since unlike YARN which has to do this since it has the PID for the container-executor process (bash) and has to trace the bash -> java spawn inheritance.
> !ProcfsBasedProcessTree.png!
> The latency effect of this is less clearly visible with the profiler turned on as this is primarily related to rate of syscalls + overhead in the kernel (via the following codepath in YARN).
> !ProcfsFiles.png!
> {code}
>  private List<String> getProcessList() {
>     String[] processDirs = (new File(procfsDir)).list();
> ...
>     for (String dir : processDirs) {
>       try {
>         if ((new File(procfsDir, dir)).isDirectory()) {
>           processList.add(dir);
>         }
> ...
>   public void updateProcessTree() {
>     if (!pid.equals(deadPid)) {
>       // Get the list of processes
>       List<String> processList = getProcessList();
> ...
>       for (String proc : processList) {
>         // Get information for each process
>         ProcessInfo pInfo = new ProcessInfo(proc);
>         if (constructProcessInfo(pInfo, procfsDir) != null) {
>           allProcessInfo.put(proc, pInfo);
>           if (proc.equals(this.pid)) {
>             me = pInfo; // cache 'me'
>             processTree.put(proc, pInfo);
>           }
>         }
>       }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)