You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Rajesh Balamohan (JIRA)" <ji...@apache.org> on 2014/04/20 04:35:14 UTC

[jira] [Created] (TEZ-1074) DAGAppMaster takes lots of CPU when running a reasonably large job in the cluster

Rajesh Balamohan created TEZ-1074:
-------------------------------------

             Summary: DAGAppMaster takes lots of CPU when running a reasonably large job in the cluster
                 Key: TEZ-1074
                 URL: https://issues.apache.org/jira/browse/TEZ-1074
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Rajesh Balamohan


- Ran a job which used 200 containers.
- DAGAppMaster was running at 70% most of the time during the job.
- Profiling revealed that lots of time was spent on TezEvent.readFields --> ... --> TaskStatusUpdateEvent.readFields().
- Default "tez.task.am.heartbeat.interval-ms.max=100" ms.  With 200 containers, 2000 events (containing counters) were processed by DAGAppMaster.

With large job, cpu usage can bloat up significantly.  

One option to reduce CPU usage could be to send modified TezCounters in TezStatusUpdateEvent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)