You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Ravi Prakash (Commented) (JIRA)" <ji...@apache.org> on 2011/11/28 18:15:40 UTC
[jira] [Commented] (MAPREDUCE-3476) Optimize YARN API calls
[ https://issues.apache.org/jira/browse/MAPREDUCE-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158577#comment-13158577 ]
Ravi Prakash commented on MAPREDUCE-3476:
-----------------------------------------
Courtesy [~amar_kamat] These APIs need to be investigated for optimization
{quote}
1. JobClient.getClusterStatus()
2. clusterStatus.getMaxMapTasks()
3. clusterStatus.getMaxReduceTasks()
4. clusterStatus.getTaskTrackers()
5. o.p.h.mapreduce.job.mapProgress()
6. o.p.h.mapreduce.job.reduceProgress()
{quote}
>From another quote
{quote}
While improving Gridmix we also got a chance to benchmark few YARN APIs. Here is the summary:
1. APIs to get map and reduce slot capacity cost ~0 secs.
2. API to get the job's map task progress takes 115secs in the worst case. Around 8 calls took more than 10 secs.
Around 26 calls took more than 5 secs. Around 144 calls took more than 1 sec. There were ~43,883 calls made to this
API.
3. API to get job's reduce task progress takes 16secs in the worst case. Around 3 calls took more than 10 secs. Around
4 calls took more than 5 secs. Around 34 calls took more than 1 sec. Around 22,446 calls were made to this API.
4. API to get the number of trackers also take ~0 secs.
The fact that getting map progress of a single job can take ~115secs in the worst case is surprising! I guess
optimizing the map progress and reduce progress APIs can be the first step.
{quote}
> Optimize YARN API calls
> -----------------------
>
> Key: MAPREDUCE-3476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3476
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: mrv2
> Affects Versions: 0.23.0
> Reporter: Ravi Prakash
> Assignee: Ravi Prakash
> Priority: Critical
>
> Several YARN API calls are taking inordinately long. This might be a performance blocker.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira