You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Ravi Prakash (Commented) (JIRA)" <ji...@apache.org> on 2011/11/28 18:15:40 UTC

[jira] [Commented] (MAPREDUCE-3476) Optimize YARN API calls

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158577#comment-13158577 ] 

Ravi Prakash commented on MAPREDUCE-3476:
-----------------------------------------

Courtesy [~amar_kamat] These APIs need to be investigated for optimization

{quote}
1. JobClient.getClusterStatus()
2. clusterStatus.getMaxMapTasks()
3. clusterStatus.getMaxReduceTasks()
4. clusterStatus.getTaskTrackers()
5. o.p.h.mapreduce.job.mapProgress()
6. o.p.h.mapreduce.job.reduceProgress()
{quote}

>From another quote
{quote}
While improving Gridmix we also got a chance to benchmark few YARN APIs. Here is the summary:
1. APIs to get map and reduce slot capacity cost ~0 secs.
2. API to get the job's map task progress takes 115secs in the worst case. Around 8 calls took more than 10 secs.
Around 26 calls took more than 5 secs. Around 144 calls took more than 1 sec. There were ~43,883 calls made to this
API.
3. API to get job's reduce task progress takes 16secs in the worst case. Around 3 calls took more than 10 secs. Around
4 calls took more than 5 secs. Around 34 calls took more than 1 sec. Around 22,446 calls were made to this API.
4. API to get the number of trackers also take ~0 secs.

The fact that getting map progress of a single job can take ~115secs in the worst case is surprising! I guess
optimizing the map progress and reduce progress APIs can be the first step.
{quote}

                
> Optimize YARN API calls
> -----------------------
>
>                 Key: MAPREDUCE-3476
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3476
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>    Affects Versions: 0.23.0
>            Reporter: Ravi Prakash
>            Assignee: Ravi Prakash
>            Priority: Critical
>
> Several YARN API calls are taking inordinately long. This might be a performance blocker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira