You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Matei Zaharia (JIRA)" <ji...@apache.org> on 2009/12/03 08:48:20 UTC

[jira] Commented: (MAPREDUCE-1261) Enhance mumak to implement a 'stress-test' for the JobTracker

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785213#action_12785213 ] 

Matei Zaharia commented on MAPREDUCE-1261:
------------------------------------------

Is a TaskTracker that's not doing any work really so resource-intensive that you need 50 machines to simulate a 4000-node cluster? It might be nice to figure out a more efficient way to do this than having one thread per TaskTracker so that it's possible to run these stress-tests without needing a 50-node cluster. For example, we might be able to run many TaskTrackers in one thread using asynchronous IO, if the RPC framework supports that.

> Enhance mumak to implement a 'stress-test' for the JobTracker
> -------------------------------------------------------------
>
>                 Key: MAPREDUCE-1261
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1261
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/mumak
>            Reporter: Arun C Murthy
>
> I propose we enhance mumak to implement a proper 'stress-test' tool for the JobTracker. The idea is that we enhance mumak to have a mode where it can use the *real* JobTracker (and Scheduler of course) and mumak's SimulatedTaskTracker to run real workloads from production job-history traces. Clearly we will need to make necessary changes to allow the SimulatedTaskTrackers to run independently (a thread per SimulatedTT) in a distributed manner.
> We can then simulate very large clusters and workloads using a handful of machines (say ~50 machines to simulate workload which originally ran on a 4000 node cluster), also we can use this to stress the JobTracker with synthetic workloads.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.