You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Mahadev konar (JIRA)" <ji...@apache.org> on 2007/01/18 00:49:30 UTC
[jira] Commented: (HADOOP-719) Integration of Hadoop with batch schedulers

    [ https://issues.apache.org/jira/browse/HADOOP-719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465610 ] 

Mahadev konar commented on HADOOP-719:
--------------------------------------

We have implemented a basic integration of Hadoop with Torque resource manager -- Hadoop On Demand (HOD). It is implemented in python. The implementation allows you to bring up 

multiple map reduce clusters on demand. In the current implementation you can specify a given number of nodes to run your job. HOD gets these nodes from the 

resource manager and brings up a map reduce cluster on these nodes. A job is specified to HOD to run on this transient cluster. HOD will run the job on this 

cluster and then bring down the transient map reduce cluster and give back the nodes to the resource manager. All the logs are shipped back to the client 

that invoked HOD. The drawbacks of the current implementation are --
1) it just allows one job to run per map reduce cluster -- i.e. until the main() method in your job.jar returns.
2)  the machines allocated via torque might not be machines having data locally.
3) There is no support for software distribution to run multiple instances of different hadoop versions
4) The current implementation waits for node allocation (rather than putting them in the queue and exiting) until the nodes are allocated.


The current implementation is also capable of bringing up a transient DFS. This would be helpful for testing purposes where you could bring up a transient 

DFS and a transient MapRed cluster and run jobs on it. The current implementation is also capable of bringing up transient map reduce on a static list of nodes (via ssh) and locally.

We intend to address the current drawbacks in the later upgrades to HOD.

Would this be a good candidate to check into hadoop contrib?

> Integration of Hadoop with batch schedulers
> -------------------------------------------
>
>                 Key: HADOOP-719
>                 URL: https://issues.apache.org/jira/browse/HADOOP-719
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: contrib/streaming
>            Reporter: Mahadev konar
>         Assigned To: Mahadev konar
>
> Hadoop On Demand (HOD) is an integration of Hadoop with batch schedulers like Condor/torque/sun grid etc. Hadoop On Demand or HOD hereafter is a system that populates a Hadoop instance using a shared batch scheduler. HOD will find a requested number of nodes and start up Hadoop daemons on them. Users map reduce jobs can then run on the hadoop instance. After the job is done, HOD gives back the  nodes to the shared batch scheduler. A group of users will use HOD to acquire Hadoop instances of varying sizes and the batch scheduler will schedule requests in a way that important jobs gain more importance/resources and finish fast. Here are a list of requirements for HOD and batch schedulers:
> Key Requirements :
> --- Should allocate the specified minimum number of nodes for a job 
>    Many batch jobs can finish in time, only when enough resources are allocated. Therefore batch scheduler should allocate the asked number of nodes for a given job when the job starts. This is simple form of what's known as gang    scheduling.
>   Often the minimum nodes are not available right away, especially if the job asked for a large number. The batch scheduler should support advance reservation for important jobs so that the wait time can be determined. In advance   reservation, a reservation is created on earliest future point when the preoccupied nodes become available. When nodes are currently idle but booked by future reservations, batch scheduler is ok to give them to other jobs to increase system utilization, but only when doing so does not delay existing reservations.
> --- run short urgent job without costing too much loss to long job. Especially, should not kill job tracker of long job. 
>   Some jobs, mostly short ones, are time sensitive and need urgent treatment. Often, large portion of cluster nodes will be occupied by long running jobs. Batch scheduler should be able to preempt long jobs and run urgent jobs. Then, urgent jobs will finish quickly and long jobs can re-gain the nodes afterward. 
> When preemption happens, HOD should minimize the loss to long jobs. Especially, it should not kill job tracker of long job.
> --- be able to dial up, at run time, share of resources for more important projects.
>   Viewed at high level, a given cluster is shared by multiple projects. A project consists of a number of jobs submitted by a group of users.Batch scheduler should allow important projects to have more resources. This should be tunable at run time as what projects deem more important may change over time. 
> --- prevent malicious abuse of the system. 
>   A shared cluster environment can be put in jeopardy if malicious or erroneous job code does: 
>  -- hold unneeded resources for a long period 
>  -- use privileges for unworthy work 
>   Such abuse can easily cause under-utilization or starvation of other jobs. Batch scheduler should allow  setting up policies for preventing resource abuse by: 
>  -- limit privileges to legitimate uses asking for proper amount 
>  -- throttle peak use of resources per player 
>  -- monitor and reduce starvation 
> --- The behavior should be simple and predictable 
>    When status of the system is queried, we should be able to determine what factors caused it to reach current status and what could be the future behavior with or without our tuning on the system. 
> --- be portable to major resource managers 
>    HOD design should be portable so that in future we are able to plugin other resource manager. 
> Some of the key requirements are implemented by the batch schedulers. The others need to be implemented by HOD.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira