You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Sreekanth Ramakrishnan (JIRA)" <ji...@apache.org> on 2008/08/19 08:43:44 UTC

[jira] Issue Comment Edited: (HADOOP-3930) Decide how to integrate scheduler info into CLI and job tracker web page

    [ https://issues.apache.org/jira/browse/HADOOP-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623575#action_12623575 ] 

sreekanth edited comment on HADOOP-3930 at 8/18/08 11:42 PM:
--------------------------------------------------------------------------

Attaching patch with adding API's in the TaskScheduler to expose scheduling information related to it.

Added following methods to TaskScheduler:

Map<String,String> getSchedulingInfo(JobInProgress job) - Returns map containing scheduling information related to a particular job
Map<String,String> getSchedulingInfo(String queueName ) - Returns map containing scheduling information related to a particular queue.
Collection<JobInProgress> getJobs(String queue) - Returns a list of jobs for particular queue
List<String> getQueues()  - Returns all the queues which scheduler uses.

List<String> getQueueSchedulingParameterList() - Returns ordered List of the scheduling parameters related to queues.
List<String> getJobSchedulingParameterList() - Returns ordered List of the scheduling parameters related to a particular Job

The above two methods were introduced, to determine the the order in which the columns in a table have to be generated by the web UI.

A new method was introduced in JobTracker:
TaskScheduler getTaskScheduler() - Returns the instance of task scheduler which is used by JobTracker.

JobQueueTaskScheduler and LimitTasksPerJobTaskScheduler have been modified to implement the new API's to expose scheduling information.

Have made changes in the jobtracker.jsp to do the following:

Create a new section called Scheduler information and build a table dynamically for displaying the scheduler information pertaining to queues which scheduler holds. The order of the column is determined by value returned from getQueueSchedulingParameterList(). 
Created sections in the Job Table generation for displaying scheduling information pertaining to the particular job. The order of the column is determined by value returned from getJobSchedulingParameterList (). 
 
If a particular scheduler returns null for getQueueSchedulingParameterList, then the new section called Scheduler information is not displayed in the jobtracker.jsp
If a particular scheduler returns null for the getSchedulingInfo(JobInProgress job) then no new section is added on to the Job Table.


Any thoughts on improving the above approach

      was (Author: sreekanth):
    Attaching patch with adding API's in the TaskScheduler to expose scheduling information related to it.

Added following methods to TaskScheduler:

Map<String,String> getSchedulingInfo(JobInProgress job) - Returns map containing scheduling information related to a particular job
Map<String,String> getSchedulingInfo(String queueName ) - Returns map containing scheduling information related to a particular queue.
Collection<JobInProgress> getJobs(String queue) - Returns a list of jobs for particular queue
List<String> getQueues()  - Returns all the queues which scheduler uses.

List<String> getQueueSchedulingParameterList() - Returns ordered List of the scheduling parameters related to queues.
List<String> getJobSchedulingParameterList() - Returns ordered List of the scheduling parameters related to a particular Job

The above two methods were introduced, to determine the the order in which the columns in a table have to be generated by the web UI.

A new method was introduced in JobTracker:
TaskScheduler getTaskScheduler() - Returns the instance of task scheduler which is used by JobTracker.

JobQueueTaskScheduler and LimitTasksPerJobTaskScheduler have been modified to implement the new API's to expose scheduling information.

Have made changes in the jobtracker.jsp to do the following:

Create a new section called Scheduler information and build a table dynamically for displaying the scheduler information pertaining to queues which scheduler holds. The order of the column is determined by value returned from getQueueSchedulingParameterList(). 
Created sections in the Job Table generation for displaying scheduling information pertaining to the particular job. The order of the column is determined by value returned from getJobSchedulingParameterList (). 
 
If a particular scheduler returns null for getQueueSchedulingParameterList, then the new section called Scheduler information is not displayed in the jobtracker.jsp
If a particular scheduler returns null for the getSchedulingInfo(JobInProgress job) then no new section is added on to the Job Table.
  
> Decide how to integrate scheduler info into CLI and job tracker web page
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-3930
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3930
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.17.2
>            Reporter: Matei Zaharia
>            Priority: Minor
>         Attachments: 3930-1.patch, mockup.JPG
>
>
> We need a way for job schedulers such as HADOOP-3445 and HADOOP-3476 to provide info to display on the JobTracker web interface and in the CLI. The main things needed seem to be:
> * A way for schedulers to provide info to show in a column on the web UI and in the CLI - something as simple as a single string, or a map<string, int> for multiple parameters.
> * Some sorting order for jobs - maybe a method to sort a list of jobs.
> Let's figure out what the best way to do this is and implement it in the existing schedulers.
> My first-order proposal at an API: Augment the TaskScheduler with
> * public Map<String, String> getSchedulingInfo(JobInProgress job) -- returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of jobs.
> * public Map<String, String> getSchedulingInfo(String queue) -- returns key-value pairs which are displayed in columns on the web UI or the CLI for the list of queues.
> * public Collection<JobInProgress> getJobs(String queueName) -- returns the list of jobs in a given queue, sorted by a scheduler-specific order (the order it wants to run them in / schedule the next task in / etc).
> * public List<String> getQueues();

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.