You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by "Min Zhou (JIRA)" <ji...@apache.org> on 2014/02/04 01:54:10 UTC

[jira] [Comment Edited] (TAJO-540) (Umbrella) Implement Tajo Query Scheduler

    [ https://issues.apache.org/jira/browse/TAJO-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890213#comment-13890213 ] 

Min Zhou edited comment on TAJO-540 at 2/4/14 12:53 AM:
--------------------------------------------------------

Ok, I got time to write a more detailed plan for this ticket.

Historically, the first scheduler exists in hadoop ecosystem is the JobTracker in mapreduce.  JobTracker actually plays two roles of a mapreduce cluster, one is resource management and the other is job tasks scheduling. Because of JobTracker's playing those two roles,  the job response time and scalability of JobTracker is not good. This kind of issue also came across the ancestor of mapreduce - Google, which later start a projected named Borg with one of the goal to address this problem. Borg become a cluster resource management scheduler in Google, and its current version name from their paper is Omega.(see https://medium.com/large-scale-data-processing/a7a81f278e6f )

Later this kind of resource scheduler appears into our vision. That's Mesos and Hadoop Yarn. The different between this 2 is mesos support gang scheduling and yarn support incremental scheduling.  Both of them divided cluster scheduling into 2 layers, the higher  layer is resource management, which is the responsibility of  those two.  They control the resource for each application/framework/job, and the other role for job tasks scheduling of a JobTracker has put down into the lower layer - Each application/framework/job's master plays this role, coordinating the tasks for one application/framework/job.

>From our benchmarking,  a job with 10 sleep zero ms tasks in hadoop 1.0 costed about 20 seconds because of JobTracker's scheduling. And Hadoop Yarn take the same level time as well.  What we need here is not a scheduler as MRAppMaster, it's a low-latency scheduler.  From Jeff Dean's paper ( http://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/abstract ), we get a knowledge that Google is always beyond us.  They developed a so called tied request technology to solve the low-latency requirements.  please see the tied request section in http://static.googleusercontent.com/media/research.google.com/en//people/jeff/MIT_BigData_Sep2012.pdf if you can't download the acm paper.




was (Author: coderplay):
Ok, I got time to write a more detailed plan for this ticket.

Historically, the first scheduler exists in hadoop ecosystem is the JobTracker in mapreduce.  JobTracker actually plays two roles of a mapreduce cluster, one is resource management and the other is job tasks scheduling. Because of JobTracker's playing those two roles,  the job response time and scalability of JobTracker is not good. This kind of issue also came across the ancestor of mapreduce - Google, which later start a projected named Borg which one of the goal to address this problem. Borg become a cluster resource management scheduler in Google, and its code name from their paper is Omega.(see https://medium.com/large-scale-data-processing/a7a81f278e6f )

Later this kind of resource scheduler appears into our vision. That's Mesos and Hadoop Yarn. The different between this 2 is mesos support gang scheduling and yarn support incremental scheduling.  Both of them divided cluster scheduling into 2 layers, the higher  layer is resource management, which is the responsibility of  those two.  They control the resource for each application/framework/job, and the other role for job tasks scheduling of a JobTracker has put down into the lower layer - Each application/framework/job's master plays this role, coordinating the tasks for one application/framework/job.

>From our benchmarking,  a job with 10 sleep zero ms tasks in hadoop 1.0 costed about 20 seconds because of JobTracker's scheduling. And Hadoop Yarn take the same level time as well.  What we need here is not a scheduler as MRAppMaster, it's a low-latency scheduler.  From Jeff Dean's paper ( http://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/abstract ), we get a knowledge that Google is always beyond us.  They developed a so called tied request technology to solve the low-latency requirements.  please see the tied request section in http://static.googleusercontent.com/media/research.google.com/en//people/jeff/MIT_BigData_Sep2012.pdf if you can't download the acm paper.



> (Umbrella) Implement Tajo Query Scheduler
> -----------------------------------------
>
>                 Key: TAJO-540
>                 URL: https://issues.apache.org/jira/browse/TAJO-540
>             Project: Tajo
>          Issue Type: New Feature
>            Reporter: Hyunsik Choi
>
> Currently, there is no Tajo query scheduler. So, all queries launched simultaneously compete cluster resource which is managed by TajoResourceManager.
> In this issue, we will investigate,  design, and implement a Tajo query scheduler. This is an umbrella issue for that. We will create subtasks for them.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)