You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tez.apache.org by Raajay <ra...@gmail.com> on 2016/01/13 18:20:31 UTC

DAGScheduler

Hello,

I want to implement a DAGScheduler that ensures that all (or some) map
vertices of a given dag are not scheduled immediately, but say after a
configurable amount of time, T.

I can think to do this in 3 following ways:

1. I would buffer all the "scheduleTask" events and then raise
"TaskAttemptEventSchedule" after time, T. For this, I would need a clock
with in DAGScheduler. I can also subscribe to an event raised periodically.
Is there such an eventGenerator in Tez ?

2. Using delayed events. Is there support to raised delayed events ?

3. Modify the priority levels in "DAGSchedulerNaturalOrderControlled" based
on T, to delay the start of vertices ?


Which of the three is easiest to implement and possibly have limited
side-effects ? Any help/pointers is appreciated.

Thanks
Raajay

Re: DAGScheduler

Posted by Raajay <ra...@gmail.com>.
Hi Jeff,

I need it for a research prototype exploring joint DAG scheduling; so there
are cases where we want some stages to be delayed. The current DAGScheduler
(NaturalOrderControlled) immediately sends the map tasks to be scheduled.

I wrote up a DAGScheduler, which can read a flat file, and release tasks to
be scheduled when natural order and time constraints are satisfied. The
code is at (
https://github.com/raajay/tez/blob/crossquery/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/DAGSchedulerCrossQuery.java)
if you are interested.


Basically, upon creating an instance of a DAG Scheduler, an Executor
Services is launched to periodically call a function in DAGScheduler that
clears out the pending events.

I would like to avoid reading the start times from a flat file; is it
possible to send it
as a string via "localResources"? If yes, what methods should I use ?

Thanks
Raajay

On Wed, Jan 13, 2016 at 11:58 AM, Jianfeng (Jeff) Zhang <
jzhang@hortonworks.com> wrote:

>
> Hi Raajay,
>
> I don’t think this would it be an easy change (Maybe you can do it
> VertexManager, but may still not be trivial change),  do you mind tell
> what’s your purpose for waiting for a period of time ?
>
>
> Best Regard,
> Jeff Zhang
>
>
> From: Raajay <ra...@gmail.com>
> Reply-To: "user@tez.apache.org" <us...@tez.apache.org>
> Date: Wednesday, January 13, 2016 at 9:20 AM
> To: "user@tez.apache.org" <us...@tez.apache.org>
> Subject: DAGScheduler
>
> Hello,
>
> I want to implement a DAGScheduler that ensures that all (or some) map
> vertices of a given dag are not scheduled immediately, but say after a
> configurable amount of time, T.
>
> I can think to do this in 3 following ways:
>
> 1. I would buffer all the "scheduleTask" events and then raise
> "TaskAttemptEventSchedule" after time, T. For this, I would need a clock
> with in DAGScheduler. I can also subscribe to an event raised periodically.
> Is there such an eventGenerator in Tez ?
>
> 2. Using delayed events. Is there support to raised delayed events ?
>
> 3. Modify the priority levels in "DAGSchedulerNaturalOrderControlled"
> based on T, to delay the start of vertices ?
>
>
> Which of the three is easiest to implement and possibly have limited
> side-effects ? Any help/pointers is appreciated.
>
> Thanks
> Raajay
>
>
>

Re: DAGScheduler

Posted by "Jianfeng (Jeff) Zhang" <jz...@hortonworks.com>.
Hi Raajay,

I don't think this would it be an easy change (Maybe you can do it VertexManager, but may still not be trivial change),  do you mind tell what's your purpose for waiting for a period of time ?


Best Regard,
Jeff Zhang


From: Raajay <ra...@gmail.com>>
Reply-To: "user@tez.apache.org<ma...@tez.apache.org>" <us...@tez.apache.org>>
Date: Wednesday, January 13, 2016 at 9:20 AM
To: "user@tez.apache.org<ma...@tez.apache.org>" <us...@tez.apache.org>>
Subject: DAGScheduler

Hello,

I want to implement a DAGScheduler that ensures that all (or some) map vertices of a given dag are not scheduled immediately, but say after a configurable amount of time, T.

I can think to do this in 3 following ways:

1. I would buffer all the "scheduleTask" events and then raise "TaskAttemptEventSchedule" after time, T. For this, I would need a clock with in DAGScheduler. I can also subscribe to an event raised periodically. Is there such an eventGenerator in Tez ?

2. Using delayed events. Is there support to raised delayed events ?

3. Modify the priority levels in "DAGSchedulerNaturalOrderControlled" based on T, to delay the start of vertices ?


Which of the three is easiest to implement and possibly have limited side-effects ? Any help/pointers is appreciated.

Thanks
Raajay