You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Imran Rashid <ir...@cloudera.com.INVALID> on 2019/02/04 16:42:04 UTC

scheduler braindump: architecture, gotchas, etc.

The scheduler has been pretty error-prone and hard to work on, and I feel
like there may be a dwindling core of active experts.  I'm sure its very
discouraging to folks trying to make what seem like simple changes, and
then find they are in a rats nest of complex issues they weren't
expecting.  But for those who are still trying, THANK YOU!  more
involvement and more folks becoming experts is definitely needed.

I put together a doc going over the architecture of the scheduler, and
things I've seen us get bitten by in the past.  Its sort of a brain dump,
but I'm hopeful it'll help orient new folks to the scheduler.  I also hope
more experts will chime in -- there are places in the doc I know I've
missed things, and called that out, but there are probably even more that
should be discussed, & mistakes I've made.  All input welcome.

https://docs.google.com/document/d/1oiE21t-8gXLXk5evo-t-BXpO5Hdcob5D-Ps40hogsp8/edit?usp=sharing

Re: scheduler braindump: architecture, gotchas, etc.

Posted by Parth Gandhi <pa...@gmail.com>.
Thank you Imran, this is quite helpful.

Regards,
Parth Kamlesh Gandhi


On Mon, Feb 4, 2019 at 11:01 AM Rubén Berenguel <rb...@gmail.com>
wrote:

> Thanks Imran, will definitely give it a look (even if just out of sheer
> interest on how the sausage is done)
>
> R
>
>
> On 4 February 2019 at 17:59:33, John Zhuge (jzhuge@apache.org) wrote:
>
> Thanks Imran!
>
> On Mon, Feb 4, 2019 at 8:42 AM Imran Rashid <ir...@cloudera.com.invalid>
> wrote:
>
>> The scheduler has been pretty error-prone and hard to work on, and I feel
>> like there may be a dwindling core of active experts.  I'm sure its very
>> discouraging to folks trying to make what seem like simple changes, and
>> then find they are in a rats nest of complex issues they weren't
>> expecting.  But for those who are still trying, THANK YOU!  more
>> involvement and more folks becoming experts is definitely needed.
>>
>> I put together a doc going over the architecture of the scheduler, and
>> things I've seen us get bitten by in the past.  Its sort of a brain dump,
>> but I'm hopeful it'll help orient new folks to the scheduler.  I also hope
>> more experts will chime in -- there are places in the doc I know I've
>> missed things, and called that out, but there are probably even more that
>> should be discussed, & mistakes I've made.  All input welcome.
>>
>>
>> https://docs.google.com/document/d/1oiE21t-8gXLXk5evo-t-BXpO5Hdcob5D-Ps40hogsp8/edit?usp=sharing
>>
>
>
> --
> John Zhuge
>
>

Re: scheduler braindump: architecture, gotchas, etc.

Posted by Rubén Berenguel <rb...@gmail.com>.
Thanks Imran, will definitely give it a look (even if just out of sheer
interest on how the sausage is done)

R


On 4 February 2019 at 17:59:33, John Zhuge (jzhuge@apache.org) wrote:

Thanks Imran!

On Mon, Feb 4, 2019 at 8:42 AM Imran Rashid <ir...@cloudera.com.invalid>
wrote:

> The scheduler has been pretty error-prone and hard to work on, and I feel
> like there may be a dwindling core of active experts.  I'm sure its very
> discouraging to folks trying to make what seem like simple changes, and
> then find they are in a rats nest of complex issues they weren't
> expecting.  But for those who are still trying, THANK YOU!  more
> involvement and more folks becoming experts is definitely needed.
>
> I put together a doc going over the architecture of the scheduler, and
> things I've seen us get bitten by in the past.  Its sort of a brain dump,
> but I'm hopeful it'll help orient new folks to the scheduler.  I also hope
> more experts will chime in -- there are places in the doc I know I've
> missed things, and called that out, but there are probably even more that
> should be discussed, & mistakes I've made.  All input welcome.
>
>
> https://docs.google.com/document/d/1oiE21t-8gXLXk5evo-t-BXpO5Hdcob5D-Ps40hogsp8/edit?usp=sharing
>


--
John Zhuge

Re: scheduler braindump: architecture, gotchas, etc.

Posted by sujith chacko <su...@gmail.com>.
Thanks Li and Imran for providing us  an overview about one of the complex
module in spark 👍 Excellent sharing.

Regards
Sujith.

On Mon, 4 Feb 2019 at 10:54 PM, Xiao Li <ga...@gmail.com> wrote:

> Thank you, Imran!
>
> Also, I attached the slides of "Deep Dive: Scheduler of Apache Spark".
>
> Cheers,
>
> Xiao
>
>
>
> John Zhuge <jz...@apache.org> 于2019年2月4日周一 上午8:59写道:
>
>> Thanks Imran!
>>
>> On Mon, Feb 4, 2019 at 8:42 AM Imran Rashid <ir...@cloudera.com.invalid>
>> wrote:
>>
>>> The scheduler has been pretty error-prone and hard to work on, and I
>>> feel like there may be a dwindling core of active experts.  I'm sure its
>>> very discouraging to folks trying to make what seem like simple changes,
>>> and then find they are in a rats nest of complex issues they weren't
>>> expecting.  But for those who are still trying, THANK YOU!  more
>>> involvement and more folks becoming experts is definitely needed.
>>>
>>> I put together a doc going over the architecture of the scheduler, and
>>> things I've seen us get bitten by in the past.  Its sort of a brain dump,
>>> but I'm hopeful it'll help orient new folks to the scheduler.  I also hope
>>> more experts will chime in -- there are places in the doc I know I've
>>> missed things, and called that out, but there are probably even more that
>>> should be discussed, & mistakes I've made.  All input welcome.
>>>
>>>
>>> https://docs.google.com/document/d/1oiE21t-8gXLXk5evo-t-BXpO5Hdcob5D-Ps40hogsp8/edit?usp=sharing
>>>
>>
>>
>> --
>> John Zhuge
>>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org

Re: scheduler braindump: architecture, gotchas, etc.

Posted by John Zhuge <jz...@apache.org>.
Thx Xiao!

On Mon, Feb 4, 2019 at 9:04 AM Xiao Li <ga...@gmail.com> wrote:

> Thank you, Imran!
>
> Also, I attached the slides of "Deep Dive: Scheduler of Apache Spark".
>
> Cheers,
>
> Xiao
>
>
>
> John Zhuge <jz...@apache.org> 于2019年2月4日周一 上午8:59写道:
>
>> Thanks Imran!
>>
>> On Mon, Feb 4, 2019 at 8:42 AM Imran Rashid <ir...@cloudera.com.invalid>
>> wrote:
>>
>>> The scheduler has been pretty error-prone and hard to work on, and I
>>> feel like there may be a dwindling core of active experts.  I'm sure its
>>> very discouraging to folks trying to make what seem like simple changes,
>>> and then find they are in a rats nest of complex issues they weren't
>>> expecting.  But for those who are still trying, THANK YOU!  more
>>> involvement and more folks becoming experts is definitely needed.
>>>
>>> I put together a doc going over the architecture of the scheduler, and
>>> things I've seen us get bitten by in the past.  Its sort of a brain dump,
>>> but I'm hopeful it'll help orient new folks to the scheduler.  I also hope
>>> more experts will chime in -- there are places in the doc I know I've
>>> missed things, and called that out, but there are probably even more that
>>> should be discussed, & mistakes I've made.  All input welcome.
>>>
>>>
>>> https://docs.google.com/document/d/1oiE21t-8gXLXk5evo-t-BXpO5Hdcob5D-Ps40hogsp8/edit?usp=sharing
>>>
>>
>>
>> --
>> John Zhuge
>>
>

-- 
John Zhuge

Re: scheduler braindump: architecture, gotchas, etc.

Posted by Xiao Li <ga...@gmail.com>.
Thank you, Imran!

Also, I attached the slides of "Deep Dive: Scheduler of Apache Spark".

Cheers,

Xiao



John Zhuge <jz...@apache.org> 于2019年2月4日周一 上午8:59写道:

> Thanks Imran!
>
> On Mon, Feb 4, 2019 at 8:42 AM Imran Rashid <ir...@cloudera.com.invalid>
> wrote:
>
>> The scheduler has been pretty error-prone and hard to work on, and I feel
>> like there may be a dwindling core of active experts.  I'm sure its very
>> discouraging to folks trying to make what seem like simple changes, and
>> then find they are in a rats nest of complex issues they weren't
>> expecting.  But for those who are still trying, THANK YOU!  more
>> involvement and more folks becoming experts is definitely needed.
>>
>> I put together a doc going over the architecture of the scheduler, and
>> things I've seen us get bitten by in the past.  Its sort of a brain dump,
>> but I'm hopeful it'll help orient new folks to the scheduler.  I also hope
>> more experts will chime in -- there are places in the doc I know I've
>> missed things, and called that out, but there are probably even more that
>> should be discussed, & mistakes I've made.  All input welcome.
>>
>>
>> https://docs.google.com/document/d/1oiE21t-8gXLXk5evo-t-BXpO5Hdcob5D-Ps40hogsp8/edit?usp=sharing
>>
>
>
> --
> John Zhuge
>

Re: scheduler braindump: architecture, gotchas, etc.

Posted by John Zhuge <jz...@apache.org>.
Thanks Imran!

On Mon, Feb 4, 2019 at 8:42 AM Imran Rashid <ir...@cloudera.com.invalid>
wrote:

> The scheduler has been pretty error-prone and hard to work on, and I feel
> like there may be a dwindling core of active experts.  I'm sure its very
> discouraging to folks trying to make what seem like simple changes, and
> then find they are in a rats nest of complex issues they weren't
> expecting.  But for those who are still trying, THANK YOU!  more
> involvement and more folks becoming experts is definitely needed.
>
> I put together a doc going over the architecture of the scheduler, and
> things I've seen us get bitten by in the past.  Its sort of a brain dump,
> but I'm hopeful it'll help orient new folks to the scheduler.  I also hope
> more experts will chime in -- there are places in the doc I know I've
> missed things, and called that out, but there are probably even more that
> should be discussed, & mistakes I've made.  All input welcome.
>
>
> https://docs.google.com/document/d/1oiE21t-8gXLXk5evo-t-BXpO5Hdcob5D-Ps40hogsp8/edit?usp=sharing
>


-- 
John Zhuge