You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Marcelo Vanzin <va...@cloudera.com> on 2017/07/31 17:27:57 UTC

[VOTE] [SPIP] SPARK-18085: Better History Server scalability

Hey all,

Following the SPIP process, I'm putting this SPIP up for a vote. It's
been open for comments as an SPIP for about 3 weeks now, and had been
open without the SPIP label for about 9 months before that. There has
been no new feedback since it was tagged as an SPIP, so I'm assuming
all the people who looked at it are OK with the current proposal.

The vote will be up for the next 72 hours. Please reply with your vote:

+1: Yeah, let's go forward and implement the SPIP.
+0: Don't really care.
-1: I don't think this is a good idea because of the following
technical reasons.

Thanks!

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org

Re: [VOTE] [SPIP] SPARK-18085: Better History Server scalability

Posted by Reynold Xin <rx...@databricks.com>.

A late +1 too.

On Thu, Aug 3, 2017 at 1:37 PM Marcelo Vanzin <va...@cloudera.com> wrote:

> This vote passes with 3 binding +1 votes, 5 non-binding votes, and no -1
> votes.
>
> Thanks all!
>
> +1 votes (binding):
> Tom Graves
> Sean Owen
> Marcelo Vanzin
>
> +1 votes (non-binding):
> Ryan Blue
> Denis Bolshakov
> Dong Joon Hyun
> Hyukjin Kwon
> Ashutosh Pathak
>
>
> On Mon, Jul 31, 2017 at 10:27 AM, Marcelo Vanzin <va...@cloudera.com>
> wrote:
> > Hey all,
> >
> > Following the SPIP process, I'm putting this SPIP up for a vote. It's
> > been open for comments as an SPIP for about 3 weeks now, and had been
> > open without the SPIP label for about 9 months before that. There has
> > been no new feedback since it was tagged as an SPIP, so I'm assuming
> > all the people who looked at it are OK with the current proposal.
> >
> > The vote will be up for the next 72 hours. Please reply with your vote:
> >
> > +1: Yeah, let's go forward and implement the SPIP.
> > +0: Don't really care.
> > -1: I don't think this is a good idea because of the following
> > technical reasons.
> >
> > Thanks!
> >
> > --
> > Marcelo
>
>
>
> --
> Marcelo
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>

Re: [VOTE] [SPIP] SPARK-18085: Better History Server scalability

Posted by Marcelo Vanzin <va...@cloudera.com>.

This vote passes with 3 binding +1 votes, 5 non-binding votes, and no -1 votes.

Thanks all!

+1 votes (binding):
Tom Graves
Sean Owen
Marcelo Vanzin

+1 votes (non-binding):
Ryan Blue
Denis Bolshakov
Dong Joon Hyun
Hyukjin Kwon
Ashutosh Pathak


On Mon, Jul 31, 2017 at 10:27 AM, Marcelo Vanzin <va...@cloudera.com> wrote:
> Hey all,
>
> Following the SPIP process, I'm putting this SPIP up for a vote. It's
> been open for comments as an SPIP for about 3 weeks now, and had been
> open without the SPIP label for about 9 months before that. There has
> been no new feedback since it was tagged as an SPIP, so I'm assuming
> all the people who looked at it are OK with the current proposal.
>
> The vote will be up for the next 72 hours. Please reply with your vote:
>
> +1: Yeah, let's go forward and implement the SPIP.
> +0: Don't really care.
> -1: I don't think this is a good idea because of the following
> technical reasons.
>
> Thanks!
>
> --
> Marcelo



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org

Re: [VOTE] [SPIP] SPARK-18085: Better History Server scalability

Posted by Marcelo Vanzin <va...@cloudera.com>.

Thanks all for the comments. Just a clarification:

On Tue, Aug 1, 2017 at 2:18 AM, Sean Owen <so...@cloudera.com> wrote:
> Is 'spark-ui' too broad? doesn't sound like this module would actually house
> all the UIs. spark-shs-ui or something?
> Good that this can be implemented in parallel to the existing mechanism for
> the initial milestones.
> So the "Mx" milestones are essentially required, in your view? and the "SMx"
> are optional and stand-alone?

The spec attached to the bug was written before I had a chance to work
on most of the implementation, so it doesn't accurately reflect the
current status of the code. The major changes are all there, but I
changed some details when I ran into issues during implementation that
I thought would end up being blockers. The two main changes from the
spec are:

- I gave up separating the UI into a separate module for now. It would
probably break someone's workflow, and it made it really hard to track
UI changes made concurrently. With the current approach I know I'm not
missing anything that was added to the UI while I was writing that
code.

- While doing some perf testing I decided to make some optional
milestones required; mainly, having an in-memory implementation, and
having that be the default. This also keeps the current behavior as
the default, and people have to opt-in to using the new code.

- In spite of calling it out as a non-goal, I found that at least the
current stage page became really slow with a LevelDB backend with
large jobs (it's already slow now, but it got worse), so I included
SPARK-20657 in the things I think are needed for the first
implementation.

That should be all.

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org

Re: [VOTE] [SPIP] SPARK-18085: Better History Server scalability

Posted by Hyukjin Kwon <gu...@gmail.com>.

+1

Although I am not used to this code path,  I read the proposal few times
and it makes sense to me.

On 1 Aug 2017 7:01 pm, "Denis Bolshakov" <bo...@gmail.com> wrote:

+1

Absolutely agree on SPARK-18085.

On 1 August 2017 at 12:18, Sean Owen <so...@cloudera.com> wrote:

> (Direct link to design doc, linked from JIRA)
> https://issues.apache.org/jira/browse/SPARK-18085
> https://issues.apache.org/jira/secure/attachment/12835040/
> spark_hs_next_gen.pdf
>
> I know Marcelo has looked closely at this issue for a long while and trust
> his judgment about what needs to be fixed, and how. I know he has a good
> view into practical pain points like this from customer deployments.
>
> I read the whole doc. Although I have not followed the SHS issues closely,
> the recounting of issues makes sense to me, as well as the stated goals.
>
> There is a considerable amount of change proposed here, but it's broken
> down into milestones. The change doesn't seem excessive given what needs to
> be improved -- it does need a rearchitecting.
>
> Now, some minor comments
>
> +1 to reusing a 'database' library already used by Spark.
> Is 'spark-ui' too broad? doesn't sound like this module would actually
> house all the UIs. spark-shs-ui or something?
> Good that this can be implemented in parallel to the existing mechanism
> for the initial milestones.
> So the "Mx" milestones are essentially required, in your view? and the
> "SMx" are optional and stand-alone?
>
>
> So yes I have no concerns, trust the analysis, think it's a real problem
> that will take a sustained effort. +1
>
>
> On Mon, Jul 31, 2017 at 6:28 PM Marcelo Vanzin <va...@cloudera.com>
> wrote:
>
>> Hey all,
>>
>> Following the SPIP process, I'm putting this SPIP up for a vote. It's
>> been open for comments as an SPIP for about 3 weeks now, and had been
>> open without the SPIP label for about 9 months before that. There has
>> been no new feedback since it was tagged as an SPIP, so I'm assuming
>> all the people who looked at it are OK with the current proposal.
>>
>> The vote will be up for the next 72 hours. Please reply with your vote:
>>
>> +1: Yeah, let's go forward and implement the SPIP.
>> +0: Don't really care.
>> -1: I don't think this is a good idea because of the following
>> technical reasons.
>>
>> Thanks!
>>
>> --
>> Marcelo
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>
>>


-- 
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.denis@gmail.com

Re: [VOTE] [SPIP] SPARK-18085: Better History Server scalability

Posted by Denis Bolshakov <bo...@gmail.com>.

+1

Absolutely agree on SPARK-18085.

On 1 August 2017 at 12:18, Sean Owen <so...@cloudera.com> wrote:

> (Direct link to design doc, linked from JIRA)
> https://issues.apache.org/jira/browse/SPARK-18085
> https://issues.apache.org/jira/secure/attachment/
> 12835040/spark_hs_next_gen.pdf
>
> I know Marcelo has looked closely at this issue for a long while and trust
> his judgment about what needs to be fixed, and how. I know he has a good
> view into practical pain points like this from customer deployments.
>
> I read the whole doc. Although I have not followed the SHS issues closely,
> the recounting of issues makes sense to me, as well as the stated goals.
>
> There is a considerable amount of change proposed here, but it's broken
> down into milestones. The change doesn't seem excessive given what needs to
> be improved -- it does need a rearchitecting.
>
> Now, some minor comments
>
> +1 to reusing a 'database' library already used by Spark.
> Is 'spark-ui' too broad? doesn't sound like this module would actually
> house all the UIs. spark-shs-ui or something?
> Good that this can be implemented in parallel to the existing mechanism
> for the initial milestones.
> So the "Mx" milestones are essentially required, in your view? and the
> "SMx" are optional and stand-alone?
>
>
> So yes I have no concerns, trust the analysis, think it's a real problem
> that will take a sustained effort. +1
>
>
> On Mon, Jul 31, 2017 at 6:28 PM Marcelo Vanzin <va...@cloudera.com>
> wrote:
>
>> Hey all,
>>
>> Following the SPIP process, I'm putting this SPIP up for a vote. It's
>> been open for comments as an SPIP for about 3 weeks now, and had been
>> open without the SPIP label for about 9 months before that. There has
>> been no new feedback since it was tagged as an SPIP, so I'm assuming
>> all the people who looked at it are OK with the current proposal.
>>
>> The vote will be up for the next 72 hours. Please reply with your vote:
>>
>> +1: Yeah, let's go forward and implement the SPIP.
>> +0: Don't really care.
>> -1: I don't think this is a good idea because of the following
>> technical reasons.
>>
>> Thanks!
>>
>> --
>> Marcelo
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>
>>


-- 
//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.denis@gmail.com

Re: [VOTE] [SPIP] SPARK-18085: Better History Server scalability

Posted by Sean Owen <so...@cloudera.com>.

(Direct link to design doc, linked from JIRA)
https://issues.apache.org/jira/browse/SPARK-18085
https://issues.apache.org/jira/secure/attachment/12835040/spark_hs_next_gen.pdf

I know Marcelo has looked closely at this issue for a long while and trust
his judgment about what needs to be fixed, and how. I know he has a good
view into practical pain points like this from customer deployments.

I read the whole doc. Although I have not followed the SHS issues closely,
the recounting of issues makes sense to me, as well as the stated goals.

There is a considerable amount of change proposed here, but it's broken
down into milestones. The change doesn't seem excessive given what needs to
be improved -- it does need a rearchitecting.

Now, some minor comments

+1 to reusing a 'database' library already used by Spark.
Is 'spark-ui' too broad? doesn't sound like this module would actually
house all the UIs. spark-shs-ui or something?
Good that this can be implemented in parallel to the existing mechanism for
the initial milestones.
So the "Mx" milestones are essentially required, in your view? and the
"SMx" are optional and stand-alone?

So yes I have no concerns, trust the analysis, think it's a real problem
that will take a sustained effort. +1

On Mon, Jul 31, 2017 at 6:28 PM Marcelo Vanzin <va...@cloudera.com> wrote:

> Hey all,
>
> Following the SPIP process, I'm putting this SPIP up for a vote. It's
> been open for comments as an SPIP for about 3 weeks now, and had been
> open without the SPIP label for about 9 months before that. There has
> been no new feedback since it was tagged as an SPIP, so I'm assuming
> all the people who looked at it are OK with the current proposal.
>
> The vote will be up for the next 72 hours. Please reply with your vote:
>
> +1: Yeah, let's go forward and implement the SPIP.
> +0: Don't really care.
> -1: I don't think this is a good idea because of the following
> technical reasons.
>
> Thanks!
>
> --
> Marcelo
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>

Re: [VOTE] [SPIP] SPARK-18085: Better History Server scalability

Posted by Dong Joon Hyun <dh...@hortonworks.com>.

+1 (non-binding)

Dongjoon.

From: Ryan Blue <rb...@netflix.com.INVALID>
Reply-To: "rblue@netflix.com" <rb...@netflix.com>
Date: Tuesday, August 1, 2017 at 9:06 AM
To: Tom Graves <tg...@yahoo.com.invalid>
Cc: Marcelo Vanzin <va...@cloudera.com>, "dev@spark.apache.org" <de...@spark.apache.org>
Subject: Re: [VOTE] [SPIP] SPARK-18085: Better History Server scalability

+1 (non-binding)

On Tue, Aug 1, 2017 at 6:48 AM, Tom Graves <tg...@yahoo.com.invalid>> wrote:
+1.

Tom

On Monday, July 31, 2017, 12:28:02 PM CDT, Marcelo Vanzin <va...@cloudera.com>> wrote:

Hey all,

Following the SPIP process, I'm putting this SPIP up for a vote. It's
been open for comments as an SPIP for about 3 weeks now, and had been
open without the SPIP label for about 9 months before that. There has
been no new feedback since it was tagged as an SPIP, so I'm assuming
all the people who looked at it are OK with the current proposal.

The vote will be up for the next 72 hours. Please reply with your vote:

+1: Yeah, let's go forward and implement the SPIP.
+0: Don't really care.
-1: I don't think this is a good idea because of the following
technical reasons.

Thanks!

--
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org<ma...@spark.apache.org>

--
Ryan Blue
Software Engineer
Netflix

Re: [VOTE] [SPIP] SPARK-18085: Better History Server scalability

Posted by Ryan Blue <rb...@netflix.com.INVALID>.

+1 (non-binding)

On Tue, Aug 1, 2017 at 6:48 AM, Tom Graves <tg...@yahoo.com.invalid>
wrote:

> +1.
>
>
> Tom
>
>
>
> On Monday, July 31, 2017, 12:28:02 PM CDT, Marcelo Vanzin <
> vanzin@cloudera.com> wrote:
>
>
> Hey all,
>
> Following the SPIP process, I'm putting this SPIP up for a vote. It's
> been open for comments as an SPIP for about 3 weeks now, and had been
> open without the SPIP label for about 9 months before that. There has
> been no new feedback since it was tagged as an SPIP, so I'm assuming
> all the people who looked at it are OK with the current proposal.
>
> The vote will be up for the next 72 hours. Please reply with your vote:
>
> +1: Yeah, let's go forward and implement the SPIP.
> +0: Don't really care.
> -1: I don't think this is a good idea because of the following
> technical reasons.
>
> Thanks!
>
> --
> Marcelo
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>


-- 
Ryan Blue
Software Engineer
Netflix

Re: [VOTE] [SPIP] SPARK-18085: Better History Server scalability

Posted by Tom Graves <tg...@yahoo.com.INVALID>.

+1. 

Tom


On Monday, July 31, 2017, 12:28:02 PM CDT, Marcelo Vanzin <va...@cloudera.com> wrote:

Hey all,

Following the SPIP process, I'm putting this SPIP up for a vote. It's
been open for comments as an SPIP for about 3 weeks now, and had been
open without the SPIP label for about 9 months before that. There has
been no new feedback since it was tagged as an SPIP, so I'm assuming
all the people who looked at it are OK with the current proposal.

The vote will be up for the next 72 hours. Please reply with your vote:

+1: Yeah, let's go forward and implement the SPIP.
+0: Don't really care.
-1: I don't think this is a good idea because of the following
technical reasons.

Thanks!

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org

Re: [VOTE] [SPIP] SPARK-18085: Better History Server scalability

Posted by Marcelo Vanzin <va...@cloudera.com>.

Adding my own +1 (binding).

On Mon, Jul 31, 2017 at 10:27 AM, Marcelo Vanzin <va...@cloudera.com> wrote:
> Hey all,
>
> Following the SPIP process, I'm putting this SPIP up for a vote. It's
> been open for comments as an SPIP for about 3 weeks now, and had been
> open without the SPIP label for about 9 months before that. There has
> been no new feedback since it was tagged as an SPIP, so I'm assuming
> all the people who looked at it are OK with the current proposal.
>
> The vote will be up for the next 72 hours. Please reply with your vote:
>
> +1: Yeah, let's go forward and implement the SPIP.
> +0: Don't really care.
> -1: I don't think this is a good idea because of the following
> technical reasons.
>
> Thanks!
>
> --
> Marcelo



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org