You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by Karthik Kambatla <ka...@cloudera.com> on 2015/07/29 19:37:54 UTC

"Reservation" ambiguity

Hi folks

We use the word "reservation" to mean both (1) reservations on nodes to
avoid starvation of big container asks, and (2) the recent SLA work. This
is confusing both to developers and end-users.

I was wondering if people are open to calling the first one a "hold" and
the second one a "reservation". We can change the terminology in the code
and add new metrics for hold in branch-2 and remove the metrics for
reserved* in Hadoop-3?

Thoughts?

Re: "Reservation" ambiguity

Posted by Allen Wittenauer <aw...@altiscale.com>.
On Jul 29, 2015, at 10:37 AM, Karthik Kambatla <ka...@cloudera.com> wrote:

> Hi folks
> 
> We use the word "reservation" to mean both (1) reservations on nodes to
> avoid starvation of big container asks, and (2) the recent SLA work. This
> is confusing both to developers and end-users.
> 
> I was wondering if people are open to calling the first one a "hold" and
> the second one a "reservation". We can change the terminology in the code
> and add new metrics for hold in branch-2 and remove the metrics for
> reserved* in Hadoop-3?
> 
> Thoughts?


Whichever one got checked in first gets ‘reservation’.



Re: "Reservation" ambiguity

Posted by Ray Chiang <rc...@cloudera.com>.
Sorry for the delayed response.  My two cents:

1) "Hold" has some ambiguity as a word, since it can be both a noun and a
verb.  Using "hold" in a sentence will be confusing in documentation.

2) Here are some alternatives.  Sadly, a lot of these words are quite long
and could be painful for coding:

   - accomodation
   - allowance
   - booking
   - prerequisite
   - settlement
   - withholding

-Ray


On Wed, Jul 29, 2015 at 10:37 AM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> Hi folks
>
> We use the word "reservation" to mean both (1) reservations on nodes to
> avoid starvation of big container asks, and (2) the recent SLA work. This
> is confusing both to developers and end-users.
>
> I was wondering if people are open to calling the first one a "hold" and
> the second one a "reservation". We can change the terminology in the code
> and add new metrics for hold in branch-2 and remove the metrics for
> reserved* in Hadoop-3?
>
> Thoughts?
>

Re: "Reservation" ambiguity

Posted by Karthik Kambatla <ka...@cloudera.com>.
Thinking more about it, kind of agree with you Chris and Vinod on not
removing old metrics in Hadoop-3. Would it be reasonable to keep them
around but deprecate them? Or, should we just not mess with them at all?

Internal variables could be changed for better readability irrespective; we
might end up doing that for FairScheduler anyhow.

On Wed, Aug 5, 2015 at 10:34 AM, Vinod Kumar Vavilapalli <
vinodkv@hortonworks.com> wrote:

> It wasn’t confusing to me given we know the internals, but I can see why
> it would be.
>
> In my mind, what existed for a long time was ‘internal reservation of
> resources for individual containers within an application’ and the newer
> feature is about user-driven reservation of resources at application /
> workload granularity. For the newer metrics if/when we add, we should just
> make the distinction explicit - for e.g. numUserReservedResources. Agreeing
> with others, this doesn’t warrant us to break existing metrics even in the
> next major release.
>
> Thanks
> +Vinod
>
> On Aug 5, 2015, at 10:16 AM, Carlo Curino <ccurino@microsoft.com<mailto:
> ccurino@microsoft.com>> wrote:
>
> +1 on keeping the name "reservation" for the user-visible (2).
>
> On top of the external/internal argument that Chris makes (which I
> completely agree with), I noticed the following:
>
> While developing (2)  we spoke with lots and lots of folks both in
> industry and academia, and the term
> "reservation" was very evocative and intuitive. Within seconds people were
> using it to refer to the functionality
> and easily grasping the idea.  On the other hand, every time I spoke about
> (1) using the keyword "reservation",
> I had to add a bunch of context, expand, explain, and even then people
> were naturally drawn to refer to it
> as "hoarding of resources for large containers", or "large container
> management".
>
> Other alternative names for (1) could be: "hoarded" or "prefecthed"
> resources.
>
> My 2 cents...
>
> Cheers,
> Carlo
>
> -----Original Message-----
> From: Karthik Kambatla [mailto:kasha@cloudera.com]
> Sent: Wednesday, August 5, 2015 8:20 AM
> To: yarn-dev@hadoop.apache.org<ma...@hadoop.apache.org>
> Subject: Re: "Reservation" ambiguity
>
> Inline.
>
> On Tue, Aug 4, 2015 at 6:48 PM, Chris Douglas <cdouglas@apache.org<mailto:
> cdouglas@apache.org>> wrote:
>
> How visible are (1) reservations? They're an internal, implementation
> detail exposed in metrics only to explain the edge cases they create.
> Are users typically aware of them?
>
>
> This is internal, and I don't think users are aware of the mechanics.
> However, they do see metrics for "reserved" resources.
>
>
>
> SLA reservations (2) are user-visible, and express the contract with
> users/operators symmetrically. While (1) is a concept, renaming (2)
> would require user-breaking code changes.
>
>
> Yes, I don't think we should rename (2).
>
>
>
> Unless you're discussing the intersection- the effect of reservations
> (1) on a reservation (2)- it's usually clear from context... I'd
> rather avoid breaking anyone listening to the metrics in Hadoop-3.
>
>
> I propose to add new metrics holdMB, holdCores for reservedMB,
> reseveredCores. Could we deprecate the older metrics in Hadoop-2 and
> Hadoop-3, and remove them in Hadoop-4?
>
>
>
> Maybe reservations (2) could have been named "sessions", but that
> collided with applications that already used it for a similar concept.
> -C
>
> On Wed, Jul 29, 2015 at 10:37 AM, Karthik Kambatla
> <ka...@cloudera.com>>
> wrote:
> Hi folks
>
> We use the word "reservation" to mean both (1) reservations on nodes
> to avoid starvation of big container asks, and (2) the recent SLA
> work. This is confusing both to developers and end-users.
>
> I was wondering if people are open to calling the first one a "hold"
> and the second one a "reservation". We can change the terminology in
> the code and add new metrics for hold in branch-2 and remove the
> metrics for
> reserved* in Hadoop-3?
>
> Thoughts?
>
>
>

Re: "Reservation" ambiguity

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
It wasn’t confusing to me given we know the internals, but I can see why it would be.

In my mind, what existed for a long time was ‘internal reservation of resources for individual containers within an application’ and the newer feature is about user-driven reservation of resources at application / workload granularity. For the newer metrics if/when we add, we should just make the distinction explicit - for e.g. numUserReservedResources. Agreeing with others, this doesn’t warrant us to break existing metrics even in the next major release.

Thanks
+Vinod

On Aug 5, 2015, at 10:16 AM, Carlo Curino <cc...@microsoft.com>> wrote:

+1 on keeping the name "reservation" for the user-visible (2).

On top of the external/internal argument that Chris makes (which I completely agree with), I noticed the following:

While developing (2)  we spoke with lots and lots of folks both in industry and academia, and the term
"reservation" was very evocative and intuitive. Within seconds people were using it to refer to the functionality
and easily grasping the idea.  On the other hand, every time I spoke about (1) using the keyword "reservation",
I had to add a bunch of context, expand, explain, and even then people were naturally drawn to refer to it
as "hoarding of resources for large containers", or "large container management".

Other alternative names for (1) could be: "hoarded" or "prefecthed" resources.

My 2 cents...

Cheers,
Carlo

-----Original Message-----
From: Karthik Kambatla [mailto:kasha@cloudera.com]
Sent: Wednesday, August 5, 2015 8:20 AM
To: yarn-dev@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: "Reservation" ambiguity

Inline.

On Tue, Aug 4, 2015 at 6:48 PM, Chris Douglas <cd...@apache.org>> wrote:

How visible are (1) reservations? They're an internal, implementation
detail exposed in metrics only to explain the edge cases they create.
Are users typically aware of them?


This is internal, and I don't think users are aware of the mechanics.
However, they do see metrics for "reserved" resources.



SLA reservations (2) are user-visible, and express the contract with
users/operators symmetrically. While (1) is a concept, renaming (2)
would require user-breaking code changes.


Yes, I don't think we should rename (2).



Unless you're discussing the intersection- the effect of reservations
(1) on a reservation (2)- it's usually clear from context... I'd
rather avoid breaking anyone listening to the metrics in Hadoop-3.


I propose to add new metrics holdMB, holdCores for reservedMB, reseveredCores. Could we deprecate the older metrics in Hadoop-2 and Hadoop-3, and remove them in Hadoop-4?



Maybe reservations (2) could have been named "sessions", but that
collided with applications that already used it for a similar concept.
-C

On Wed, Jul 29, 2015 at 10:37 AM, Karthik Kambatla
<ka...@cloudera.com>>
wrote:
Hi folks

We use the word "reservation" to mean both (1) reservations on nodes
to avoid starvation of big container asks, and (2) the recent SLA
work. This is confusing both to developers and end-users.

I was wondering if people are open to calling the first one a "hold"
and the second one a "reservation". We can change the terminology in
the code and add new metrics for hold in branch-2 and remove the
metrics for
reserved* in Hadoop-3?

Thoughts?



RE: "Reservation" ambiguity

Posted by Carlo Curino <cc...@microsoft.com>.
+1 on keeping the name "reservation" for the user-visible (2). 

On top of the external/internal argument that Chris makes (which I completely agree with), I noticed the following:

While developing (2)  we spoke with lots and lots of folks both in industry and academia, and the term 
"reservation" was very evocative and intuitive. Within seconds people were using it to refer to the functionality 
and easily grasping the idea.  On the other hand, every time I spoke about (1) using the keyword "reservation",
I had to add a bunch of context, expand, explain, and even then people were naturally drawn to refer to it 
as "hoarding of resources for large containers", or "large container management".

Other alternative names for (1) could be: "hoarded" or "prefecthed" resources.

My 2 cents...

Cheers,
Carlo

-----Original Message-----
From: Karthik Kambatla [mailto:kasha@cloudera.com] 
Sent: Wednesday, August 5, 2015 8:20 AM
To: yarn-dev@hadoop.apache.org
Subject: Re: "Reservation" ambiguity

Inline.

On Tue, Aug 4, 2015 at 6:48 PM, Chris Douglas <cd...@apache.org> wrote:

> How visible are (1) reservations? They're an internal, implementation 
> detail exposed in metrics only to explain the edge cases they create.
> Are users typically aware of them?
>

This is internal, and I don't think users are aware of the mechanics.
However, they do see metrics for "reserved" resources.


>
> SLA reservations (2) are user-visible, and express the contract with 
> users/operators symmetrically. While (1) is a concept, renaming (2) 
> would require user-breaking code changes.
>

Yes, I don't think we should rename (2).


>
> Unless you're discussing the intersection- the effect of reservations
> (1) on a reservation (2)- it's usually clear from context... I'd 
> rather avoid breaking anyone listening to the metrics in Hadoop-3.
>

I propose to add new metrics holdMB, holdCores for reservedMB, reseveredCores. Could we deprecate the older metrics in Hadoop-2 and Hadoop-3, and remove them in Hadoop-4?


>
> Maybe reservations (2) could have been named "sessions", but that 
> collided with applications that already used it for a similar concept.
> -C
>
> On Wed, Jul 29, 2015 at 10:37 AM, Karthik Kambatla 
> <ka...@cloudera.com>
> wrote:
> > Hi folks
> >
> > We use the word "reservation" to mean both (1) reservations on nodes 
> > to avoid starvation of big container asks, and (2) the recent SLA 
> > work. This is confusing both to developers and end-users.
> >
> > I was wondering if people are open to calling the first one a "hold" 
> > and the second one a "reservation". We can change the terminology in 
> > the code and add new metrics for hold in branch-2 and remove the 
> > metrics for
> > reserved* in Hadoop-3?
> >
> > Thoughts?
>

Re: "Reservation" ambiguity

Posted by Karthik Kambatla <ka...@cloudera.com>.
Inline.

On Tue, Aug 4, 2015 at 6:48 PM, Chris Douglas <cd...@apache.org> wrote:

> How visible are (1) reservations? They're an internal, implementation
> detail exposed in metrics only to explain the edge cases they create.
> Are users typically aware of them?
>

This is internal, and I don't think users are aware of the mechanics.
However, they do see metrics for "reserved" resources.


>
> SLA reservations (2) are user-visible, and express the contract with
> users/operators symmetrically. While (1) is a concept, renaming (2)
> would require user-breaking code changes.
>

Yes, I don't think we should rename (2).


>
> Unless you're discussing the intersection- the effect of reservations
> (1) on a reservation (2)- it's usually clear from context... I'd
> rather avoid breaking anyone listening to the metrics in Hadoop-3.
>

I propose to add new metrics holdMB, holdCores for reservedMB,
reseveredCores. Could we deprecate the older metrics in Hadoop-2 and
Hadoop-3, and remove them in Hadoop-4?


>
> Maybe reservations (2) could have been named "sessions", but that
> collided with applications that already used it for a similar concept.
> -C
>
> On Wed, Jul 29, 2015 at 10:37 AM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
> > Hi folks
> >
> > We use the word "reservation" to mean both (1) reservations on nodes to
> > avoid starvation of big container asks, and (2) the recent SLA work. This
> > is confusing both to developers and end-users.
> >
> > I was wondering if people are open to calling the first one a "hold" and
> > the second one a "reservation". We can change the terminology in the code
> > and add new metrics for hold in branch-2 and remove the metrics for
> > reserved* in Hadoop-3?
> >
> > Thoughts?
>

Re: "Reservation" ambiguity

Posted by Chris Douglas <cd...@apache.org>.
How visible are (1) reservations? They're an internal, implementation
detail exposed in metrics only to explain the edge cases they create.
Are users typically aware of them?

SLA reservations (2) are user-visible, and express the contract with
users/operators symmetrically. While (1) is a concept, renaming (2)
would require user-breaking code changes.

Unless you're discussing the intersection- the effect of reservations
(1) on a reservation (2)- it's usually clear from context... I'd
rather avoid breaking anyone listening to the metrics in Hadoop-3.

Maybe reservations (2) could have been named "sessions", but that
collided with applications that already used it for a similar concept.
-C

On Wed, Jul 29, 2015 at 10:37 AM, Karthik Kambatla <ka...@cloudera.com> wrote:
> Hi folks
>
> We use the word "reservation" to mean both (1) reservations on nodes to
> avoid starvation of big container asks, and (2) the recent SLA work. This
> is confusing both to developers and end-users.
>
> I was wondering if people are open to calling the first one a "hold" and
> the second one a "reservation". We can change the terminology in the code
> and add new metrics for hold in branch-2 and remove the metrics for
> reserved* in Hadoop-3?
>
> Thoughts?