You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Alexander Rukletsov <ru...@gmail.com> on 2016/02/19 14:12:36 UTC

Review Request 43760: Propagated executor shutdown grace period to executors.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43760/
-----------------------------------------------------------

Review request for mesos, Anand Mazumdar and Ben Mahler.


Bugs: MESOS-1571
    https://issues.apache.org/jira/browse/MESOS-1571


Repository: mesos


Description
-------

Executor shutdown grace period, which configured on the agent, is
propagated to executors via the `MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD`
environment variable. The executor library uses this timeout to delay
the hard shutdown of the related executor.


Diffs
-----

  src/exec/exec.cpp 83dbee9dd2d9a4e7ebf395c8070bc7f9f8412ef1 
  src/slave/containerizer/containerizer.cpp 59904684cdeb17ef2b42092a3558802c42bfb6bd 

Diff: https://reviews.apache.org/r/43760/diff/


Testing
-------

The complete chain was tested. See https://reviews.apache.org/r/43764/ .


Thanks,

Alexander Rukletsov


Re: Review Request 43760: Propagated executor shutdown grace period to executors.

Posted by Guangya Liu <gy...@gmail.com>.

> On 二月 19, 2016, 1:38 p.m., Guangya Liu wrote:
> > src/exec/exec.cpp, lines 698-699
> > <https://reviews.apache.org/r/43760/diff/1/?file=1259192#file1259192line698>
> >
> >     Since there is already a default value for `shutdownGracePeriod` here, what about just log warning message here and continue?
> >     
> >         Duration shutdownGracePeriod = EXECUTOR_SHUTDOWN_GRACE_PERIOD;
> >         value = os::getenv("MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD");
> >         if (value.isSome()) {
> >           Try<Duration> parse = Duration::parse(value.get());
> >           if (parse.isError()) {
> >             LOG(WARNING) << "Cannot parse MESOS_SHUTDOWN_GRACE_PERIOD '"
> >                          << value.get() << "': " << parse.error();
> >           }
> >     
> >           shutdownGracePeriod = parse.get();
> >         }

Sorry, I mean as following or similiar.

Duration shutdownGracePeriod = EXECUTOR_SHUTDOWN_GRACE_PERIOD;
value = os::getenv("MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD");
if (value.isSome()) {
  Try<Duration> parse = Duration::parse(value.get());
  if (parse.isError()) {
    LOG(WARNING) << "Cannot parse MESOS_SHUTDOWN_GRACE_PERIOD '"
                 << value.get() << "': " << parse.error();
  } else {
    shutdownGracePeriod = parse.get();
  }
}


- Guangya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43760/#review119875
-----------------------------------------------------------


On 二月 19, 2016, 1:12 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43760/
> -----------------------------------------------------------
> 
> (Updated 二月 19, 2016, 1:12 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Ben Mahler.
> 
> 
> Bugs: MESOS-1571
>     https://issues.apache.org/jira/browse/MESOS-1571
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Executor shutdown grace period, which configured on the agent, is
> propagated to executors via the `MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD`
> environment variable. The executor library uses this timeout to delay
> the hard shutdown of the related executor.
> 
> 
> Diffs
> -----
> 
>   src/exec/exec.cpp 83dbee9dd2d9a4e7ebf395c8070bc7f9f8412ef1 
>   src/slave/containerizer/containerizer.cpp 59904684cdeb17ef2b42092a3558802c42bfb6bd 
> 
> Diff: https://reviews.apache.org/r/43760/diff/
> 
> 
> Testing
> -------
> 
> The complete chain was tested. See https://reviews.apache.org/r/43764/ .
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>


Re: Review Request 43760: Propagated executor shutdown grace period to executors.

Posted by Guangya Liu <gy...@gmail.com>.

> On 二月 19, 2016, 1:38 p.m., Guangya Liu wrote:
> > src/exec/exec.cpp, lines 698-699
> > <https://reviews.apache.org/r/43760/diff/1/?file=1259192#file1259192line698>
> >
> >     Since there is already a default value for `shutdownGracePeriod` here, what about just log warning message here and continue?
> >     
> >         Duration shutdownGracePeriod = EXECUTOR_SHUTDOWN_GRACE_PERIOD;
> >         value = os::getenv("MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD");
> >         if (value.isSome()) {
> >           Try<Duration> parse = Duration::parse(value.get());
> >           if (parse.isError()) {
> >             LOG(WARNING) << "Cannot parse MESOS_SHUTDOWN_GRACE_PERIOD '"
> >                          << value.get() << "': " << parse.error();
> >           }
> >     
> >           shutdownGracePeriod = parse.get();
> >         }
> 
> Guangya Liu wrote:
>     Sorry, I mean as following or similiar.
>     
>     Duration shutdownGracePeriod = EXECUTOR_SHUTDOWN_GRACE_PERIOD;
>     value = os::getenv("MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD");
>     if (value.isSome()) {
>       Try<Duration> parse = Duration::parse(value.get());
>       if (parse.isError()) {
>         LOG(WARNING) << "Cannot parse MESOS_SHUTDOWN_GRACE_PERIOD '"
>                      << value.get() << "': " << parse.error();
>       } else {
>         shutdownGracePeriod = parse.get();
>       }
>     }
> 
> Alexander Rukletsov wrote:
>     It's indeed not critical from the executor's point of view, but it indicates something strange happened with the environment. We usually "fail early, fail fast", hence I opted for an abnormal exit.

OK, we can follow same way as other env vars to `fail early, fail fast`. But there is indeed a bit difference for this and other env vars is that this var does have a default value but others do not have. I will drop this comments.


- Guangya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43760/#review119875
-----------------------------------------------------------


On 二月 19, 2016, 1:12 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43760/
> -----------------------------------------------------------
> 
> (Updated 二月 19, 2016, 1:12 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Ben Mahler.
> 
> 
> Bugs: MESOS-1571
>     https://issues.apache.org/jira/browse/MESOS-1571
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Executor shutdown grace period, which configured on the agent, is
> propagated to executors via the `MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD`
> environment variable. The executor library uses this timeout to delay
> the hard shutdown of the related executor.
> 
> 
> Diffs
> -----
> 
>   src/exec/exec.cpp 83dbee9dd2d9a4e7ebf395c8070bc7f9f8412ef1 
>   src/slave/containerizer/containerizer.cpp 59904684cdeb17ef2b42092a3558802c42bfb6bd 
> 
> Diff: https://reviews.apache.org/r/43760/diff/
> 
> 
> Testing
> -------
> 
> The complete chain was tested. See https://reviews.apache.org/r/43764/ .
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>


Re: Review Request 43760: Propagated executor shutdown grace period to executors.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On Feb. 19, 2016, 1:38 p.m., Guangya Liu wrote:
> > src/exec/exec.cpp, lines 698-699
> > <https://reviews.apache.org/r/43760/diff/1/?file=1259192#file1259192line698>
> >
> >     Since there is already a default value for `shutdownGracePeriod` here, what about just log warning message here and continue?
> >     
> >         Duration shutdownGracePeriod = EXECUTOR_SHUTDOWN_GRACE_PERIOD;
> >         value = os::getenv("MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD");
> >         if (value.isSome()) {
> >           Try<Duration> parse = Duration::parse(value.get());
> >           if (parse.isError()) {
> >             LOG(WARNING) << "Cannot parse MESOS_SHUTDOWN_GRACE_PERIOD '"
> >                          << value.get() << "': " << parse.error();
> >           }
> >     
> >           shutdownGracePeriod = parse.get();
> >         }
> 
> Guangya Liu wrote:
>     Sorry, I mean as following or similiar.
>     
>     Duration shutdownGracePeriod = EXECUTOR_SHUTDOWN_GRACE_PERIOD;
>     value = os::getenv("MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD");
>     if (value.isSome()) {
>       Try<Duration> parse = Duration::parse(value.get());
>       if (parse.isError()) {
>         LOG(WARNING) << "Cannot parse MESOS_SHUTDOWN_GRACE_PERIOD '"
>                      << value.get() << "': " << parse.error();
>       } else {
>         shutdownGracePeriod = parse.get();
>       }
>     }

It's indeed not critical from the executor's point of view, but it indicates something strange happened with the environment. We usually "fail early, fail fast", hence I opted for an abnormal exit.


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43760/#review119875
-----------------------------------------------------------


On Feb. 19, 2016, 1:12 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43760/
> -----------------------------------------------------------
> 
> (Updated Feb. 19, 2016, 1:12 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Ben Mahler.
> 
> 
> Bugs: MESOS-1571
>     https://issues.apache.org/jira/browse/MESOS-1571
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Executor shutdown grace period, which configured on the agent, is
> propagated to executors via the `MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD`
> environment variable. The executor library uses this timeout to delay
> the hard shutdown of the related executor.
> 
> 
> Diffs
> -----
> 
>   src/exec/exec.cpp 83dbee9dd2d9a4e7ebf395c8070bc7f9f8412ef1 
>   src/slave/containerizer/containerizer.cpp 59904684cdeb17ef2b42092a3558802c42bfb6bd 
> 
> Diff: https://reviews.apache.org/r/43760/diff/
> 
> 
> Testing
> -------
> 
> The complete chain was tested. See https://reviews.apache.org/r/43764/ .
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>


Re: Review Request 43760: Propagated executor shutdown grace period to executors.

Posted by Guangya Liu <gy...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43760/#review119875
-----------------------------------------------------------




src/exec/exec.cpp (lines 697 - 698)
<https://reviews.apache.org/r/43760/#comment181266>

    Since there is already a default value for `shutdownGracePeriod` here, what about just log warning message here and continue?
    
        Duration shutdownGracePeriod = EXECUTOR_SHUTDOWN_GRACE_PERIOD;
        value = os::getenv("MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD");
        if (value.isSome()) {
          Try<Duration> parse = Duration::parse(value.get());
          if (parse.isError()) {
            LOG(WARNING) << "Cannot parse MESOS_SHUTDOWN_GRACE_PERIOD '"
                         << value.get() << "': " << parse.error();
          }
    
          shutdownGracePeriod = parse.get();
        }


- Guangya Liu


On 二月 19, 2016, 1:12 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43760/
> -----------------------------------------------------------
> 
> (Updated 二月 19, 2016, 1:12 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Ben Mahler.
> 
> 
> Bugs: MESOS-1571
>     https://issues.apache.org/jira/browse/MESOS-1571
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Executor shutdown grace period, which configured on the agent, is
> propagated to executors via the `MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD`
> environment variable. The executor library uses this timeout to delay
> the hard shutdown of the related executor.
> 
> 
> Diffs
> -----
> 
>   src/exec/exec.cpp 83dbee9dd2d9a4e7ebf395c8070bc7f9f8412ef1 
>   src/slave/containerizer/containerizer.cpp 59904684cdeb17ef2b42092a3558802c42bfb6bd 
> 
> Diff: https://reviews.apache.org/r/43760/diff/
> 
> 
> Testing
> -------
> 
> The complete chain was tested. See https://reviews.apache.org/r/43764/ .
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>


Re: Review Request 43760: Propagated executor shutdown grace period to executors.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43760/#review120928
-----------------------------------------------------------




src/exec/exec.cpp (line 693)
<https://reviews.apache.org/r/43760/#comment182459>

    I pulled down these patches to tweak them as I wanted to add the configurable shutdown grace period. If we kept in mind that the shutdown grace period should be configurable, this would be 'MESOS_DEFAULT_EXECUTOR_SHUTDOWN_GRACE_PERIOD'. It would be just a default specified by the agent, that can be overridden by ExecutorInfo.
    
    In the new API (see src/executor/executor.cpp), it looks like we still need to send this default through an environment variable because we have self-initiated shutdowns. These self-initiated shutdowns won't have the 'Shutdown' Event's grace period so the executor needs to know the agent's default outside of the 'Shutdown' Event.
    
    Hope this makes some sense about how I'm thinking about tweaking this. I'll send out the tweaked patches and we can discuss!



src/exec/exec.cpp (lines 697 - 698)
<https://reviews.apache.org/r/43760/#comment182457>

    I see "Cannot parse" was copied, but the language we use to consistently print failures is:
    
    "Failed to <do something>: <reason>"


- Ben Mahler


On Feb. 19, 2016, 1:12 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43760/
> -----------------------------------------------------------
> 
> (Updated Feb. 19, 2016, 1:12 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Ben Mahler.
> 
> 
> Bugs: MESOS-1571
>     https://issues.apache.org/jira/browse/MESOS-1571
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Executor shutdown grace period, which configured on the agent, is
> propagated to executors via the `MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD`
> environment variable. The executor library uses this timeout to delay
> the hard shutdown of the related executor.
> 
> 
> Diffs
> -----
> 
>   src/exec/exec.cpp 83dbee9dd2d9a4e7ebf395c8070bc7f9f8412ef1 
>   src/slave/containerizer/containerizer.cpp 59904684cdeb17ef2b42092a3558802c42bfb6bd 
> 
> Diff: https://reviews.apache.org/r/43760/diff/
> 
> 
> Testing
> -------
> 
> The complete chain was tested. See https://reviews.apache.org/r/43764/ .
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>