You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Andrei Budnik <ab...@mesosphere.com> on 2019/08/14 16:11:52 UTC

Review Request 71289: Fixed out-of-order processing of requests in composing containerizer.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71289/
-----------------------------------------------------------

Review request for mesos, Gilbert Song, Greg Mann, and Qian Zhang.


Bugs: MESOS-9887
    https://issues.apache.org/jira/browse/MESOS-9887


Repository: mesos


Description
-------

Previously, the composing containerizer could return
"Container not found" failure before all preceding requests
have been processed by the underlying containerizer. It could
lead to subtle errors on a client side, which expects that all
requests are processed strictly in the order they arrived.
This patch introduces DESTROYED state and forces all requests
to a DESTROYED container to wait until the underlying containerizer
finishes processing of requests sent before the container transitioned
to the DESTROYED state.


Diffs
-----

  src/slave/containerizer/composing.cpp d854794fc4775fb8a05efc233d488a64b9ef620a 


Diff: https://reviews.apache.org/r/71289/diff/1/


Testing
-------


Thanks,

Andrei Budnik


Re: Review Request 71289: Fixed out-of-order processing of requests in composing containerizer.

Posted by Andrei Budnik <ab...@mesosphere.com>.

> On Авг. 19, 2019, 12:02 п.п., Qian Zhang wrote:
> > src/slave/containerizer/composing.cpp
> > Lines 662 (patched)
> > <https://reviews.apache.org/r/71289/diff/1/?file=2160913#file2160913line678>
> >
> >     Why do we need to call `wait` here? In this case (i.e., `finalAcknowledgement` is set and container has been removed from `containers_`), `wait` will always return `None()` right?

added a duplicate comment which clarifies the reason why we need to call `wait` (terminated nested container, e.g., a health check).


> On Авг. 19, 2019, 12:02 п.п., Qian Zhang wrote:
> > src/slave/containerizer/composing.cpp
> > Lines 631-635 (original), 688-695 (patched)
> > <https://reviews.apache.org/r/71289/diff/1/?file=2160913#file2160913line704>
> >
> >     I understand the reason that here we call underlying containerizer's `status` method is to ensure the previous requests got processed before the container is removed from composing containerizer's `containers_` map. However, I am not sure if we can achieve it actually. Let's say the previous request is UCR's `usage` method, I think with the code here we can ensure that before the container is removed from composing containerizer's `containers_` map, the UCR's `usage` method **returns**, but that does not mean the `usage` method **finishes** all its jobs, i.e., it may just return a future and the underlying isolators are still collecting the container's usage.
> >     
> >     So I think it might be still possible that a subsequent request to composing container gets a `Container not found` failure while the previous request has not finished its jobs yet.

Yeah, I realised that it's not even a composing c'zer bug. The same problem might happen with Mesos c'zer.


- Andrei


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71289/#review217272
-----------------------------------------------------------


On Авг. 14, 2019, 4:11 п.п., Andrei Budnik wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71289/
> -----------------------------------------------------------
> 
> (Updated Авг. 14, 2019, 4:11 п.п.)
> 
> 
> Review request for mesos, Gilbert Song, Greg Mann, and Qian Zhang.
> 
> 
> Bugs: MESOS-9887
>     https://issues.apache.org/jira/browse/MESOS-9887
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Previously, the composing containerizer could return
> "Container not found" failure before all preceding requests
> have been processed by the underlying containerizer. It could
> lead to subtle errors on a client side, which expects that all
> requests are processed strictly in the order they arrived.
> This patch introduces DESTROYED state and forces all requests
> to a DESTROYED container to wait until the underlying containerizer
> finishes processing of requests sent before the container transitioned
> to the DESTROYED state.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/composing.cpp d854794fc4775fb8a05efc233d488a64b9ef620a 
> 
> 
> Diff: https://reviews.apache.org/r/71289/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>


Re: Review Request 71289: Fixed out-of-order processing of requests in composing containerizer.

Posted by Qian Zhang <zh...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71289/#review217272
-----------------------------------------------------------




src/slave/containerizer/composing.cpp
Line 136 (original), 137-138 (patched)
<https://reviews.apache.org/r/71289/#comment304567>

    A newline between.



src/slave/containerizer/composing.cpp
Lines 325-326 (original), 336-337 (patched)
<https://reviews.apache.org/r/71289/#comment304568>

    Better to name the method as `waited` rather than `destroyed`? Since it gets called when `wait` is done.



src/slave/containerizer/composing.cpp
Lines 541 (patched)
<https://reviews.apache.org/r/71289/#comment304569>

    Kill this blank line, and ditto for other places.



src/slave/containerizer/composing.cpp
Lines 662 (patched)
<https://reviews.apache.org/r/71289/#comment304571>

    Why do we need to call `wait` here? In this case (i.e., `finalAcknowledgement` is set and container has been removed from `containers_`), `wait` will always return `None()` right?



src/slave/containerizer/composing.cpp
Lines 631-635 (original), 688-695 (patched)
<https://reviews.apache.org/r/71289/#comment304580>

    I understand the reason that here we call underlying containerizer's `status` method is to ensure the previous requests got processed before the container is removed from composing containerizer's `containers_` map. However, I am not sure if we can achieve it actually. Let's say the previous request is UCR's `usage` method, I think with the code here we can ensure that before the container is removed from composing containerizer's `containers_` map, the UCR's `usage` method **returns**, but that does not mean the `usage` method **finishes** all its jobs, i.e., it may just return a future and the underlying isolators are still collecting the container's usage.
    
    So I think it might be still possible that a subsequent request to composing container gets a `Container not found` failure while the previous request has not finished its jobs yet.


- Qian Zhang


On Aug. 15, 2019, 12:11 a.m., Andrei Budnik wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71289/
> -----------------------------------------------------------
> 
> (Updated Aug. 15, 2019, 12:11 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Greg Mann, and Qian Zhang.
> 
> 
> Bugs: MESOS-9887
>     https://issues.apache.org/jira/browse/MESOS-9887
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Previously, the composing containerizer could return
> "Container not found" failure before all preceding requests
> have been processed by the underlying containerizer. It could
> lead to subtle errors on a client side, which expects that all
> requests are processed strictly in the order they arrived.
> This patch introduces DESTROYED state and forces all requests
> to a DESTROYED container to wait until the underlying containerizer
> finishes processing of requests sent before the container transitioned
> to the DESTROYED state.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/composing.cpp d854794fc4775fb8a05efc233d488a64b9ef620a 
> 
> 
> Diff: https://reviews.apache.org/r/71289/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>