You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Chun-Hung Hsiao <ch...@apache.org> on 2018/06/13 23:58:53 UTC
Re: Review Request 65975: Sequentialize reconciliations of storage pools in SLRP.


> On April 20, 2018, 1:27 p.m., Benjamin Bannier wrote:
> > First round of reviews.
> > 
> > I am not a big fan on how reconcilations are modelled here. The counting seems to lead to an incomplete encapsulation of correct behavior. I'd much rather see standard `libprocess` actor semantics and better use of data structures instead. Right now the code appears too complicated and potentially brittle to the touch to me.
> 
> Benjamin Bannier wrote:
>     I am nto sure what a good example of a solution would, but have a look at e.g., the master registrar in how it pushes updates (operations) into a simple `deque` which it then works on sequentially. While it triggers work on the updates manually the control flow is easier to see (still not easy enough though). Maybe we have other examples as well.

A few notes here:

There are two concurrent sources that will trigger the reconciliation:
A. Changes in the set of known profiles.
B. Deletion of a volume/block with a gone profile (this is not supported before this patch).
The reconciliation triggered by B could be combined with that triggered by A, but not vice versa.

Also, I'd like to point out that the reconciliation must wait for all pending deletion to complete,
and during the wait, the resource provider should not accept new related operations.

So let's consider that such a deletion arrives: (i) before, (ii) during, or (iii) after a
reconciliation (either A or B):

For (i), if B is triggered for the deletion first, then A needs to wait for B;
if B is not tiggered yet, then B will wait for the deletion to be completed,
and B will be combined with the reconciliation (A/B).

For (ii), the deletion will be dropped. Note that we need to drop all operations
when a reconciliation is *enqueued*, not when it is executed.

For (iii), the deletion will be accepted, and B will be triggered after the reconciliation.

In other words, all A-type reconciliations need to be executed after the previous reconciliation,
and a B-type reconcilation can be combined with another A-type or B-type reconciliation that is in progress.

Meanwhile, whenever a reconciliation is enqueued to the actor, we need to *synchronously* mark the actor
in some "reconciliation" mode to drop related incoming operations.
The primary reason we want to do that is because the current code synchronously apply all speculative operations.
This decision is made to simplify the overall logic and maintain the invariant that
once an operation is accepted (and checkpointed), it must be able to be applied on the current total resources.
So, the only way to support both A and B and maintaining this invariant is to have ensure that
whenever a reconciliation is enqueued, no more related operation will be enqueued.

Hope this summary could help reviewing this patch.

I'll update this patch later with a simpler logic.


> On April 20, 2018, 1:27 p.m., Benjamin Bannier wrote:
> > src/resource_provider/storage/provider.cpp
> > Lines 1092 (patched)
> > <https://reviews.apache.org/r/65975/diff/4/?file=1996920#file1996920line1123>
> >
> >     Could you add a comment why this and the assertion in the continuation below are valid? The guaranteed execution paths of this `loop` with below `defer`s are not obvious to me.
> >     
> >     Ideally we would not need this variable at all to ensure correct push-pop semantics, but defer to actor behavior and queues. This is hidden under very thick continuation layers at the moment.

As described above, whenever a reconciliation is enqueued, it is not allowed to enqueue more operations,
so the main purpose of this variable is to *synchronously* mark this actor in a special mode to drop all incoming operations.

But yes, this counter is hard to understand and kinda error prone. I'll update the patch with a different (and hopefully easier) logic.


> On April 20, 2018, 1:27 p.m., Benjamin Bannier wrote:
> > src/resource_provider/storage/provider.cpp
> > Lines 1284 (patched)
> > <https://reviews.apache.org/r/65975/diff/4/?file=1996920#file1996920line1315>
> >
> >     It would be great if this function would either return a value, or trigger `fatal()` to make sure we do our part to keep the agent up-to-date.

Will do.


- Chun-Hung


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65975/#review201629
-----------------------------------------------------------


On April 12, 2018, 3:36 a.m., Chun-Hung Hsiao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65975/
> -----------------------------------------------------------
> 
> (Updated April 12, 2018, 3:36 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Jie Yu.
> 
> 
> Bugs: MESOS-8825
>     https://issues.apache.org/jira/browse/MESOS-8825
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The storage pools needs to be reconciled in the following two scenarios:
> 
> 1. When there is a change in the set of known profiles.
> 2. When a volume/block of an unknown profile is destroyed, because the
>    disk space being freed up may belong to a known profile.
> 
> This patch adds a sequence to coordinate the reconciliations for the
> above two cases.
> 
> 
> Diffs
> -----
> 
>   src/resource_provider/storage/provider.cpp a07620d1c4bf618f669575b3e183bf598da905a6 
> 
> 
> Diff: https://reviews.apache.org/r/65975/diff/4/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>