You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Gaston Kleiman <ga...@mesosphere.io> on 2017/12/01 00:10:57 UTC

Re: Review Request 64095: Added a generic actor to be used by status update managers.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64095/
-----------------------------------------------------------

(Updated Nov. 30, 2017, 4:10 p.m.)


Review request for mesos and Greg Mann.


Changes
-------

Changed the signature of the `recover` method.


Bugs: MESOS-8197
    https://issues.apache.org/jira/browse/MESOS-8197


Repository: mesos


Description
-------

This actor handles the checkpointing, recovery, and retry of status
updates, and will initially be used by the offer operations status
update manager.


Diffs (updated)
-----

  src/status_update_manager/status_update_manager_process.hpp PRE-CREATION 


Diff: https://reviews.apache.org/r/64095/diff/3/

Changes: https://reviews.apache.org/r/64095/diff/2-3/


Testing
-------

Added new tests at the end of the chain. They passed 5000 times on GNU/Linux.


Thanks,

Gaston Kleiman


Re: Review Request 64095: Added a generic actor to be used by status update managers.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64095/#review192528
-----------------------------------------------------------




src/status_update_manager/status_update_manager_process.hpp
Lines 588 (patched)
<https://reviews.apache.org/r/64095/#comment270729>

    Could we switch to returning an `Owned` here instead of a raw pointer?



src/status_update_manager/status_update_manager_process.hpp
Lines 593-596 (patched)
<https://reviews.apache.org/r/64095/#comment270730>

    What are the implications of this for the empty file case? It seems the caller must be aware that they should delete an empty file, but I'm not sure that they receive specific enough feedback for this.
    
    Perhaps `recover` should remove the file if it's empty?



src/status_update_manager/status_update_manager_process.hpp
Lines 609-610 (patched)
<https://reviews.apache.org/r/64095/#comment270733>

    Is there a reason you're not using `O_APPEND` here?



src/status_update_manager/status_update_manager_process.hpp
Lines 650 (patched)
<https://reviews.apache.org/r/64095/#comment270746>

    You can probably remove this comment.



src/status_update_manager/status_update_manager_process.hpp
Lines 651 (patched)
<https://reviews.apache.org/r/64095/#comment270745>

    s/update stream/updates/



src/status_update_manager/status_update_manager_process.hpp
Lines 654 (patched)
<https://reviews.apache.org/r/64095/#comment270748>

    Given the log line above, I think this comment can be removed as well.


- Greg Mann


On Dec. 1, 2017, 1:42 a.m., Gaston Kleiman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64095/
> -----------------------------------------------------------
> 
> (Updated Dec. 1, 2017, 1:42 a.m.)
> 
> 
> Review request for mesos and Greg Mann.
> 
> 
> Bugs: MESOS-8197
>     https://issues.apache.org/jira/browse/MESOS-8197
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This actor handles the checkpointing, recovery, and retry of status
> updates.
> 
> It will initially be used by the offer operations status update
> manager, but it was designed and implemented so that it can replace
> the current implementation of the task status update manager.
> 
> 
> Diffs
> -----
> 
>   src/status_update_manager/status_update_manager_process.hpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/64095/diff/3/
> 
> 
> Testing
> -------
> 
> Added new tests at the end of the chain. They passed 5000 times on GNU/Linux.
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>


Re: Review Request 64095: Added a generic actor to be used by status update managers.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64095/#review192544
-----------------------------------------------------------




src/status_update_manager/status_update_manager_process.hpp
Lines 654 (patched)
<https://reviews.apache.org/r/64095/#comment270763>

    Actually, after looking at this again I was thinking: perhaps this comment should be changed to tell readers that we are doing two things here: both building our in-memory streams and the state object which we return to the caller. WDYT?



src/status_update_manager/status_update_manager_process.hpp
Lines 666 (patched)
<https://reviews.apache.org/r/64095/#comment270759>

    Could we use a `switch` here instead of `if`?



src/status_update_manager/status_update_manager_process.hpp
Lines 678 (patched)
<https://reviews.apache.org/r/64095/#comment270761>

    s/.get()./->/



src/status_update_manager/status_update_manager_process.hpp
Lines 692 (patched)
<https://reviews.apache.org/r/64095/#comment270769>

    As written, this is a bit opaque. I had to look up why we're seeking from the current position with an offset of zero. I now see that we're doing this in an attempt to return the current file position, ensuring that the current position is valid.
    
    Could you clarify this? I could imagine doing it by renaming this variable from `lseek` to `currentPosition`, or putting a note in the comment above.



src/status_update_manager/status_update_manager_process.hpp
Lines 709-710 (patched)
<https://reviews.apache.org/r/64095/#comment270771>

    Fits on one line.



src/status_update_manager/status_update_manager_process.hpp
Lines 717 (patched)
<https://reviews.apache.org/r/64095/#comment270775>

    Could you get rid of this `else`?



src/status_update_manager/status_update_manager_process.hpp
Lines 725 (patched)
<https://reviews.apache.org/r/64095/#comment270776>

    We could probably use a `std::pair` here instead. And maybe we can use `std::make_pair`, as long as it doesn't have issues with type deduction?



src/status_update_manager/status_update_manager_process.hpp
Lines 734 (patched)
<https://reviews.apache.org/r/64095/#comment270783>

    s/duplicate/duplicate or has already been acknowledged/



src/status_update_manager/status_update_manager_process.hpp
Lines 747 (patched)
<https://reviews.apache.org/r/64095/#comment270780>

    s/uuid/status_uuid/



src/status_update_manager/status_update_manager_process.hpp
Lines 754 (patched)
<https://reviews.apache.org/r/64095/#comment270782>

    Could you remove the exclamation point?



src/status_update_manager/status_update_manager_process.hpp
Lines 780-783 (patched)
<https://reviews.apache.org/r/64095/#comment270787>

    I know that in this context, using just `uuid` is not as ambiguous, but given the presence of multiple UUIDs in general I think it's better to be explicit. Could we use `status_uuid` for the function argument?



src/status_update_manager/status_update_manager_process.hpp
Lines 790-791 (patched)
<https://reviews.apache.org/r/64095/#comment270790>

    Since we print the status UUID in the stream operator for the update type, we should be able to eliminate the explicit UUID in the log line here and not lose any information.



src/status_update_manager/status_update_manager_process.hpp
Lines 847-849 (patched)
<https://reviews.apache.org/r/64095/#comment270810>

    I think it's more common for us to keep the empty function body on the same line:
    
    ```
    : terminated(false), streamId(_streamId), path(_path), fd(_fd) {}
    ```
    
    Here and elsewhere.



src/status_update_manager/status_update_manager_process.hpp
Lines 855 (patched)
<https://reviews.apache.org/r/64095/#comment270812>

    s/its/it's/



src/status_update_manager/status_update_manager_process.hpp
Lines 873 (patched)
<https://reviews.apache.org/r/64095/#comment270813>

    Ditto - use a `switch` here.



src/status_update_manager/status_update_manager_process.hpp
Lines 885-886 (patched)
<https://reviews.apache.org/r/64095/#comment270815>

    Hmm this error message seems strange in the ACK case?



src/status_update_manager/status_update_manager_process.hpp
Lines 905 (patched)
<https://reviews.apache.org/r/64095/#comment270816>

    Ditto - use a `switch` here.



src/status_update_manager/status_update_manager_process.hpp
Lines 907-908 (patched)
<https://reviews.apache.org/r/64095/#comment270823>

    Let's pass framework ID into the creation method to eliminate the NONE case here.



src/status_update_manager/status_update_manager_process.hpp
Lines 912 (patched)
<https://reviews.apache.org/r/64095/#comment270819>

    I think we can remove this comment.



src/status_update_manager/status_update_manager_process.hpp
Lines 923 (patched)
<https://reviews.apache.org/r/64095/#comment270824>

    Can remove this comment.


- Greg Mann


On Dec. 1, 2017, 1:42 a.m., Gaston Kleiman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64095/
> -----------------------------------------------------------
> 
> (Updated Dec. 1, 2017, 1:42 a.m.)
> 
> 
> Review request for mesos and Greg Mann.
> 
> 
> Bugs: MESOS-8197
>     https://issues.apache.org/jira/browse/MESOS-8197
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This actor handles the checkpointing, recovery, and retry of status
> updates.
> 
> It will initially be used by the offer operations status update
> manager, but it was designed and implemented so that it can replace
> the current implementation of the task status update manager.
> 
> 
> Diffs
> -----
> 
>   src/status_update_manager/status_update_manager_process.hpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/64095/diff/3/
> 
> 
> Testing
> -------
> 
> Added new tests at the end of the chain. They passed 5000 times on GNU/Linux.
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>


Re: Review Request 64095: Added a generic actor to be used by status update managers.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64095/#review192904
-----------------------------------------------------------




src/status_update_manager/status_update_manager_process.hpp
Lines 56-58 (patched)
<https://reviews.apache.org/r/64095/#comment271247>

    Should we add "Recovering status update state after failover" to this list?
    
    Also, should we explicitly say that the SUM is not responsible for garbage collection of completed streams?



src/status_update_manager/status_update_manager_process.hpp
Lines 67 (patched)
<https://reviews.apache.org/r/64095/#comment271368>

    After looking at the recovery code, looks like we should replace: s/immediately after recovery/during recovery/



src/status_update_manager/status_update_manager_process.hpp
Lines 73-74 (patched)
<https://reviews.apache.org/r/64095/#comment271365>

    I did a quick search and it looks like we usually indent 4 spaces for template parameters? Could you confirm and update if appropriate?



src/status_update_manager/status_update_manager_process.hpp
Lines 95 (patched)
<https://reviews.apache.org/r/64095/#comment271249>

    Add a comment explaining this member?



src/status_update_manager/status_update_manager_process.hpp
Lines 110-111 (patched)
<https://reviews.apache.org/r/64095/#comment271250>

    Place on same line as initialization list.



src/status_update_manager/status_update_manager_process.hpp
Lines 115-120 (patched)
<https://reviews.apache.org/r/64095/#comment271254>

    Is this code necessary?



src/status_update_manager/status_update_manager_process.hpp
Lines 128 (patched)
<https://reviews.apache.org/r/64095/#comment271271>

    s/a call-back//



src/status_update_manager/status_update_manager_process.hpp
Lines 152 (patched)
<https://reviews.apache.org/r/64095/#comment271302>

    Is this the appropriate place for this comment? Perhaps it can be removed?



src/status_update_manager/status_update_manager_process.hpp
Lines 182 (patched)
<https://reviews.apache.org/r/64095/#comment271304>

    s/frameworkId/`frameworkId`/
    
    or
    
    s/frameworkId/framework ID/



src/status_update_manager/status_update_manager_process.hpp
Lines 206 (patched)
<https://reviews.apache.org/r/64095/#comment271299>

    I don't understand this comment?



src/status_update_manager/status_update_manager_process.hpp
Lines 227 (patched)
<https://reviews.apache.org/r/64095/#comment271307>

    Backticks instead of quotes for consistency.



src/status_update_manager/status_update_manager_process.hpp
Lines 237-238 (patched)
<https://reviews.apache.org/r/64095/#comment271310>

    Maybe we should leave a TODO either here or next to the constant declaration saying that we should move this constant into this file once MESOS-8296 is resolved?



src/status_update_manager/status_update_manager_process.hpp
Lines 268 (patched)
<https://reviews.apache.org/r/64095/#comment271319>

    s/the status update stream for stream/status update stream/



src/status_update_manager/status_update_manager_process.hpp
Lines 268-269 (patched)
<https://reviews.apache.org/r/64095/#comment271320>

    Indent 4 spaces.



src/status_update_manager/status_update_manager_process.hpp
Lines 285 (patched)
<https://reviews.apache.org/r/64095/#comment271323>

    Can probably remove this comment.



src/status_update_manager/status_update_manager_process.hpp
Lines 311 (patched)
<https://reviews.apache.org/r/64095/#comment271339>

    Period.



src/status_update_manager/status_update_manager_process.hpp
Lines 331-332 (patched)
<https://reviews.apache.org/r/64095/#comment271340>

    The following is more consistent with our style guide:
    
    ```
    const std::string message =
      "Failed to recover status update stream " +
      stringify(streamId) + ": " + result.error();
    ```



src/status_update_manager/status_update_manager_process.hpp
Lines 405 (patched)
<https://reviews.apache.org/r/64095/#comment271346>

    Are we sure that this is being resent? Isn't it possible that the update was queued while the manager was paused?



src/status_update_manager/status_update_manager_process.hpp
Lines 429-432 (patched)
<https://reviews.apache.org/r/64095/#comment271290>

    4 Space indent



src/status_update_manager/status_update_manager_process.hpp
Lines 431 (patched)
<https://reviews.apache.org/r/64095/#comment271350>

    Let's not use the C-style cast here.
    
    How about:
    ```
    checkpoint ? Option<std::string>(getPath(streamId)) : None());
    ```
    
    I think you'll need to use `Option<std::string>::none()` for the NONE if the compiler complains.



src/status_update_manager/status_update_manager_process.hpp
Lines 454-457 (patched)
<https://reviews.apache.org/r/64095/#comment271363>

    This indentation is difficult to parse.
    
    How about
    ```
    Result<std::pair<
        process::Owned<StatusUpdateStream>,
        typename StatusUpdateStream::State>> result =
          StatusUpdateStream::recover(streamId, getPath(streamId), strict);
    ```



src/status_update_manager/status_update_manager_process.hpp
Lines 496 (patched)
<https://reviews.apache.org/r/64095/#comment271265>

    Let's return an `Option<StatusUpdateStream*>` here.
    
    See this related Slack discussion: https://mesos.slack.com/archives/C1LPTK50T/p1512502377000080



src/status_update_manager/status_update_manager_process.hpp
Lines 497 (patched)
<https://reviews.apache.org/r/64095/#comment271367>

    Indentation.



src/status_update_manager/status_update_manager_process.hpp
Lines 536 (patched)
<https://reviews.apache.org/r/64095/#comment271369>

    Backticks instead of quotes for consistency.



src/status_update_manager/status_update_manager_process.hpp
Lines 539 (patched)
<https://reviews.apache.org/r/64095/#comment271370>

    Is this comment necessary? I think the entire class should only be used for messages which expect an ACK?



src/status_update_manager/status_update_manager_process.hpp
Lines 551-558 (patched)
<https://reviews.apache.org/r/64095/#comment271371>

    Indent only four spaces.



src/status_update_manager/status_update_manager_process.hpp
Lines 565 (patched)
<https://reviews.apache.org/r/64095/#comment271374>

    You can probably remove this TODO.



src/status_update_manager/status_update_manager_process.hpp
Lines 573-587 (patched)
<https://reviews.apache.org/r/64095/#comment271376>

    Sorry, after looking at this again I feel like we should switch to just sending the single update associated with the delay that has elapsed. Should be pretty simple to pass a stream ID into this?



src/status_update_manager/status_update_manager_process.hpp
Lines 593-594 (patched)
<https://reviews.apache.org/r/64095/#comment271377>

    Are these used?



src/status_update_manager/status_update_manager_process.hpp
Lines 631-633 (patched)
<https://reviews.apache.org/r/64095/#comment271381>

    Indentation.



src/status_update_manager/status_update_manager_process.hpp
Lines 640 (patched)
<https://reviews.apache.org/r/64095/#comment271384>

    Indentation.
    
    I'm gonna stop raising issues for these :)
    Please do a sweep and fix indentation throughout the chain.



src/status_update_manager/status_update_manager_process.hpp
Lines 671 (patched)
<https://reviews.apache.org/r/64095/#comment271385>

    As we discussed, you should be able to `return std::move(stream);` here.



src/status_update_manager/status_update_manager_process.hpp
Lines 715-725 (patched)
<https://reviews.apache.org/r/64095/#comment271387>

    I'm having trouble understanding why we don't allow these errors during non-strict recovery, but we will allow errors during protobuf parsing?



src/status_update_manager/status_update_manager_process.hpp
Lines 739-740 (patched)
<https://reviews.apache.org/r/64095/#comment271386>

    Backticks for consistency, here and below.



src/status_update_manager/status_update_manager_process.hpp
Lines 758-759 (patched)
<https://reviews.apache.org/r/64095/#comment271388>

    Ditto regarding formatting here.



src/status_update_manager/status_update_manager_process.hpp
Lines 777 (patched)
<https://reviews.apache.org/r/64095/#comment271389>

    Remove extra space.



src/status_update_manager/status_update_manager_process.hpp
Lines 791 (patched)
<https://reviews.apache.org/r/64095/#comment271390>

    s/duplicate/is a duplicate/



src/status_update_manager/status_update_manager_process.hpp
Lines 873-874 (patched)
<https://reviews.apache.org/r/64095/#comment271391>

    ```
    if (statusUuid !=
        UUID::fromBytes(update.status().status_uuid()).get()) {
    ```



src/status_update_manager/status_update_manager_process.hpp
Lines 906 (patched)
<https://reviews.apache.org/r/64095/#comment271392>

    Remove extra space.



src/status_update_manager/status_update_manager_process.hpp
Lines 987 (patched)
<https://reviews.apache.org/r/64095/#comment271394>

    Capitalize "This", here and elsewhere.



src/status_update_manager/status_update_manager_process.hpp
Lines 1026 (patched)
<https://reviews.apache.org/r/64095/#comment271393>

    s/MANAGER/MANAGER_PROCESS/


- Greg Mann


On Dec. 2, 2017, 2:44 a.m., Gaston Kleiman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64095/
> -----------------------------------------------------------
> 
> (Updated Dec. 2, 2017, 2:44 a.m.)
> 
> 
> Review request for mesos and Greg Mann.
> 
> 
> Bugs: MESOS-8197
>     https://issues.apache.org/jira/browse/MESOS-8197
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This actor handles the checkpointing, recovery, and retry of status
> updates.
> 
> It will initially be used by the offer operations status update
> manager, but it was designed and implemented so that it can replace
> the current implementation of the task status update manager.
> 
> 
> Diffs
> -----
> 
>   src/status_update_manager/status_update_manager_process.hpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/64095/diff/7/
> 
> 
> Testing
> -------
> 
> Added new tests at the end of the chain. They passed 5000 times on GNU/Linux.
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>


Re: Review Request 64095: Added a generic actor to be used by status update managers.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64095/#review193204
-----------------------------------------------------------




src/status_update_manager/status_update_manager_process.hpp
Lines 117 (patched)
<https://reviews.apache.org/r/64095/#comment271752>

    Nit: let's add a `that` for consistency. I can fix while committing.



src/status_update_manager/status_update_manager_process.hpp
Lines 121 (patched)
<https://reviews.apache.org/r/64095/#comment271753>

    Nit: backticks. I'll fix while committing.



src/status_update_manager/status_update_manager_process.hpp
Lines 182 (patched)
<https://reviews.apache.org/r/64095/#comment271756>

    Let's get rid of "framework=" here and below. Will fix while committing.



src/status_update_manager/status_update_manager_process.hpp
Lines 361-371 (patched)
<https://reviews.apache.org/r/64095/#comment271759>

    Nit: Too many newlines between methods. I'll fix while committing.



src/status_update_manager/status_update_manager_process.hpp
Lines 377 (patched)
<https://reviews.apache.org/r/64095/#comment271760>

    Nit: Let's iterate with a reference here. Will fix while committing.



src/status_update_manager/status_update_manager_process.hpp
Lines 509 (patched)
<https://reviews.apache.org/r/64095/#comment271763>

    Nit: we can omit this comment. Will fix while committing.



src/status_update_manager/status_update_manager_process.hpp
Lines 805 (patched)
<https://reviews.apache.org/r/64095/#comment271767>

    Using underscores like this doesn't really follow our style guide. Let's follow up with a patch later to fix this style throughout the class.


- Greg Mann


On Dec. 7, 2017, 1:22 a.m., Gaston Kleiman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64095/
> -----------------------------------------------------------
> 
> (Updated Dec. 7, 2017, 1:22 a.m.)
> 
> 
> Review request for mesos and Greg Mann.
> 
> 
> Bugs: MESOS-8197
>     https://issues.apache.org/jira/browse/MESOS-8197
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This actor handles the checkpointing, recovery, and retry of status
> updates.
> 
> It will initially be used by the offer operations status update
> manager, but it was designed and implemented so that it can replace
> the current implementation of the task status update manager.
> 
> 
> Diffs
> -----
> 
>   src/slave/constants.hpp 2d07bce299e63b8492a5ecb08b29508a671bc14c 
>   src/status_update_manager/status_update_manager_process.hpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/64095/diff/10/
> 
> 
> Testing
> -------
> 
> Added new tests at the end of the chain. They passed 5000 times on GNU/Linux.
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>


Re: Review Request 64095: Added a generic actor to be used by status update managers.

Posted by Gaston Kleiman <ga...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64095/
-----------------------------------------------------------

(Updated Dec. 6, 2017, 5:22 p.m.)


Review request for mesos and Greg Mann.


Changes
-------

Addressed feedback.


Bugs: MESOS-8197
    https://issues.apache.org/jira/browse/MESOS-8197


Repository: mesos


Description
-------

This actor handles the checkpointing, recovery, and retry of status
updates.

It will initially be used by the offer operations status update
manager, but it was designed and implemented so that it can replace
the current implementation of the task status update manager.


Diffs (updated)
-----

  src/slave/constants.hpp 2d07bce299e63b8492a5ecb08b29508a671bc14c 
  src/status_update_manager/status_update_manager_process.hpp PRE-CREATION 


Diff: https://reviews.apache.org/r/64095/diff/8/

Changes: https://reviews.apache.org/r/64095/diff/7-8/


Testing
-------

Added new tests at the end of the chain. They passed 5000 times on GNU/Linux.


Thanks,

Gaston Kleiman


Re: Review Request 64095: Added a generic actor to be used by status update managers.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64095/#review193010
-----------------------------------------------------------




src/status_update_manager/status_update_manager_process.hpp
Lines 808-810 (patched)
<https://reviews.apache.org/r/64095/#comment271481>

    We should really only do something like this in test code, if ever (a `.get()` without a CHECK or `if` guard). Adding the CHECK will get us a stack trace if things go wrong. Instead, here let's do:
    
    ```
    Try<UUID> statusUuid = UUID::fromBytes(update.status().status_uuid());
    CHECK_SOME(statusUuid);
    if (acknowledged.contains(statusUuid.get())) {
    ```
    
    Here and elsewhere.


- Greg Mann


On Dec. 2, 2017, 2:44 a.m., Gaston Kleiman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64095/
> -----------------------------------------------------------
> 
> (Updated Dec. 2, 2017, 2:44 a.m.)
> 
> 
> Review request for mesos and Greg Mann.
> 
> 
> Bugs: MESOS-8197
>     https://issues.apache.org/jira/browse/MESOS-8197
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This actor handles the checkpointing, recovery, and retry of status
> updates.
> 
> It will initially be used by the offer operations status update
> manager, but it was designed and implemented so that it can replace
> the current implementation of the task status update manager.
> 
> 
> Diffs
> -----
> 
>   src/status_update_manager/status_update_manager_process.hpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/64095/diff/7/
> 
> 
> Testing
> -------
> 
> Added new tests at the end of the chain. They passed 5000 times on GNU/Linux.
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>


Re: Review Request 64095: Added a generic actor to be used by status update managers.

Posted by Gaston Kleiman <ga...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64095/
-----------------------------------------------------------

(Updated Dec. 1, 2017, 6:44 p.m.)


Review request for mesos and Greg Mann.


Changes
-------

Changed some indentation to reduce "jaggedness".


Bugs: MESOS-8197
    https://issues.apache.org/jira/browse/MESOS-8197


Repository: mesos


Description
-------

This actor handles the checkpointing, recovery, and retry of status
updates.

It will initially be used by the offer operations status update
manager, but it was designed and implemented so that it can replace
the current implementation of the task status update manager.


Diffs (updated)
-----

  src/status_update_manager/status_update_manager_process.hpp PRE-CREATION 


Diff: https://reviews.apache.org/r/64095/diff/5/

Changes: https://reviews.apache.org/r/64095/diff/4-5/


Testing
-------

Added new tests at the end of the chain. They passed 5000 times on GNU/Linux.


Thanks,

Gaston Kleiman


Re: Review Request 64095: Added a generic actor to be used by status update managers.

Posted by Gaston Kleiman <ga...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64095/
-----------------------------------------------------------

(Updated Dec. 1, 2017, 6:12 p.m.)


Review request for mesos and Greg Mann.


Changes
-------

Addressed feedback.


Bugs: MESOS-8197
    https://issues.apache.org/jira/browse/MESOS-8197


Repository: mesos


Description
-------

This actor handles the checkpointing, recovery, and retry of status
updates.

It will initially be used by the offer operations status update
manager, but it was designed and implemented so that it can replace
the current implementation of the task status update manager.


Diffs (updated)
-----

  src/status_update_manager/status_update_manager_process.hpp PRE-CREATION 


Diff: https://reviews.apache.org/r/64095/diff/4/

Changes: https://reviews.apache.org/r/64095/diff/3-4/


Testing
-------

Added new tests at the end of the chain. They passed 5000 times on GNU/Linux.


Thanks,

Gaston Kleiman


Re: Review Request 64095: Added a generic actor to be used by status update managers.

Posted by Gaston Kleiman <ga...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64095/#review192409
-----------------------------------------------------------



Result of the review session with Graag.


src/status_update_manager/status_update_manager_process.hpp
Lines 55-56 (patched)
<https://reviews.apache.org/r/64095/#comment270524>

    Let's try making this a nested class. Maybe we can name it `StatusUpdateStream` and get rid of the typedef.



src/status_update_manager/status_update_manager_process.hpp
Lines 69 (patched)
<https://reviews.apache.org/r/64095/#comment270525>

    Reference here a JIRA for the migration of the `TaskStatusUpdateManager`.
    
    Say whether the manager will be paused or not on creation.



src/status_update_manager/status_update_manager_process.hpp
Lines 73 (patched)
<https://reviews.apache.org/r/64095/#comment270526>

    Check what we do in our codebase regarding indentation of template parameters.



src/status_update_manager/status_update_manager_process.hpp
Lines 76 (patched)
<https://reviews.apache.org/r/64095/#comment270528>

    Add a description.



src/status_update_manager/status_update_manager_process.hpp
Lines 87 (patched)
<https://reviews.apache.org/r/64095/#comment270530>

    s/state/streams/



src/status_update_manager/status_update_manager_process.hpp
Lines 119 (patched)
<https://reviews.apache.org/r/64095/#comment270533>

    s/getPath/_getPath/



src/status_update_manager/status_update_manager_process.hpp
Lines 122 (patched)
<https://reviews.apache.org/r/64095/#comment270539>

    s/forward/_forwardCallback/



src/status_update_manager/status_update_manager_process.hpp
Lines 130-132 (patched)
<https://reviews.apache.org/r/64095/#comment270540>

    s/If no stream ID/If checkpoint is `false`/



src/status_update_manager/status_update_manager_process.hpp
Lines 172-173 (patched)
<https://reviews.apache.org/r/64095/#comment270543>

    Keep just the beginning.



src/status_update_manager/status_update_manager_process.hpp
Lines 178-179 (patched)
<https://reviews.apache.org/r/64095/#comment270545>

    Forward the status update if this is at the front of the queue.
    Subsequent status updates will be sent in 'acknowledgement()'.



src/status_update_manager/status_update_manager_process.hpp
Lines 223-235 (patched)
<https://reviews.apache.org/r/64095/#comment270553>

    Move this to Stream::acknowledgement



src/status_update_manager/status_update_manager_process.hpp
Lines 246 (patched)
<https://reviews.apache.org/r/64095/#comment270554>

    s/Duplicate status acknowledgement/Duplicate status update acknowledgement/



src/status_update_manager/status_update_manager_process.hpp
Lines 276 (patched)
<https://reviews.apache.org/r/64095/#comment270556>

    Maybe "Recovers the status update manager's state using the supplied stream IDs."



src/status_update_manager/status_update_manager_process.hpp
Lines 280-281 (patched)
<https://reviews.apache.org/r/64095/#comment270559>

    - The recovered state if successful.
    - The recovered state, including the number of errors encountered, if `strict == false` and any of the streams couldn't be recovered.
    - A `Failure` if `strict == true` and any of the streams couldn't be recovered.



src/status_update_manager/status_update_manager_process.hpp
Lines 288 (patched)
<https://reviews.apache.org/r/64095/#comment270563>

    Let's call this just `state`.



src/status_update_manager/status_update_manager_process.hpp
Lines 307 (patched)
<https://reviews.apache.org/r/64095/#comment270569>

    We want to increment `state.errors` if `strict=false`.



src/status_update_manager/status_update_manager_process.hpp
Lines 309 (patched)
<https://reviews.apache.org/r/64095/#comment270567>

    // This can happen if the initial checkpoint of the stream didn't complete.



src/status_update_manager/status_update_manager_process.hpp
Lines 332 (patched)
<https://reviews.apache.org/r/64095/#comment270571>

    s/the//



src/status_update_manager/status_update_manager_process.hpp
Lines 334 (patched)
<https://reviews.apache.org/r/64095/#comment270572>

    I think that we don't need this one ;-)/



src/status_update_manager/status_update_manager_process.hpp
Lines 341-342 (patched)
<https://reviews.apache.org/r/64095/#comment270573>

    ```
        LOG(INFO) << "Closing status update streams for framework"
                  << " '" << frameworkId << "'";
    ```



src/status_update_manager/status_update_manager_process.hpp
Lines 346 (patched)
<https://reviews.apache.org/r/64095/#comment270574>

    Split onto two different lines.



src/status_update_manager/status_update_manager_process.hpp
Lines 366-373 (patched)
<https://reviews.apache.org/r/64095/#comment270575>

    We can use `stream->next()` here.



src/status_update_manager/status_update_manager_process.hpp
Lines 569 (patched)
<https://reviews.apache.org/r/64095/#comment270568>

    We can make this a `bool` called `error`.



src/status_update_manager/status_update_manager_process.hpp
Lines 724 (patched)
<https://reviews.apache.org/r/64095/#comment270570>

    If `state.updates.empty()` return `None()`.


- Gaston Kleiman


On Nov. 30, 2017, 5:42 p.m., Gaston Kleiman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64095/
> -----------------------------------------------------------
> 
> (Updated Nov. 30, 2017, 5:42 p.m.)
> 
> 
> Review request for mesos and Greg Mann.
> 
> 
> Bugs: MESOS-8197
>     https://issues.apache.org/jira/browse/MESOS-8197
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This actor handles the checkpointing, recovery, and retry of status
> updates.
> 
> It will initially be used by the offer operations status update
> manager, but it was designed and implemented so that it can replace
> the current implementation of the task status update manager.
> 
> 
> Diffs
> -----
> 
>   src/status_update_manager/status_update_manager_process.hpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/64095/diff/3/
> 
> 
> Testing
> -------
> 
> Added new tests at the end of the chain. They passed 5000 times on GNU/Linux.
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>


Re: Review Request 64095: Added a generic actor to be used by status update managers.

Posted by Gaston Kleiman <ga...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64095/
-----------------------------------------------------------

(Updated Nov. 30, 2017, 5:42 p.m.)


Review request for mesos and Greg Mann.


Changes
-------

Result of the review session with Graag.


Bugs: MESOS-8197
    https://issues.apache.org/jira/browse/MESOS-8197


Repository: mesos


Description (updated)
-------

This actor handles the checkpointing, recovery, and retry of status
updates.

It will initially be used by the offer operations status update
manager, but it was designed and implemented so that it can replace
the current implementation of the task status update manager.


Diffs
-----

  src/status_update_manager/status_update_manager_process.hpp PRE-CREATION 


Diff: https://reviews.apache.org/r/64095/diff/3/


Testing
-------

Added new tests at the end of the chain. They passed 5000 times on GNU/Linux.


Thanks,

Gaston Kleiman