You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Vinod Kone <vi...@gmail.com> on 2013/03/01 20:53:29 UTC

Re: Review Request: Slave Restart (Part Deux): Status Update Manager


> On Feb. 27, 2013, 10:31 p.m., Ben Mahler wrote:
> > src/common/type_utils.hpp, line 306
> > <https://reviews.apache.org/r/7212/diff/12/?file=261358#file261358line306>
> >
> >     can you quote these ids, given we currently don't enforce no whitespace?

Throughout the code base we only quote executor ids because we know they can have space in them (our command executor does). I would be keep task ids unquoted for now, to be consistent. May be we can do a sweep later on of the code base if we decide to fix it.


> On Feb. 27, 2013, 10:31 p.m., Ben Mahler wrote:
> > src/messages/messages.proto, line 59
> > <https://reviews.apache.org/r/7212/diff/12/?file=261360#file261360line59>
> >
> >     Hm.. how is sequence used? Is it perpetually incremented across recoveries? If so let's make this a uint64.

i think this is not used anymore! thanks for catching this.


> On Feb. 27, 2013, 10:31 p.m., Ben Mahler wrote:
> > src/slave/status_update_manager.hpp, line 69
> > <https://reviews.apache.org/r/7212/diff/12/?file=261361#file261361line69>
> >
> >     s/Whether the update is handled successfully (e.g. checkpointed)./A Future of the update operation./
> >     
> >     You can then describe more return semantics if you like separately but I think if we want to use annotations like @return we should actually describe what we're returning in the first sentence (which is what Jie originally did before I refactored his return types).
> >     
> >     Whether sounds like a boolean to me, I think this pattern is present in the cgroups isolation module because the functions there used to return booleans before I refactored them to return Try's.

Changed it to "A future indicating whether the update is handled successfully (e.g. checkpointed)". I think that is more clear.


> On Feb. 27, 2013, 10:31 p.m., Ben Mahler wrote:
> > src/slave/status_update_manager.hpp, line 83
> > <https://reviews.apache.org/r/7212/diff/12/?file=261361#file261361line83>
> >
> >     Can you add a ticket or review to track this? If there is one.

there isn't one :/


> On Feb. 27, 2013, 10:31 p.m., Ben Mahler wrote:
> > src/slave/status_update_manager.hpp, line 153
> > <https://reviews.apache.org/r/7212/diff/12/?file=261361#file261361line153>
> >
> >     Can you maybe add a comment or note on what the uuid is for and why there can be a mismatch?

Actually I don't remember what the reasoning was. Made this a CHECK to be safe.


> On Feb. 27, 2013, 10:31 p.m., Ben Mahler wrote:
> > src/slave/status_update_manager.hpp, line 221
> > <https://reviews.apache.org/r/7212/diff/12/?file=261361#file261361line221>
> >
> >     Why are you not using the protobuf write error? We've lost that information here!

good point. fixed.


> On Feb. 27, 2013, 10:31 p.m., Ben Mahler wrote:
> > src/slave/status_update_manager.cpp, line 169
> > <https://reviews.apache.org/r/7212/diff/12/?file=261362#file261362line169>
> >
> >     Why are you checking the timeout is expired? Can you add a comment?

Because this is the first in the stream, as mentioned in the comment above. Timeout only gets set when we send an update. Let me know if thats not clear?


> On Feb. 27, 2013, 10:31 p.m., Ben Mahler wrote:
> > src/slave/status_update_manager.cpp, line 218
> > <https://reviews.apache.org/r/7212/diff/12/?file=261362#file261362line218>
> >
> >     s/might/will

see above.


> On Feb. 27, 2013, 10:31 p.m., Ben Mahler wrote:
> > src/slave/status_update_manager.cpp, line 295
> > <https://reviews.apache.org/r/7212/diff/12/?file=261362#file261362line295>
> >
> >     Can you make this an Option, that way you can CHECK_NOTNULL when the option is some in the caller?

I think thats a bit of too much overhead. NULL is a perfectly valid sentinel value to check against. Making it an option, doesn't add additional value but adds overhead.


> On Feb. 27, 2013, 10:31 p.m., Ben Mahler wrote:
> > src/slave/status_update_manager.cpp, line 208
> > <https://reviews.apache.org/r/7212/diff/12/?file=261362#file261362line208>
> >
> >     s/might/will/
> >     
> >     Why do we allow this to happen? If it's hard to prevent, please elaborate in this comment.

We use 'might' for 'will', throughout our code base :)

While it might (:)) be easy to stop status updates reaching the SUM, there might be cases where the SUM can receive an ACK for a terminated stream. For eg: updates generated by the master. Added a comment for this.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7212/#review17166
-----------------------------------------------------------


On Feb. 23, 2013, 8:11 a.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7212/
> -----------------------------------------------------------
> 
> (Updated Feb. 23, 2013, 8:11 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Description
> -------
> 
> added shutdownFramework
> 
> 
> Minor fixes for open
> 
> 
> Ben's comments
> 
> 
> Refactoring SUM
> 
> 
> Bens' comments
> 
> 
> Status Update Manager
> 
> Rebased off latest trunk
> 
> Conflicts:
> 	src/Makefile.am
> 	src/common/protobuf_utils.hpp
> 	src/common/utils.hpp
> 	src/slave/slave.cpp
> 	src/tests/protobuf_io_tests.cpp
> 	src/tests/utils_tests.cpp
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am c94736df660a25b58dc47c07d9c56c3c26152a66 
>   src/common/protobuf_utils.hpp 69baab78c8db2d2d33ffbbe7f5e5fc80d65b0e1a 
>   src/common/type_utils.hpp fde69aeec403b3455839dca6b0b2e1507d81ba00 
>   src/local/local.cpp 3402029e07341212717e346a97e7cd30fa52ed51 
>   src/messages/messages.proto 815fcbbcb4a8643f50950a294cedf7281b2a187f 
>   src/slave/status_update_manager.hpp PRE-CREATION 
>   src/slave/status_update_manager.cpp PRE-CREATION 
>   third_party/libprocess/include/process/timeout.hpp cac996070359a3e7ecdd8077af83c8c4cf9735fd 
> 
> Diff: https://reviews.apache.org/r/7212/diff/
> 
> 
> Testing
> -------
> 
> make check.
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>