You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Vinod Kone <vi...@gmail.com> on 2013/04/18 10:35:52 UTC

Review Request: Fixed slave to properly schedule executor directories for garbage collection.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10604/
-----------------------------------------------------------

Review request for mesos, Benjamin Hindman and Ben Mahler.


Description
-------

Refactored runTask() and some other pieces of slave, to make this hopefully clear.

Also, sneaked in some bug fixes when executorStarted() is called.


Diffs
-----

  src/slave/slave.hpp 54c66863db217077a050dc414caf0976447500be 
  src/slave/slave.cpp 00b2375505e362959ac34061e3066cf8ace96adf 
  src/tests/allocator_zookeeper_tests.cpp 42faaa067bdfa0c7f33260eb5cb3b9e5956c3037 

Diff: https://reviews.apache.org/r/10604/diff/


Testing
-------

make check.

NOTE: GarbageCollectorIntegrationTest.Unschedule test now correctly verifies that executors/frameworks are properly unscheduled despite adding tasks to 'pending'.


Thanks,

Vinod Kone


Re: Review Request: Fixed slave to properly schedule executor directories for garbage collection.

Posted by Vinod Kone <vi...@gmail.com>.

> On April 19, 2013, 12:05 a.m., Ben Mahler wrote:
> > src/slave/slave.cpp, line 739
> > <https://reviews.apache.org/r/10604/diff/3/?file=282489#file282489line739>
> >
> >     Well, if the slave re-registers we won't send TASK_LOST in the master, we'll be sending killTask to the slave. But only once I implement the task consolidation in the master during re-registration.

adding a TODO, since there is no longer consolidation at the master.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10604/#review19417
-----------------------------------------------------------


On April 18, 2013, 11:46 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10604/
> -----------------------------------------------------------
> 
> (Updated April 18, 2013, 11:46 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Description
> -------
> 
> Refactored runTask() and some other pieces of slave, to make this hopefully clear.
> 
> Also, sneaked in some bug fixes when executorStarted() is called.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 54c66863db217077a050dc414caf0976447500be 
>   src/slave/slave.cpp 00b2375505e362959ac34061e3066cf8ace96adf 
>   src/tests/allocator_zookeeper_tests.cpp 42faaa067bdfa0c7f33260eb5cb3b9e5956c3037 
> 
> Diff: https://reviews.apache.org/r/10604/diff/
> 
> 
> Testing
> -------
> 
> make check.
> 
> NOTE: GarbageCollectorIntegrationTest.Unschedule test now correctly verifies that executors/frameworks are properly unscheduled despite adding tasks to 'pending'.
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request: Fixed slave to properly schedule executor directories for garbage collection.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10604/#review19417
-----------------------------------------------------------



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40156>

    Well, if the slave re-registers we won't send TASK_LOST in the master, we'll be sending killTask to the slave. But only once I implement the task consolidation in the master during re-registration.



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40157>

    great!



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40163>

    Can you move this TODO to be for the statusUpdate() call?



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40158>

    s/guaranteed/guarantee/



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40159>

    s/would cause/causes/



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40160>

    RECOVERING



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40165>

    s/master/the master/



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40161>

    Unless it recovers and re-registers, in which case the master will send killTask to this slave.
    
    s/send send/send/



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40162>

    Same line as the CHECK?



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40166>

    Add a note as to why we don't send an update? Because we don't want to send conflicting status updates for this task, correct?



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40167>

    Bad sentence: "will be removed all those tasks"



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40172>

    Curious if this is related, or you just noticed this bug as well so fixing it here?



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40170>

    great!


- Ben Mahler


On April 18, 2013, 11:46 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10604/
> -----------------------------------------------------------
> 
> (Updated April 18, 2013, 11:46 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Description
> -------
> 
> Refactored runTask() and some other pieces of slave, to make this hopefully clear.
> 
> Also, sneaked in some bug fixes when executorStarted() is called.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 54c66863db217077a050dc414caf0976447500be 
>   src/slave/slave.cpp 00b2375505e362959ac34061e3066cf8ace96adf 
>   src/tests/allocator_zookeeper_tests.cpp 42faaa067bdfa0c7f33260eb5cb3b9e5956c3037 
> 
> Diff: https://reviews.apache.org/r/10604/diff/
> 
> 
> Testing
> -------
> 
> make check.
> 
> NOTE: GarbageCollectorIntegrationTest.Unschedule test now correctly verifies that executors/frameworks are properly unscheduled despite adding tasks to 'pending'.
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request: Fixed slave to properly schedule executor directories for garbage collection.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10604/#review19424
-----------------------------------------------------------

Ship it!


Ship It!

- Ben Mahler


On April 19, 2013, 1:28 a.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10604/
> -----------------------------------------------------------
> 
> (Updated April 19, 2013, 1:28 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Description
> -------
> 
> Refactored runTask() and some other pieces of slave, to make this hopefully clear.
> 
> Also, sneaked in some bug fixes when executorStarted() is called.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 54c66863db217077a050dc414caf0976447500be 
>   src/slave/slave.cpp 00b2375505e362959ac34061e3066cf8ace96adf 
>   src/tests/allocator_zookeeper_tests.cpp 42faaa067bdfa0c7f33260eb5cb3b9e5956c3037 
> 
> Diff: https://reviews.apache.org/r/10604/diff/
> 
> 
> Testing
> -------
> 
> make check.
> 
> NOTE: GarbageCollectorIntegrationTest.Unschedule test now correctly verifies that executors/frameworks are properly unscheduled despite adding tasks to 'pending'.
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request: Fixed slave to properly schedule executor directories for garbage collection.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10604/
-----------------------------------------------------------

(Updated April 19, 2013, 1:28 a.m.)


Review request for mesos, Benjamin Hindman and Ben Mahler.


Changes
-------

benm's. no need for review.


Description
-------

Refactored runTask() and some other pieces of slave, to make this hopefully clear.

Also, sneaked in some bug fixes when executorStarted() is called.


Diffs (updated)
-----

  src/slave/slave.hpp 54c66863db217077a050dc414caf0976447500be 
  src/slave/slave.cpp 00b2375505e362959ac34061e3066cf8ace96adf 
  src/tests/allocator_zookeeper_tests.cpp 42faaa067bdfa0c7f33260eb5cb3b9e5956c3037 

Diff: https://reviews.apache.org/r/10604/diff/


Testing
-------

make check.

NOTE: GarbageCollectorIntegrationTest.Unschedule test now correctly verifies that executors/frameworks are properly unscheduled despite adding tasks to 'pending'.


Thanks,

Vinod Kone


Re: Review Request: Fixed slave to properly schedule executor directories for garbage collection.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10604/
-----------------------------------------------------------

(Updated April 18, 2013, 11:46 p.m.)


Review request for mesos, Benjamin Hindman and Ben Mahler.


Changes
-------

benm's


Description
-------

Refactored runTask() and some other pieces of slave, to make this hopefully clear.

Also, sneaked in some bug fixes when executorStarted() is called.


Diffs (updated)
-----

  src/slave/slave.hpp 54c66863db217077a050dc414caf0976447500be 
  src/slave/slave.cpp 00b2375505e362959ac34061e3066cf8ace96adf 
  src/tests/allocator_zookeeper_tests.cpp 42faaa067bdfa0c7f33260eb5cb3b9e5956c3037 

Diff: https://reviews.apache.org/r/10604/diff/


Testing
-------

make check.

NOTE: GarbageCollectorIntegrationTest.Unschedule test now correctly verifies that executors/frameworks are properly unscheduled despite adding tasks to 'pending'.


Thanks,

Vinod Kone


Re: Review Request: Fixed slave to properly schedule executor directories for garbage collection.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10604/#review19410
-----------------------------------------------------------


As discussed, we should add consolidation of tasks in the master against the tasks the slave re-registers with.


src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40144>

    Can you check against the slaveId in the TaskInfo to ensure it matches?



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40125>

    And now to also ensure the framework is not removed, right?



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40126>

    This NOTE is not possible any longer, right?
    
    We can CHECKNOTNULL on the framework, because a runTask would have inserted a pending task, thus preventing framework removal in the interim.



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40143>

    Check that we're not in recovering here. And handle RECOVERING in runTask instead.



src/slave/slave.cpp
<https://reviews.apache.org/r/10604/#comment40124>

    Looks like createExecutor is not needed any longer, can you merge it into launchExecutor?


- Ben Mahler


On April 18, 2013, 9:11 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10604/
> -----------------------------------------------------------
> 
> (Updated April 18, 2013, 9:11 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Description
> -------
> 
> Refactored runTask() and some other pieces of slave, to make this hopefully clear.
> 
> Also, sneaked in some bug fixes when executorStarted() is called.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 54c66863db217077a050dc414caf0976447500be 
>   src/slave/slave.cpp 00b2375505e362959ac34061e3066cf8ace96adf 
>   src/tests/allocator_zookeeper_tests.cpp 42faaa067bdfa0c7f33260eb5cb3b9e5956c3037 
> 
> Diff: https://reviews.apache.org/r/10604/diff/
> 
> 
> Testing
> -------
> 
> make check.
> 
> NOTE: GarbageCollectorIntegrationTest.Unschedule test now correctly verifies that executors/frameworks are properly unscheduled despite adding tasks to 'pending'.
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request: Fixed slave to properly schedule executor directories for garbage collection.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10604/
-----------------------------------------------------------

(Updated April 18, 2013, 9:11 p.m.)


Review request for mesos, Benjamin Hindman and Ben Mahler.


Changes
-------

benm's offline comments.


Description
-------

Refactored runTask() and some other pieces of slave, to make this hopefully clear.

Also, sneaked in some bug fixes when executorStarted() is called.


Diffs (updated)
-----

  src/slave/slave.hpp 54c66863db217077a050dc414caf0976447500be 
  src/slave/slave.cpp 00b2375505e362959ac34061e3066cf8ace96adf 
  src/tests/allocator_zookeeper_tests.cpp 42faaa067bdfa0c7f33260eb5cb3b9e5956c3037 

Diff: https://reviews.apache.org/r/10604/diff/


Testing
-------

make check.

NOTE: GarbageCollectorIntegrationTest.Unschedule test now correctly verifies that executors/frameworks are properly unscheduled despite adding tasks to 'pending'.


Thanks,

Vinod Kone