You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Anand Mazumdar <an...@apache.org> on 2017/02/13 05:44:11 UTC

Review Request 56587: Fixed a crash on the master upon receiving an invalid inverse offer.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56587/
-----------------------------------------------------------

Review request for mesos and Joseph Wu.


Bugs: MESOS-7119
    https://issues.apache.org/jira/browse/MESOS-7119


Repository: mesos


Description
-------

The erroneous invariant check for `slaveId` can be trigerred when
the master accepts an invalid inverse offer or when the inverse offer
has been already rescinded.


Diffs
-----

  src/master/master.cpp dff320acd1a13995d51cc0c303edea0cdfaa336d 
  src/tests/master_maintenance_tests.cpp 9861d1186e046a4fd547e73825c02b71eedd5ce2 

Diff: https://reviews.apache.org/r/56587/diff/


Testing
-------

make check + added a test that used to crash before.


Thanks,

Anand Mazumdar


Re: Review Request 56587: Fixed a crash on the master upon receiving an invalid inverse offer.

Posted by Mesos Reviewbot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56587/#review165296
-----------------------------------------------------------



Patch looks great!

Reviews applied: [56587]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Feb. 13, 2017, 5:44 a.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56587/
> -----------------------------------------------------------
> 
> (Updated Feb. 13, 2017, 5:44 a.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Bugs: MESOS-7119
>     https://issues.apache.org/jira/browse/MESOS-7119
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The erroneous invariant check for `slaveId` can be trigerred when
> the master accepts an invalid inverse offer or when the inverse offer
> has been already rescinded.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp dff320acd1a13995d51cc0c303edea0cdfaa336d 
>   src/tests/master_maintenance_tests.cpp 9861d1186e046a4fd547e73825c02b71eedd5ce2 
> 
> Diff: https://reviews.apache.org/r/56587/diff/
> 
> 
> Testing
> -------
> 
> make check + added a test that used to crash before.
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>


Re: Review Request 56587: Fixed a crash on the master upon receiving an invalid inverse offer.

Posted by Anand Mazumdar <an...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56587/
-----------------------------------------------------------

(Updated Feb. 15, 2017, 7:48 p.m.)


Review request for mesos and Joseph Wu.


Changes
-------

Review comments, NNFR.


Bugs: MESOS-7119
    https://issues.apache.org/jira/browse/MESOS-7119


Repository: mesos


Description
-------

The erroneous invariant check for `slaveId` can be trigerred when
the master accepts an invalid inverse offer or when the inverse offer
has been already rescinded.


Diffs (updated)
-----

  src/master/master.cpp dff320acd1a13995d51cc0c303edea0cdfaa336d 
  src/tests/master_maintenance_tests.cpp 9861d1186e046a4fd547e73825c02b71eedd5ce2 

Diff: https://reviews.apache.org/r/56587/diff/


Testing
-------

make check + added a test that used to crash before.


Thanks,

Anand Mazumdar


Re: Review Request 56587: Fixed a crash on the master upon receiving an invalid inverse offer.

Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56587/#review165594
-----------------------------------------------------------


Fix it, then Ship it!





src/master/master.cpp (line 4731)
<https://reviews.apache.org/r/56587/#comment237453>

    We don't use this `slaveId` after assigning it, anymore.  So remove this statement.  And the local variable.


- Joseph Wu


On Feb. 14, 2017, 12:22 p.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56587/
> -----------------------------------------------------------
> 
> (Updated Feb. 14, 2017, 12:22 p.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Bugs: MESOS-7119
>     https://issues.apache.org/jira/browse/MESOS-7119
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The erroneous invariant check for `slaveId` can be trigerred when
> the master accepts an invalid inverse offer or when the inverse offer
> has been already rescinded.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp dff320acd1a13995d51cc0c303edea0cdfaa336d 
>   src/tests/master_maintenance_tests.cpp 9861d1186e046a4fd547e73825c02b71eedd5ce2 
> 
> Diff: https://reviews.apache.org/r/56587/diff/
> 
> 
> Testing
> -------
> 
> make check + added a test that used to crash before.
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>


Re: Review Request 56587: Fixed a crash on the master upon receiving an invalid inverse offer.

Posted by Mesos Reviewbot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56587/#review165631
-----------------------------------------------------------



Patch looks great!

Reviews applied: [56587]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Feb. 14, 2017, 8:22 p.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56587/
> -----------------------------------------------------------
> 
> (Updated Feb. 14, 2017, 8:22 p.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Bugs: MESOS-7119
>     https://issues.apache.org/jira/browse/MESOS-7119
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The erroneous invariant check for `slaveId` can be trigerred when
> the master accepts an invalid inverse offer or when the inverse offer
> has been already rescinded.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp dff320acd1a13995d51cc0c303edea0cdfaa336d 
>   src/tests/master_maintenance_tests.cpp 9861d1186e046a4fd547e73825c02b71eedd5ce2 
> 
> Diff: https://reviews.apache.org/r/56587/diff/
> 
> 
> Testing
> -------
> 
> make check + added a test that used to crash before.
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>


Re: Review Request 56587: Fixed a crash on the master upon receiving an invalid inverse offer.

Posted by Anand Mazumdar <an...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56587/
-----------------------------------------------------------

(Updated Feb. 14, 2017, 8:22 p.m.)


Review request for mesos and Joseph Wu.


Changes
-------

Review comments.


Bugs: MESOS-7119
    https://issues.apache.org/jira/browse/MESOS-7119


Repository: mesos


Description
-------

The erroneous invariant check for `slaveId` can be trigerred when
the master accepts an invalid inverse offer or when the inverse offer
has been already rescinded.


Diffs (updated)
-----

  src/master/master.cpp dff320acd1a13995d51cc0c303edea0cdfaa336d 
  src/tests/master_maintenance_tests.cpp 9861d1186e046a4fd547e73825c02b71eedd5ce2 

Diff: https://reviews.apache.org/r/56587/diff/


Testing
-------

make check + added a test that used to crash before.


Thanks,

Anand Mazumdar


Re: Review Request 56587: Fixed a crash on the master upon receiving an invalid inverse offer.

Posted by Joseph Wu <jo...@mesosphere.io>.

> On Feb. 13, 2017, 11:51 a.m., Joseph Wu wrote:
> > src/master/master.cpp, lines 4756-4777
> > <https://reviews.apache.org/r/56587/diff/1/?file=1631678#file1631678line4756>
> >
> >     Moving the error above the CHECK leaves out the log message in one case:
> >     
> >     * ACCEPT_INVERSE_OFFER is called with multiple offers (invalid-offer-id and valid-offer-id).
> >     * In the processing loop, we will LOG(WARNING) for `invalid-offer-id`.
> >     * After the processing loop, we _should_ LOG(INFO) for `valid-offer-id` with the agent and framework info too.  <-- This one is left out in your patch
> 
> Anand Mazumdar wrote:
>     hmm, the semantics that we have currently for the `accept()`/`acceptInverseOffers()` is that even if we have a _single_ invalid offer we treat the entire request as being invalid. This seems like the right behavior to me i.e., we log that the call used invalid offers if any of the offers are invalid. If we feel otherwise though, that seems like a separate orthogonal issue that we should tackle for both the handlers.

It feels right to reject the entire array of offers if any of them are invalid.

The problem with inverse offers is that there is no feedback to the framework.  If a framework accidentally formats all InverseOfferIDs incorrectly, we have no way of indicating this to the framework.  For normal offers, the master can send back a status update with TASK_DROPPED/TASK_LOST.
^ This is why we try to process all the valid InverseOfferIDs even if part of the request is invalid.


- Joseph


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56587/#review165383
-----------------------------------------------------------


On Feb. 12, 2017, 9:44 p.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56587/
> -----------------------------------------------------------
> 
> (Updated Feb. 12, 2017, 9:44 p.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Bugs: MESOS-7119
>     https://issues.apache.org/jira/browse/MESOS-7119
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The erroneous invariant check for `slaveId` can be trigerred when
> the master accepts an invalid inverse offer or when the inverse offer
> has been already rescinded.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp dff320acd1a13995d51cc0c303edea0cdfaa336d 
>   src/tests/master_maintenance_tests.cpp 9861d1186e046a4fd547e73825c02b71eedd5ce2 
> 
> Diff: https://reviews.apache.org/r/56587/diff/
> 
> 
> Testing
> -------
> 
> make check + added a test that used to crash before.
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>


Re: Review Request 56587: Fixed a crash on the master upon receiving an invalid inverse offer.

Posted by Anand Mazumdar <an...@apache.org>.

> On Feb. 13, 2017, 7:51 p.m., Joseph Wu wrote:
> > src/master/master.cpp, lines 4756-4777
> > <https://reviews.apache.org/r/56587/diff/1/?file=1631678#file1631678line4756>
> >
> >     Moving the error above the CHECK leaves out the log message in one case:
> >     
> >     * ACCEPT_INVERSE_OFFER is called with multiple offers (invalid-offer-id and valid-offer-id).
> >     * In the processing loop, we will LOG(WARNING) for `invalid-offer-id`.
> >     * After the processing loop, we _should_ LOG(INFO) for `valid-offer-id` with the agent and framework info too.  <-- This one is left out in your patch

hmm, the semantics that we have currently for the `accept()`/`acceptInverseOffers()` is that even if we have a _single_ invalid offer we treat the entire request as being invalid. This seems like the right behavior to me i.e., we log that the call used invalid offers if any of the offers are invalid. If we feel otherwise though, that seems like a separate orthogonal issue that we should tackle for both the handlers.


- Anand


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56587/#review165383
-----------------------------------------------------------


On Feb. 13, 2017, 5:44 a.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56587/
> -----------------------------------------------------------
> 
> (Updated Feb. 13, 2017, 5:44 a.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Bugs: MESOS-7119
>     https://issues.apache.org/jira/browse/MESOS-7119
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The erroneous invariant check for `slaveId` can be trigerred when
> the master accepts an invalid inverse offer or when the inverse offer
> has been already rescinded.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp dff320acd1a13995d51cc0c303edea0cdfaa336d 
>   src/tests/master_maintenance_tests.cpp 9861d1186e046a4fd547e73825c02b71eedd5ce2 
> 
> Diff: https://reviews.apache.org/r/56587/diff/
> 
> 
> Testing
> -------
> 
> make check + added a test that used to crash before.
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>


Re: Review Request 56587: Fixed a crash on the master upon receiving an invalid inverse offer.

Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56587/#review165383
-----------------------------------------------------------




src/master/master.cpp (lines 4754 - 4768)
<https://reviews.apache.org/r/56587/#comment237219>

    Moving the error above the CHECK leaves out the log message in one case:
    
    * ACCEPT_INVERSE_OFFER is called with multiple offers (invalid-offer-id and valid-offer-id).
    * In the processing loop, we will LOG(WARNING) for `invalid-offer-id`.
    * After the processing loop, we _should_ LOG(INFO) for `valid-offer-id` with the agent and framework info too.  <-- This one is left out in your patch


- Joseph Wu


On Feb. 12, 2017, 9:44 p.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56587/
> -----------------------------------------------------------
> 
> (Updated Feb. 12, 2017, 9:44 p.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Bugs: MESOS-7119
>     https://issues.apache.org/jira/browse/MESOS-7119
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The erroneous invariant check for `slaveId` can be trigerred when
> the master accepts an invalid inverse offer or when the inverse offer
> has been already rescinded.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp dff320acd1a13995d51cc0c303edea0cdfaa336d 
>   src/tests/master_maintenance_tests.cpp 9861d1186e046a4fd547e73825c02b71eedd5ce2 
> 
> Diff: https://reviews.apache.org/r/56587/diff/
> 
> 
> Testing
> -------
> 
> make check + added a test that used to crash before.
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>