You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benjamin Mahler <bm...@apache.org> on 2017/09/02 00:38:04 UTC

Review Request 62050: Added a test to reproduce the OOM issue in MESOS-7934.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62050/
-----------------------------------------------------------

Review request for mesos, Chun-Hung Hsiao and Jie Yu.


Bugs: MESOS-7934
    https://issues.apache.org/jira/browse/MESOS-7934


Repository: mesos


Description
-------

See summary.


Diffs
-----

  3rdparty/libprocess/src/tests/ssl_tests.cpp 6affda8e07d76d925a87f08b297bb6ebb4e3257f 


Diff: https://reviews.apache.org/r/62050/diff/1/


Testing
-------

ran in repetition


Thanks,

Benjamin Mahler


Re: Review Request 62050: Added a test to reproduce the OOM issue in MESOS-7934.

Posted by Chun-Hung Hsiao <ch...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62050/#review184408
-----------------------------------------------------------




3rdparty/libprocess/src/tests/ssl_tests.cpp
Lines 932 (patched)
<https://reviews.apache.org/r/62050/#comment260596>

    Although we can reproduce this many times, this is still racy and might potentially be flaky in another environment. Is it possible to use SSL API to ensure that the client first reads a portion of the data, then we close the write end, then continue reading the remaining of the data. Not sure how much work this would take and if it is worth the cost, so I'll just leave a comment but not open an issue.


- Chun-Hung Hsiao


On Sept. 2, 2017, 12:38 a.m., Benjamin Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62050/
> -----------------------------------------------------------
> 
> (Updated Sept. 2, 2017, 12:38 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Jie Yu.
> 
> 
> Bugs: MESOS-7934
>     https://issues.apache.org/jira/browse/MESOS-7934
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/src/tests/ssl_tests.cpp 6affda8e07d76d925a87f08b297bb6ebb4e3257f 
> 
> 
> Diff: https://reviews.apache.org/r/62050/diff/1/
> 
> 
> Testing
> -------
> 
> ran in repetition
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>


Re: Review Request 62050: Added a test to reproduce the OOM issue in MESOS-7934.

Posted by Benjamin Mahler <bm...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62050/
-----------------------------------------------------------

(Updated Sept. 2, 2017, 3:15 a.m.)


Review request for mesos, Chun-Hung Hsiao and Jie Yu.


Changes
-------

Updated to a minimal test based on jie's analysis.


Bugs: MESOS-7934
    https://issues.apache.org/jira/browse/MESOS-7934


Repository: mesos


Description (updated)
-------

Added a test to reproduce the OOM issue in MESOS-7934.


Diffs (updated)
-----

  3rdparty/libprocess/src/tests/ssl_tests.cpp 6affda8e07d76d925a87f08b297bb6ebb4e3257f 


Diff: https://reviews.apache.org/r/62050/diff/2/

Changes: https://reviews.apache.org/r/62050/diff/1-2/


Testing
-------

ran in repetition


Thanks,

Benjamin Mahler


Re: Review Request 62050: Added a test to reproduce the OOM issue in MESOS-7934.

Posted by Benjamin Mahler <bm...@apache.org>.

> On Sept. 2, 2017, 3:01 a.m., Jie Yu wrote:
> > 3rdparty/libprocess/src/tests/ssl_tests.cpp
> > Lines 939 (patched)
> > <https://reviews.apache.org/r/62050/diff/1/?file=1813747#file1813747line939>
> >
> >     This is not necessary to trigger the original oom issue, but i guess this more realistically simulate the real world scenario where the server recv returns 0 and triggers a shutdown of the server socket.
> >     
> >     I'd add some comments to explain here. Essentially, the bug might be triggered if send is called after shutdown. In that case, libevent will keep give us BEV_EVENT_ERROR with errno=0. Internally, SSL_write returns SSL_ERROR_SSL (`protocol is shutdown in SSL routines SSL_write`).
> >     https://github.com/libevent/libevent/blob/release-2.0.22-stable/bufferevent_openssl.c#L504

Good analysis, much appreciated!

Updated to a minimal test and updated the comments in the subsequent patch to reflect that BEV_EVENT_ERROR with errno=0 occurs for sends after a shutdown.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62050/#review184411
-----------------------------------------------------------


On Sept. 2, 2017, 12:38 a.m., Benjamin Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62050/
> -----------------------------------------------------------
> 
> (Updated Sept. 2, 2017, 12:38 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Jie Yu.
> 
> 
> Bugs: MESOS-7934
>     https://issues.apache.org/jira/browse/MESOS-7934
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/src/tests/ssl_tests.cpp 6affda8e07d76d925a87f08b297bb6ebb4e3257f 
> 
> 
> Diff: https://reviews.apache.org/r/62050/diff/1/
> 
> 
> Testing
> -------
> 
> ran in repetition
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>


Re: Review Request 62050: Added a test to reproduce the OOM issue in MESOS-7934.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62050/#review184411
-----------------------------------------------------------


Fix it, then Ship it!





3rdparty/libprocess/src/tests/ssl_tests.cpp
Lines 939 (patched)
<https://reviews.apache.org/r/62050/#comment260597>

    This is not necessary to trigger the original oom issue, but i guess this more realistically simulate the real world scenario where the server recv returns 0 and triggers a shutdown of the server socket.
    
    I'd add some comments to explain here. Essentially, the bug might be triggered if send is called after shutdown. In that case, libevent will keep give us BEV_EVENT_ERROR with errno=0. Internally, SSL_write returns SSL_ERROR_SSL (`protocol is shutdown in SSL routines SSL_write`).
    https://github.com/libevent/libevent/blob/release-2.0.22-stable/bufferevent_openssl.c#L504


- Jie Yu


On Sept. 2, 2017, 12:38 a.m., Benjamin Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62050/
> -----------------------------------------------------------
> 
> (Updated Sept. 2, 2017, 12:38 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Jie Yu.
> 
> 
> Bugs: MESOS-7934
>     https://issues.apache.org/jira/browse/MESOS-7934
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/src/tests/ssl_tests.cpp 6affda8e07d76d925a87f08b297bb6ebb4e3257f 
> 
> 
> Diff: https://reviews.apache.org/r/62050/diff/1/
> 
> 
> Testing
> -------
> 
> ran in repetition
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>