You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benjamin Hindman <be...@berkeley.edu> on 2017/01/08 03:22:24 UTC

Re: Review Request 53802: Eliminated an EOF race condition in libprocess SSL socket.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53802/#review160824
-----------------------------------------------------------


Fix it, then Ship it!





3rdparty/libprocess/src/libevent_ssl_socket.hpp (line 143)
<https://reviews.apache.org/r/53802/#comment232051>

    I don't see any win in the name change from `recv_callback` to `perform_bev_read`. You also didn't delete the declaration of `recv_callback` above even though you changed the function name below.



3rdparty/libprocess/src/libevent_ssl_socket.hpp (line 155)
<https://reviews.apache.org/r/53802/#comment232052>

    Was your inititon that `received_eof` was going to be protected by `synchronized (bev)`? I want to make sure there isn't a race with the IO thread setting `received_eof` in the `event_callback` and another thread reading and missing `received_eof` in `recv`. If you're confident that the `synchronized (bev)` is a sufficient barrier then let's just document that and maybe even move this up closer to `bufferevent* bev;` and specify that it's protected by `synchronized (bev)` which it gets automatically in any of the callbacks (i.e., `recv_callback`, `send_callback`, `event_callback`).



3rdparty/libprocess/src/libevent_ssl_socket.cpp (line 228)
<https://reviews.apache.org/r/53802/#comment232053>

    One of the rasons I prefer calling this `recv_callback` is that it has symmetry with the others, `send_callback` and `event_callback`.



3rdparty/libprocess/src/libevent_ssl_socket.cpp (lines 399 - 402)
<https://reviews.apache.org/r/53802/#comment232054>

    This is a tricky comment. Here's my iteration on what you have to try and call out that even though we've received EOF there still might be data left in the buffer:
    
    While we've received EOF there is still the possibility that we have data left in the buffer that needs to get drained first. Thus, we pretend as though we've just received data and force a `recv_callback` explicitly which will either fulfill an existing `recv_request` with the data or if there is no data in the buffer it will complete the `recv_request` with a length of zero, representing EOF.


- Benjamin Hindman


On Dec. 15, 2016, 5:53 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53802/
> -----------------------------------------------------------
> 
> (Updated Dec. 15, 2016, 5:53 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Joseph Wu.
> 
> 
> Bugs: MESOS-6802
>     https://issues.apache.org/jira/browse/MESOS-6802
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Previously, it was possible for an SSL socket to either:
> 1) Fail to receive an EOF if the EOF event was received when
>    there was no pending recv() request.
> 2) Fail to receive all data sent on the sending side if an
>    EOF event was received before all sent data was read.
> 
> This patch eliminates these race conditions to ensure reliable
> receipt of both sent data and EOFs.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/src/libevent_ssl_socket.hpp 57eaf4f607d0628db466cc1a139772eeeaa51136 
>   3rdparty/libprocess/src/libevent_ssl_socket.cpp dddd0e292a8b0d470f4e199db08f09a0c863d73c 
> 
> Diff: https://reviews.apache.org/r/53802/diff/
> 
> 
> Testing
> -------
> 
> Testing details are found at the end of this chain.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 53802: Eliminated an EOF race condition in libprocess SSL socket.

Posted by Greg Mann <gr...@mesosphere.io>.

> On Jan. 8, 2017, 3:22 a.m., Benjamin Hindman wrote:
> > 3rdparty/libprocess/src/libevent_ssl_socket.hpp, line 143
> > <https://reviews.apache.org/r/53802/diff/10/?file=1581240#file1581240line143>
> >
> >     I don't see any win in the name change from `recv_callback` to `perform_bev_read`. You also didn't delete the declaration of `recv_callback` above even though you changed the function name below.

I liked the name perform_bev_read because this function is no longer simply a continuation of the libevent callback, it also has a callsite in event_callback. I also find it confusing that the XXX_callback functions all have continuations with the same name in this file.

However, you make a good point regarding consistency. We could rename this function `_recv_callback`, which matches our usual pattern for continuations, and I could follow up with another patch to similarly change the names of the relevant `send_callback` and `event_callback` functions. Thoughts?


> On Jan. 8, 2017, 3:22 a.m., Benjamin Hindman wrote:
> > 3rdparty/libprocess/src/libevent_ssl_socket.hpp, line 155
> > <https://reviews.apache.org/r/53802/diff/10/?file=1581240#file1581240line155>
> >
> >     Was your inititon that `received_eof` was going to be protected by `synchronized (bev)`? I want to make sure there isn't a race with the IO thread setting `received_eof` in the `event_callback` and another thread reading and missing `received_eof` in `recv`. If you're confident that the `synchronized (bev)` is a sufficient barrier then let's just document that and maybe even move this up closer to `bufferevent* bev;` and specify that it's protected by `synchronized (bev)` which it gets automatically in any of the callbacks (i.e., `recv_callback`, `send_callback`, `event_callback`).

I originally did have received_eof explicitly protected by synchronized (bev), but as BenM mentioned in our offline discussion we should be safe without this since all accesses of received_eof occur in the event loop. I'll move this declaration and add a comment.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53802/#review160824
-----------------------------------------------------------


On Dec. 15, 2016, 5:53 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53802/
> -----------------------------------------------------------
> 
> (Updated Dec. 15, 2016, 5:53 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Joseph Wu.
> 
> 
> Bugs: MESOS-6802
>     https://issues.apache.org/jira/browse/MESOS-6802
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Previously, it was possible for an SSL socket to either:
> 1) Fail to receive an EOF if the EOF event was received when
>    there was no pending recv() request.
> 2) Fail to receive all data sent on the sending side if an
>    EOF event was received before all sent data was read.
> 
> This patch eliminates these race conditions to ensure reliable
> receipt of both sent data and EOFs.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/src/libevent_ssl_socket.hpp 57eaf4f607d0628db466cc1a139772eeeaa51136 
>   3rdparty/libprocess/src/libevent_ssl_socket.cpp dddd0e292a8b0d470f4e199db08f09a0c863d73c 
> 
> Diff: https://reviews.apache.org/r/53802/diff/
> 
> 
> Testing
> -------
> 
> Testing details are found at the end of this chain.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>