You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Joseph Wu (JIRA)" <ji...@apache.org> on 2016/11/22 02:35:58 UTC

[jira] [Commented] (MESOS-6621) SSL downgrade path will CHECK-fail when using both temporary and persistent sockets

    [ https://issues.apache.org/jira/browse/MESOS-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15685468#comment-15685468 ] 

Joseph Wu commented on MESOS-6621:
----------------------------------

This turns out to be clunky (but possible) to repro in a unit test.  I've attached a patch which includes an SSL test and a test-specific replacement for the {{3rdparty/libprocess/src/tests/test_linkee.cpp}} file which exists in the code base already.

The check failure can be reproduced with relatively few iterations:
{code}
3rdparty/libprocess/libprocess-tests --gtest_filter="SSLTest.TempDowngrade" --gtest_break_on_failure --gtest_repeat=10
{code}

> SSL downgrade path will CHECK-fail when using both temporary and persistent sockets
> -----------------------------------------------------------------------------------
>
>                 Key: MESOS-6621
>                 URL: https://issues.apache.org/jira/browse/MESOS-6621
>             Project: Mesos
>          Issue Type: Bug
>          Components: libprocess
>    Affects Versions: 1.0.2, 1.1.0
>         Environment: SSL with downgrade enabled
>            Reporter: Joseph Wu
>            Assignee: Joseph Wu
>            Priority: Critical
>              Labels: mesosphere
>         Attachments: test.patch, test_linkee.cpp
>
>
> The code path for downgrading sockets from SSL to non-SSL includes this code:
> {code}
>     // If this address is a temporary link.
>     if (temps.count(addresses[to_fd]) > 0) {
>       temps[addresses[to_fd]] = to_fd;
>       // No need to erase as we're changing the value, not the key.
>     }
>     // If this address is a persistent link.
>     if (persists.count(addresses[to_fd]) > 0) {
>       persists[addresses[to_fd]] = to_fd;
>       // No need to erase as we're changing the value, not the key.
>     }
> {code}
> https://github.com/apache/mesos/blob/1.1.x/3rdparty/libprocess/src/process.cpp#L2311-L2321
> It is possible for libprocess to hold both temporary and persistent sockets to the same address.  This can happen when a message is first sent ({{ProcessBase::send}}), and then a link is established ({{ProcessBase::link}}).  When the target of the message/link is a non-SSL socket, both temporary and persistent sockets go through the downgrade path.
> If a temporary socket is present while a persistent socket is being created, the above code will remap both temporary and persistent sockets to the same address (it should only remap the persistent socket).  This leads to some CHECK failures if those sockets are used or closed later:
> * {code}
>     bool persist = persists.count(address) > 0;
>     bool temp = temps.count(address) > 0;
>     if (persist || temp) {
>       int s = persist ? persists[address] : temps[address];
>       CHECK(sockets.count(s) > 0);
> socket = sockets.at(s);
> {code}
> https://github.com/apache/mesos/blob/1.1.x/3rdparty/libprocess/src/process.cpp#L1942
> * {code}
>         if (dispose.count(s) > 0) {
>           // This is either a temporary socket we created or it's a
>           // socket that we were receiving data from and possibly
>           // sending HTTP responses back on. Clean up either way.
>           if (addresses.count(s) > 0) {
>             const Address& address = addresses[s];
>             CHECK(temps.count(address) > 0 && temps[address] == s);
> temps.erase(address);
> {code}
> https://github.com/apache/mesos/blob/1.1.x/3rdparty/libprocess/src/process.cpp#L2044



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)