You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Till Toenshoff (JIRA)" <ji...@apache.org> on 2016/05/11 01:06:12 UTC

[jira] [Comment Edited] (MESOS-5340) libevent builds may prevent new connections

    [ https://issues.apache.org/jira/browse/MESOS-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15279337#comment-15279337 ] 

Till Toenshoff edited comment on MESOS-5340 at 5/11/16 1:05 AM:
----------------------------------------------------------------

In parallel, [~alexr] and I came up with a different, but also more intrusive approach: https://reviews.apache.org/r/47207/


was (Author: tillt):
I parallel, [~alexr] and I came up with a different, but also more intrusive approach: https://reviews.apache.org/r/47207/

> libevent builds may prevent new connections
> -------------------------------------------
>
>                 Key: MESOS-5340
>                 URL: https://issues.apache.org/jira/browse/MESOS-5340
>             Project: Mesos
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.29.0, 0.28.1
>            Reporter: Till Toenshoff
>            Assignee: Benjamin Mahler
>            Priority: Blocker
>              Labels: mesosphere, security, ssl
>
> When using an SSL-enabled build of Mesos in combination with SSL-downgrading support, any connection that does not actually transmit data will hang the runnable (e.g. master).
> For reproducing the issue (on any platform)...
> Spin up a master with enabled SSL-downgrading:
> {noformat}
> $ export SSL_ENABLED=true
> $ export SSL_SUPPORT_DOWNGRADE=true
> $ export SSL_KEY_FILE=/path/to/your/foo.key
> $ export SSL_CERT_FILE=/path/to/your/foo.crt
> $ export SSL_CA_FILE=/path/to/your/ca.crt
> $ ./bin/mesos-master.sh --work_dir=/tmp/foo
> {noformat}
> Create some artificial HTTP request load for quickly spotting the problem in both, the master logs as well as the output of CURL itself:
> {noformat}
> $ while true; do sleep 0.1; echo $( date +">%H:%M:%S.%3N"; curl -s -k -A "SSL Debug" http://localhost:5050/master/slaves; echo ;date +"<%H:%M:%S.%3N"; echo); done
> {noformat}
> Now create a connection to the master that does not transmit any data:
> {noformat}
> $ telnet localhost 5050
> {noformat}
> You should now see the CURL requests hanging, the master stops responding to new connections. This will persist until either some data is transmitted via the above telnet connection or it is closed.
> This problem has initially been observed when running Mesos on an AWS cluster with enabled load-balancer (which uses an idle, persistent connection) for the master node. Such connection does naturally not transmit any data as long as there are no external requests routed via the load-balancer. AWS allows setting up a timeout for those connections and in our test environment, this duration was set to 60 seconds and hence we were seeing our master getting repetitively unresponsive for 60 seconds, then getting "unstuck" for a brief period until it got stuck again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)