You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Marvin Frick (JIRA)" <ji...@apache.org> on 2016/05/02 12:26:12 UTC

[jira] [Commented] (MESOS-2043) framework auth fail with timeout error and never get authenticated

    [ https://issues.apache.org/jira/browse/MESOS-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266349#comment-15266349 ] 

Marvin Frick commented on MESOS-2043:
-------------------------------------

If I am not mistaken, we are discussing two different issues which seem to be highly correlated here. The issues title states it is about framework authentication but in fact we are discussion slave authentication as well. In our case slave authentication works with no problems, registering  the marathon framework however fails every time.

Marathon (1.1.1-1.0.472.ubuntu1404) log:
{code}
I0502 09:27:46.399130 24009 sched.cpp:382] Authenticating with master master@aa.bbb.cc.ddd
:5050
I0502 09:27:46.399268 24009 sched.cpp:389] Using default CRAM-MD5 authenticatee
I0502 09:27:46.399502 24009 authenticatee.cpp:121] Creating new client SASL connection
W0502 09:27:51.400413 24011 sched.cpp:493] Authentication timed out
I0502 09:27:51.400933 24011 sched.cpp:451] Failed to authenticate with master master@aa.bbb.cc.ddd
:5050: Authentication discarded
I0502 09:27:51.401150 24011 sched.cpp:382] Authenticating with master master@aa.bbb.cc.ddd
:5050
I0502 09:27:51.401304 24011 sched.cpp:389] Using default CRAM-MD5 authenticatee
I0502 09:27:51.401659 24011 authenticatee.cpp:121] Creating new client SASL connection
W0502 09:27:56.406292 24007 sched.cpp:493] Authentication timed out
I0502 09:27:56.406658 24007 sched.cpp:451] Failed to authenticate with master master@aa.bbb.cc.ddd
:5050: Authentication discarded
I0502 09:27:56.406826 24007 sched.cpp:382] Authenticating with master master@aa.bbb.cc.ddd
:5050
I0502 09:27:56.406944 24007 sched.cpp:389] Using default CRAM-MD5 authenticatee
I0502 09:27:56.407167 24007 authenticatee.cpp:121] Creating new client SASL connection
W0502 09:28:01.411710 24008 sched.cpp:493] Authentication timed out
I0502 09:28:01.412058 24008 sched.cpp:451] Failed to authenticate with master master@aa.bbb.cc.ddd
:5050: Authentication discarded
{code}

mesos (0.28.1-2.0.20.ubuntu1404) log:
{code}
I0502 09:27:46.402734 30601 master.cpp:5495] Authenticating scheduler-b10b6b59-a00c-4c51-aa50-7ba5f7ee54f6@aa.bbb.cc.eee
:41333
I0502 09:27:46.402963 30601 authenticator.cpp:98] Creating new server SASL connection
W0502 09:27:51.403853 30595 master.cpp:5541] Authentication timed out
W0502 09:27:51.404013 30595 master.cpp:5522] Failed to authenticate scheduler-b10b6b59-a00c-4c51-aa50-7ba5f7ee54f6@aa.bbb.cc.eee
:41333: Authentication discarded
I0502 09:27:51.404860 30596 master.cpp:5495] Authenticating scheduler-b10b6b59-a00c-4c51-aa50-7ba5f7ee54f6@aa.bbb.cc.eee
:41333
I0502 09:27:51.405025 30596 authenticator.cpp:98] Creating new server SASL connection
W0502 09:27:56.406782 30597 master.cpp:5541] Authentication timed out
W0502 09:27:56.406919 30597 master.cpp:5522] Failed to authenticate scheduler-b10b6b59-a00c-4c51-aa50-7ba5f7ee54f6@aa.bbb.cc.eee
:41333: Authentication discarded
I0502 09:27:56.409240 30597 master.cpp:5495] Authenticating scheduler-b10b6b59-a00c-4c51-aa50-7ba5f7ee54f6@aa.bbb.cc.eee
:41333
I0502 09:27:56.409400 30597 authenticator.cpp:98] Creating new server SASL connection
W0502 09:28:01.414705 30601 master.cpp:5541] Authentication timed out
W0502 09:28:01.414901 30601 master.cpp:5522] Failed to authenticate scheduler-b10b6b59-a00c-4c51-aa50-7ba5f7ee54f6@aa.bbb.cc.eee
:41333: Authentication discarded
I0502 09:28:01.415805 30596 master.cpp:5495] Authenticating scheduler-b10b6b59-a00c-4c51-aa50-7ba5f7ee54f6@aa.bbb.cc.eee
:41333
I0502 09:28:01.415988 30596 authenticator.cpp:98] Creating new server SASL connection
W0502 09:28:06.428802 30602 master.cpp:5541] Authentication timed out
W0502 09:28:06.428957 30602 master.cpp:5522] Failed to authenticate scheduler-b10b6b59-a00c-4c51-aa50-7ba5f7ee54f6@aa.bbb.cc.eee
:41333: Authentication discarded
{code}

*Is there any known workaround for this issue?* I guess downgrading to mesos 0.27.1 and marathon 0.15.3 (which we are running with the exact same config in another cluster) is only a temporary solution.

> framework auth fail with timeout error and never get authenticated
> ------------------------------------------------------------------
>
>                 Key: MESOS-2043
>                 URL: https://issues.apache.org/jira/browse/MESOS-2043
>             Project: Mesos
>          Issue Type: Bug
>          Components: master, scheduler driver, security, slave
>    Affects Versions: 0.21.0
>            Reporter: Bhuvan Arumugam
>            Assignee: Greg Mann
>            Priority: Critical
>              Labels: mesosphere, security
>             Fix For: 0.29.0
>
>         Attachments: aurora-scheduler.20141104-1606-1706.log, master.log, mesos-master.20141104-1606-1706.log, slave.log
>
>
> I'm facing this issue in master as of https://github.com/apache/mesos/commit/74ea59e144d131814c66972fb0cc14784d3503d4
> As [~adam-mesos] mentioned in IRC, this sounds similar to MESOS-1866. I'm running 1 master and 1 scheduler (aurora). The framework authentication fail due to time out:
> error on mesos master:
> {code}
> I1104 19:37:17.741449  8329 master.cpp:3874] Authenticating scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083
> I1104 19:37:17.741585  8329 master.cpp:3885] Using default CRAM-MD5 authenticator
> I1104 19:37:17.742106  8336 authenticator.hpp:169] Creating new server SASL connection
> W1104 19:37:22.742959  8329 master.cpp:3953] Authentication timed out
> W1104 19:37:22.743548  8329 master.cpp:3930] Failed to authenticate scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083: Authentication discarded
> {code}
> scheduler error:
> {code}
> I1104 19:38:57.885486 49012 sched.cpp:283] Authenticating with master master@MASTER_IP:PORT
> I1104 19:38:57.885928 49002 authenticatee.hpp:133] Creating new client SASL connection
> I1104 19:38:57.890581 49007 authenticatee.hpp:224] Received SASL authentication mechanisms: CRAM-MD5
> I1104 19:38:57.890656 49007 authenticatee.hpp:250] Attempting to authenticate with mechanism 'CRAM-MD5'
> W1104 19:39:02.891196 49005 sched.cpp:378] Authentication timed out
> I1104 19:39:02.891850 49018 sched.cpp:338] Failed to authenticate with master master@MASTER_IP:PORT: Authentication discarded
> {code}
> Looks like 2 instances {{scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94}} & {{scheduler-d2d4437b-d375-4467-a583-362152fe065a}} of same framework is trying to authenticate and fail.
> {code}
> W1104 19:36:30.769420  8319 master.cpp:3930] Failed to authenticate scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94@SCHEDULER_IP:8083: Failed to communicate with authenticatee
> I1104 19:36:42.701441  8328 master.cpp:3860] Queuing up authentication request from scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083 because authentication is still in progress
> {code}
> Restarting master and scheduler didn't fix it. 
> This particular issue happen with 1 master and 1 scheduler after MESOS-1866 is fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)