You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by ai...@accenture.com on 2016/04/12 10:02:22 UTC

Slaves not getting registered

Hi,

I’m unable to get the slave registered with the master node. I’ve configured both the masters and slave machines but am unable to get it registered.

Thank you.

________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.
______________________________________________________________________________________

www.accenture.com

Re: Slaves not getting registered

Posted by haosdent <ha...@gmail.com>.
>Is there a way to find out the number of masters that are present in the
environment together through CLI/GUI?
There is a command line tool `mesos-resolve` which can show master info in
terminal. But I think it may not related to your problem here. If your
mesos doesn't depend on zookeeper, it should only have 1 master in your
mesos cluster.

On Tue, Apr 12, 2016 at 8:36 PM, June Taylor <ju...@umn.edu> wrote:

> Try looking in /var/log/mesos/ at these files: mesos-slave.WARNING,
> mesos-slave.INFO, mesos-slave.ERROR
>
>
> Thanks,
> June Taylor
> System Administrator, Minnesota Population Center
> University of Minnesota
>
> On Tue, Apr 12, 2016 at 4:36 AM, Dick Davies <di...@hellooperator.net>
> wrote:
>
>> There's no mention of a slave there, have a look at the logs on the
>> slaves filesystem and see if it is giving any errors.
>>
>> On 12 April 2016 at 10:17,  <ai...@accenture.com> wrote:
>> > The GUI log shows like this:
>> >
>> >
>> >
>> > I0412 08:45:51.379609  3616 master.cpp:3673] Processing DECLINE call for
>> > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O282 ] for framework
>> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
>> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>> >
>> > I0412 08:45:54.637461  3612 http.cpp:501] HTTP GET for
>> /master/state.json
>> > from 10.211.203.147:59463 with User-Agent='Mozilla/5.0 (Windows NT 6.0;
>> > WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
>> >
>> > I0412 08:45:57.376288  3619 master.cpp:5350] Sending 1 offers to
>> framework
>> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
>> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>> >
>> > I0412 08:45:57.385325 3613 master.cpp:3673] Processing DECLINE call for
>> > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O283 ] for framework
>> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
>> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>> >
>> > I0412 08:46:03.383728  3614 master.cpp:5350] Sending 1 offers to
>> framework
>> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
>> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>> >
>> > I0412 08:46:03.396531  3612 master.cpp:3673] Processing DECLINE call for
>> > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O284 ] for framework
>> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
>> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>> >
>> > I0412 08:46:04.665582  3612 http.cpp:501] HTTP GET for
>> /master/state.json
>> > from 10.211.203.147:59464 with User-Agent='Mozilla/5.0 (Windows NT 6.0;
>> > WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
>> >
>> > I0412 08:46:09.389493  3616 master.cpp:5350] Sending 1 offers to
>> framework
>> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
>> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>> >
>> >
>> >
>> >
>> >
>> > Is there a way to find out the number of masters that are present in the
>> > environment together through CLI/GUI?
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > From: haosdent [mailto:haosdent@gmail.com]
>> > Sent: 12 April 2016 13:37
>> > To: user <us...@mesos.apache.org>
>> > Subject: Re: Slaves not getting registered
>> >
>> >
>> >
>> >>but am unable to get it registered.
>> >
>> > Hi, @aishwarya Could you post master and slave log to provide more
>> details?
>> > Usually it is because of network problem.
>> >
>> >
>> >
>> > On Tue, Apr 12, 2016 at 4:02 PM, <ai...@accenture.com>
>> wrote:
>> >
>> > Hi,
>> >
>> >
>> >
>> > I’m unable to get the slave registered with the master node. I’ve
>> configured
>> > both the masters and slave machines but am unable to get it registered.
>> >
>> >
>> >
>> > Thank you.
>> >
>> >
>> >
>> > ________________________________
>> >
>> >
>> > This message is for the designated recipient only and may contain
>> > privileged, proprietary, or otherwise confidential information. If you
>> have
>> > received it in error, please notify the sender immediately and delete
>> the
>> > original. Any other use of the e-mail by you is prohibited. Where
>> allowed by
>> > local law, electronic communications with Accenture and its affiliates,
>> > including e-mail and instant messaging (including content), may be
>> scanned
>> > by our systems for the purposes of information security and assessment
>> of
>> > internal compliance with Accenture policy.
>> >
>> ______________________________________________________________________________________
>> >
>> > www.accenture.com
>> >
>> >
>> >
>> >
>> >
>> > --
>> >
>> > Best Regards,
>> >
>> > Haosdent Huang
>>
>
>


-- 
Best Regards,
Haosdent Huang

Re: Slaves not getting registered

Posted by Dick Davies <di...@hellooperator.net>.
The masters are losing their zookeeper connection too, which is
forcing an election:

I0412 11:01:48.887229  3677 group.cpp:460] Lost connection to
ZooKeeper, attempting to reconnect ...

I0412 11:01:48.919545  3675 group.cpp:519] ZooKeeper session expired

I0412 11:01:48.919848  3680 detector.cpp:154] Detected a new leader: None

I0412 11:01:48.919922  3680 master.cpp:1710] The newly elected leader is None

You need to tune your zookeeper cluster I'd guess, there's something
not right there.

On 13 April 2016 at 06:09,  <ai...@accenture.com> wrote:
> Hi,
>
>
>
> I configured the zookeeper file in slave machine by adding the master
> details and now the salve is getting registered.
>
>
>
> But I don’t why, the three masters keep fluctuating among themselves to be
> the leader when I try accessing the master IP in the GUI.
>
>
>
> Thank you.
>
>
>
>
>
> From: haosdent [mailto:haosdent@gmail.com]
> Sent: 13 April 2016 09:25
> To: user <us...@mesos.apache.org>
> Cc: Kumari, Suruchi <su...@accenture.com>
>
>
> Subject: Re: Slaves not getting registered
>
>
>
>>I0412 11:01:50.586612  3732 recover.cpp:578] Successfully joined the Paxos
>> group
>
>
>
> According to this, master 1 should connect to zk successfully.
>
>
>
>>root@slave1:/var/log/mesos# tail -f
>> mesos-slave.slave1.invalid-user.log.INFO.20160412-110554.1696
>
>>I0413 03:12:54.532676  1711 group.cpp:519] ZooKeeper session expired
>
>>I0413 03:12:58.757953  1715 slave.cpp:4304] Current disk usage 6.44%. Max
>> allowed age: 5.848917453828577days
>
>>W0413 03:13:04.539577  1715 group.cpp:503] Timed out waiting to connect to
>> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
>
>
> How about check whether you could connect to zk on slave1 or not?
>
>
>
> On Wed, Apr 13, 2016 at 11:49 AM, <ai...@accenture.com>
> wrote:
>
> I checked the zookeeper status by running the command:
>
>
>
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.52 2181 | grep Mode
>
> Mode: follower
>
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.53 2181 | grep Mode
>
> Mode: leader
>
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.54 2181 | grep Mode
>
> Mode: follower
>
>
>
> And it seems like it’s working fine. Is there another way to check the
> health status?
>
>
>
>
>
> From: Abhishek Amralkar [mailto:Abhishek.Amralkar@talentica.com]
> Sent: 13 April 2016 09:10
>
>
> To: user@mesos.apache.org
> Subject: Re: Slaves not getting registered
>
>
>
> Have you checked if your ZooKeeper cluster is healthy? accessible from Mesos
> Masters?
>
>
>
> W0413 03:12:24.512336  1715 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:34.519641  1710 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:44.521181  1713 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:54.532501  1711 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
>
>
> It seems Mesos masters are not able to communicate to Zookeeper.
>
>
>
> -Abhishek
>
> On 13-Apr-2016, at 9:06 AM, aishwarya.adyanthaya@accenture.com wrote:
>
>
>
> Hi,
>
>
>
> I have been following the document from the digitalocean (mesos-doc-link)
> where I have set 3 masters and one slave. Below are the log details:
>
>
>
> root@master1:/var/log/mesos# tail -f mesos-master.INFO
>
> I0412 11:01:50.579818  3736 recover.cpp:193] Received a recover response
> from a replica in VOTING status
>
> I0412 11:01:50.579903  3736 recover.cpp:564] Updating replica status to
> RECOVERING
>
> I0412 11:01:50.583102  3736 leveldb.cpp:304] Persisting metadata (8 bytes)
> to leveldb took 3.154399ms
>
> I0412 11:01:50.583137  3736 replica.cpp:320] Persisted replica status to
> RECOVERING
>
> I0412 11:01:50.583176  3736 recover.cpp:543] Starting catch-up from position
> 1 to 2
>
> I0412 11:01:50.583732  3736 recover.cpp:564] Updating replica status to
> VOTING
>
> I0412 11:01:50.586318  3736 leveldb.cpp:304] Persisting metadata (8 bytes)
> to leveldb took 2.540703ms
>
> I0412 11:01:50.586484  3736 replica.cpp:320] Persisted replica status to
> VOTING
>
> I0412 11:01:50.586612  3732 recover.cpp:578] Successfully joined the Paxos
> group
>
> I0412 11:01:50.586745  3731 recover.cpp:462] Recover process terminated
>
>
>
> root@master1:/var/log/mesos# tail -f mesos-master.WARNING
>
> Log file created at: 2016/04/12 11:01:49
>
> Running on machine: master1
>
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
>
> W0412 11:01:49.024226  3712 authenticator.cpp:511] No credentials provided,
> authentication requests will be refused
>
>
>
> root@master1:/var/log/mesos# tail -f
> mesos-master.master1.invalid-user.log.INFO.20160412-11014
>
> tail: cannot open
> ‘mesos-master.master1.invalid-user.log.INFO.20160412-11014’ for reading: No
> such file or directory
>
> root@master1:/var/log/mesos# tail -f
> mesos-master.master1.invalid-user.log.INFO.20160412-11014
>
> mesos-master.master1.invalid-user.log.INFO.20160412-110143.3651
> mesos-master.master1.invalid-user.log.INFO.20160412-110148.3712
>
> root@master1:/var/log/mesos# tail -f
> mesos-master.master1.invalid-user.log.INFO.20160412-110143.3651
>
> I0412 11:01:46.424433  3676 replica.cpp:673] Replica in EMPTY status
> received a broadcasted recover request from (5)@30.30.30.53:5050
>
> I0412 11:01:47.068586  3675 replica.cpp:673] Replica in EMPTY status
> received a broadcasted recover request from (8)@30.30.30.53:5050
>
> I0412 11:01:47.592926  3677 replica.cpp:673] Replica in EMPTY status
> received a broadcasted recover request from (11)@30.30.30.53:5050
>
> I0412 11:01:48.188248  3680 replica.cpp:673] Replica in EMPTY status
> received a broadcasted recover request from (14)@30.30.30.53:5050
>
> I0412 11:01:48.887104  3678 group.cpp:460] Lost connection to ZooKeeper,
> attempting to reconnect ...
>
> I0412 11:01:48.887177  3674 group.cpp:460] Lost connection to ZooKeeper,
> attempting to reconnect ...
>
> I0412 11:01:48.887229  3677 group.cpp:460] Lost connection to ZooKeeper,
> attempting to reconnect ...
>
> I0412 11:01:48.919545  3675 group.cpp:519] ZooKeeper session expired
>
> I0412 11:01:48.919848  3680 detector.cpp:154] Detected a new leader: None
>
> I0412 11:01:48.919922  3680 master.cpp:1710] The newly elected leader is
> None
>
>
>
>
>
> root@slave1:/var/log/mesos# tail -f
> mesos-slave.slave1.invalid-user.log.INFO.20160412-110554.1696
>
> I0413 03:12:54.532676  1711 group.cpp:519] ZooKeeper session expired
>
> I0413 03:12:58.757953  1715 slave.cpp:4304] Current disk usage 6.44%. Max
> allowed age: 5.848917453828577days
>
> W0413 03:13:04.539577  1715 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> I0413 03:13:04.539798  1715 group.cpp:519] ZooKeeper session expired
>
> W0413 03:13:14.542245  1713 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> I0413 03:13:14.542434  1713 group.cpp:519] ZooKeeper session expired
>
>
>
> root@slave1:/var/log/mesos# tail -f mesos-slave.WARNING
>
> W0413 03:12:24.512336  1715 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:34.519641  1710 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:44.521181  1713 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:54.532501  1711 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
>
>
> Thank you.
>
>
>
>
>
> From: June Taylor [mailto:june@umn.edu]
> Sent: 12 April 2016 18:06
> To: user@mesos.apache.org
> Subject: Re: Slaves not getting registered
>
>
>
> Try looking in /var/log/mesos/ at these files: mesos-slave.WARNING,
> mesos-slave.INFO, mesos-slave.ERROR
>
>
>
>
> Thanks,
>
> June Taylor
>
> System Administrator, Minnesota Population Center
>
> University of Minnesota
>
>
>
> On Tue, Apr 12, 2016 at 4:36 AM, Dick Davies <di...@hellooperator.net> wrote:
>
> There's no mention of a slave there, have a look at the logs on the
> slaves filesystem and see if it is giving any errors.
>
>
> On 12 April 2016 at 10:17,  <ai...@accenture.com> wrote:
>> The GUI log shows like this:
>>
>>
>>
>> I0412 08:45:51.379609  3616 master.cpp:3673] Processing DECLINE call for
>> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O282 ] for framework
>> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
>> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>>
>> I0412 08:45:54.637461  3612 http.cpp:501] HTTP GET for /master/state.json
>> from 10.211.203.147:59463 with User-Agent='Mozilla/5.0 (Windows NT 6.0;
>> WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
>>
>> I0412 08:45:57.376288  3619 master.cpp:5350] Sending 1 offers to framework
>> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
>> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>>
>> I0412 08:45:57.385325 3613 master.cpp:3673] Processing DECLINE call for
>> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O283 ] for framework
>> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
>> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>>
>> I0412 08:46:03.383728  3614 master.cpp:5350] Sending 1 offers to framework
>> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
>> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>>
>> I0412 08:46:03.396531  3612 master.cpp:3673] Processing DECLINE call for
>> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O284 ] for framework
>> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
>> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>>
>> I0412 08:46:04.665582  3612 http.cpp:501] HTTP GET for /master/state.json
>> from 10.211.203.147:59464 with User-Agent='Mozilla/5.0 (Windows NT 6.0;
>> WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
>>
>> I0412 08:46:09.389493  3616 master.cpp:5350] Sending 1 offers to framework
>> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
>> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>>
>>
>>
>>
>>
>> Is there a way to find out the number of masters that are present in the
>> environment together through CLI/GUI?
>>
>>
>>
>>
>>
>>
>>
>> From: haosdent [mailto:haosdent@gmail.com]
>> Sent: 12 April 2016 13:37
>> To: user <us...@mesos.apache.org>
>> Subject: Re: Slaves not getting registered
>>
>>
>>
>>>but am unable to get it registered.
>>
>> Hi, @aishwarya Could you post master and slave log to provide more
>> details?
>> Usually it is because of network problem.
>>
>>
>>
>> On Tue, Apr 12, 2016 at 4:02 PM, <ai...@accenture.com>
>> wrote:
>>
>> Hi,
>>
>>
>>
>> I’m unable to get the slave registered with the master node. I’ve
>> configured
>> both the masters and slave machines but am unable to get it registered.
>>
>>
>>
>> Thank you.
>>
>>
>>
>> ________________________________
>>
>>
>> This message is for the designated recipient only and may contain
>> privileged, proprietary, or otherwise confidential information. If you
>> have
>> received it in error, please notify the sender immediately and delete the
>> original. Any other use of the e-mail by you is prohibited. Where allowed
>> by
>> local law, electronic communications with Accenture and its affiliates,
>> including e-mail and instant messaging (including content), may be scanned
>> by our systems for the purposes of information security and assessment of
>> internal compliance with Accenture policy.
>>
>> ______________________________________________________________________________________
>>
>> www.accenture.com
>>
>>
>>
>>
>>
>> --
>>
>> Best Regards,
>>
>> Haosdent Huang
>
>
>
>
>
> ________________________________
>
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed by
> local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy.
> ______________________________________________________________________________________
>
> www.accenture.com
>
>
>
>
>
>
>
> --
>
> Best Regards,
>
> Haosdent Huang

Re: Mesos-master url in HA

Posted by Steven Schlansker <ss...@opentable.com>.
I personally believe that this is not a sufficient workaround -- what
if the master is failing over, and your autoscaler happens to redirect
to a master which just lost leadership?

This solution is inherently racy and leads to the end user writing extra code
to work around it, and even then can still result in extremely difficult
to diagnose bugs.

I'd filed an issue on this a while ago (0.20 days):
https://issues.apache.org/jira/browse/MESOS-1865

but unfortunately it is still not resolved.

> On Apr 13, 2016, at 12:44 AM, Alexander Rojas <al...@mesosphere.io> wrote:
> 
> Hi guillermo,
> 
> The master has the `/redirect` endpoint which should point you to the current leader.
> 
>> On 13 Apr 2016, at 08:20, Guillermo Rodriguez <gu...@spritekin.com> wrote:
>> 
>> Hi,
>> 
>> I have 3 mesos master setup for HA. One has the lead.
>> 
>> http://172.31.35.90:5050/master/slaves   <<< leader, returns slave list
>> http://172.31.35.91:5050/master/slaves   <<< follower, returns empty list
>> http://172.31.35.92:5050/master/slaves   <<< follower, returns empty list
>> 
>> I have an URL mesos-master.mydomain.com pointing to the leader and that works fine because it returns the slave list which I need for my autoscaler. But I'm afraid if the master fails the URL will no longer be valid. So I added the three IPs to the router (AWS Route53)  so it would round robin, but of course this will return an empty list sometimes because it hits a follower which returns empty.
>> 
>> So my question is, is it possible to redirect the call from the followers to the leader master?
>> 
>> Thanks.
>> 
> 


Re: Mesos-master url in HA

Posted by Guillermo Rodriguez <gu...@spritekin.com>.
Ok go my solution, autoscaler now uses the mesos-dns record for 
leader.mesos instead of the Route53 record.
  
 Thanks!
  
  

----------------------------------------
 From: "Alexander Rojas" <al...@mesosphere.io>
Sent: Wednesday, April 13, 2016 5:45 PM
To: user@mesos.apache.org, guimo@spritekin.com
Subject: Re: Mesos-master url in HA   
 Hi guillermo,
  
 The master has the `/redirect` endpoint which should point you to the 
current leader.
     On 13 Apr 2016, at 08:20, Guillermo Rodriguez <gu...@spritekin.com> 
wrote:
    Hi,
  
 I have 3 mesos master setup for HA. One has the lead.
  
  http://172.31.35.90:5050/master/slaves   <<< leader, returns slave list
 http://172.31.35.91:5050/master/slaves   <<< follower, returns empty list
http://172.31.35.92:5050/master/slaves   <<< follower, returns empty list
  
 I have an URL mesos-master.mydomain.com pointing to the leader and that 
works fine because it returns the slave list which I need for my 
autoscaler. But I'm afraid if the master fails the URL will no longer be 
valid. So I added the three IPs to the router (AWS Route53)  so it would 
round robin, but of course this will return an empty list sometimes because 
it hits a follower which returns empty.
  
 So my question is, is it possible to redirect the call from the followers 
to the leader master?
  
 Thanks.
  

 


Re: Mesos-master url in HA

Posted by Alexander Rojas <al...@mesosphere.io>.
Hi guillermo,

The master has the `/redirect` endpoint which should point you to the current leader.

> On 13 Apr 2016, at 08:20, Guillermo Rodriguez <gu...@spritekin.com> wrote:
> 
> Hi,
>  
> I have 3 mesos master setup for HA. One has the lead.
>  
> http://172.31.35.90:5050/master/slaves   <<< leader, returns slave list
> http://172.31.35.91:5050/master/slaves   <<< follower, returns empty list
> http://172.31.35.92:5050/master/slaves   <<< follower, returns empty list
>  
> I have an URL mesos-master.mydomain.com pointing to the leader and that works fine because it returns the slave list which I need for my autoscaler. But I'm afraid if the master fails the URL will no longer be valid. So I added the three IPs to the router (AWS Route53)  so it would round robin, but of course this will return an empty list sometimes because it hits a follower which returns empty.
>  
> So my question is, is it possible to redirect the call from the followers to the leader master?
>  
> Thanks.
>  


Mesos-master url in HA

Posted by Guillermo Rodriguez <gu...@spritekin.com>.
Hi,
  
 I have 3 mesos master setup for HA. One has the lead.
  
  http://172.31.35.90:5050/master/slaves   <<< leader, returns slave list
 http://172.31.35.91:5050/master/slaves   <<< follower, returns empty list
http://172.31.35.92:5050/master/slaves   <<< follower, returns empty list
  
 I have an URL mesos-master.mydomain.com pointing to the leader and that 
works fine because it returns the slave list which I need for my 
autoscaler. But I'm afraid if the master fails the URL will no longer be 
valid. So I added the three IPs to the router (AWS Route53)  so it would 
round robin, but of course this will return an empty list sometimes because 
it hits a follower which returns empty.
  
 So my question is, is it possible to redirect the call from the followers 
to the leader master?
  
 Thanks.
  


Re: Slaves not getting registered

Posted by haosdent <ha...@gmail.com>.
>the three masters keep fluctuating among themselves to be the leader.

Do all the network between 3 Mesos masters and zookeeper are stable? Is it
lost packets when you ping zookeeper servers in every Mesos master?

On Wed, Apr 13, 2016 at 1:15 PM, Abhishek Amralkar <
Abhishek.Amralkar@talentica.com> wrote:

> Not sure, but try to change the quorum and check.
>
>
>
> On 13-Apr-2016, at 10:39 AM, aishwarya.adyanthaya@accenture.com wrote:
>
> Hi,
>
> I configured the zookeeper file in slave machine by adding the master
> details and now the salve is getting registered.
>
> But I don’t why, the three masters keep fluctuating among themselves to be
> the leader when I try accessing the master IP in the GUI.
>
> Thank you.
>
>
> *From:* haosdent [mailto:haosdent@gmail.com <ha...@gmail.com>]
> *Sent:* 13 April 2016 09:25
> *To:* user <us...@mesos.apache.org>
> *Cc:* Kumari, Suruchi <su...@accenture.com>
> *Subject:* Re: Slaves not getting registered
>
> >I0412 11:01:50.586612  3732 recover.cpp:578] Successfully joined the
> Paxos group
>
> According to this, master 1 should connect to zk successfully.
>
> >root@slave1:/var/log/mesos# tail -f
> mesos-slave.slave1.invalid-user.log.INFO.20160412-110554.1696
> >I0413 03:12:54.532676  1711 group.cpp:519] ZooKeeper session expired
> >I0413 03:12:58.757953  1715 slave.cpp:4304] Current disk usage 6.44%. Max
> allowed age: 5.848917453828577days
> >W0413 03:13:04.539577  1715 group.cpp:503] Timed out waiting to connect
> to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> How about check whether you could connect to zk on slave1 or not?
>
> On Wed, Apr 13, 2016 at 11:49 AM, <ai...@accenture.com>
> wrote:
>
> I checked the zookeeper status by running the command:
>
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.52 2181 | grep Mode
> Mode: follower
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.53 2181 | grep Mode
> Mode: leader
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.54 2181 | grep Mode
> Mode: follower
>
> And it seems like it’s working fine. Is there another way to check the
> health status?
>
>
> *From:* Abhishek Amralkar [mailto:Abhishek.Amralkar@talentica.com]
> *Sent:* 13 April 2016 09:10
>
> *To:* user@mesos.apache.org
> *Subject:* Re: Slaves not getting registered
>
> Have you checked if your ZooKeeper cluster is healthy? accessible from
> Mesos Masters?
>
> W0413 03:12:24.512336  1715 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
> W0413 03:12:34.519641  1710 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
> W0413 03:12:44.521181  1713 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
> W0413 03:12:54.532501  1711 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> It seems Mesos masters are not able to communicate to Zookeeper.
>
> -Abhishek
>
> On 13-Apr-2016, at 9:06 AM, aishwarya.adyanthaya@accenture.com wrote:
>
> Hi,
>
> I have been following the document from the digitalocean (mesos-doc-link
> <https://www.digitalocean.com/community/tutorials/how-to-configure-a-production-ready-mesosphere-cluster-on-ubuntu-14-04>)
> where I have set 3 masters and one slave. Below are the log details:
>
> root@master1:/var/log/mesos# tail -f mesos-master.INFO
> I0412 11:01:50.579818  3736 recover.cpp:193] Received a recover response
> from a replica in VOTING status
> I0412 11:01:50.579903  3736 recover.cpp:564] Updating replica status to
> RECOVERING
> I0412 11:01:50.583102  3736 leveldb.cpp:304] Persisting metadata (8 bytes)
> to leveldb took 3.154399ms
> I0412 11:01:50.583137  3736 replica.cpp:320] Persisted replica status to
> RECOVERING
> I0412 11:01:50.583176  3736 recover.cpp:543] Starting catch-up from
> position 1 to 2
> I0412 11:01:50.583732  3736 recover.cpp:564] Updating replica status to
> VOTING
> I0412 11:01:50.586318  3736 leveldb.cpp:304] Persisting metadata (8 bytes)
> to leveldb took 2.540703ms
> I0412 11:01:50.586484  3736 replica.cpp:320] Persisted replica status to
> VOTING
> I0412 11:01:50.586612  3732 recover.cpp:578] Successfully joined the Paxos
> group
> I0412 11:01:50.586745  3731 recover.cpp:462] Recover process terminated
>
> root@master1:/var/log/mesos# tail -f mesos-master.WARNING
> Log file created at: 2016/04/12 11:01:49
> Running on machine: master1
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
> W0412 11:01:49.024226  3712 authenticator.cpp:511] No credentials
> provided, authentication requests will be refused
>
> root@master1:/var/log/mesos# tail -f
> mesos-master.master1.invalid-user.log.INFO.20160412-11014
> tail: cannot open
> ‘mesos-master.master1.invalid-user.log.INFO.20160412-11014’ for reading: No
> such file or directory
> root@master1:/var/log/mesos# tail -f
> mesos-master.master1.invalid-user.log.INFO.20160412-11014
> mesos-master.master1.invalid-user.log.INFO.20160412-110143.3651
> mesos-master.master1.invalid-user.log.INFO.20160412-110148.3712
> root@master1:/var/log/mesos# tail -f
> mesos-master.master1.invalid-user.log.INFO.20160412-110143.3651
> I0412 11:01:46.424433  3676 replica.cpp:673] Replica in EMPTY status
> received a broadcasted recover request from (5)@30.30.30.53:5050
> I0412 11:01:47.068586  3675 replica.cpp:673] Replica in EMPTY status
> received a broadcasted recover request from (8)@30.30.30.53:5050
> I0412 11:01:47.592926  3677 replica.cpp:673] Replica in EMPTY status
> received a broadcasted recover request from (11)@30.30.30.53:5050
> I0412 11:01:48.188248  3680 replica.cpp:673] Replica in EMPTY status
> received a broadcasted recover request from (14)@30.30.30.53:5050
> I0412 11:01:48.887104  3678 group.cpp:460] Lost connection to ZooKeeper,
> attempting to reconnect ...
> I0412 11:01:48.887177  3674 group.cpp:460] Lost connection to ZooKeeper,
> attempting to reconnect ...
> I0412 11:01:48.887229  3677 group.cpp:460] Lost connection to ZooKeeper,
> attempting to reconnect ...
> I0412 11:01:48.919545  3675 group.cpp:519] ZooKeeper session expired
> I0412 11:01:48.919848  3680 detector.cpp:154] Detected a new leader: None
> I0412 11:01:48.919922  3680 master.cpp:1710] The newly elected leader is
> None
>
>
> root@slave1:/var/log/mesos# tail -f
> mesos-slave.slave1.invalid-user.log.INFO.20160412-110554.1696
> I0413 03:12:54.532676  1711 group.cpp:519] ZooKeeper session expired
> I0413 03:12:58.757953  1715 slave.cpp:4304] Current disk usage 6.44%. Max
> allowed age: 5.848917453828577days
> W0413 03:13:04.539577  1715 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
> I0413 03:13:04.539798  1715 group.cpp:519] ZooKeeper session expired
> W0413 03:13:14.542245  1713 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
> I0413 03:13:14.542434  1713 group.cpp:519] ZooKeeper session expired
>
> root@slave1:/var/log/mesos# tail -f mesos-slave.WARNING
> W0413 03:12:24.512336  1715 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
> W0413 03:12:34.519641  1710 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
> W0413 03:12:44.521181  1713 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
> W0413 03:12:54.532501  1711 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> Thank you.
>
>
> *From:* June Taylor [mailto:june@umn.edu <ju...@umn.edu>]
> *Sent:* 12 April 2016 18:06
> *To:* user@mesos.apache.org
> *Subject:* Re: Slaves not getting registered
>
> Try looking in /var/log/mesos/ at these files: mesos-slave.WARNING,
> mesos-slave.INFO, mesos-slave.ERROR
>
>
> Thanks,
> June Taylor
> System Administrator, Minnesota Population Center
> University of Minnesota
>
> On Tue, Apr 12, 2016 at 4:36 AM, Dick Davies <di...@hellooperator.net>
> wrote:
>
> There's no mention of a slave there, have a look at the logs on the
> slaves filesystem and see if it is giving any errors.
>
> On 12 April 2016 at 10:17,  <ai...@accenture.com> wrote:
> > The GUI log shows like this:
> >
> >
> >
> > I0412 08:45:51.379609  3616 master.cpp:3673] Processing DECLINE call for
> > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O282 ] for framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:45:54.637461  3612 http.cpp:501] HTTP GET for /master/state.json
> > from 10.211.203.147:59463 with User-Agent='Mozilla/5.0 (Windows NT 6.0;
> > WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
> >
> > I0412 08:45:57.376288  3619 master.cpp:5350] Sending 1 offers to
> framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:45:57.385325 3613 <385325%20%203613> master.cpp:3673]
> Processing DECLINE call for
> > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O283 ] for framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:46:03.383728  3614 master.cpp:5350] Sending 1 offers to
> framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:46:03.396531  3612 master.cpp:3673] Processing DECLINE call for
> > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O284 ] for framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:46:04.665582  3612 http.cpp:501] HTTP GET for /master/state.json
> > from 10.211.203.147:59464 with User-Agent='Mozilla/5.0 (Windows NT 6.0;
> > WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
> >
> > I0412 08:46:09.389493  3616 master.cpp:5350] Sending 1 offers to
> framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> >
> >
> >
> >
> > Is there a way to find out the number of masters that are present in the
> > environment together through CLI/GUI?
> >
> >
> >
> >
> >
> >
> >
> > From: haosdent [mailto:haosdent@gmail.com]
> > Sent: 12 April 2016 13:37
> > To: user <us...@mesos.apache.org>
> > Subject: Re: Slaves not getting registered
> >
> >
> >
> >>but am unable to get it registered.
> >
> > Hi, @aishwarya Could you post master and slave log to provide more
> details?
> > Usually it is because of network problem.
> >
> >
> >
> > On Tue, Apr 12, 2016 at 4:02 PM, <ai...@accenture.com>
> wrote:
> >
> > Hi,
> >
> >
> >
> > I’m unable to get the slave registered with the master node. I’ve
> configured
> > both the masters and slave machines but am unable to get it registered.
> >
> >
> >
> > Thank you.
> >
> >
> >
> > ________________________________
> >
> >
> > This message is for the designated recipient only and may contain
> > privileged, proprietary, or otherwise confidential information. If you
> have
> > received it in error, please notify the sender immediately and delete the
> > original. Any other use of the e-mail by you is prohibited. Where
> allowed by
> > local law, electronic communications with Accenture and its affiliates,
> > including e-mail and instant messaging (including content), may be
> scanned
> > by our systems for the purposes of information security and assessment of
> > internal compliance with Accenture policy.
> >
> ______________________________________________________________________________________
> >
> > www.accenture.com
> >
> >
> >
> >
> >
> > --
> >
> > Best Regards,
> >
> > Haosdent Huang
>
>
>
> ------------------------------
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed
> by local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy.
>
> ______________________________________________________________________________________
>
> www.accenture.com
>
>
>
>
>
>
> --
> Best Regards,
> Haosdent Huang
>
>
>


-- 
Best Regards,
Haosdent Huang

RE: Slaves not getting registered

Posted by ai...@accenture.com.
Hi,

I configured the zookeeper file in slave machine by adding the master details and now the salve is getting registered.

But I don’t why, the three masters keep fluctuating among themselves to be the leader when I try accessing the master IP in the GUI.

Thank you.


From: haosdent [mailto:haosdent@gmail.com]
Sent: 13 April 2016 09:25
To: user <us...@mesos.apache.org>
Cc: Kumari, Suruchi <su...@accenture.com>
Subject: Re: Slaves not getting registered

>I0412 11:01:50.586612  3732 recover.cpp:578] Successfully joined the Paxos group

According to this, master 1 should connect to zk successfully.

>root@slave1:/var/log/mesos# tail -f mesos-slave.slave1.invalid-user.log.INFO.20160412-110554.1696
>I0413 03:12:54.532676  1711 group.cpp:519] ZooKeeper session expired
>I0413 03:12:58.757953  1715 slave.cpp:4304] Current disk usage 6.44%. Max allowed age: 5.848917453828577days
>W0413 03:13:04.539577  1715 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration

How about check whether you could connect to zk on slave1 or not?

On Wed, Apr 13, 2016 at 11:49 AM, <ai...@accenture.com>> wrote:
I checked the zookeeper status by running the command:

root@master1:/home/ubuntu# echo stat | nc 30.30.30.52 2181 | grep Mode
Mode: follower
root@master1:/home/ubuntu# echo stat | nc 30.30.30.53 2181 | grep Mode
Mode: leader
root@master1:/home/ubuntu# echo stat | nc 30.30.30.54 2181 | grep Mode
Mode: follower

And it seems like it’s working fine. Is there another way to check the health status?


From: Abhishek Amralkar [mailto:Abhishek.Amralkar@talentica.com<ma...@talentica.com>]
Sent: 13 April 2016 09:10

To: user@mesos.apache.org<ma...@mesos.apache.org>
Subject: Re: Slaves not getting registered

Have you checked if your ZooKeeper cluster is healthy? accessible from Mesos Masters?

W0413 03:12:24.512336  1715 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:34.519641  1710 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:44.521181  1713 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:54.532501  1711 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration

It seems Mesos masters are not able to communicate to Zookeeper.

-Abhishek
On 13-Apr-2016, at 9:06 AM, aishwarya.adyanthaya@accenture.com<ma...@accenture.com> wrote:

Hi,

I have been following the document from the digitalocean (mesos-doc-link<https://www.digitalocean.com/community/tutorials/how-to-configure-a-production-ready-mesosphere-cluster-on-ubuntu-14-04>) where I have set 3 masters and one slave. Below are the log details:

root@master1:/var/log/mesos# tail -f mesos-master.INFO
I0412 11:01:50.579818  3736 recover.cpp:193] Received a recover response from a replica in VOTING status
I0412 11:01:50.579903  3736 recover.cpp:564] Updating replica status to RECOVERING
I0412 11:01:50.583102  3736 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 3.154399ms
I0412 11:01:50.583137  3736 replica.cpp:320] Persisted replica status to RECOVERING
I0412 11:01:50.583176  3736 recover.cpp:543] Starting catch-up from position 1 to 2
I0412 11:01:50.583732  3736 recover.cpp:564] Updating replica status to VOTING
I0412 11:01:50.586318  3736 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 2.540703ms
I0412 11:01:50.586484  3736 replica.cpp:320] Persisted replica status to VOTING
I0412 11:01:50.586612  3732 recover.cpp:578] Successfully joined the Paxos group
I0412 11:01:50.586745  3731 recover.cpp:462] Recover process terminated

root@master1:/var/log/mesos# tail -f mesos-master.WARNING
Log file created at: 2016/04/12 11:01:49
Running on machine: master1
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
W0412 11:01:49.024226  3712 authenticator.cpp:511] No credentials provided, authentication requests will be refused

root@master1:/var/log/mesos# tail -f mesos-master.master1.invalid-user.log.INFO.20160412-11014
tail: cannot open ‘mesos-master.master1.invalid-user.log.INFO.20160412-11014’ for reading: No such file or directory
root@master1:/var/log/mesos# tail -f mesos-master.master1.invalid-user.log.INFO.20160412-11014
mesos-master.master1.invalid-user.log.INFO.20160412-110143.3651  mesos-master.master1.invalid-user.log.INFO.20160412-110148.3712
root@master1:/var/log/mesos# tail -f mesos-master.master1.invalid-user.log.INFO.20160412-110143.3651
I0412 11:01:46.424433  3676 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (5)@30.30.30.53:5050<http://30.30.30.53:5050>
I0412 11:01:47.068586  3675 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (8)@30.30.30.53:5050<http://30.30.30.53:5050>
I0412 11:01:47.592926  3677 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (11)@30.30.30.53:5050<http://30.30.30.53:5050>
I0412 11:01:48.188248  3680 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (14)@30.30.30.53:5050<http://30.30.30.53:5050>
I0412 11:01:48.887104  3678 group.cpp:460] Lost connection to ZooKeeper, attempting to reconnect ...
I0412 11:01:48.887177  3674 group.cpp:460] Lost connection to ZooKeeper, attempting to reconnect ...
I0412 11:01:48.887229  3677 group.cpp:460] Lost connection to ZooKeeper, attempting to reconnect ...
I0412 11:01:48.919545  3675 group.cpp:519] ZooKeeper session expired
I0412 11:01:48.919848  3680 detector.cpp:154] Detected a new leader: None
I0412 11:01:48.919922  3680 master.cpp:1710] The newly elected leader is None


root@slave1:/var/log/mesos# tail -f mesos-slave.slave1.invalid-user.log.INFO.20160412-110554.1696
I0413 03:12:54.532676  1711 group.cpp:519] ZooKeeper session expired
I0413 03:12:58.757953  1715 slave.cpp:4304] Current disk usage 6.44%. Max allowed age: 5.848917453828577days
W0413 03:13:04.539577  1715 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
I0413 03:13:04.539798  1715 group.cpp:519] ZooKeeper session expired
W0413 03:13:14.542245  1713 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
I0413 03:13:14.542434  1713 group.cpp:519] ZooKeeper session expired

root@slave1:/var/log/mesos# tail -f mesos-slave.WARNING
W0413 03:12:24.512336  1715 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:34.519641  1710 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:44.521181  1713 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:54.532501  1711 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration

Thank you.


From: June Taylor [mailto:june@umn.edu]
Sent: 12 April 2016 18:06
To: user@mesos.apache.org<ma...@mesos.apache.org>
Subject: Re: Slaves not getting registered

Try looking in /var/log/mesos/ at these files: mesos-slave.WARNING, mesos-slave.INFO, mesos-slave.ERROR


Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota

On Tue, Apr 12, 2016 at 4:36 AM, Dick Davies <di...@hellooperator.net>> wrote:
There's no mention of a slave there, have a look at the logs on the
slaves filesystem and see if it is giving any errors.

On 12 April 2016 at 10:17,  <ai...@accenture.com>> wrote:
> The GUI log shows like this:
>
>
>
> I0412 08:45:51.379609  3616 master.cpp:3673] Processing DECLINE call for
> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O282 ] for framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208/>
>
> I0412 08:45:54.637461  3612 http.cpp:501] HTTP GET for /master/state.json
> from 10.211.203.147:59463<http://10.211.203.147:59463/> with User-Agent='Mozilla/5.0 (Windows NT 6.0;
> WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
>
> I0412 08:45:57.376288  3619 master.cpp:5350] Sending 1 offers to framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208/>
>
> I0412 08:45:57.385325 3613<tel:385325%20%203613> master.cpp:3673] Processing DECLINE call for
> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O283 ] for framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208/>
>
> I0412 08:46:03.383728  3614 master.cpp:5350] Sending 1 offers to framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208/>
>
> I0412 08:46:03.396531  3612 master.cpp:3673] Processing DECLINE call for
> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O284 ] for framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208/>
>
> I0412 08:46:04.665582  3612 http.cpp:501] HTTP GET for /master/state.json
> from 10.211.203.147:59464<http://10.211.203.147:59464/> with User-Agent='Mozilla/5.0 (Windows NT 6.0;
> WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
>
> I0412 08:46:09.389493  3616 master.cpp:5350] Sending 1 offers to framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208/>
>
>
>
>
>
> Is there a way to find out the number of masters that are present in the
> environment together through CLI/GUI?
>
>
>
>
>
>
>
> From: haosdent [mailto:haosdent@gmail.com<ma...@gmail.com>]
> Sent: 12 April 2016 13:37
> To: user <us...@mesos.apache.org>>
> Subject: Re: Slaves not getting registered
>
>
>
>>but am unable to get it registered.
>
> Hi, @aishwarya Could you post master and slave log to provide more details?
> Usually it is because of network problem.
>
>
>
> On Tue, Apr 12, 2016 at 4:02 PM, <ai...@accenture.com>> wrote:
>
> Hi,
>
>
>
> I’m unable to get the slave registered with the master node. I’ve configured
> both the masters and slave machines but am unable to get it registered.
>
>
>
> Thank you.
>
>
>
> ________________________________
>
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed by
> local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy.
> ______________________________________________________________________________________
>
> www.accenture.com<http://www.accenture.com/>
>
>
>
>
>
> --
>
> Best Regards,
>
> Haosdent Huang


________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.
______________________________________________________________________________________

www.accenture.com<http://www.accenture.com/>




--
Best Regards,
Haosdent Huang

Re: Slaves not getting registered

Posted by haosdent <ha...@gmail.com>.
>I0412 11:01:50.586612  3732 recover.cpp:578] Successfully joined the Paxos
group

According to this, master 1 should connect to zk successfully.

>root@slave1:/var/log/mesos# tail -f
mesos-slave.slave1.invalid-user.log.INFO.20160412-110554.1696
>I0413 03:12:54.532676  1711 group.cpp:519] ZooKeeper session expired
>I0413 03:12:58.757953  1715 slave.cpp:4304] Current disk usage 6.44%. Max
allowed age: 5.848917453828577days
>W0413 03:13:04.539577  1715 group.cpp:503] Timed out waiting to connect to
ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration

How about check whether you could connect to zk on slave1 or not?

On Wed, Apr 13, 2016 at 11:49 AM, <ai...@accenture.com>
wrote:

> I checked the zookeeper status by running the command:
>
>
>
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.52 2181 | grep Mode
>
> Mode: follower
>
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.53 2181 | grep Mode
>
> Mode: leader
>
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.54 2181 | grep Mode
>
> Mode: follower
>
>
>
> And it seems like it’s working fine. Is there another way to check the
> health status?
>
>
>
>
>
> *From:* Abhishek Amralkar [mailto:Abhishek.Amralkar@talentica.com]
> *Sent:* 13 April 2016 09:10
>
> *To:* user@mesos.apache.org
> *Subject:* Re: Slaves not getting registered
>
>
>
> Have you checked if your ZooKeeper cluster is healthy? accessible from
> Mesos Masters?
>
>
>
> W0413 03:12:24.512336  1715 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:34.519641  1710 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:44.521181  1713 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:54.532501  1711 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
>
>
> It seems Mesos masters are not able to communicate to Zookeeper.
>
>
>
> -Abhishek
>
> On 13-Apr-2016, at 9:06 AM, aishwarya.adyanthaya@accenture.com wrote:
>
>
>
> Hi,
>
>
>
> I have been following the document from the digitalocean (mesos-doc-link
> <https://www.digitalocean.com/community/tutorials/how-to-configure-a-production-ready-mesosphere-cluster-on-ubuntu-14-04>)
> where I have set 3 masters and one slave. Below are the log details:
>
>
>
> root@master1:/var/log/mesos# tail -f mesos-master.INFO
>
> I0412 11:01:50.579818  3736 recover.cpp:193] Received a recover response
> from a replica in VOTING status
>
> I0412 11:01:50.579903  3736 recover.cpp:564] Updating replica status to
> RECOVERING
>
> I0412 11:01:50.583102  3736 leveldb.cpp:304] Persisting metadata (8 bytes)
> to leveldb took 3.154399ms
>
> I0412 11:01:50.583137  3736 replica.cpp:320] Persisted replica status to
> RECOVERING
>
> I0412 11:01:50.583176  3736 recover.cpp:543] Starting catch-up from
> position 1 to 2
>
> I0412 11:01:50.583732  3736 recover.cpp:564] Updating replica status to
> VOTING
>
> I0412 11:01:50.586318  3736 leveldb.cpp:304] Persisting metadata (8 bytes)
> to leveldb took 2.540703ms
>
> I0412 11:01:50.586484  3736 replica.cpp:320] Persisted replica status to
> VOTING
>
> I0412 11:01:50.586612  3732 recover.cpp:578] Successfully joined the Paxos
> group
>
> I0412 11:01:50.586745  3731 recover.cpp:462] Recover process terminated
>
>
>
> root@master1:/var/log/mesos# tail -f mesos-master.WARNING
>
> Log file created at: 2016/04/12 11:01:49
>
> Running on machine: master1
>
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
>
> W0412 11:01:49.024226  3712 authenticator.cpp:511] No credentials
> provided, authentication requests will be refused
>
>
>
> root@master1:/var/log/mesos# tail -f
> mesos-master.master1.invalid-user.log.INFO.20160412-11014
>
> tail: cannot open
> ‘mesos-master.master1.invalid-user.log.INFO.20160412-11014’ for reading: No
> such file or directory
>
> root@master1:/var/log/mesos# tail -f
> mesos-master.master1.invalid-user.log.INFO.20160412-11014
>
> mesos-master.master1.invalid-user.log.INFO.20160412-110143.3651
> mesos-master.master1.invalid-user.log.INFO.20160412-110148.3712
>
> root@master1:/var/log/mesos# tail -f
> mesos-master.master1.invalid-user.log.INFO.20160412-110143.3651
>
> I0412 11:01:46.424433  3676 replica.cpp:673] Replica in EMPTY status
> received a broadcasted recover request from (5)@30.30.30.53:5050
>
> I0412 11:01:47.068586  3675 replica.cpp:673] Replica in EMPTY status
> received a broadcasted recover request from (8)@30.30.30.53:5050
>
> I0412 11:01:47.592926  3677 replica.cpp:673] Replica in EMPTY status
> received a broadcasted recover request from (11)@30.30.30.53:5050
>
> I0412 11:01:48.188248  3680 replica.cpp:673] Replica in EMPTY status
> received a broadcasted recover request from (14)@30.30.30.53:5050
>
> I0412 11:01:48.887104  3678 group.cpp:460] Lost connection to ZooKeeper,
> attempting to reconnect ...
>
> I0412 11:01:48.887177  3674 group.cpp:460] Lost connection to ZooKeeper,
> attempting to reconnect ...
>
> I0412 11:01:48.887229  3677 group.cpp:460] Lost connection to ZooKeeper,
> attempting to reconnect ...
>
> I0412 11:01:48.919545  3675 group.cpp:519] ZooKeeper session expired
>
> I0412 11:01:48.919848  3680 detector.cpp:154] Detected a new leader: None
>
> I0412 11:01:48.919922  3680 master.cpp:1710] The newly elected leader is
> None
>
>
>
>
>
> root@slave1:/var/log/mesos# tail -f
> mesos-slave.slave1.invalid-user.log.INFO.20160412-110554.1696
>
> I0413 03:12:54.532676  1711 group.cpp:519] ZooKeeper session expired
>
> I0413 03:12:58.757953  1715 slave.cpp:4304] Current disk usage 6.44%. Max
> allowed age: 5.848917453828577days
>
> W0413 03:13:04.539577  1715 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> I0413 03:13:04.539798  1715 group.cpp:519] ZooKeeper session expired
>
> W0413 03:13:14.542245  1713 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> I0413 03:13:14.542434  1713 group.cpp:519] ZooKeeper session expired
>
>
>
> root@slave1:/var/log/mesos# tail -f mesos-slave.WARNING
>
> W0413 03:12:24.512336  1715 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:34.519641  1710 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:44.521181  1713 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:54.532501  1711 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
>
>
> Thank you.
>
>
>
>
>
> *From:* June Taylor [mailto:june@umn.edu <ju...@umn.edu>]
> *Sent:* 12 April 2016 18:06
> *To:* user@mesos.apache.org
> *Subject:* Re: Slaves not getting registered
>
>
>
> Try looking in /var/log/mesos/ at these files: mesos-slave.WARNING,
> mesos-slave.INFO, mesos-slave.ERROR
>
>
>
>
> Thanks,
>
> June Taylor
>
> System Administrator, Minnesota Population Center
>
> University of Minnesota
>
>
>
> On Tue, Apr 12, 2016 at 4:36 AM, Dick Davies <di...@hellooperator.net>
> wrote:
>
> There's no mention of a slave there, have a look at the logs on the
> slaves filesystem and see if it is giving any errors.
>
>
> On 12 April 2016 at 10:17,  <ai...@accenture.com> wrote:
> > The GUI log shows like this:
> >
> >
> >
> > I0412 08:45:51.379609  3616 master.cpp:3673] Processing DECLINE call for
> > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O282 ] for framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:45:54.637461  3612 http.cpp:501] HTTP GET for /master/state.json
> > from 10.211.203.147:59463 with User-Agent='Mozilla/5.0 (Windows NT 6.0;
> > WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
> >
> > I0412 08:45:57.376288  3619 master.cpp:5350] Sending 1 offers to
> framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:45:57.385325 3613 <385325%20%203613> master.cpp:3673]
> Processing DECLINE call for
> > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O283 ] for framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:46:03.383728  3614 master.cpp:5350] Sending 1 offers to
> framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:46:03.396531  3612 master.cpp:3673] Processing DECLINE call for
> > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O284 ] for framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:46:04.665582  3612 http.cpp:501] HTTP GET for /master/state.json
> > from 10.211.203.147:59464 with User-Agent='Mozilla/5.0 (Windows NT 6.0;
> > WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
> >
> > I0412 08:46:09.389493  3616 master.cpp:5350] Sending 1 offers to
> framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> >
> >
> >
> >
> > Is there a way to find out the number of masters that are present in the
> > environment together through CLI/GUI?
> >
> >
> >
> >
> >
> >
> >
> > From: haosdent [mailto:haosdent@gmail.com]
> > Sent: 12 April 2016 13:37
> > To: user <us...@mesos.apache.org>
> > Subject: Re: Slaves not getting registered
> >
> >
> >
> >>but am unable to get it registered.
> >
> > Hi, @aishwarya Could you post master and slave log to provide more
> details?
> > Usually it is because of network problem.
> >
> >
> >
> > On Tue, Apr 12, 2016 at 4:02 PM, <ai...@accenture.com>
> wrote:
> >
> > Hi,
> >
> >
> >
> > I’m unable to get the slave registered with the master node. I’ve
> configured
> > both the masters and slave machines but am unable to get it registered.
> >
> >
> >
> > Thank you.
> >
> >
> >
> > ________________________________
> >
> >
> > This message is for the designated recipient only and may contain
> > privileged, proprietary, or otherwise confidential information. If you
> have
> > received it in error, please notify the sender immediately and delete the
> > original. Any other use of the e-mail by you is prohibited. Where
> allowed by
> > local law, electronic communications with Accenture and its affiliates,
> > including e-mail and instant messaging (including content), may be
> scanned
> > by our systems for the purposes of information security and assessment of
> > internal compliance with Accenture policy.
> >
> ______________________________________________________________________________________
> >
> > www.accenture.com
> >
> >
> >
> >
> >
> > --
> >
> > Best Regards,
> >
> > Haosdent Huang
>
>
>
>
> ------------------------------
>
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed
> by local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy.
>
> ______________________________________________________________________________________
>
> www.accenture.com
>
>
>



-- 
Best Regards,
Haosdent Huang

RE: Slaves not getting registered

Posted by ai...@accenture.com.
I checked the zookeeper status by running the command:

root@master1:/home/ubuntu# echo stat | nc 30.30.30.52 2181 | grep Mode
Mode: follower
root@master1:/home/ubuntu# echo stat | nc 30.30.30.53 2181 | grep Mode
Mode: leader
root@master1:/home/ubuntu# echo stat | nc 30.30.30.54 2181 | grep Mode
Mode: follower

And it seems like it’s working fine. Is there another way to check the health status?


From: Abhishek Amralkar [mailto:Abhishek.Amralkar@talentica.com]
Sent: 13 April 2016 09:10
To: user@mesos.apache.org
Subject: Re: Slaves not getting registered

Have you checked if your ZooKeeper cluster is healthy? accessible from Mesos Masters?

W0413 03:12:24.512336  1715 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:34.519641  1710 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:44.521181  1713 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:54.532501  1711 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration

It seems Mesos masters are not able to communicate to Zookeeper.

-Abhishek
On 13-Apr-2016, at 9:06 AM, aishwarya.adyanthaya@accenture.com<ma...@accenture.com> wrote:

Hi,

I have been following the document from the digitalocean (mesos-doc-link<https://www.digitalocean.com/community/tutorials/how-to-configure-a-production-ready-mesosphere-cluster-on-ubuntu-14-04>) where I have set 3 masters and one slave. Below are the log details:

root@master1:/var/log/mesos# tail -f mesos-master.INFO
I0412 11:01:50.579818  3736 recover.cpp:193] Received a recover response from a replica in VOTING status
I0412 11:01:50.579903  3736 recover.cpp:564] Updating replica status to RECOVERING
I0412 11:01:50.583102  3736 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 3.154399ms
I0412 11:01:50.583137  3736 replica.cpp:320] Persisted replica status to RECOVERING
I0412 11:01:50.583176  3736 recover.cpp:543] Starting catch-up from position 1 to 2
I0412 11:01:50.583732  3736 recover.cpp:564] Updating replica status to VOTING
I0412 11:01:50.586318  3736 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 2.540703ms
I0412 11:01:50.586484  3736 replica.cpp:320] Persisted replica status to VOTING
I0412 11:01:50.586612  3732 recover.cpp:578] Successfully joined the Paxos group
I0412 11:01:50.586745  3731 recover.cpp:462] Recover process terminated

root@master1:/var/log/mesos# tail -f mesos-master.WARNING
Log file created at: 2016/04/12 11:01:49
Running on machine: master1
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
W0412 11:01:49.024226  3712 authenticator.cpp:511] No credentials provided, authentication requests will be refused

root@master1:/var/log/mesos# tail -f mesos-master.master1.invalid-user.log.INFO.20160412-11014
tail: cannot open ‘mesos-master.master1.invalid-user.log.INFO.20160412-11014’ for reading: No such file or directory
root@master1:/var/log/mesos# tail -f mesos-master.master1.invalid-user.log.INFO.20160412-11014
mesos-master.master1.invalid-user.log.INFO.20160412-110143.3651  mesos-master.master1.invalid-user.log.INFO.20160412-110148.3712
root@master1:/var/log/mesos# tail -f mesos-master.master1.invalid-user.log.INFO.20160412-110143.3651
I0412 11:01:46.424433  3676 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (5)@30.30.30.53:5050
I0412 11:01:47.068586  3675 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (8)@30.30.30.53:5050
I0412 11:01:47.592926  3677 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (11)@30.30.30.53:5050
I0412 11:01:48.188248  3680 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (14)@30.30.30.53:5050
I0412 11:01:48.887104  3678 group.cpp:460] Lost connection to ZooKeeper, attempting to reconnect ...
I0412 11:01:48.887177  3674 group.cpp:460] Lost connection to ZooKeeper, attempting to reconnect ...
I0412 11:01:48.887229  3677 group.cpp:460] Lost connection to ZooKeeper, attempting to reconnect ...
I0412 11:01:48.919545  3675 group.cpp:519] ZooKeeper session expired
I0412 11:01:48.919848  3680 detector.cpp:154] Detected a new leader: None
I0412 11:01:48.919922  3680 master.cpp:1710] The newly elected leader is None


root@slave1:/var/log/mesos# tail -f mesos-slave.slave1.invalid-user.log.INFO.20160412-110554.1696
I0413 03:12:54.532676  1711 group.cpp:519] ZooKeeper session expired
I0413 03:12:58.757953  1715 slave.cpp:4304] Current disk usage 6.44%. Max allowed age: 5.848917453828577days
W0413 03:13:04.539577  1715 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
I0413 03:13:04.539798  1715 group.cpp:519] ZooKeeper session expired
W0413 03:13:14.542245  1713 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
I0413 03:13:14.542434  1713 group.cpp:519] ZooKeeper session expired

root@slave1:/var/log/mesos# tail -f mesos-slave.WARNING
W0413 03:12:24.512336  1715 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:34.519641  1710 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:44.521181  1713 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:54.532501  1711 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration

Thank you.


From: June Taylor [mailto:june@umn.edu]
Sent: 12 April 2016 18:06
To: user@mesos.apache.org<ma...@mesos.apache.org>
Subject: Re: Slaves not getting registered

Try looking in /var/log/mesos/ at these files: mesos-slave.WARNING, mesos-slave.INFO, mesos-slave.ERROR


Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota

On Tue, Apr 12, 2016 at 4:36 AM, Dick Davies <di...@hellooperator.net>> wrote:
There's no mention of a slave there, have a look at the logs on the
slaves filesystem and see if it is giving any errors.

On 12 April 2016 at 10:17,  <ai...@accenture.com>> wrote:
> The GUI log shows like this:
>
>
>
> I0412 08:45:51.379609  3616 master.cpp:3673] Processing DECLINE call for
> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O282 ] for framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208/>
>
> I0412 08:45:54.637461  3612 http.cpp:501] HTTP GET for /master/state.json
> from 10.211.203.147:59463<http://10.211.203.147:59463/> with User-Agent='Mozilla/5.0 (Windows NT 6.0;
> WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
>
> I0412 08:45:57.376288  3619 master.cpp:5350] Sending 1 offers to framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208/>
>
> I0412 08:45:57.385325 3613<tel:385325%20%203613> master.cpp:3673] Processing DECLINE call for
> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O283 ] for framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208/>
>
> I0412 08:46:03.383728  3614 master.cpp:5350] Sending 1 offers to framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208/>
>
> I0412 08:46:03.396531  3612 master.cpp:3673] Processing DECLINE call for
> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O284 ] for framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208/>
>
> I0412 08:46:04.665582  3612 http.cpp:501] HTTP GET for /master/state.json
> from 10.211.203.147:59464<http://10.211.203.147:59464/> with User-Agent='Mozilla/5.0 (Windows NT 6.0;
> WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
>
> I0412 08:46:09.389493  3616 master.cpp:5350] Sending 1 offers to framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208/>
>
>
>
>
>
> Is there a way to find out the number of masters that are present in the
> environment together through CLI/GUI?
>
>
>
>
>
>
>
> From: haosdent [mailto:haosdent@gmail.com<ma...@gmail.com>]
> Sent: 12 April 2016 13:37
> To: user <us...@mesos.apache.org>>
> Subject: Re: Slaves not getting registered
>
>
>
>>but am unable to get it registered.
>
> Hi, @aishwarya Could you post master and slave log to provide more details?
> Usually it is because of network problem.
>
>
>
> On Tue, Apr 12, 2016 at 4:02 PM, <ai...@accenture.com>> wrote:
>
> Hi,
>
>
>
> I’m unable to get the slave registered with the master node. I’ve configured
> both the masters and slave machines but am unable to get it registered.
>
>
>
> Thank you.
>
>
>
> ________________________________
>
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed by
> local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy.
> ______________________________________________________________________________________
>
> www.accenture.com<http://www.accenture.com/>
>
>
>
>
>
> --
>
> Best Regards,
>
> Haosdent Huang


________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.
______________________________________________________________________________________

www.accenture.com<http://www.accenture.com/>


RE: Slaves not getting registered

Posted by ai...@accenture.com.
Hi,

I have been following the document from the digitalocean (mesos-doc-link<https://www.digitalocean.com/community/tutorials/how-to-configure-a-production-ready-mesosphere-cluster-on-ubuntu-14-04>) where I have set 3 masters and one slave. Below are the log details:

root@master1:/var/log/mesos# tail -f mesos-master.INFO
I0412 11:01:50.579818  3736 recover.cpp:193] Received a recover response from a replica in VOTING status
I0412 11:01:50.579903  3736 recover.cpp:564] Updating replica status to RECOVERING
I0412 11:01:50.583102  3736 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 3.154399ms
I0412 11:01:50.583137  3736 replica.cpp:320] Persisted replica status to RECOVERING
I0412 11:01:50.583176  3736 recover.cpp:543] Starting catch-up from position 1 to 2
I0412 11:01:50.583732  3736 recover.cpp:564] Updating replica status to VOTING
I0412 11:01:50.586318  3736 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 2.540703ms
I0412 11:01:50.586484  3736 replica.cpp:320] Persisted replica status to VOTING
I0412 11:01:50.586612  3732 recover.cpp:578] Successfully joined the Paxos group
I0412 11:01:50.586745  3731 recover.cpp:462] Recover process terminated

root@master1:/var/log/mesos# tail -f mesos-master.WARNING
Log file created at: 2016/04/12 11:01:49
Running on machine: master1
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
W0412 11:01:49.024226  3712 authenticator.cpp:511] No credentials provided, authentication requests will be refused

root@master1:/var/log/mesos# tail -f mesos-master.master1.invalid-user.log.INFO.20160412-11014
tail: cannot open ‘mesos-master.master1.invalid-user.log.INFO.20160412-11014’ for reading: No such file or directory
root@master1:/var/log/mesos# tail -f mesos-master.master1.invalid-user.log.INFO.20160412-11014
mesos-master.master1.invalid-user.log.INFO.20160412-110143.3651  mesos-master.master1.invalid-user.log.INFO.20160412-110148.3712
root@master1:/var/log/mesos# tail -f mesos-master.master1.invalid-user.log.INFO.20160412-110143.3651
I0412 11:01:46.424433  3676 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (5)@30.30.30.53:5050
I0412 11:01:47.068586  3675 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (8)@30.30.30.53:5050
I0412 11:01:47.592926  3677 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (11)@30.30.30.53:5050
I0412 11:01:48.188248  3680 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (14)@30.30.30.53:5050
I0412 11:01:48.887104  3678 group.cpp:460] Lost connection to ZooKeeper, attempting to reconnect ...
I0412 11:01:48.887177  3674 group.cpp:460] Lost connection to ZooKeeper, attempting to reconnect ...
I0412 11:01:48.887229  3677 group.cpp:460] Lost connection to ZooKeeper, attempting to reconnect ...
I0412 11:01:48.919545  3675 group.cpp:519] ZooKeeper session expired
I0412 11:01:48.919848  3680 detector.cpp:154] Detected a new leader: None
I0412 11:01:48.919922  3680 master.cpp:1710] The newly elected leader is None


root@slave1:/var/log/mesos# tail -f mesos-slave.slave1.invalid-user.log.INFO.20160412-110554.1696
I0413 03:12:54.532676  1711 group.cpp:519] ZooKeeper session expired
I0413 03:12:58.757953  1715 slave.cpp:4304] Current disk usage 6.44%. Max allowed age: 5.848917453828577days
W0413 03:13:04.539577  1715 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
I0413 03:13:04.539798  1715 group.cpp:519] ZooKeeper session expired
W0413 03:13:14.542245  1713 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
I0413 03:13:14.542434  1713 group.cpp:519] ZooKeeper session expired

root@slave1:/var/log/mesos# tail -f mesos-slave.WARNING
W0413 03:12:24.512336  1715 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:34.519641  1710 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:44.521181  1713 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
W0413 03:12:54.532501  1711 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration

Thank you.


From: June Taylor [mailto:june@umn.edu]
Sent: 12 April 2016 18:06
To: user@mesos.apache.org
Subject: Re: Slaves not getting registered

Try looking in /var/log/mesos/ at these files: mesos-slave.WARNING, mesos-slave.INFO, mesos-slave.ERROR


Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota

On Tue, Apr 12, 2016 at 4:36 AM, Dick Davies <di...@hellooperator.net>> wrote:
There's no mention of a slave there, have a look at the logs on the
slaves filesystem and see if it is giving any errors.

On 12 April 2016 at 10:17,  <ai...@accenture.com>> wrote:
> The GUI log shows like this:
>
>
>
> I0412 08:45:51.379609  3616 master.cpp:3673] Processing DECLINE call for
> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O282 ] for framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208>
>
> I0412 08:45:54.637461  3612 http.cpp:501] HTTP GET for /master/state.json
> from 10.211.203.147:59463<http://10.211.203.147:59463> with User-Agent='Mozilla/5.0 (Windows NT 6.0;
> WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
>
> I0412 08:45:57.376288  3619 master.cpp:5350] Sending 1 offers to framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208>
>
> I0412 08:45:57.385325 3613<tel:385325%20%203613> master.cpp:3673] Processing DECLINE call for
> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O283 ] for framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208>
>
> I0412 08:46:03.383728  3614 master.cpp:5350] Sending 1 offers to framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208>
>
> I0412 08:46:03.396531  3612 master.cpp:3673] Processing DECLINE call for
> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O284 ] for framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208>
>
> I0412 08:46:04.665582  3612 http.cpp:501] HTTP GET for /master/state.json
> from 10.211.203.147:59464<http://10.211.203.147:59464> with User-Agent='Mozilla/5.0 (Windows NT 6.0;
> WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
>
> I0412 08:46:09.389493  3616 master.cpp:5350] Sending 1 offers to framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208<http://scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208>
>
>
>
>
>
> Is there a way to find out the number of masters that are present in the
> environment together through CLI/GUI?
>
>
>
>
>
>
>
> From: haosdent [mailto:haosdent@gmail.com<ma...@gmail.com>]
> Sent: 12 April 2016 13:37
> To: user <us...@mesos.apache.org>>
> Subject: Re: Slaves not getting registered
>
>
>
>>but am unable to get it registered.
>
> Hi, @aishwarya Could you post master and slave log to provide more details?
> Usually it is because of network problem.
>
>
>
> On Tue, Apr 12, 2016 at 4:02 PM, <ai...@accenture.com>> wrote:
>
> Hi,
>
>
>
> I’m unable to get the slave registered with the master node. I’ve configured
> both the masters and slave machines but am unable to get it registered.
>
>
>
> Thank you.
>
>
>
> ________________________________
>
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed by
> local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy.
> ______________________________________________________________________________________
>
> www.accenture.com<http://www.accenture.com>
>
>
>
>
>
> --
>
> Best Regards,
>
> Haosdent Huang


________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.
______________________________________________________________________________________

www.accenture.com

Re: Slaves not getting registered

Posted by June Taylor <ju...@umn.edu>.
Try looking in /var/log/mesos/ at these files: mesos-slave.WARNING,
mesos-slave.INFO, mesos-slave.ERROR


Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota

On Tue, Apr 12, 2016 at 4:36 AM, Dick Davies <di...@hellooperator.net> wrote:

> There's no mention of a slave there, have a look at the logs on the
> slaves filesystem and see if it is giving any errors.
>
> On 12 April 2016 at 10:17,  <ai...@accenture.com> wrote:
> > The GUI log shows like this:
> >
> >
> >
> > I0412 08:45:51.379609  3616 master.cpp:3673] Processing DECLINE call for
> > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O282 ] for framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:45:54.637461  3612 http.cpp:501] HTTP GET for /master/state.json
> > from 10.211.203.147:59463 with User-Agent='Mozilla/5.0 (Windows NT 6.0;
> > WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
> >
> > I0412 08:45:57.376288  3619 master.cpp:5350] Sending 1 offers to
> framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:45:57.385325 3613 master.cpp:3673] Processing DECLINE call for
> > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O283 ] for framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:46:03.383728  3614 master.cpp:5350] Sending 1 offers to
> framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:46:03.396531  3612 master.cpp:3673] Processing DECLINE call for
> > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O284 ] for framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> > I0412 08:46:04.665582  3612 http.cpp:501] HTTP GET for /master/state.json
> > from 10.211.203.147:59464 with User-Agent='Mozilla/5.0 (Windows NT 6.0;
> > WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
> >
> > I0412 08:46:09.389493  3616 master.cpp:5350] Sending 1 offers to
> framework
> > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> > scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
> >
> >
> >
> >
> >
> > Is there a way to find out the number of masters that are present in the
> > environment together through CLI/GUI?
> >
> >
> >
> >
> >
> >
> >
> > From: haosdent [mailto:haosdent@gmail.com]
> > Sent: 12 April 2016 13:37
> > To: user <us...@mesos.apache.org>
> > Subject: Re: Slaves not getting registered
> >
> >
> >
> >>but am unable to get it registered.
> >
> > Hi, @aishwarya Could you post master and slave log to provide more
> details?
> > Usually it is because of network problem.
> >
> >
> >
> > On Tue, Apr 12, 2016 at 4:02 PM, <ai...@accenture.com>
> wrote:
> >
> > Hi,
> >
> >
> >
> > I’m unable to get the slave registered with the master node. I’ve
> configured
> > both the masters and slave machines but am unable to get it registered.
> >
> >
> >
> > Thank you.
> >
> >
> >
> > ________________________________
> >
> >
> > This message is for the designated recipient only and may contain
> > privileged, proprietary, or otherwise confidential information. If you
> have
> > received it in error, please notify the sender immediately and delete the
> > original. Any other use of the e-mail by you is prohibited. Where
> allowed by
> > local law, electronic communications with Accenture and its affiliates,
> > including e-mail and instant messaging (including content), may be
> scanned
> > by our systems for the purposes of information security and assessment of
> > internal compliance with Accenture policy.
> >
> ______________________________________________________________________________________
> >
> > www.accenture.com
> >
> >
> >
> >
> >
> > --
> >
> > Best Regards,
> >
> > Haosdent Huang
>

Re: Slaves not getting registered

Posted by Dick Davies <di...@hellooperator.net>.
There's no mention of a slave there, have a look at the logs on the
slaves filesystem and see if it is giving any errors.

On 12 April 2016 at 10:17,  <ai...@accenture.com> wrote:
> The GUI log shows like this:
>
>
>
> I0412 08:45:51.379609  3616 master.cpp:3673] Processing DECLINE call for
> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O282 ] for framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>
> I0412 08:45:54.637461  3612 http.cpp:501] HTTP GET for /master/state.json
> from 10.211.203.147:59463 with User-Agent='Mozilla/5.0 (Windows NT 6.0;
> WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
>
> I0412 08:45:57.376288  3619 master.cpp:5350] Sending 1 offers to framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>
> I0412 08:45:57.385325  3613 master.cpp:3673] Processing DECLINE call for
> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O283 ] for framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>
> I0412 08:46:03.383728  3614 master.cpp:5350] Sending 1 offers to framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>
> I0412 08:46:03.396531  3612 master.cpp:3673] Processing DECLINE call for
> offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O284 ] for framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>
> I0412 08:46:04.665582  3612 http.cpp:501] HTTP GET for /master/state.json
> from 10.211.203.147:59464 with User-Agent='Mozilla/5.0 (Windows NT 6.0;
> WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
>
> I0412 08:46:09.389493  3616 master.cpp:5350] Sending 1 offers to framework
> 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at
> scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
>
>
>
>
>
> Is there a way to find out the number of masters that are present in the
> environment together through CLI/GUI?
>
>
>
>
>
>
>
> From: haosdent [mailto:haosdent@gmail.com]
> Sent: 12 April 2016 13:37
> To: user <us...@mesos.apache.org>
> Subject: Re: Slaves not getting registered
>
>
>
>>but am unable to get it registered.
>
> Hi, @aishwarya Could you post master and slave log to provide more details?
> Usually it is because of network problem.
>
>
>
> On Tue, Apr 12, 2016 at 4:02 PM, <ai...@accenture.com> wrote:
>
> Hi,
>
>
>
> I’m unable to get the slave registered with the master node. I’ve configured
> both the masters and slave machines but am unable to get it registered.
>
>
>
> Thank you.
>
>
>
> ________________________________
>
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed by
> local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy.
> ______________________________________________________________________________________
>
> www.accenture.com
>
>
>
>
>
> --
>
> Best Regards,
>
> Haosdent Huang

RE: Slaves not getting registered

Posted by ai...@accenture.com.
The GUI log shows like this:

I0412 08:45:51.379609  3616 master.cpp:3673] Processing DECLINE call for offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O282 ] for framework 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
I0412 08:45:54.637461  3612 http.cpp:501] HTTP GET for /master/state.json from 10.211.203.147:59463 with User-Agent='Mozilla/5.0 (Windows NT 6.0; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
I0412 08:45:57.376288  3619 master.cpp:5350] Sending 1 offers to framework 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
I0412 08:45:57.385325  3613 master.cpp:3673] Processing DECLINE call for offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O283 ] for framework 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
I0412 08:46:03.383728  3614 master.cpp:5350] Sending 1 offers to framework 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
I0412 08:46:03.396531  3612 master.cpp:3673] Processing DECLINE call for offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O284 ] for framework 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208
I0412 08:46:04.665582  3612 http.cpp:501] HTTP GET for /master/state.json from 10.211.203.147:59464 with User-Agent='Mozilla/5.0 (Windows NT 6.0; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0'
I0412 08:46:09.389493  3616 master.cpp:5350] Sending 1 offers to framework 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at scheduler-15022696-44ec-43d2-b193-a3cc4021d20e@30.30.30.48:42208


Is there a way to find out the number of masters that are present in the environment together through CLI/GUI?



From: haosdent [mailto:haosdent@gmail.com]
Sent: 12 April 2016 13:37
To: user <us...@mesos.apache.org>
Subject: Re: Slaves not getting registered

>but am unable to get it registered.
Hi, @aishwarya Could you post master and slave log to provide more details? Usually it is because of network problem.

On Tue, Apr 12, 2016 at 4:02 PM, <ai...@accenture.com>> wrote:
Hi,

I’m unable to get the slave registered with the master node. I’ve configured both the masters and slave machines but am unable to get it registered.

Thank you.

________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.
______________________________________________________________________________________

www.accenture.com<http://www.accenture.com>



--
Best Regards,
Haosdent Huang

Re: Slaves not getting registered

Posted by haosdent <ha...@gmail.com>.
>but am unable to get it registered.
Hi, @aishwarya Could you post master and slave log to provide more details?
Usually it is because of network problem.

On Tue, Apr 12, 2016 at 4:02 PM, <ai...@accenture.com> wrote:

> Hi,
>
>
>
> I’m unable to get the slave registered with the master node. I’ve
> configured both the masters and slave machines but am unable to get it
> registered.
>
>
>
> Thank you.
>
> ------------------------------
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed
> by local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy.
>
> ______________________________________________________________________________________
>
> www.accenture.com
>



-- 
Best Regards,
Haosdent Huang

RE: Slaves not getting registered

Posted by ai...@accenture.com.
Hi,

There is no error while starting the slave machine. I have two masters and one slave configured and the slave isn’t getting registered in the GUI.

Previously when I worked on single master and single slave, the GUI  showed the activated status of slave.

Thank you.

From: Abhishek Amralkar [mailto:Abhishek.Amralkar@talentica.com]
Sent: 12 April 2016 13:37
To: user@mesos.apache.org
Subject: Re: Slaves not getting registered

What error you are getting when you are starting the mesos slaves?
On 12-Apr-2016, at 1:32 PM, aishwarya.adyanthaya@accenture.com<ma...@accenture.com> wrote:

Hi,

I’m unable to get the slave registered with the master node. I’ve configured both the masters and slave machines but am unable to get it registered.

Thank you.

________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.
______________________________________________________________________________________

www.accenture.com<http://www.accenture.com/>