You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Eugenia Gabrielova <ga...@llnl.gov> on 2012/09/14 00:33:44 UTC

Mesos-Master on Zookeeper Failover

Dear Mesos Development Team,

I'm running into an interesting failover bug when running Mesos on 
Zookeeper. It seems to replicate this known failover bug, which is open: 
https://issues.apache.org/jira/browse/MESOS-246. Would you be willing to 
suggest a possible workaround or a fix I can make locally? I've tried a 
few configuration changes but am unable to find a fix.

Please find below my output from just a local setup, running a local 
standalone Zookeeper + Mesos Master + Slave. I am running the latest 
Master, and Zookeeper 3.3.6 (Stable latest); operating system is CentOS 
6.3. I do have Mesos running without issues on Zookeeper in one local 
development environment (RHEL 6.2) with an identical configuration, but 
in each other environment (one CentOS 6.3, other RHEL 6.2), it 
encounters similar ZOO_DEBUG -> ping 0ms behaviour on fresh installs. 
Please let me know if I can provide further configuration information.

Sincerely,
Eugenia

Outputs:
*Master
*[...] > ./bin/mesos-master.sh --zk=zk://localhost:2181/znode
I0913 22:19:44.869660 26895 main.cpp:115] Build: 2012-09-13 19:33:05 by root
I0913 22:19:44.870185 26895 main.cpp:116] Starting Mesos master
I0913 22:19:44.871429 26910 master.cpp:299] Master started on 
10.180.4.184:5050
I0913 22:19:44.871531 26910 master.cpp:314] Master ID: 
201209132219-3087315978-5050-26895
W0913 22:19:44.874285 26910 master.cpp:77] No whitelist given. 
Advertising offers for all slaves
2012-09-13 22:19:44,874:26895(0x7f04a41fb720):ZOO_INFO@log_env@658: 
Client environment:zookeeper.version=zookeeper C client 3.3.4
2012-09-13 22:19:44,874:26895(0x7f04a41fb720):ZOO_INFO@log_env@662: 
Client environment:host.name=science-falcon
2012-09-13 22:19:44,875:26895(0x7f04a41fb720):ZOO_INFO@log_env@669: 
Client environment:os.name=Linux
2012-09-13 22:19:44,875:26895(0x7f04a41fb720):ZOO_INFO@log_env@670: 
Client environment:os.arch=2.6.32-279.el6.x86_64
2012-09-13 22:19:44,875:26895(0x7f04a41fb720):ZOO_INFO@log_env@671: 
Client environment:os.version=#1 SMP Fri Jun 22 12:19:21 UTC 2012
2012-09-13 22:19:44,875:26895(0x7f04a41fb720):ZOO_INFO@log_env@679: 
Client environment:user.name=root
2012-09-13 22:19:44,875:26895(0x7f04a41fb720):ZOO_INFO@log_env@687: 
Client environment:user.home=/root
2012-09-13 22:19:44,875:26895(0x7f04a41fb720):ZOO_INFO@log_env@699: 
Client environment:user.dir=/usr/local/mesos
2012-09-13 
22:19:44,875:26895(0x7f04a41fb720):ZOO_INFO@zookeeper_init@727: 
Initiating client connection, host=localhost:2181 sessionTimeout=10000 
watcher=0x7f04a35e0450 sessionId=0 sessionPasswd=<null> context=0x849ea0 
flags=0
2012-09-13 
22:19:44,875:26895(0x7f04a41fb720):ZOO_DEBUG@start_threads@152: starting 
threads...
2012-09-13 22:19:44,876:26895(0x7f049fb1d700):ZOO_DEBUG@do_io@279: 
started IO thread
2012-09-13 
22:19:44,878:26895(0x7f049f11c700):ZOO_DEBUG@do_completion@326: started 
completion thread
2012-09-13 
22:19:44,878:26895(0x7f049fb1d700):ZOO_INFO@check_events@1585: initiated 
connection to server [::1:2181]
I0913 22:19:44.899587 26914 webui.cpp:61] Loading webui script at 
'/usr/local/mesos/src/webui/master/webui.py'
2012-09-13 
22:19:44,974:26895(0x7f049fb1d700):ZOO_INFO@check_events@1632: session 
establishment complete on server [::1:2181], 
sessionId=0x139c14eb1dc0005, negotiated timeout=10000
2012-09-13 
22:19:44,974:26895(0x7f049fb1d700):ZOO_DEBUG@check_events@1638: Calling 
a watcher for a ZOO_SESSION_EVENT and the state=ZOO_CONNECTED_STATE
2012-09-13 
22:19:44,974:26895(0x7f049f11c700):ZOO_DEBUG@process_completions@1765: 
Calling a watcher for node [], type = -1 event=ZOO_SESSION_EVENT
I0913 22:19:44.975180 26910 detector.cpp:287] Master detector connected 
to ZooKeeper ...
I0913 22:19:44.975307 26910 detector.cpp:316] Trying to create znode 
'/znode' in ZooKeeper
2012-09-13 
22:19:44,975:26895(0x7f04a1134700):ZOO_DEBUG@zoo_acreate@2503: Sending 
request xid=0x50525c01 for path [/znode] to ::1:2181
2012-09-13 
22:19:44,982:26895(0x7f049fb1d700):ZOO_DEBUG@zookeeper_process@1989: 
Queueing asynchronous response
2012-09-13 
22:19:44,982:26895(0x7f049f11c700):ZOO_DEBUG@process_completions@1817: 
Calling COMPLETION_STRING for xid=0x50525c01 rc=-110
Bottle server starting up (using WSGIRefServer())...
Listening on http://0.0.0.0:8080/
Use Ctrl-C to quit.

2012-09-13 
22:19:48,313:26895(0x7f049fb1d700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:19:51,662:26895(0x7f049fb1d700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 11 ms
2012-09-13 
22:19:54,988:26895(0x7f049fb1d700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:19:58,325:26895(0x7f049fb1d700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:20:01,662:26895(0x7f049fb1d700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:20:04,999:26895(0x7f049fb1d700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:20:08,336:26895(0x7f049fb1d700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:20:11,674:26895(0x7f049fb1d700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:20:15,013:26895(0x7f049fb1d700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 3 ms
2012-09-13 
22:20:18,348:26895(0x7f049fb1d700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms

*Slave
*[...] > ./bin/mesos-slave.sh --master=zk://localhost:2181/znode 
"--resources=cpus:2;mem:1024"
I0913 22:20:31.852876 26922 main.cpp:123] Creating "process" isolation 
module
I0913 22:20:31.853536 26922 main.cpp:131] Build: 2012-09-13 19:33:05 by root
I0913 22:20:31.853575 26922 main.cpp:132] Starting Mesos slave
I0913 22:20:31.887912 26937 slave.cpp:172] Slave started on 
1)@10.180.4.184:60711
I0913 22:20:31.887969 26937 slave.cpp:173] Slave resources: cpus=2; mem=1024
2012-09-13 22:20:31,890:26922(0x7fb78eb13720):ZOO_INFO@log_env@658: 
Client environment:zookeeper.version=zookeeper C client 3.3.4
2012-09-13 22:20:31,890:26922(0x7fb78eb13720):ZOO_INFO@log_env@662: 
Client environment:host.name=science-falcon
2012-09-13 22:20:31,890:26922(0x7fb78eb13720):ZOO_INFO@log_env@669: 
Client environment:os.name=Linux
2012-09-13 22:20:31,890:26922(0x7fb78eb13720):ZOO_INFO@log_env@670: 
Client environment:os.arch=2.6.32-279.el6.x86_64
2012-09-13 22:20:31,890:26922(0x7fb78eb13720):ZOO_INFO@log_env@671: 
Client environment:os.version=#1 SMP Fri Jun 22 12:19:21 UTC 2012
2012-09-13 22:20:31,890:26922(0x7fb78eb13720):ZOO_INFO@log_env@679: 
Client environment:user.name=root
2012-09-13 22:20:31,890:26922(0x7fb78eb13720):ZOO_INFO@log_env@687: 
Client environment:user.home=/root
2012-09-13 22:20:31,891:26922(0x7fb78eb13720):ZOO_INFO@log_env@699: 
Client environment:user.dir=/usr/local/mesos
2012-09-13 
22:20:31,891:26922(0x7fb78eb13720):ZOO_INFO@zookeeper_init@727: 
Initiating client connection, host=localhost:2181 sessionTimeout=10000 
watcher=0x7fb78def8450 sessionId=0 sessionPasswd=<null> 
context=0x11616c0 flags=0
2012-09-13 
22:20:31,891:26922(0x7fb78eb13720):ZOO_DEBUG@start_threads@152: starting 
threads...
2012-09-13 22:20:31,892:26922(0x7fb78a435700):ZOO_DEBUG@do_io@279: 
started IO thread
2012-09-13 
22:20:31,893:26922(0x7fb789a34700):ZOO_DEBUG@do_completion@326: started 
completion thread
2012-09-13 
22:20:31,893:26922(0x7fb78a435700):ZOO_INFO@check_events@1585: initiated 
connection to server [127.0.0.1:2181]
I0913 22:20:31.960598 26941 webui.cpp:61] Loading webui script at 
'/usr/local/mesos/src/webui/slave/webui.py'
2012-09-13 
22:20:31,972:26922(0x7fb78a435700):ZOO_INFO@check_events@1632: session 
establishment complete on server [127.0.0.1:2181], 
sessionId=0x139c14eb1dc0006, negotiated timeout=10000
2012-09-13 
22:20:31,972:26922(0x7fb78a435700):ZOO_DEBUG@check_events@1638: Calling 
a watcher for a ZOO_SESSION_EVENT and the state=ZOO_CONNECTED_STATE
2012-09-13 
22:20:31,972:26922(0x7fb789a34700):ZOO_DEBUG@process_completions@1765: 
Calling a watcher for node [], type = -1 event=ZOO_SESSION_EVENT
I0913 22:20:31.972904 26937 detector.cpp:287] Master detector connected 
to ZooKeeper ...
I0913 22:20:31.973017 26937 detector.cpp:316] Trying to create znode 
'/znode' in ZooKeeper
2012-09-13 
22:20:31,973:26922(0x7fb78ba4c700):ZOO_DEBUG@zoo_acreate@2503: Sending 
request xid=0x50525c30 for path [/znode] to 127.0.0.1:2181
2012-09-13 
22:20:31,980:26922(0x7fb78a435700):ZOO_DEBUG@zookeeper_process@1989: 
Queueing asynchronous response
2012-09-13 
22:20:31,980:26922(0x7fb789a34700):ZOO_DEBUG@process_completions@1817: 
Calling COMPLETION_STRING for xid=0x50525c30 rc=-110
Bottle server starting up (using WSGIRefServer())...
Listening on http://0.0.0.0:8081/
Use Ctrl-C to quit.

2012-09-13 
22:20:35,311:26922(0x7fb78a435700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:20:38,648:26922(0x7fb78a435700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:20:41,986:26922(0x7fb78a435700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:20:45,323:26922(0x7fb78a435700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:20:48,660:26922(0x7fb78a435700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:20:51,997:26922(0x7fb78a435700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:20:55,334:26922(0x7fb78a435700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:20:58,672:26922(0x7fb78a435700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 1 ms
2012-09-13 
22:21:02,008:26922(0x7fb78a435700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:21:05,346:26922(0x7fb78a435700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:21:08,683:26922(0x7fb78a435700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms
2012-09-13 
22:21:12,020:26922(0x7fb78a435700):ZOO_DEBUG@zookeeper_process@1983: Got 
ping response in 0 ms



Re: Mesos-Master on Zookeeper Failover

Posted by Eugenia Gabrielova <ga...@llnl.gov>.
Hi,

Thanks very much for your help; here is the issue: 
https://issues.apache.org/jira/browse/MESOS-278

I don't think I have the authority to assign one (I'm not a committer). 
Let me know if there are any other tests I should run.

Sincerely,
Eugenia

On 09/13/2012 04:58 PM, Vinod Kone wrote:
> Can you file a ticket please and assign it to me? I will take a look.
>
> @vinodkone
>
>
> On Thu, Sep 13, 2012 at 4:39 PM, Eugenia Gabrielova<ga...@llnl.gov>wrote:
>
>> Hi Vinod,
>>
>> At this time I believe I am on the up-to-date Mesos trunk version; the
>> latest commit in the log is "Short term fix for ignoring executor resources
>> during launch".
>>
>> Thanks for the ACLs tip, I hadn't thought of that. I took a look at my
>> Zookeeper ACL permissions, and according to the documentation here (
>> http://zookeeper.apache.org/**doc/r3.1.2/**zookeeperProgrammers.html#sc_**
>> ZooKeeperAccessControl<http://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_ZooKeeperAccessControl>),
>> they should be permissive enough. Is there a preferred ACL configuration
>> for Mesos?
>>
>> [zk: localhost:2181(CONNECTED) 0] getAcl /znode
>> 'world,'anyone
>> : cdrwa
>> [zk: localhost:2181(CONNECTED) 1]
>>
>> Thank you,
>> Eugenia
>>
>>
>> On 09/13/2012 03:54 PM, Vinod Kone wrote:
>>
>>> Hi Eugenia,
>>>
>>> What version of mesos are you running? I would suggest building and
>>> running
>>> the latest trunk version.
>>>
>>> Also, is your local zookeeper setup with proper ACLs? I would look at
>>> zookeeper logs to see if it shows anything suspicious? Because, to me, it
>>> looks like both the master and slave are having a hard time creating the
>>> znodes.
>>>
>>> @vinodkone
>>>
>>>
>>> On Thu, Sep 13, 2012 at 3:33 PM, Eugenia Gabrielova<gabrielov1@llnl.gov**
>>>> wrote:
>>>   Dear Mesos Development Team,
>>>> I'm running into an interesting failover bug when running Mesos on
>>>> Zookeeper. It seems to replicate this known failover bug, which is open:
>>>> https://issues.apache.org/****jira/browse/MESOS-246<https://issues.apache.org/**jira/browse/MESOS-246>
>>>> <https://**issues.apache.org/jira/browse/**MESOS-246<https://issues.apache.org/jira/browse/MESOS-246>
>>>>> .
>>>> Would you be willing to suggest a possible workaround or a fix I can make
>>>> locally? I've tried a few configuration changes but am unable to find a
>>>> fix.
>>>>
>>>> Please find below my output from just a local setup, running a local
>>>> standalone Zookeeper + Mesos Master + Slave. I am running the latest
>>>> Master, and Zookeeper 3.3.6 (Stable latest); operating system is CentOS
>>>> 6.3. I do have Mesos running without issues on Zookeeper in one local
>>>> development environment (RHEL 6.2) with an identical configuration, but
>>>> in
>>>> each other environment (one CentOS 6.3, other RHEL 6.2), it encounters
>>>> similar ZOO_DEBUG ->   ping 0ms behaviour on fresh installs. Please let me
>>>> know if I can provide further configuration information.
>>>>
>>>> Sincerely,
>>>> Eugenia
>>>>
>>>> Outputs:
>>>> *Master
>>>> *[...]>   ./bin/mesos-master.sh --zk=zk://localhost:2181/znode
>>>> I0913 22:19:44.869660 26895 main.cpp:115] Build: 2012-09-13 19:33:05 by
>>>> root
>>>> I0913 22:19:44.870185 26895 main.cpp:116] Starting Mesos master
>>>> I0913 22:19:44.871429 26910 master.cpp:299] Master started on
>>>> 10.180.4.184:5050
>>>> I0913 22:19:44.871531 26910 master.cpp:314] Master ID:
>>>> 201209132219-3087315978-5050-****26895
>>>>
>>>> W0913 22:19:44.874285 26910 master.cpp:77] No whitelist given.
>>>> Advertising
>>>> offers for all slaves
>>>> 2012-09-13 22:19:44,874:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>>> *env@658:
>>>> Client environment:zookeeper.version=****zookeeper C client 3.3.4
>>>> 2012-09-13 22:19:44,874:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>>> *env@662:
>>>> Client environment:host.name=science-****falcon
>>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>>> *env@669:
>>>> Client environment:os.name=Linux
>>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>>> *env@670:
>>>> Client environment:os.arch=2.6.32-****279.el6.x86_64
>>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>>> *env@671:
>>>>
>>>> Client environment:os.version=#1 SMP Fri Jun 22 12:19:21 UTC 2012
>>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>>> *env@679:
>>>> Client environment:user.name=root
>>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>>> *env@687:
>>>> Client environment:user.home=/root
>>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>>> *env@699:
>>>> Client environment:user.dir=/usr/****local/mesos
>>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_INFO@**
>>>>
>>>> zookeeper_init@727: Initiating client connection, host=localhost:2181
>>>> sessionTimeout=10000 watcher=0x7f04a35e0450 sessionId=0
>>>> sessionPasswd=<null>   context=0x849ea0 flags=0
>>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_DEBUG@**
>>>> start_threads@152: starting threads...
>>>> 2012-09-13 22:19:44,876:26895(****0x7f049fb1d700):ZOO_DEBUG@do_***
>>>> *io@279:
>>>> started IO thread
>>>> 2012-09-13 22:19:44,878:26895(****0x7f049f11c700):ZOO_DEBUG@do_****
>>>> completion@326: started completion thread
>>>> 2012-09-13 22:19:44,878:26895(****0x7f049fb1d700):ZOO_INFO@**
>>>>
>>>> check_events@1585: initiated connection to server [::1:2181]
>>>> I0913 22:19:44.899587 26914 webui.cpp:61] Loading webui script at
>>>> '/usr/local/mesos/src/webui/****master/webui.py'
>>>> 2012-09-13 22:19:44,974:26895(****0x7f049fb1d700):ZOO_INFO@**
>>>>
>>>> check_events@1632: session establishment complete on server [::1:2181],
>>>> sessionId=0x139c14eb1dc0005, negotiated timeout=10000
>>>> 2012-09-13 22:19:44,974:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>>
>>>> check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the
>>>> state=ZOO_CONNECTED_STATE
>>>> 2012-09-13 22:19:44,974:26895(****0x7f049f11c700):ZOO_DEBUG@**
>>>>
>>>> process_completions@1765: Calling a watcher for node [], type = -1
>>>> event=ZOO_SESSION_EVENT
>>>> I0913 22:19:44.975180 26910 detector.cpp:287] Master detector connected
>>>> to
>>>> ZooKeeper ...
>>>> I0913 22:19:44.975307 26910 detector.cpp:316] Trying to create znode
>>>> '/znode' in ZooKeeper
>>>> 2012-09-13 22:19:44,975:26895(****0x7f04a1134700):ZOO_DEBUG@zoo_****
>>>>
>>>> acreate@2503: Sending request xid=0x50525c01 for path [/znode] to
>>>> ::1:2181
>>>> 2012-09-13 22:19:44,982:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>> zookeeper_process@1989: Queueing asynchronous response
>>>> 2012-09-13 22:19:44,982:26895(****0x7f049f11c700):ZOO_DEBUG@**
>>>>
>>>> process_completions@1817: Calling COMPLETION_STRING for xid=0x50525c01
>>>> rc=-110
>>>> Bottle server starting up (using WSGIRefServer())...
>>>> Listening on http://0.0.0.0:8080/
>>>> Use Ctrl-C to quit.
>>>>
>>>> 2012-09-13 22:19:48,313:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:19:51,662:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 11 ms
>>>> 2012-09-13 22:19:54,988:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:19:58,325:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:20:01,662:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:20:04,999:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:20:08,336:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:20:11,674:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:20:15,013:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 3 ms
>>>> 2012-09-13 22:20:18,348:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>>
>>>> *Slave
>>>> *[...]>   ./bin/mesos-slave.sh --master=zk://localhost:2181/****znode
>>>>
>>>> "--resources=cpus:2;mem:1024"
>>>> I0913 22:20:31.852876 26922 main.cpp:123] Creating "process" isolation
>>>> module
>>>> I0913 22:20:31.853536 26922 main.cpp:131] Build: 2012-09-13 19:33:05 by
>>>> root
>>>> I0913 22:20:31.853575 26922 main.cpp:132] Starting Mesos slave
>>>> I0913 22:20:31.887912 26937 slave.cpp:172] Slave started on 1)@
>>>> 10.180.4.184:60711
>>>> I0913 22:20:31.887969 26937 slave.cpp:173] Slave resources: cpus=2;
>>>> mem=1024
>>>> 2012-09-13 22:20:31,890:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>>> *env@658:
>>>> Client environment:zookeeper.version=****zookeeper C client 3.3.4
>>>> 2012-09-13 22:20:31,890:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>>> *env@662:
>>>> Client environment:host.name=science-****falcon
>>>> 2012-09-13 22:20:31,890:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>>> *env@669:
>>>> Client environment:os.name=Linux
>>>> 2012-09-13 22:20:31,890:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>>> *env@670:
>>>> Client environment:os.arch=2.6.32-****279.el6.x86_64
>>>> 2012-09-13 22:20:31,890:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>>> *env@671:
>>>>
>>>> Client environment:os.version=#1 SMP Fri Jun 22 12:19:21 UTC 2012
>>>> 2012-09-13 22:20:31,890:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>>> *env@679:
>>>> Client environment:user.name=root
>>>> 2012-09-13 22:20:31,890:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>>> *env@687:
>>>> Client environment:user.home=/root
>>>> 2012-09-13 22:20:31,891:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>>> *env@699:
>>>> Client environment:user.dir=/usr/****local/mesos
>>>> 2012-09-13 22:20:31,891:26922(****0x7fb78eb13720):ZOO_INFO@**
>>>>
>>>> zookeeper_init@727: Initiating client connection, host=localhost:2181
>>>> sessionTimeout=10000 watcher=0x7fb78def8450 sessionId=0
>>>> sessionPasswd=<null>   context=0x11616c0 flags=0
>>>> 2012-09-13 22:20:31,891:26922(****0x7fb78eb13720):ZOO_DEBUG@**
>>>> start_threads@152: starting threads...
>>>> 2012-09-13 22:20:31,892:26922(****0x7fb78a435700):ZOO_DEBUG@do_***
>>>> *io@279:
>>>> started IO thread
>>>> 2012-09-13 22:20:31,893:26922(****0x7fb789a34700):ZOO_DEBUG@do_****
>>>> completion@326: started completion thread
>>>> 2012-09-13 22:20:31,893:26922(****0x7fb78a435700):ZOO_INFO@**
>>>>
>>>> check_events@1585: initiated connection to server [127.0.0.1:2181]
>>>> I0913 22:20:31.960598 26941 webui.cpp:61] Loading webui script at
>>>> '/usr/local/mesos/src/webui/****slave/webui.py'
>>>> 2012-09-13 22:20:31,972:26922(****0x7fb78a435700):ZOO_INFO@**
>>>>
>>>> check_events@1632: session establishment complete on server [
>>>> 127.0.0.1:2181], sessionId=0x139c14eb1dc0006, negotiated timeout=10000
>>>> 2012-09-13 22:20:31,972:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>>
>>>> check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the
>>>> state=ZOO_CONNECTED_STATE
>>>> 2012-09-13 22:20:31,972:26922(****0x7fb789a34700):ZOO_DEBUG@**
>>>>
>>>> process_completions@1765: Calling a watcher for node [], type = -1
>>>> event=ZOO_SESSION_EVENT
>>>> I0913 22:20:31.972904 26937 detector.cpp:287] Master detector connected
>>>> to
>>>> ZooKeeper ...
>>>> I0913 22:20:31.973017 26937 detector.cpp:316] Trying to create znode
>>>> '/znode' in ZooKeeper
>>>> 2012-09-13 22:20:31,973:26922(****0x7fb78ba4c700):ZOO_DEBUG@zoo_****
>>>>
>>>> acreate@2503: Sending request xid=0x50525c30 for path [/znode] to
>>>> 127.0.0.1:2181
>>>> 2012-09-13 22:20:31,980:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>> zookeeper_process@1989: Queueing asynchronous response
>>>> 2012-09-13 22:20:31,980:26922(****0x7fb789a34700):ZOO_DEBUG@**
>>>>
>>>> process_completions@1817: Calling COMPLETION_STRING for xid=0x50525c30
>>>> rc=-110
>>>> Bottle server starting up (using WSGIRefServer())...
>>>> Listening on http://0.0.0.0:8081/
>>>> Use Ctrl-C to quit.
>>>>
>>>> 2012-09-13 22:20:35,311:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:20:38,648:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:20:41,986:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:20:45,323:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:20:48,660:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:20:51,997:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:20:55,334:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:20:58,672:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 1 ms
>>>> 2012-09-13 22:21:02,008:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:21:05,346:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:21:08,683:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>> 2012-09-13 22:21:12,020:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>>
>>>> zookeeper_process@1983: Got ping response in 0 ms
>>>>
>>>>
>>>>
>>>>


Re: Mesos-Master on Zookeeper Failover

Posted by Vinod Kone <vi...@twitter.com>.
Can you file a ticket please and assign it to me? I will take a look.

@vinodkone


On Thu, Sep 13, 2012 at 4:39 PM, Eugenia Gabrielova <ga...@llnl.gov>wrote:

> Hi Vinod,
>
> At this time I believe I am on the up-to-date Mesos trunk version; the
> latest commit in the log is "Short term fix for ignoring executor resources
> during launch".
>
> Thanks for the ACLs tip, I hadn't thought of that. I took a look at my
> Zookeeper ACL permissions, and according to the documentation here (
> http://zookeeper.apache.org/**doc/r3.1.2/**zookeeperProgrammers.html#sc_**
> ZooKeeperAccessControl<http://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_ZooKeeperAccessControl>),
> they should be permissive enough. Is there a preferred ACL configuration
> for Mesos?
>
> [zk: localhost:2181(CONNECTED) 0] getAcl /znode
> 'world,'anyone
> : cdrwa
> [zk: localhost:2181(CONNECTED) 1]
>
> Thank you,
> Eugenia
>
>
> On 09/13/2012 03:54 PM, Vinod Kone wrote:
>
>> Hi Eugenia,
>>
>> What version of mesos are you running? I would suggest building and
>> running
>> the latest trunk version.
>>
>> Also, is your local zookeeper setup with proper ACLs? I would look at
>> zookeeper logs to see if it shows anything suspicious? Because, to me, it
>> looks like both the master and slave are having a hard time creating the
>> znodes.
>>
>> @vinodkone
>>
>>
>> On Thu, Sep 13, 2012 at 3:33 PM, Eugenia Gabrielova<gabrielov1@llnl.gov**
>> >wrote:
>>
>>  Dear Mesos Development Team,
>>>
>>> I'm running into an interesting failover bug when running Mesos on
>>> Zookeeper. It seems to replicate this known failover bug, which is open:
>>> https://issues.apache.org/****jira/browse/MESOS-246<https://issues.apache.org/**jira/browse/MESOS-246>
>>> <https://**issues.apache.org/jira/browse/**MESOS-246<https://issues.apache.org/jira/browse/MESOS-246>
>>> >.
>>>
>>> Would you be willing to suggest a possible workaround or a fix I can make
>>> locally? I've tried a few configuration changes but am unable to find a
>>> fix.
>>>
>>> Please find below my output from just a local setup, running a local
>>> standalone Zookeeper + Mesos Master + Slave. I am running the latest
>>> Master, and Zookeeper 3.3.6 (Stable latest); operating system is CentOS
>>> 6.3. I do have Mesos running without issues on Zookeeper in one local
>>> development environment (RHEL 6.2) with an identical configuration, but
>>> in
>>> each other environment (one CentOS 6.3, other RHEL 6.2), it encounters
>>> similar ZOO_DEBUG ->  ping 0ms behaviour on fresh installs. Please let me
>>> know if I can provide further configuration information.
>>>
>>> Sincerely,
>>> Eugenia
>>>
>>> Outputs:
>>> *Master
>>> *[...]>  ./bin/mesos-master.sh --zk=zk://localhost:2181/znode
>>> I0913 22:19:44.869660 26895 main.cpp:115] Build: 2012-09-13 19:33:05 by
>>> root
>>> I0913 22:19:44.870185 26895 main.cpp:116] Starting Mesos master
>>> I0913 22:19:44.871429 26910 master.cpp:299] Master started on
>>> 10.180.4.184:5050
>>> I0913 22:19:44.871531 26910 master.cpp:314] Master ID:
>>> 201209132219-3087315978-5050-****26895
>>>
>>> W0913 22:19:44.874285 26910 master.cpp:77] No whitelist given.
>>> Advertising
>>> offers for all slaves
>>> 2012-09-13 22:19:44,874:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>> *env@658:
>>> Client environment:zookeeper.version=****zookeeper C client 3.3.4
>>> 2012-09-13 22:19:44,874:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>> *env@662:
>>> Client environment:host.name=science-****falcon
>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>> *env@669:
>>> Client environment:os.name=Linux
>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>> *env@670:
>>> Client environment:os.arch=2.6.32-****279.el6.x86_64
>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>> *env@671:
>>>
>>> Client environment:os.version=#1 SMP Fri Jun 22 12:19:21 UTC 2012
>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>> *env@679:
>>> Client environment:user.name=root
>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>> *env@687:
>>> Client environment:user.home=/root
>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_INFO@log_***
>>> *env@699:
>>> Client environment:user.dir=/usr/****local/mesos
>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_INFO@**
>>>
>>> zookeeper_init@727: Initiating client connection, host=localhost:2181
>>> sessionTimeout=10000 watcher=0x7f04a35e0450 sessionId=0
>>> sessionPasswd=<null>  context=0x849ea0 flags=0
>>> 2012-09-13 22:19:44,875:26895(****0x7f04a41fb720):ZOO_DEBUG@**
>>> start_threads@152: starting threads...
>>> 2012-09-13 22:19:44,876:26895(****0x7f049fb1d700):ZOO_DEBUG@do_***
>>> *io@279:
>>> started IO thread
>>> 2012-09-13 22:19:44,878:26895(****0x7f049f11c700):ZOO_DEBUG@do_****
>>> completion@326: started completion thread
>>> 2012-09-13 22:19:44,878:26895(****0x7f049fb1d700):ZOO_INFO@**
>>>
>>> check_events@1585: initiated connection to server [::1:2181]
>>> I0913 22:19:44.899587 26914 webui.cpp:61] Loading webui script at
>>> '/usr/local/mesos/src/webui/****master/webui.py'
>>> 2012-09-13 22:19:44,974:26895(****0x7f049fb1d700):ZOO_INFO@**
>>>
>>> check_events@1632: session establishment complete on server [::1:2181],
>>> sessionId=0x139c14eb1dc0005, negotiated timeout=10000
>>> 2012-09-13 22:19:44,974:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>
>>> check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the
>>> state=ZOO_CONNECTED_STATE
>>> 2012-09-13 22:19:44,974:26895(****0x7f049f11c700):ZOO_DEBUG@**
>>>
>>> process_completions@1765: Calling a watcher for node [], type = -1
>>> event=ZOO_SESSION_EVENT
>>> I0913 22:19:44.975180 26910 detector.cpp:287] Master detector connected
>>> to
>>> ZooKeeper ...
>>> I0913 22:19:44.975307 26910 detector.cpp:316] Trying to create znode
>>> '/znode' in ZooKeeper
>>> 2012-09-13 22:19:44,975:26895(****0x7f04a1134700):ZOO_DEBUG@zoo_****
>>>
>>> acreate@2503: Sending request xid=0x50525c01 for path [/znode] to
>>> ::1:2181
>>> 2012-09-13 22:19:44,982:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>> zookeeper_process@1989: Queueing asynchronous response
>>> 2012-09-13 22:19:44,982:26895(****0x7f049f11c700):ZOO_DEBUG@**
>>>
>>> process_completions@1817: Calling COMPLETION_STRING for xid=0x50525c01
>>> rc=-110
>>> Bottle server starting up (using WSGIRefServer())...
>>> Listening on http://0.0.0.0:8080/
>>> Use Ctrl-C to quit.
>>>
>>> 2012-09-13 22:19:48,313:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:19:51,662:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 11 ms
>>> 2012-09-13 22:19:54,988:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:19:58,325:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:20:01,662:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:20:04,999:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:20:08,336:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:20:11,674:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:20:15,013:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 3 ms
>>> 2012-09-13 22:20:18,348:26895(****0x7f049fb1d700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>>
>>> *Slave
>>> *[...]>  ./bin/mesos-slave.sh --master=zk://localhost:2181/****znode
>>>
>>> "--resources=cpus:2;mem:1024"
>>> I0913 22:20:31.852876 26922 main.cpp:123] Creating "process" isolation
>>> module
>>> I0913 22:20:31.853536 26922 main.cpp:131] Build: 2012-09-13 19:33:05 by
>>> root
>>> I0913 22:20:31.853575 26922 main.cpp:132] Starting Mesos slave
>>> I0913 22:20:31.887912 26937 slave.cpp:172] Slave started on 1)@
>>> 10.180.4.184:60711
>>> I0913 22:20:31.887969 26937 slave.cpp:173] Slave resources: cpus=2;
>>> mem=1024
>>> 2012-09-13 22:20:31,890:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>> *env@658:
>>> Client environment:zookeeper.version=****zookeeper C client 3.3.4
>>> 2012-09-13 22:20:31,890:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>> *env@662:
>>> Client environment:host.name=science-****falcon
>>> 2012-09-13 22:20:31,890:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>> *env@669:
>>> Client environment:os.name=Linux
>>> 2012-09-13 22:20:31,890:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>> *env@670:
>>> Client environment:os.arch=2.6.32-****279.el6.x86_64
>>> 2012-09-13 22:20:31,890:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>> *env@671:
>>>
>>> Client environment:os.version=#1 SMP Fri Jun 22 12:19:21 UTC 2012
>>> 2012-09-13 22:20:31,890:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>> *env@679:
>>> Client environment:user.name=root
>>> 2012-09-13 22:20:31,890:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>> *env@687:
>>> Client environment:user.home=/root
>>> 2012-09-13 22:20:31,891:26922(****0x7fb78eb13720):ZOO_INFO@log_***
>>> *env@699:
>>> Client environment:user.dir=/usr/****local/mesos
>>> 2012-09-13 22:20:31,891:26922(****0x7fb78eb13720):ZOO_INFO@**
>>>
>>> zookeeper_init@727: Initiating client connection, host=localhost:2181
>>> sessionTimeout=10000 watcher=0x7fb78def8450 sessionId=0
>>> sessionPasswd=<null>  context=0x11616c0 flags=0
>>> 2012-09-13 22:20:31,891:26922(****0x7fb78eb13720):ZOO_DEBUG@**
>>> start_threads@152: starting threads...
>>> 2012-09-13 22:20:31,892:26922(****0x7fb78a435700):ZOO_DEBUG@do_***
>>> *io@279:
>>> started IO thread
>>> 2012-09-13 22:20:31,893:26922(****0x7fb789a34700):ZOO_DEBUG@do_****
>>> completion@326: started completion thread
>>> 2012-09-13 22:20:31,893:26922(****0x7fb78a435700):ZOO_INFO@**
>>>
>>> check_events@1585: initiated connection to server [127.0.0.1:2181]
>>> I0913 22:20:31.960598 26941 webui.cpp:61] Loading webui script at
>>> '/usr/local/mesos/src/webui/****slave/webui.py'
>>> 2012-09-13 22:20:31,972:26922(****0x7fb78a435700):ZOO_INFO@**
>>>
>>> check_events@1632: session establishment complete on server [
>>> 127.0.0.1:2181], sessionId=0x139c14eb1dc0006, negotiated timeout=10000
>>> 2012-09-13 22:20:31,972:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>
>>> check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the
>>> state=ZOO_CONNECTED_STATE
>>> 2012-09-13 22:20:31,972:26922(****0x7fb789a34700):ZOO_DEBUG@**
>>>
>>> process_completions@1765: Calling a watcher for node [], type = -1
>>> event=ZOO_SESSION_EVENT
>>> I0913 22:20:31.972904 26937 detector.cpp:287] Master detector connected
>>> to
>>> ZooKeeper ...
>>> I0913 22:20:31.973017 26937 detector.cpp:316] Trying to create znode
>>> '/znode' in ZooKeeper
>>> 2012-09-13 22:20:31,973:26922(****0x7fb78ba4c700):ZOO_DEBUG@zoo_****
>>>
>>> acreate@2503: Sending request xid=0x50525c30 for path [/znode] to
>>> 127.0.0.1:2181
>>> 2012-09-13 22:20:31,980:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>> zookeeper_process@1989: Queueing asynchronous response
>>> 2012-09-13 22:20:31,980:26922(****0x7fb789a34700):ZOO_DEBUG@**
>>>
>>> process_completions@1817: Calling COMPLETION_STRING for xid=0x50525c30
>>> rc=-110
>>> Bottle server starting up (using WSGIRefServer())...
>>> Listening on http://0.0.0.0:8081/
>>> Use Ctrl-C to quit.
>>>
>>> 2012-09-13 22:20:35,311:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:20:38,648:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:20:41,986:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:20:45,323:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:20:48,660:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:20:51,997:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:20:55,334:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:20:58,672:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 1 ms
>>> 2012-09-13 22:21:02,008:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:21:05,346:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:21:08,683:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>> 2012-09-13 22:21:12,020:26922(****0x7fb78a435700):ZOO_DEBUG@**
>>>
>>> zookeeper_process@1983: Got ping response in 0 ms
>>>
>>>
>>>
>>>
>

Re: Mesos-Master on Zookeeper Failover

Posted by Eugenia Gabrielova <ga...@llnl.gov>.
Hi Vinod,

At this time I believe I am on the up-to-date Mesos trunk version; the 
latest commit in the log is "Short term fix for ignoring executor 
resources during launch".

Thanks for the ACLs tip, I hadn't thought of that. I took a look at my 
Zookeeper ACL permissions, and according to the documentation here 
(http://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_ZooKeeperAccessControl), 
they should be permissive enough. Is there a preferred ACL configuration 
for Mesos?

[zk: localhost:2181(CONNECTED) 0] getAcl /znode
'world,'anyone
: cdrwa
[zk: localhost:2181(CONNECTED) 1]

Thank you,
Eugenia

On 09/13/2012 03:54 PM, Vinod Kone wrote:
> Hi Eugenia,
>
> What version of mesos are you running? I would suggest building and running
> the latest trunk version.
>
> Also, is your local zookeeper setup with proper ACLs? I would look at
> zookeeper logs to see if it shows anything suspicious? Because, to me, it
> looks like both the master and slave are having a hard time creating the
> znodes.
>
> @vinodkone
>
>
> On Thu, Sep 13, 2012 at 3:33 PM, Eugenia Gabrielova<ga...@llnl.gov>wrote:
>
>> Dear Mesos Development Team,
>>
>> I'm running into an interesting failover bug when running Mesos on
>> Zookeeper. It seems to replicate this known failover bug, which is open:
>> https://issues.apache.org/**jira/browse/MESOS-246<https://issues.apache.org/jira/browse/MESOS-246>.
>> Would you be willing to suggest a possible workaround or a fix I can make
>> locally? I've tried a few configuration changes but am unable to find a fix.
>>
>> Please find below my output from just a local setup, running a local
>> standalone Zookeeper + Mesos Master + Slave. I am running the latest
>> Master, and Zookeeper 3.3.6 (Stable latest); operating system is CentOS
>> 6.3. I do have Mesos running without issues on Zookeeper in one local
>> development environment (RHEL 6.2) with an identical configuration, but in
>> each other environment (one CentOS 6.3, other RHEL 6.2), it encounters
>> similar ZOO_DEBUG ->  ping 0ms behaviour on fresh installs. Please let me
>> know if I can provide further configuration information.
>>
>> Sincerely,
>> Eugenia
>>
>> Outputs:
>> *Master
>> *[...]>  ./bin/mesos-master.sh --zk=zk://localhost:2181/znode
>> I0913 22:19:44.869660 26895 main.cpp:115] Build: 2012-09-13 19:33:05 by
>> root
>> I0913 22:19:44.870185 26895 main.cpp:116] Starting Mesos master
>> I0913 22:19:44.871429 26910 master.cpp:299] Master started on
>> 10.180.4.184:5050
>> I0913 22:19:44.871531 26910 master.cpp:314] Master ID:
>> 201209132219-3087315978-5050-**26895
>> W0913 22:19:44.874285 26910 master.cpp:77] No whitelist given. Advertising
>> offers for all slaves
>> 2012-09-13 22:19:44,874:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@658:
>> Client environment:zookeeper.version=**zookeeper C client 3.3.4
>> 2012-09-13 22:19:44,874:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@662:
>> Client environment:host.name=science-**falcon
>> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@669:
>> Client environment:os.name=Linux
>> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@670:
>> Client environment:os.arch=2.6.32-**279.el6.x86_64
>> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@671:
>> Client environment:os.version=#1 SMP Fri Jun 22 12:19:21 UTC 2012
>> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@679:
>> Client environment:user.name=root
>> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@687:
>> Client environment:user.home=/root
>> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@699:
>> Client environment:user.dir=/usr/**local/mesos
>> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_INFO@**
>> zookeeper_init@727: Initiating client connection, host=localhost:2181
>> sessionTimeout=10000 watcher=0x7f04a35e0450 sessionId=0
>> sessionPasswd=<null>  context=0x849ea0 flags=0
>> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_DEBUG@**
>> start_threads@152: starting threads...
>> 2012-09-13 22:19:44,876:26895(**0x7f049fb1d700):ZOO_DEBUG@do_**io@279:
>> started IO thread
>> 2012-09-13 22:19:44,878:26895(**0x7f049f11c700):ZOO_DEBUG@do_**
>> completion@326: started completion thread
>> 2012-09-13 22:19:44,878:26895(**0x7f049fb1d700):ZOO_INFO@**
>> check_events@1585: initiated connection to server [::1:2181]
>> I0913 22:19:44.899587 26914 webui.cpp:61] Loading webui script at
>> '/usr/local/mesos/src/webui/**master/webui.py'
>> 2012-09-13 22:19:44,974:26895(**0x7f049fb1d700):ZOO_INFO@**
>> check_events@1632: session establishment complete on server [::1:2181],
>> sessionId=0x139c14eb1dc0005, negotiated timeout=10000
>> 2012-09-13 22:19:44,974:26895(**0x7f049fb1d700):ZOO_DEBUG@**
>> check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the
>> state=ZOO_CONNECTED_STATE
>> 2012-09-13 22:19:44,974:26895(**0x7f049f11c700):ZOO_DEBUG@**
>> process_completions@1765: Calling a watcher for node [], type = -1
>> event=ZOO_SESSION_EVENT
>> I0913 22:19:44.975180 26910 detector.cpp:287] Master detector connected to
>> ZooKeeper ...
>> I0913 22:19:44.975307 26910 detector.cpp:316] Trying to create znode
>> '/znode' in ZooKeeper
>> 2012-09-13 22:19:44,975:26895(**0x7f04a1134700):ZOO_DEBUG@zoo_**
>> acreate@2503: Sending request xid=0x50525c01 for path [/znode] to ::1:2181
>> 2012-09-13 22:19:44,982:26895(**0x7f049fb1d700):ZOO_DEBUG@**
>> zookeeper_process@1989: Queueing asynchronous response
>> 2012-09-13 22:19:44,982:26895(**0x7f049f11c700):ZOO_DEBUG@**
>> process_completions@1817: Calling COMPLETION_STRING for xid=0x50525c01
>> rc=-110
>> Bottle server starting up (using WSGIRefServer())...
>> Listening on http://0.0.0.0:8080/
>> Use Ctrl-C to quit.
>>
>> 2012-09-13 22:19:48,313:26895(**0x7f049fb1d700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:19:51,662:26895(**0x7f049fb1d700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 11 ms
>> 2012-09-13 22:19:54,988:26895(**0x7f049fb1d700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:19:58,325:26895(**0x7f049fb1d700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:20:01,662:26895(**0x7f049fb1d700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:20:04,999:26895(**0x7f049fb1d700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:20:08,336:26895(**0x7f049fb1d700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:20:11,674:26895(**0x7f049fb1d700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:20:15,013:26895(**0x7f049fb1d700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 3 ms
>> 2012-09-13 22:20:18,348:26895(**0x7f049fb1d700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>>
>> *Slave
>> *[...]>  ./bin/mesos-slave.sh --master=zk://localhost:2181/**znode
>> "--resources=cpus:2;mem:1024"
>> I0913 22:20:31.852876 26922 main.cpp:123] Creating "process" isolation
>> module
>> I0913 22:20:31.853536 26922 main.cpp:131] Build: 2012-09-13 19:33:05 by
>> root
>> I0913 22:20:31.853575 26922 main.cpp:132] Starting Mesos slave
>> I0913 22:20:31.887912 26937 slave.cpp:172] Slave started on 1)@
>> 10.180.4.184:60711
>> I0913 22:20:31.887969 26937 slave.cpp:173] Slave resources: cpus=2;
>> mem=1024
>> 2012-09-13 22:20:31,890:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@658:
>> Client environment:zookeeper.version=**zookeeper C client 3.3.4
>> 2012-09-13 22:20:31,890:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@662:
>> Client environment:host.name=science-**falcon
>> 2012-09-13 22:20:31,890:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@669:
>> Client environment:os.name=Linux
>> 2012-09-13 22:20:31,890:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@670:
>> Client environment:os.arch=2.6.32-**279.el6.x86_64
>> 2012-09-13 22:20:31,890:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@671:
>> Client environment:os.version=#1 SMP Fri Jun 22 12:19:21 UTC 2012
>> 2012-09-13 22:20:31,890:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@679:
>> Client environment:user.name=root
>> 2012-09-13 22:20:31,890:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@687:
>> Client environment:user.home=/root
>> 2012-09-13 22:20:31,891:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@699:
>> Client environment:user.dir=/usr/**local/mesos
>> 2012-09-13 22:20:31,891:26922(**0x7fb78eb13720):ZOO_INFO@**
>> zookeeper_init@727: Initiating client connection, host=localhost:2181
>> sessionTimeout=10000 watcher=0x7fb78def8450 sessionId=0
>> sessionPasswd=<null>  context=0x11616c0 flags=0
>> 2012-09-13 22:20:31,891:26922(**0x7fb78eb13720):ZOO_DEBUG@**
>> start_threads@152: starting threads...
>> 2012-09-13 22:20:31,892:26922(**0x7fb78a435700):ZOO_DEBUG@do_**io@279:
>> started IO thread
>> 2012-09-13 22:20:31,893:26922(**0x7fb789a34700):ZOO_DEBUG@do_**
>> completion@326: started completion thread
>> 2012-09-13 22:20:31,893:26922(**0x7fb78a435700):ZOO_INFO@**
>> check_events@1585: initiated connection to server [127.0.0.1:2181]
>> I0913 22:20:31.960598 26941 webui.cpp:61] Loading webui script at
>> '/usr/local/mesos/src/webui/**slave/webui.py'
>> 2012-09-13 22:20:31,972:26922(**0x7fb78a435700):ZOO_INFO@**
>> check_events@1632: session establishment complete on server [
>> 127.0.0.1:2181], sessionId=0x139c14eb1dc0006, negotiated timeout=10000
>> 2012-09-13 22:20:31,972:26922(**0x7fb78a435700):ZOO_DEBUG@**
>> check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the
>> state=ZOO_CONNECTED_STATE
>> 2012-09-13 22:20:31,972:26922(**0x7fb789a34700):ZOO_DEBUG@**
>> process_completions@1765: Calling a watcher for node [], type = -1
>> event=ZOO_SESSION_EVENT
>> I0913 22:20:31.972904 26937 detector.cpp:287] Master detector connected to
>> ZooKeeper ...
>> I0913 22:20:31.973017 26937 detector.cpp:316] Trying to create znode
>> '/znode' in ZooKeeper
>> 2012-09-13 22:20:31,973:26922(**0x7fb78ba4c700):ZOO_DEBUG@zoo_**
>> acreate@2503: Sending request xid=0x50525c30 for path [/znode] to
>> 127.0.0.1:2181
>> 2012-09-13 22:20:31,980:26922(**0x7fb78a435700):ZOO_DEBUG@**
>> zookeeper_process@1989: Queueing asynchronous response
>> 2012-09-13 22:20:31,980:26922(**0x7fb789a34700):ZOO_DEBUG@**
>> process_completions@1817: Calling COMPLETION_STRING for xid=0x50525c30
>> rc=-110
>> Bottle server starting up (using WSGIRefServer())...
>> Listening on http://0.0.0.0:8081/
>> Use Ctrl-C to quit.
>>
>> 2012-09-13 22:20:35,311:26922(**0x7fb78a435700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:20:38,648:26922(**0x7fb78a435700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:20:41,986:26922(**0x7fb78a435700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:20:45,323:26922(**0x7fb78a435700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:20:48,660:26922(**0x7fb78a435700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:20:51,997:26922(**0x7fb78a435700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:20:55,334:26922(**0x7fb78a435700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:20:58,672:26922(**0x7fb78a435700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 1 ms
>> 2012-09-13 22:21:02,008:26922(**0x7fb78a435700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:21:05,346:26922(**0x7fb78a435700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:21:08,683:26922(**0x7fb78a435700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>> 2012-09-13 22:21:12,020:26922(**0x7fb78a435700):ZOO_DEBUG@**
>> zookeeper_process@1983: Got ping response in 0 ms
>>
>>
>>


Re: Mesos-Master on Zookeeper Failover

Posted by Vinod Kone <vi...@twitter.com>.
Hi Eugenia,

What version of mesos are you running? I would suggest building and running
the latest trunk version.

Also, is your local zookeeper setup with proper ACLs? I would look at
zookeeper logs to see if it shows anything suspicious? Because, to me, it
looks like both the master and slave are having a hard time creating the
znodes.

@vinodkone


On Thu, Sep 13, 2012 at 3:33 PM, Eugenia Gabrielova <ga...@llnl.gov>wrote:

> Dear Mesos Development Team,
>
> I'm running into an interesting failover bug when running Mesos on
> Zookeeper. It seems to replicate this known failover bug, which is open:
> https://issues.apache.org/**jira/browse/MESOS-246<https://issues.apache.org/jira/browse/MESOS-246>.
> Would you be willing to suggest a possible workaround or a fix I can make
> locally? I've tried a few configuration changes but am unable to find a fix.
>
> Please find below my output from just a local setup, running a local
> standalone Zookeeper + Mesos Master + Slave. I am running the latest
> Master, and Zookeeper 3.3.6 (Stable latest); operating system is CentOS
> 6.3. I do have Mesos running without issues on Zookeeper in one local
> development environment (RHEL 6.2) with an identical configuration, but in
> each other environment (one CentOS 6.3, other RHEL 6.2), it encounters
> similar ZOO_DEBUG -> ping 0ms behaviour on fresh installs. Please let me
> know if I can provide further configuration information.
>
> Sincerely,
> Eugenia
>
> Outputs:
> *Master
> *[...] > ./bin/mesos-master.sh --zk=zk://localhost:2181/znode
> I0913 22:19:44.869660 26895 main.cpp:115] Build: 2012-09-13 19:33:05 by
> root
> I0913 22:19:44.870185 26895 main.cpp:116] Starting Mesos master
> I0913 22:19:44.871429 26910 master.cpp:299] Master started on
> 10.180.4.184:5050
> I0913 22:19:44.871531 26910 master.cpp:314] Master ID:
> 201209132219-3087315978-5050-**26895
> W0913 22:19:44.874285 26910 master.cpp:77] No whitelist given. Advertising
> offers for all slaves
> 2012-09-13 22:19:44,874:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@658:
> Client environment:zookeeper.version=**zookeeper C client 3.3.4
> 2012-09-13 22:19:44,874:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@662:
> Client environment:host.name=science-**falcon
> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@669:
> Client environment:os.name=Linux
> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@670:
> Client environment:os.arch=2.6.32-**279.el6.x86_64
> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@671:
> Client environment:os.version=#1 SMP Fri Jun 22 12:19:21 UTC 2012
> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@679:
> Client environment:user.name=root
> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@687:
> Client environment:user.home=/root
> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_INFO@log_**env@699:
> Client environment:user.dir=/usr/**local/mesos
> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_INFO@**
> zookeeper_init@727: Initiating client connection, host=localhost:2181
> sessionTimeout=10000 watcher=0x7f04a35e0450 sessionId=0
> sessionPasswd=<null> context=0x849ea0 flags=0
> 2012-09-13 22:19:44,875:26895(**0x7f04a41fb720):ZOO_DEBUG@**
> start_threads@152: starting threads...
> 2012-09-13 22:19:44,876:26895(**0x7f049fb1d700):ZOO_DEBUG@do_**io@279:
> started IO thread
> 2012-09-13 22:19:44,878:26895(**0x7f049f11c700):ZOO_DEBUG@do_**
> completion@326: started completion thread
> 2012-09-13 22:19:44,878:26895(**0x7f049fb1d700):ZOO_INFO@**
> check_events@1585: initiated connection to server [::1:2181]
> I0913 22:19:44.899587 26914 webui.cpp:61] Loading webui script at
> '/usr/local/mesos/src/webui/**master/webui.py'
> 2012-09-13 22:19:44,974:26895(**0x7f049fb1d700):ZOO_INFO@**
> check_events@1632: session establishment complete on server [::1:2181],
> sessionId=0x139c14eb1dc0005, negotiated timeout=10000
> 2012-09-13 22:19:44,974:26895(**0x7f049fb1d700):ZOO_DEBUG@**
> check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the
> state=ZOO_CONNECTED_STATE
> 2012-09-13 22:19:44,974:26895(**0x7f049f11c700):ZOO_DEBUG@**
> process_completions@1765: Calling a watcher for node [], type = -1
> event=ZOO_SESSION_EVENT
> I0913 22:19:44.975180 26910 detector.cpp:287] Master detector connected to
> ZooKeeper ...
> I0913 22:19:44.975307 26910 detector.cpp:316] Trying to create znode
> '/znode' in ZooKeeper
> 2012-09-13 22:19:44,975:26895(**0x7f04a1134700):ZOO_DEBUG@zoo_**
> acreate@2503: Sending request xid=0x50525c01 for path [/znode] to ::1:2181
> 2012-09-13 22:19:44,982:26895(**0x7f049fb1d700):ZOO_DEBUG@**
> zookeeper_process@1989: Queueing asynchronous response
> 2012-09-13 22:19:44,982:26895(**0x7f049f11c700):ZOO_DEBUG@**
> process_completions@1817: Calling COMPLETION_STRING for xid=0x50525c01
> rc=-110
> Bottle server starting up (using WSGIRefServer())...
> Listening on http://0.0.0.0:8080/
> Use Ctrl-C to quit.
>
> 2012-09-13 22:19:48,313:26895(**0x7f049fb1d700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:19:51,662:26895(**0x7f049fb1d700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 11 ms
> 2012-09-13 22:19:54,988:26895(**0x7f049fb1d700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:19:58,325:26895(**0x7f049fb1d700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:20:01,662:26895(**0x7f049fb1d700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:20:04,999:26895(**0x7f049fb1d700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:20:08,336:26895(**0x7f049fb1d700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:20:11,674:26895(**0x7f049fb1d700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:20:15,013:26895(**0x7f049fb1d700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 3 ms
> 2012-09-13 22:20:18,348:26895(**0x7f049fb1d700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
>
> *Slave
> *[...] > ./bin/mesos-slave.sh --master=zk://localhost:2181/**znode
> "--resources=cpus:2;mem:1024"
> I0913 22:20:31.852876 26922 main.cpp:123] Creating "process" isolation
> module
> I0913 22:20:31.853536 26922 main.cpp:131] Build: 2012-09-13 19:33:05 by
> root
> I0913 22:20:31.853575 26922 main.cpp:132] Starting Mesos slave
> I0913 22:20:31.887912 26937 slave.cpp:172] Slave started on 1)@
> 10.180.4.184:60711
> I0913 22:20:31.887969 26937 slave.cpp:173] Slave resources: cpus=2;
> mem=1024
> 2012-09-13 22:20:31,890:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@658:
> Client environment:zookeeper.version=**zookeeper C client 3.3.4
> 2012-09-13 22:20:31,890:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@662:
> Client environment:host.name=science-**falcon
> 2012-09-13 22:20:31,890:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@669:
> Client environment:os.name=Linux
> 2012-09-13 22:20:31,890:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@670:
> Client environment:os.arch=2.6.32-**279.el6.x86_64
> 2012-09-13 22:20:31,890:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@671:
> Client environment:os.version=#1 SMP Fri Jun 22 12:19:21 UTC 2012
> 2012-09-13 22:20:31,890:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@679:
> Client environment:user.name=root
> 2012-09-13 22:20:31,890:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@687:
> Client environment:user.home=/root
> 2012-09-13 22:20:31,891:26922(**0x7fb78eb13720):ZOO_INFO@log_**env@699:
> Client environment:user.dir=/usr/**local/mesos
> 2012-09-13 22:20:31,891:26922(**0x7fb78eb13720):ZOO_INFO@**
> zookeeper_init@727: Initiating client connection, host=localhost:2181
> sessionTimeout=10000 watcher=0x7fb78def8450 sessionId=0
> sessionPasswd=<null> context=0x11616c0 flags=0
> 2012-09-13 22:20:31,891:26922(**0x7fb78eb13720):ZOO_DEBUG@**
> start_threads@152: starting threads...
> 2012-09-13 22:20:31,892:26922(**0x7fb78a435700):ZOO_DEBUG@do_**io@279:
> started IO thread
> 2012-09-13 22:20:31,893:26922(**0x7fb789a34700):ZOO_DEBUG@do_**
> completion@326: started completion thread
> 2012-09-13 22:20:31,893:26922(**0x7fb78a435700):ZOO_INFO@**
> check_events@1585: initiated connection to server [127.0.0.1:2181]
> I0913 22:20:31.960598 26941 webui.cpp:61] Loading webui script at
> '/usr/local/mesos/src/webui/**slave/webui.py'
> 2012-09-13 22:20:31,972:26922(**0x7fb78a435700):ZOO_INFO@**
> check_events@1632: session establishment complete on server [
> 127.0.0.1:2181], sessionId=0x139c14eb1dc0006, negotiated timeout=10000
> 2012-09-13 22:20:31,972:26922(**0x7fb78a435700):ZOO_DEBUG@**
> check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the
> state=ZOO_CONNECTED_STATE
> 2012-09-13 22:20:31,972:26922(**0x7fb789a34700):ZOO_DEBUG@**
> process_completions@1765: Calling a watcher for node [], type = -1
> event=ZOO_SESSION_EVENT
> I0913 22:20:31.972904 26937 detector.cpp:287] Master detector connected to
> ZooKeeper ...
> I0913 22:20:31.973017 26937 detector.cpp:316] Trying to create znode
> '/znode' in ZooKeeper
> 2012-09-13 22:20:31,973:26922(**0x7fb78ba4c700):ZOO_DEBUG@zoo_**
> acreate@2503: Sending request xid=0x50525c30 for path [/znode] to
> 127.0.0.1:2181
> 2012-09-13 22:20:31,980:26922(**0x7fb78a435700):ZOO_DEBUG@**
> zookeeper_process@1989: Queueing asynchronous response
> 2012-09-13 22:20:31,980:26922(**0x7fb789a34700):ZOO_DEBUG@**
> process_completions@1817: Calling COMPLETION_STRING for xid=0x50525c30
> rc=-110
> Bottle server starting up (using WSGIRefServer())...
> Listening on http://0.0.0.0:8081/
> Use Ctrl-C to quit.
>
> 2012-09-13 22:20:35,311:26922(**0x7fb78a435700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:20:38,648:26922(**0x7fb78a435700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:20:41,986:26922(**0x7fb78a435700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:20:45,323:26922(**0x7fb78a435700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:20:48,660:26922(**0x7fb78a435700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:20:51,997:26922(**0x7fb78a435700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:20:55,334:26922(**0x7fb78a435700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:20:58,672:26922(**0x7fb78a435700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 1 ms
> 2012-09-13 22:21:02,008:26922(**0x7fb78a435700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:21:05,346:26922(**0x7fb78a435700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:21:08,683:26922(**0x7fb78a435700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
> 2012-09-13 22:21:12,020:26922(**0x7fb78a435700):ZOO_DEBUG@**
> zookeeper_process@1983: Got ping response in 0 ms
>
>
>