You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Umesh Batra (JIRA)" <ji...@apache.org> on 2014/06/26 06:51:38 UTC

[jira] [Commented] (MESOS-1541) Mesos slave continuous disconnection

    [ https://issues.apache.org/jira/browse/MESOS-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044347#comment-14044347 ] 

Umesh Batra commented on MESOS-1541:
------------------------------------

Here's the logs Benjamin, 
 
Slave: 
 
root@pod1-08.mylab.com:/mesos/logs/slave>cat mesos-slave.INFO 
Log file created at: 2014/06/26 00:38:07
Running on machine: pod1-08.mylab.com
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0626 00:38:07.825687 30277 logging.cpp:167] INFO level logging started!
I0626 00:38:07.826467 30277 main.cpp:126] Build: 2014-06-24 16:52:17 by root
I0626 00:38:07.826484 30277 main.cpp:128] Version: 0.19.0
I0626 00:38:07.826505 30277 mesos_containerizer.cpp:124] Using isolation: cgroups/cpu,cgroups/mem
I0626 00:38:07.907639 30277 linux_launcher.cpp:66] Using /cgroup/freezer as the freezer hierarchy for the Linux launcher
I0626 00:38:07.907991 30277 main.cpp:149] Starting Mesos slave
I0626 00:38:07.910575 30292 slave.cpp:143] Slave started on 1)@12.12.249.207:5051
I0626 00:38:07.911753 30292 slave.cpp:255] Slave resources: cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]
I0626 00:38:07.911836 30292 slave.cpp:283] Slave hostname: pod1-08.mylab.com
I0626 00:38:07.911859 30292 slave.cpp:284] Slave checkpoint: true
I0626 00:38:07.915976 30280 state.cpp:33] Recovering state from '/mesos/data/slave/meta'
I0626 00:38:07.916275 30280 state.cpp:62] Failed to find the latest slave from '/mesos/data/slave/meta'
I0626 00:38:07.916532 30290 status_update_manager.cpp:193] Recovering status update manager
I0626 00:38:07.917157 30293 mesos_containerizer.cpp:281] Recovering containerizer
I0626 00:38:07.920425 30309 slave.cpp:3018] Finished recovery
I0626 00:38:07.924598 30306 group.cpp:310] Group process ((6)@12.12.249.207:5051) connected to ZooKeeper
I0626 00:38:07.924693 30306 group.cpp:784] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
I0626 00:38:07.924731 30306 group.cpp:382] Trying to create path '/mesos' in ZooKeeper
I0626 00:38:07.927206 30297 detector.cpp:135] Detected a new leader: (id='27')
I0626 00:38:07.927669 30296 group.cpp:655] Trying to get '/mesos/info_0000000027' in ZooKeeper
I0626 00:38:07.928911 30283 detector.cpp:377] A new leading master (UPID=master@11.11.42.73:5050) is detected
I0626 00:38:07.929169 30293 slave.cpp:536] New master detected at master@11.11.42.73:5050
I0626 00:38:07.929954 30293 slave.cpp:572] No credentials provided. Attempting to register without authentication
I0626 00:38:07.929981 30305 status_update_manager.cpp:167] New master detected at master@11.11.42.73:5050
I0626 00:38:07.930022 30293 slave.cpp:585] Detecting new master
I0626 00:39:07.929234 30284 slave.cpp:2873] Current usage 12.15%. Max allowed age: 5.449802484743090days
I0626 00:40:07.960175 30306 slave.cpp:2873] Current usage 12.15%. Max allowed age: 5.449799771887604days
I0626 00:41:07.991479 30302 slave.cpp:2873] Current usage 12.15%. Max allowed age: 5.449800857029803days
I0626 00:42:08.043979 30296 slave.cpp:2873] Current usage 12.15%. Max allowed age: 5.449798686745405days
I0626 00:43:08.044744 30285 slave.cpp:2873] Current usage 12.15%. Max allowed age: 5.449799771887604days
I0626 00:44:08.052268 30296 slave.cpp:2873] Current usage 12.15%. Max allowed age: 5.449799771887604days 
 
Master: 
 
I0625 21:52:55.546758 20052 master.cpp:2272] Removing old disconnected slave 20140624-220751-1227492106-5050-20047-87 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) because a registration attempt is being made from slave(1)@12.12.249.207:5051
I0625 21:52:55.547124 20052 master.cpp:3605] Removing slave 20140624-220751-1227492106-5050-20047-87 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:52:55.547675 20053 hierarchical_allocator_process.hpp:469] Removed slave 20140624-220751-1227492106-5050-20047-87
I0625 21:52:55.547724 20052 master.cpp:2302] Registering slave at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with id 20140624-220751-1227492106-5050-20047-88
I0625 21:52:55.548032 20053 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:52:55.551023 20048 log.cpp:680] Attempting to append 155 bytes to the log
I0625 21:52:55.551378 20048 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1797
I0625 21:52:55.552207 20052 replica.cpp:508] Replica received write request for position 1797
I0625 21:52:55.553515 20052 leveldb.cpp:343] Persisting action (175 bytes) to leveldb took 1.000203ms
I0625 21:52:55.553822 20052 replica.cpp:676] Persisted action at 1797
I0625 21:52:55.554349 20049 replica.cpp:655] Replica received learned notice for position 1797
I0625 21:52:55.555562 20049 leveldb.cpp:343] Persisting action (177 bytes) to leveldb took 848026ns
I0625 21:52:55.555804 20049 replica.cpp:676] Persisted action at 1797
I0625 21:52:55.555995 20049 replica.cpp:661] Replica learned APPEND action at position 1797
I0625 21:52:55.556576 20049 registrar.cpp:479] Successfully updated 'registry'
I0625 21:52:55.556704 20050 log.cpp:699] Attempting to truncate the log to 1797
I0625 21:52:55.556869 20049 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:52:55.556911 20054 master.cpp:3707] Removed slave 20140624-220751-1227492106-5050-20047-87 (pod1-08.mylab.com)
I0625 21:52:55.557059 20050 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1798
I0625 21:52:55.557425 20054 master.cpp:3722] Notifying framework 20140618-144705-1227492106-5050-7546-0000 of lost slave 20140624-220751-1227492106-5050-20047-87 (pod1-08.mylab.com) after recovering
I0625 21:52:55.558015 20050 replica.cpp:508] Replica received write request for position 1798
I0625 21:52:55.563534 20050 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 836154ns
I0625 21:52:55.563750 20050 replica.cpp:676] Persisted action at 1798
I0625 21:52:55.563952 20050 replica.cpp:655] Replica received learned notice for position 1798
I0625 21:52:55.565023 20050 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 916638ns
I0625 21:52:55.565273 20050 leveldb.cpp:401] Deleting ~2 keys from leveldb took 65712ns
I0625 21:52:55.565481 20050 replica.cpp:676] Persisted action at 1798
I0625 21:52:55.565635 20050 replica.cpp:661] Replica learned TRUNCATE action at position 1798
I0625 21:52:55.566089 20050 log.cpp:680] Attempting to append 379 bytes to the log
I0625 21:52:55.566370 20055 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1799
I0625 21:52:55.567049 20052 replica.cpp:508] Replica received write request for position 1799
I0625 21:52:55.572720 20052 leveldb.cpp:343] Persisting action (399 bytes) to leveldb took 5.351867ms
I0625 21:52:55.572986 20052 replica.cpp:676] Persisted action at 1799
I0625 21:52:55.573210 20052 replica.cpp:655] Replica received learned notice for position 1799
I0625 21:52:55.574225 20052 leveldb.cpp:343] Persisting action (401 bytes) to leveldb took 772820ns
I0625 21:52:55.574499 20052 replica.cpp:676] Persisted action at 1799
I0625 21:52:55.574707 20052 replica.cpp:661] Replica learned APPEND action at position 1799
I0625 21:52:55.575312 20048 registrar.cpp:479] Successfully updated 'registry'
I0625 21:52:55.575431 20050 log.cpp:699] Attempting to truncate the log to 1799
I0625 21:52:55.575680 20048 master.cpp:2342] Registered slave 20140624-220751-1227492106-5050-20047-88 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:52:55.575844 20050 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1800
I0625 21:52:55.576026 20048 master.cpp:3472] Adding slave 20140624-220751-1227492106-5050-20047-88 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]
I0625 21:52:55.576817 20054 replica.cpp:508] Replica received write request for position 1800
I0625 21:52:55.576797 20048 hierarchical_allocator_process.hpp:444] Added slave 20140624-220751-1227492106-5050-20047-88 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (and cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] available)
I0625 21:52:55.577381 20053 master.cpp:712] Slave 20140624-220751-1227492106-5050-20047-88 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) disconnected
I0625 21:52:55.580029 20053 master.cpp:1344] Disconnecting slave 20140624-220751-1227492106-5050-20047-88
W0625 21:52:55.580226 20053 master.cpp:2895] Master returning resources offered because slave 20140624-220751-1227492106-5050-20047-88 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) is disconnected
I0625 21:52:55.580258 20054 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 3.215771ms
I0625 21:52:55.581073 20054 replica.cpp:676] Persisted action at 1800
I0625 21:52:55.581251 20054 replica.cpp:655] Replica received learned notice for position 1800
I0625 21:52:55.580332 20049 hierarchical_allocator_process.hpp:483] Slave 20140624-220751-1227492106-5050-20047-88 disconnected
I0625 21:52:55.582268 20054 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 872554ns
I0625 21:52:55.582475 20049 hierarchical_allocator_process.hpp:636] Recovered cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (total allocatable: cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]) on slave 20140624-220751-1227492106-5050-20047-88 from framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:52:55.582594 20054 leveldb.cpp:401] Deleting ~2 keys from leveldb took 54572ns
I0625 21:52:55.583014 20054 replica.cpp:676] Persisted action at 1800
I0625 21:52:55.583153 20054 replica.cpp:661] Replica learned TRUNCATE action at position 1800
I0625 21:52:56.417865 20054 master.cpp:2770] Performing task state reconciliation for 0 task statuses of framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:53:26.423295 20054 master.cpp:2770] Performing task state reconciliation for 0 task statuses of framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:53:49.317718 20050 master.cpp:2272] Removing old disconnected slave 20140624-220751-1227492106-5050-20047-88 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) because a registration attempt is being made from slave(1)@12.12.249.207:5051
I0625 21:53:49.318001 20050 master.cpp:3605] Removing slave 20140624-220751-1227492106-5050-20047-88 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:53:49.318608 20050 master.cpp:2302] Registering slave at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with id 20140624-220751-1227492106-5050-20047-89
I0625 21:53:49.319030 20050 hierarchical_allocator_process.hpp:469] Removed slave 20140624-220751-1227492106-5050-20047-88
I0625 21:53:49.319375 20050 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:53:49.322335 20055 log.cpp:680] Attempting to append 155 bytes to the log
I0625 21:53:49.322788 20051 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1801
I0625 21:53:49.323953 20053 replica.cpp:508] Replica received write request for position 1801
I0625 21:53:49.325181 20053 leveldb.cpp:343] Persisting action (175 bytes) to leveldb took 1.006008ms
I0625 21:53:49.325368 20053 replica.cpp:676] Persisted action at 1801
I0625 21:53:49.325958 20048 replica.cpp:655] Replica received learned notice for position 1801
I0625 21:53:49.326925 20048 leveldb.cpp:343] Persisting action (177 bytes) to leveldb took 793047ns
I0625 21:53:49.327106 20048 replica.cpp:676] Persisted action at 1801
I0625 21:53:49.327313 20048 replica.cpp:661] Replica learned APPEND action at position 1801
I0625 21:53:49.327875 20049 registrar.cpp:479] Successfully updated 'registry'
I0625 21:53:49.328130 20054 log.cpp:699] Attempting to truncate the log to 1801
I0625 21:53:49.328182 20049 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:53:49.328232 20053 master.cpp:3707] Removed slave 20140624-220751-1227492106-5050-20047-88 (pod1-08.mylab.com)
I0625 21:53:49.328500 20048 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1802
I0625 21:53:49.330083 20053 master.cpp:3722] Notifying framework 20140618-144705-1227492106-5050-7546-0000 of lost slave 20140624-220751-1227492106-5050-20047-88 (pod1-08.mylab.com) after recovering
I0625 21:53:49.330658 20054 replica.cpp:508] Replica received write request for position 1802
I0625 21:53:49.337728 20054 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 750263ns
I0625 21:53:49.338016 20054 replica.cpp:676] Persisted action at 1802
I0625 21:53:49.338206 20054 replica.cpp:655] Replica received learned notice for position 1802
I0625 21:53:49.339146 20054 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 742839ns
I0625 21:53:49.339395 20054 leveldb.cpp:401] Deleting ~2 keys from leveldb took 40284ns
I0625 21:53:49.339622 20054 replica.cpp:676] Persisted action at 1802
I0625 21:53:49.339805 20054 replica.cpp:661] Replica learned TRUNCATE action at position 1802
I0625 21:53:49.340327 20049 log.cpp:680] Attempting to append 379 bytes to the log
I0625 21:53:49.340642 20051 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1803
I0625 21:53:49.341369 20050 replica.cpp:508] Replica received write request for position 1803
I0625 21:53:49.341981 20050 leveldb.cpp:343] Persisting action (399 bytes) to leveldb took 264713ns
I0625 21:53:49.342250 20050 replica.cpp:676] Persisted action at 1803
I0625 21:53:49.343209 20049 replica.cpp:655] Replica received learned notice for position 1803
I0625 21:53:49.344527 20049 leveldb.cpp:343] Persisting action (401 bytes) to leveldb took 921224ns
I0625 21:53:49.344717 20049 replica.cpp:676] Persisted action at 1803
I0625 21:53:49.352473 20049 replica.cpp:661] Replica learned APPEND action at position 1803
I0625 21:53:49.353122 20055 registrar.cpp:479] Successfully updated 'registry'
I0625 21:53:49.353245 20050 log.cpp:699] Attempting to truncate the log to 1803
I0625 21:53:49.353533 20051 master.cpp:2342] Registered slave 20140624-220751-1227492106-5050-20047-89 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:53:49.353744 20050 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1804
I0625 21:53:49.353850 20051 master.cpp:3472] Adding slave 20140624-220751-1227492106-5050-20047-89 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]
I0625 21:53:49.354635 20051 hierarchical_allocator_process.hpp:444] Added slave 20140624-220751-1227492106-5050-20047-89 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (and cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] available)
I0625 21:53:49.354857 20055 master.cpp:712] Slave 20140624-220751-1227492106-5050-20047-89 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) disconnected
I0625 21:53:49.354599 20048 replica.cpp:508] Replica received write request for position 1804
I0625 21:53:49.355104 20055 master.cpp:1344] Disconnecting slave 20140624-220751-1227492106-5050-20047-89
W0625 21:53:49.355703 20055 master.cpp:2895] Master returning resources offered because slave 20140624-220751-1227492106-5050-20047-89 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) is disconnected
I0625 21:53:49.355723 20051 hierarchical_allocator_process.hpp:483] Slave 20140624-220751-1227492106-5050-20047-89 disconnected
I0625 21:53:49.356346 20048 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 977908ns
I0625 21:53:49.356597 20051 hierarchical_allocator_process.hpp:636] Recovered cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (total allocatable: cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]) on slave 20140624-220751-1227492106-5050-20047-89 from framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:53:49.356763 20048 replica.cpp:676] Persisted action at 1804
I0625 21:53:49.357231 20048 replica.cpp:655] Replica received learned notice for position 1804
I0625 21:53:49.358211 20048 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 694676ns
I0625 21:53:49.358569 20048 leveldb.cpp:401] Deleting ~2 keys from leveldb took 76166ns
I0625 21:53:49.358765 20048 replica.cpp:676] Persisted action at 1804
I0625 21:53:49.358944 20048 replica.cpp:661] Replica learned TRUNCATE action at position 1804
I0625 21:53:56.416980 20048 master.cpp:2770] Performing task state reconciliation for 0 task statuses of framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:54:17.717218 20051 master.cpp:2272] Removing old disconnected slave 20140624-220751-1227492106-5050-20047-89 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) because a registration attempt is being made from slave(1)@12.12.249.207:5051
I0625 21:54:17.717804 20051 master.cpp:3605] Removing slave 20140624-220751-1227492106-5050-20047-89 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:54:17.718205 20048 hierarchical_allocator_process.hpp:469] Removed slave 20140624-220751-1227492106-5050-20047-89
I0625 21:54:17.718297 20051 master.cpp:2302] Registering slave at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with id 20140624-220751-1227492106-5050-20047-90
I0625 21:54:17.718385 20055 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:54:17.723970 20050 log.cpp:680] Attempting to append 155 bytes to the log
I0625 21:54:17.724277 20054 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1805
I0625 21:54:17.725200 20053 replica.cpp:508] Replica received write request for position 1805
I0625 21:54:17.725852 20053 leveldb.cpp:343] Persisting action (175 bytes) to leveldb took 366684ns
I0625 21:54:17.726052 20053 replica.cpp:676] Persisted action at 1805
I0625 21:54:17.726740 20050 replica.cpp:655] Replica received learned notice for position 1805
I0625 21:54:17.727856 20050 leveldb.cpp:343] Persisting action (177 bytes) to leveldb took 924192ns
I0625 21:54:17.728049 20050 replica.cpp:676] Persisted action at 1805
I0625 21:54:17.728222 20050 replica.cpp:661] Replica learned APPEND action at position 1805
I0625 21:54:17.728831 20052 registrar.cpp:479] Successfully updated 'registry'
I0625 21:54:17.728973 20053 log.cpp:699] Attempting to truncate the log to 1805
I0625 21:54:17.729123 20052 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:54:17.729173 20054 master.cpp:3707] Removed slave 20140624-220751-1227492106-5050-20047-89 (pod1-08.mylab.com)
I0625 21:54:17.732071 20054 master.cpp:3722] Notifying framework 20140618-144705-1227492106-5050-7546-0000 of lost slave 20140624-220751-1227492106-5050-20047-89 (pod1-08.mylab.com) after recovering
I0625 21:54:17.729365 20048 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1806
I0625 21:54:17.732893 20048 replica.cpp:508] Replica received write request for position 1806
I0625 21:54:17.736526 20048 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 3.439225ms
I0625 21:54:17.736723 20048 replica.cpp:676] Persisted action at 1806
I0625 21:54:17.736932 20048 replica.cpp:655] Replica received learned notice for position 1806
I0625 21:54:17.737862 20048 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 788456ns
I0625 21:54:17.738101 20048 leveldb.cpp:401] Deleting ~2 keys from leveldb took 64599ns
I0625 21:54:17.738296 20048 replica.cpp:676] Persisted action at 1806
I0625 21:54:17.738484 20048 replica.cpp:661] Replica learned TRUNCATE action at position 1806
I0625 21:54:17.738986 20048 log.cpp:680] Attempting to append 379 bytes to the log
I0625 21:54:17.739248 20054 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1807
I0625 21:54:17.739961 20048 replica.cpp:508] Replica received write request for position 1807
I0625 21:54:17.745477 20048 leveldb.cpp:343] Persisting action (399 bytes) to leveldb took 5.346744ms
I0625 21:54:17.745677 20048 replica.cpp:676] Persisted action at 1807
I0625 21:54:17.747091 20048 replica.cpp:655] Replica received learned notice for position 1807
I0625 21:54:17.747980 20048 leveldb.cpp:343] Persisting action (401 bytes) to leveldb took 725287ns
I0625 21:54:17.748247 20048 replica.cpp:676] Persisted action at 1807
I0625 21:54:17.748549 20048 replica.cpp:661] Replica learned APPEND action at position 1807
I0625 21:54:17.749171 20052 registrar.cpp:479] Successfully updated 'registry'
I0625 21:54:17.749292 20049 log.cpp:699] Attempting to truncate the log to 1807
I0625 21:54:17.749554 20052 master.cpp:2342] Registered slave 20140624-220751-1227492106-5050-20047-90 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:54:17.749773 20049 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1808
I0625 21:54:17.749909 20052 master.cpp:3472] Adding slave 20140624-220751-1227492106-5050-20047-90 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]
I0625 21:54:17.750484 20053 replica.cpp:508] Replica received write request for position 1808
I0625 21:54:17.750855 20050 hierarchical_allocator_process.hpp:444] Added slave 20140624-220751-1227492106-5050-20047-90 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (and cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] available)
I0625 21:54:17.750907 20052 master.cpp:712] Slave 20140624-220751-1227492106-5050-20047-90 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) disconnected
I0625 21:54:17.751941 20052 master.cpp:1344] Disconnecting slave 20140624-220751-1227492106-5050-20047-90
W0625 21:54:17.752193 20052 master.cpp:2895] Master returning resources offered because slave 20140624-220751-1227492106-5050-20047-90 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) is disconnected
I0625 21:54:17.752217 20054 hierarchical_allocator_process.hpp:483] Slave 20140624-220751-1227492106-5050-20047-90 disconnected
I0625 21:54:17.753604 20053 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 2.945508ms
I0625 21:54:17.755303 20054 hierarchical_allocator_process.hpp:636] Recovered cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (total allocatable: cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]) on slave 20140624-220751-1227492106-5050-20047-90 from framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:54:17.755372 20053 replica.cpp:676] Persisted action at 1808
I0625 21:54:17.755802 20053 replica.cpp:655] Replica received learned notice for position 1808
I0625 21:54:17.756659 20053 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 713292ns
I0625 21:54:17.756892 20053 leveldb.cpp:401] Deleting ~2 keys from leveldb took 61705ns
I0625 21:54:17.757042 20053 replica.cpp:676] Persisted action at 1808
I0625 21:54:17.757237 20053 replica.cpp:661] Replica learned TRUNCATE action at position 1808
I0625 21:54:26.417659 20053 master.cpp:2770] Performing task state reconciliation for 0 task statuses of framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:54:46.614953 20048 master.cpp:2272] Removing old disconnected slave 20140624-220751-1227492106-5050-20047-90 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) because a registration attempt is being made from slave(1)@12.12.249.207:5051
I0625 21:54:46.615293 20048 master.cpp:3605] Removing slave 20140624-220751-1227492106-5050-20047-90 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:54:46.615759 20053 hierarchical_allocator_process.hpp:469] Removed slave 20140624-220751-1227492106-5050-20047-90
I0625 21:54:46.615825 20048 master.cpp:2302] Registering slave at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with id 20140624-220751-1227492106-5050-20047-91
I0625 21:54:46.615871 20049 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:54:46.619216 20049 log.cpp:680] Attempting to append 155 bytes to the log
I0625 21:54:46.619572 20051 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1809
I0625 21:54:46.620304 20051 replica.cpp:508] Replica received write request for position 1809
I0625 21:54:46.622472 20051 leveldb.cpp:343] Persisting action (175 bytes) to leveldb took 1.85629ms
I0625 21:54:46.622692 20051 replica.cpp:676] Persisted action at 1809
I0625 21:54:46.622875 20051 replica.cpp:655] Replica received learned notice for position 1809
I0625 21:54:46.623675 20051 leveldb.cpp:343] Persisting action (177 bytes) to leveldb took 647827ns
I0625 21:54:46.623878 20051 replica.cpp:676] Persisted action at 1809
I0625 21:54:46.624023 20051 replica.cpp:661] Replica learned APPEND action at position 1809
I0625 21:54:46.624651 20051 registrar.cpp:479] Successfully updated 'registry'
I0625 21:54:46.624827 20052 log.cpp:699] Attempting to truncate the log to 1809
I0625 21:54:46.624910 20051 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:54:46.624943 20053 master.cpp:3707] Removed slave 20140624-220751-1227492106-5050-20047-90 (pod1-08.mylab.com)
I0625 21:54:46.625134 20052 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1810
I0625 21:54:46.627832 20053 master.cpp:3722] Notifying framework 20140618-144705-1227492106-5050-7546-0000 of lost slave 20140624-220751-1227492106-5050-20047-90 (pod1-08.mylab.com) after recovering
I0625 21:54:46.628622 20053 replica.cpp:508] Replica received write request for position 1810
I0625 21:54:46.629575 20053 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 767188ns
I0625 21:54:46.629782 20053 replica.cpp:676] Persisted action at 1810
I0625 21:54:46.630257 20048 replica.cpp:655] Replica received learned notice for position 1810
I0625 21:54:46.632932 20048 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 2.417318ms
I0625 21:54:46.633255 20048 leveldb.cpp:401] Deleting ~2 keys from leveldb took 73033ns
I0625 21:54:46.633463 20048 replica.cpp:676] Persisted action at 1810
I0625 21:54:46.633656 20048 replica.cpp:661] Replica learned TRUNCATE action at position 1810
I0625 21:54:46.634171 20048 log.cpp:680] Attempting to append 379 bytes to the log
I0625 21:54:46.634464 20052 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1811
I0625 21:54:46.635088 20055 replica.cpp:508] Replica received write request for position 1811
I0625 21:54:46.639701 20055 leveldb.cpp:343] Persisting action (399 bytes) to leveldb took 4.4299ms
I0625 21:54:46.639931 20055 replica.cpp:676] Persisted action at 1811
I0625 21:54:46.640147 20055 replica.cpp:655] Replica received learned notice for position 1811
I0625 21:54:46.641126 20055 leveldb.cpp:343] Persisting action (401 bytes) to leveldb took 810104ns
I0625 21:54:46.641340 20055 replica.cpp:676] Persisted action at 1811
I0625 21:54:46.641566 20055 replica.cpp:661] Replica learned APPEND action at position 1811
I0625 21:54:46.642177 20055 registrar.cpp:479] Successfully updated 'registry'
I0625 21:54:46.642319 20050 log.cpp:699] Attempting to truncate the log to 1811
I0625 21:54:46.642551 20055 master.cpp:2342] Registered slave 20140624-220751-1227492106-5050-20047-91 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:54:46.642680 20048 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1812
I0625 21:54:46.642802 20055 master.cpp:3472] Adding slave 20140624-220751-1227492106-5050-20047-91 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]
I0625 21:54:46.643735 20052 replica.cpp:508] Replica received write request for position 1812
I0625 21:54:46.643731 20054 hierarchical_allocator_process.hpp:444] Added slave 20140624-220751-1227492106-5050-20047-91 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (and cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] available)
I0625 21:54:46.643932 20053 master.cpp:712] Slave 20140624-220751-1227492106-5050-20047-91 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) disconnected
I0625 21:54:46.649900 20053 master.cpp:1344] Disconnecting slave 20140624-220751-1227492106-5050-20047-91
I0625 21:54:46.649981 20052 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 5.963961ms
I0625 21:54:46.651021 20050 hierarchical_allocator_process.hpp:483] Slave 20140624-220751-1227492106-5050-20047-91 disconnected
W0625 21:54:46.651022 20053 master.cpp:2895] Master returning resources offered because slave 20140624-220751-1227492106-5050-20047-91 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) is disconnected
I0625 21:54:46.651224 20052 replica.cpp:676] Persisted action at 1812
I0625 21:54:46.652154 20052 replica.cpp:655] Replica received learned notice for position 1812
I0625 21:54:46.652142 20053 hierarchical_allocator_process.hpp:636] Recovered cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (total allocatable: cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]) on slave 20140624-220751-1227492106-5050-20047-91 from framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:54:46.653208 20052 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 854748ns
I0625 21:54:46.653688 20052 leveldb.cpp:401] Deleting ~2 keys from leveldb took 93010ns
I0625 21:54:46.653884 20052 replica.cpp:676] Persisted action at 1812
I0625 21:54:46.654069 20052 replica.cpp:661] Replica learned TRUNCATE action at position 1812
I0625 21:54:48.421460 20049 master.cpp:2272] Removing old disconnected slave 20140624-220751-1227492106-5050-20047-91 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) because a registration attempt is being made from slave(1)@12.12.249.207:5051
I0625 21:54:48.421783 20049 master.cpp:3605] Removing slave 20140624-220751-1227492106-5050-20047-91 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:54:48.422297 20052 hierarchical_allocator_process.hpp:469] Removed slave 20140624-220751-1227492106-5050-20047-91
I0625 21:54:48.422487 20049 master.cpp:2302] Registering slave at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with id 20140624-220751-1227492106-5050-20047-92
I0625 21:54:48.422516 20054 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:54:48.425720 20054 log.cpp:680] Attempting to append 155 bytes to the log
I0625 21:54:48.426017 20055 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1813
I0625 21:54:48.426791 20055 replica.cpp:508] Replica received write request for position 1813
I0625 21:54:48.427695 20055 leveldb.cpp:343] Persisting action (175 bytes) to leveldb took 719227ns
I0625 21:54:48.427898 20055 replica.cpp:676] Persisted action at 1813
I0625 21:54:48.428540 20055 replica.cpp:655] Replica received learned notice for position 1813
I0625 21:54:48.429458 20055 leveldb.cpp:343] Persisting action (177 bytes) to leveldb took 725250ns
I0625 21:54:48.429620 20055 replica.cpp:676] Persisted action at 1813
I0625 21:54:48.429788 20055 replica.cpp:661] Replica learned APPEND action at position 1813
I0625 21:54:48.430348 20049 registrar.cpp:479] Successfully updated 'registry'
I0625 21:54:48.430647 20051 log.cpp:699] Attempting to truncate the log to 1813
I0625 21:54:48.430651 20049 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:54:48.430719 20052 master.cpp:3707] Removed slave 20140624-220751-1227492106-5050-20047-91 (pod1-08.mylab.com)
I0625 21:54:48.431011 20055 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1814
I0625 21:54:48.433953 20052 master.cpp:3722] Notifying framework 20140618-144705-1227492106-5050-7546-0000 of lost slave 20140624-220751-1227492106-5050-20047-91 (pod1-08.mylab.com) after recovering
I0625 21:54:48.434762 20051 replica.cpp:508] Replica received write request for position 1814
I0625 21:54:48.435657 20051 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 661527ns
I0625 21:54:48.435930 20051 replica.cpp:676] Persisted action at 1814
I0625 21:54:48.436712 20054 replica.cpp:655] Replica received learned notice for position 1814
I0625 21:54:48.437789 20054 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 873878ns
I0625 21:54:48.438102 20054 leveldb.cpp:401] Deleting ~2 keys from leveldb took 66176ns
I0625 21:54:48.438292 20054 replica.cpp:676] Persisted action at 1814
I0625 21:54:48.438510 20054 replica.cpp:661] Replica learned TRUNCATE action at position 1814
I0625 21:54:48.439051 20050 log.cpp:680] Attempting to append 379 bytes to the log
I0625 21:54:48.439324 20050 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1815
I0625 21:54:48.439961 20050 replica.cpp:508] Replica received write request for position 1815
I0625 21:54:48.443008 20050 leveldb.cpp:343] Persisting action (399 bytes) to leveldb took 2.854463ms
I0625 21:54:48.443215 20050 replica.cpp:676] Persisted action at 1815
I0625 21:54:48.445099 20050 replica.cpp:655] Replica received learned notice for position 1815
I0625 21:54:48.445904 20050 leveldb.cpp:343] Persisting action (401 bytes) to leveldb took 643631ns
I0625 21:54:48.446084 20050 replica.cpp:676] Persisted action at 1815
I0625 21:54:48.446250 20050 replica.cpp:661] Replica learned APPEND action at position 1815
I0625 21:54:48.446873 20050 registrar.cpp:479] Successfully updated 'registry'
I0625 21:54:48.447094 20053 log.cpp:699] Attempting to truncate the log to 1815
I0625 21:54:48.447196 20050 master.cpp:2342] Registered slave 20140624-220751-1227492106-5050-20047-92 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:54:48.447379 20053 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1816
I0625 21:54:48.447540 20050 master.cpp:3472] Adding slave 20140624-220751-1227492106-5050-20047-92 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]
I0625 21:54:48.448180 20050 replica.cpp:508] Replica received write request for position 1816
I0625 21:54:48.448256 20053 hierarchical_allocator_process.hpp:444] Added slave 20140624-220751-1227492106-5050-20047-92 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (and cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] available)
I0625 21:54:48.448741 20055 master.cpp:712] Slave 20140624-220751-1227492106-5050-20047-92 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) disconnected
I0625 21:54:48.448927 20055 master.cpp:1344] Disconnecting slave 20140624-220751-1227492106-5050-20047-92
I0625 21:54:48.449093 20053 hierarchical_allocator_process.hpp:483] Slave 20140624-220751-1227492106-5050-20047-92 disconnected
W0625 21:54:48.449093 20055 master.cpp:2895] Master returning resources offered because slave 20140624-220751-1227492106-5050-20047-92 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) is disconnected
I0625 21:54:48.450139 20050 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 1.755146ms
I0625 21:54:48.450685 20050 replica.cpp:676] Persisted action at 1816
I0625 21:54:48.450882 20050 replica.cpp:655] Replica received learned notice for position 1816
I0625 21:54:48.450678 20055 hierarchical_allocator_process.hpp:636] Recovered cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (total allocatable: cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]) on slave 20140624-220751-1227492106-5050-20047-92 from framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:54:48.451762 20050 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 725595ns
I0625 21:54:48.452030 20050 leveldb.cpp:401] Deleting ~2 keys from leveldb took 80072ns
I0625 21:54:48.452186 20050 replica.cpp:676] Persisted action at 1816
I0625 21:54:48.452349 20050 replica.cpp:661] Replica learned TRUNCATE action at position 1816
I0625 21:54:56.417362 20048 master.cpp:2770] Performing task state reconciliation for 0 task statuses of framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:55:26.417702 20051 master.cpp:2770] Performing task state reconciliation for 0 task statuses of framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:55:40.315969 20049 master.cpp:2272] Removing old disconnected slave 20140624-220751-1227492106-5050-20047-92 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) because a registration attempt is being made from slave(1)@12.12.249.207:5051
I0625 21:55:40.316464 20049 master.cpp:3605] Removing slave 20140624-220751-1227492106-5050-20047-92 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:55:40.316927 20049 master.cpp:2302] Registering slave at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with id 20140624-220751-1227492106-5050-20047-93
I0625 21:55:40.317276 20049 hierarchical_allocator_process.hpp:469] Removed slave 20140624-220751-1227492106-5050-20047-92
I0625 21:55:40.317589 20049 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:55:40.320732 20048 log.cpp:680] Attempting to append 155 bytes to the log
I0625 21:55:40.321048 20054 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1817
I0625 21:55:40.321918 20053 replica.cpp:508] Replica received write request for position 1817
I0625 21:55:40.323180 20053 leveldb.cpp:343] Persisting action (175 bytes) to leveldb took 1.045137ms
I0625 21:55:40.323360 20053 replica.cpp:676] Persisted action at 1817
I0625 21:55:40.323654 20053 replica.cpp:655] Replica received learned notice for position 1817
I0625 21:55:40.324687 20053 leveldb.cpp:343] Persisting action (177 bytes) to leveldb took 869276ns
I0625 21:55:40.324889 20053 replica.cpp:676] Persisted action at 1817
I0625 21:55:40.325039 20053 replica.cpp:661] Replica learned APPEND action at position 1817
I0625 21:55:40.325628 20053 registrar.cpp:479] Successfully updated 'registry'
I0625 21:55:40.325767 20051 log.cpp:699] Attempting to truncate the log to 1817
I0625 21:55:40.325896 20053 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:55:40.325944 20054 master.cpp:3707] Removed slave 20140624-220751-1227492106-5050-20047-92 (pod1-08.mylab.com)
I0625 21:55:40.326213 20055 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1818
I0625 21:55:40.328888 20054 master.cpp:3722] Notifying framework 20140618-144705-1227492106-5050-7546-0000 of lost slave 20140624-220751-1227492106-5050-20047-92 (pod1-08.mylab.com) after recovering
I0625 21:55:40.331310 20048 replica.cpp:508] Replica received write request for position 1818
I0625 21:55:40.332319 20048 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 780660ns
I0625 21:55:40.332552 20048 replica.cpp:676] Persisted action at 1818
I0625 21:55:40.332980 20051 replica.cpp:655] Replica received learned notice for position 1818
I0625 21:55:40.335948 20051 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 864235ns
I0625 21:55:40.336189 20051 leveldb.cpp:401] Deleting ~2 keys from leveldb took 62165ns
I0625 21:55:40.336352 20051 replica.cpp:676] Persisted action at 1818
I0625 21:55:40.336551 20051 replica.cpp:661] Replica learned TRUNCATE action at position 1818
I0625 21:55:40.336997 20051 log.cpp:680] Attempting to append 379 bytes to the log
I0625 21:55:40.337301 20050 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1819
I0625 21:55:40.337954 20054 replica.cpp:508] Replica received write request for position 1819
I0625 21:55:40.338841 20054 leveldb.cpp:343] Persisting action (399 bytes) to leveldb took 700190ns
I0625 21:55:40.339037 20054 replica.cpp:676] Persisted action at 1819
I0625 21:55:40.339540 20055 replica.cpp:655] Replica received learned notice for position 1819
I0625 21:55:40.343205 20055 leveldb.cpp:343] Persisting action (401 bytes) to leveldb took 3.498183ms
I0625 21:55:40.343443 20055 replica.cpp:676] Persisted action at 1819
I0625 21:55:40.345657 20055 replica.cpp:661] Replica learned APPEND action at position 1819
I0625 21:55:40.346269 20051 registrar.cpp:479] Successfully updated 'registry'
I0625 21:55:40.346647 20050 master.cpp:2342] Registered slave 20140624-220751-1227492106-5050-20047-93 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:55:40.346755 20052 log.cpp:699] Attempting to truncate the log to 1819
I0625 21:55:40.346916 20050 master.cpp:3472] Adding slave 20140624-220751-1227492106-5050-20047-93 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]
I0625 21:55:40.347260 20052 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1820
I0625 21:55:40.347733 20050 hierarchical_allocator_process.hpp:444] Added slave 20140624-220751-1227492106-5050-20047-93 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (and cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] available)
I0625 21:55:40.347909 20052 master.cpp:712] Slave 20140624-220751-1227492106-5050-20047-93 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) disconnected
I0625 21:55:40.348285 20052 master.cpp:1344] Disconnecting slave 20140624-220751-1227492106-5050-20047-93
W0625 21:55:40.348558 20052 master.cpp:2895] Master returning resources offered because slave 20140624-220751-1227492106-5050-20047-93 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) is disconnected
I0625 21:55:40.348589 20053 hierarchical_allocator_process.hpp:483] Slave 20140624-220751-1227492106-5050-20047-93 disconnected
I0625 21:55:40.348074 20048 replica.cpp:508] Replica received write request for position 1820
I0625 21:55:40.349493 20053 hierarchical_allocator_process.hpp:636] Recovered cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (total allocatable: cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]) on slave 20140624-220751-1227492106-5050-20047-93 from framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:55:40.350553 20048 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 1.013407ms
I0625 21:55:40.350749 20048 replica.cpp:676] Persisted action at 1820
I0625 21:55:40.352829 20048 replica.cpp:655] Replica received learned notice for position 1820
I0625 21:55:40.355903 20048 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 2.913277ms
I0625 21:55:40.356174 20048 leveldb.cpp:401] Deleting ~2 keys from leveldb took 82308ns
I0625 21:55:40.356345 20048 replica.cpp:676] Persisted action at 1820
I0625 21:55:40.356544 20048 replica.cpp:661] Replica learned TRUNCATE action at position 1820
I0625 21:55:56.417219 20050 master.cpp:2770] Performing task state reconciliation for 0 task statuses of framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:56:26.417649 20052 master.cpp:2770] Performing task state reconciliation for 0 task statuses of framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:56:34.027531 20050 master.cpp:2272] Removing old disconnected slave 20140624-220751-1227492106-5050-20047-93 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) because a registration attempt is being made from slave(1)@12.12.249.207:5051
I0625 21:56:34.027839 20050 master.cpp:3605] Removing slave 20140624-220751-1227492106-5050-20047-93 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:56:34.028262 20048 hierarchical_allocator_process.hpp:469] Removed slave 20140624-220751-1227492106-5050-20047-93
I0625 21:56:34.028465 20050 master.cpp:2302] Registering slave at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with id 20140624-220751-1227492106-5050-20047-94
I0625 21:56:34.028522 20052 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:56:34.031672 20048 log.cpp:680] Attempting to append 155 bytes to the log
I0625 21:56:34.032019 20049 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1821
I0625 21:56:34.032757 20049 replica.cpp:508] Replica received write request for position 1821
I0625 21:56:34.033812 20049 leveldb.cpp:343] Persisting action (175 bytes) to leveldb took 846659ns
I0625 21:56:34.034023 20049 replica.cpp:676] Persisted action at 1821
I0625 21:56:34.034658 20054 replica.cpp:655] Replica received learned notice for position 1821
I0625 21:56:34.035812 20054 leveldb.cpp:343] Persisting action (177 bytes) to leveldb took 854822ns
I0625 21:56:34.036031 20054 replica.cpp:676] Persisted action at 1821
I0625 21:56:34.036193 20054 replica.cpp:661] Replica learned APPEND action at position 1821
I0625 21:56:34.036780 20053 registrar.cpp:479] Successfully updated 'registry'
I0625 21:56:34.036876 20050 log.cpp:699] Attempting to truncate the log to 1821
I0625 21:56:34.037098 20053 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:56:34.037139 20054 master.cpp:3707] Removed slave 20140624-220751-1227492106-5050-20047-93 (pod1-08.mylab.com)
I0625 21:56:34.037284 20050 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1822
I0625 21:56:34.037675 20054 master.cpp:3722] Notifying framework 20140618-144705-1227492106-5050-7546-0000 of lost slave 20140624-220751-1227492106-5050-20047-93 (pod1-08.mylab.com) after recovering
I0625 21:56:34.038220 20054 replica.cpp:508] Replica received write request for position 1822
I0625 21:56:34.040055 20054 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 1.613171ms
I0625 21:56:34.040278 20054 replica.cpp:676] Persisted action at 1822
I0625 21:56:34.040505 20054 replica.cpp:655] Replica received learned notice for position 1822
I0625 21:56:34.041554 20054 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 886916ns
I0625 21:56:34.041803 20054 leveldb.cpp:401] Deleting ~2 keys from leveldb took 65517ns
I0625 21:56:34.041965 20054 replica.cpp:676] Persisted action at 1822
I0625 21:56:34.042132 20054 replica.cpp:661] Replica learned TRUNCATE action at position 1822
I0625 21:56:34.042722 20050 log.cpp:680] Attempting to append 379 bytes to the log
I0625 21:56:34.043014 20051 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1823
I0625 21:56:34.043666 20048 replica.cpp:508] Replica received write request for position 1823
I0625 21:56:34.047425 20048 leveldb.cpp:343] Persisting action (399 bytes) to leveldb took 3.438476ms
I0625 21:56:34.047713 20048 replica.cpp:676] Persisted action at 1823
I0625 21:56:34.049823 20048 replica.cpp:655] Replica received learned notice for position 1823
I0625 21:56:34.051959 20048 leveldb.cpp:343] Persisting action (401 bytes) to leveldb took 1.85996ms
I0625 21:56:34.052259 20048 replica.cpp:676] Persisted action at 1823
I0625 21:56:34.052580 20048 replica.cpp:661] Replica learned APPEND action at position 1823
I0625 21:56:34.053241 20048 registrar.cpp:479] Successfully updated 'registry'
I0625 21:56:34.053366 20052 log.cpp:699] Attempting to truncate the log to 1823
I0625 21:56:34.053836 20049 master.cpp:2342] Registered slave 20140624-220751-1227492106-5050-20047-94 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:56:34.053849 20048 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1824
I0625 21:56:34.054078 20049 master.cpp:3472] Adding slave 20140624-220751-1227492106-5050-20047-94 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]
I0625 21:56:34.054934 20053 replica.cpp:508] Replica received write request for position 1824
I0625 21:56:34.054976 20051 hierarchical_allocator_process.hpp:444] Added slave 20140624-220751-1227492106-5050-20047-94 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (and cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] available)
I0625 21:56:34.055425 20052 master.cpp:712] Slave 20140624-220751-1227492106-5050-20047-94 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) disconnected
I0625 21:56:34.057833 20052 master.cpp:1344] Disconnecting slave 20140624-220751-1227492106-5050-20047-94
I0625 21:56:34.058056 20049 hierarchical_allocator_process.hpp:483] Slave 20140624-220751-1227492106-5050-20047-94 disconnected
W0625 21:56:34.058061 20052 master.cpp:2895] Master returning resources offered because slave 20140624-220751-1227492106-5050-20047-94 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) is disconnected
I0625 21:56:34.058899 20052 hierarchical_allocator_process.hpp:636] Recovered cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (total allocatable: cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]) on slave 20140624-220751-1227492106-5050-20047-94 from framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:56:34.060380 20053 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 5.155174ms
I0625 21:56:34.060703 20053 replica.cpp:676] Persisted action at 1824
I0625 21:56:34.060982 20053 replica.cpp:655] Replica received learned notice for position 1824
I0625 21:56:34.061979 20053 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 749528ns
I0625 21:56:34.062309 20053 leveldb.cpp:401] Deleting ~2 keys from leveldb took 47898ns
I0625 21:56:34.062608 20053 replica.cpp:676] Persisted action at 1824
I0625 21:56:34.062854 20053 replica.cpp:661] Replica learned TRUNCATE action at position 1824
I0625 21:56:56.417692 20049 master.cpp:2770] Performing task state reconciliation for 0 task statuses of framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:57:26.417911 20055 master.cpp:2770] Performing task state reconciliation for 0 task statuses of framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:57:32.401031 20048 master.cpp:2272] Removing old disconnected slave 20140624-220751-1227492106-5050-20047-94 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) because a registration attempt is being made from slave(1)@12.12.249.207:5051
I0625 21:57:32.401304 20048 master.cpp:3605] Removing slave 20140624-220751-1227492106-5050-20047-94 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:57:32.401707 20050 hierarchical_allocator_process.hpp:469] Removed slave 20140624-220751-1227492106-5050-20047-94
I0625 21:57:32.401801 20048 master.cpp:2302] Registering slave at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with id 20140624-220751-1227492106-5050-20047-95
I0625 21:57:32.401913 20054 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:57:32.405450 20049 log.cpp:680] Attempting to append 155 bytes to the log
I0625 21:57:32.405791 20051 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1825
I0625 21:57:32.406589 20053 replica.cpp:508] Replica received write request for position 1825
I0625 21:57:32.408296 20053 leveldb.cpp:343] Persisting action (175 bytes) to leveldb took 1.421669ms
I0625 21:57:32.408568 20053 replica.cpp:676] Persisted action at 1825
I0625 21:57:32.410045 20053 replica.cpp:655] Replica received learned notice for position 1825
I0625 21:57:32.412384 20053 leveldb.cpp:343] Persisting action (177 bytes) to leveldb took 2.123402ms
I0625 21:57:32.412621 20053 replica.cpp:676] Persisted action at 1825
I0625 21:57:32.412801 20053 replica.cpp:661] Replica learned APPEND action at position 1825
I0625 21:57:32.413378 20051 registrar.cpp:479] Successfully updated 'registry'
I0625 21:57:32.413563 20054 log.cpp:699] Attempting to truncate the log to 1825
I0625 21:57:32.413714 20051 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:57:32.413975 20050 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1826
I0625 21:57:32.413821 20048 master.cpp:3707] Removed slave 20140624-220751-1227492106-5050-20047-94 (pod1-08.mylab.com)
I0625 21:57:32.414695 20048 master.cpp:3722] Notifying framework 20140618-144705-1227492106-5050-7546-0000 of lost slave 20140624-220751-1227492106-5050-20047-94 (pod1-08.mylab.com) after recovering
I0625 21:57:32.414782 20052 replica.cpp:508] Replica received write request for position 1826
I0625 21:57:32.418505 20052 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 905704ns
I0625 21:57:32.418705 20052 replica.cpp:676] Persisted action at 1826
I0625 21:57:32.418895 20052 replica.cpp:655] Replica received learned notice for position 1826
I0625 21:57:32.419962 20052 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 815326ns
I0625 21:57:32.420199 20052 leveldb.cpp:401] Deleting ~2 keys from leveldb took 65046ns
I0625 21:57:32.420367 20052 replica.cpp:676] Persisted action at 1826
I0625 21:57:32.420534 20052 replica.cpp:661] Replica learned TRUNCATE action at position 1826
I0625 21:57:32.421062 20053 log.cpp:680] Attempting to append 379 bytes to the log
I0625 21:57:32.421318 20053 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1827
I0625 21:57:32.424969 20055 replica.cpp:508] Replica received write request for position 1827
I0625 21:57:32.427158 20055 leveldb.cpp:343] Persisting action (399 bytes) to leveldb took 1.862604ms
I0625 21:57:32.427441 20055 replica.cpp:676] Persisted action at 1827
I0625 21:57:32.427695 20055 replica.cpp:655] Replica received learned notice for position 1827
I0625 21:57:32.428751 20055 leveldb.cpp:343] Persisting action (401 bytes) to leveldb took 757534ns
I0625 21:57:32.428980 20055 replica.cpp:676] Persisted action at 1827
I0625 21:57:32.429169 20055 replica.cpp:661] Replica learned APPEND action at position 1827
I0625 21:57:32.429812 20050 registrar.cpp:479] Successfully updated 'registry'
I0625 21:57:32.430069 20053 log.cpp:699] Attempting to truncate the log to 1827
I0625 21:57:32.430133 20050 master.cpp:2342] Registered slave 20140624-220751-1227492106-5050-20047-95 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:57:32.430469 20053 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1828
I0625 21:57:32.430578 20050 master.cpp:3472] Adding slave 20140624-220751-1227492106-5050-20047-95 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]
I0625 21:57:32.431283 20048 replica.cpp:508] Replica received write request for position 1828
I0625 21:57:32.431483 20050 hierarchical_allocator_process.hpp:444] Added slave 20140624-220751-1227492106-5050-20047-95 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (and cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] available)
I0625 21:57:32.431704 20054 master.cpp:712] Slave 20140624-220751-1227492106-5050-20047-95 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) disconnected
I0625 21:57:32.434784 20048 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 3.294383ms
I0625 21:57:32.434903 20054 master.cpp:1344] Disconnecting slave 20140624-220751-1227492106-5050-20047-95
I0625 21:57:32.435056 20048 replica.cpp:676] Persisted action at 1828
W0625 21:57:32.435480 20054 master.cpp:2895] Master returning resources offered because slave 20140624-220751-1227492106-5050-20047-95 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) is disconnected
I0625 21:57:32.435531 20048 hierarchical_allocator_process.hpp:483] Slave 20140624-220751-1227492106-5050-20047-95 disconnected
I0625 21:57:32.435789 20053 replica.cpp:655] Replica received learned notice for position 1828
I0625 21:57:32.436233 20048 hierarchical_allocator_process.hpp:636] Recovered cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (total allocatable: cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]) on slave 20140624-220751-1227492106-5050-20047-95 from framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:57:32.437180 20053 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 827289ns
I0625 21:57:32.437464 20053 leveldb.cpp:401] Deleting ~2 keys from leveldb took 77908ns
I0625 21:57:32.437659 20053 replica.cpp:676] Persisted action at 1828
I0625 21:57:32.437844 20053 replica.cpp:661] Replica learned TRUNCATE action at position 1828
I0625 21:57:56.418110 20051 master.cpp:2770] Performing task state reconciliation for 0 task statuses of framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:58:01.436463 20050 master.cpp:2272] Removing old disconnected slave 20140624-220751-1227492106-5050-20047-95 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) because a registration attempt is being made from slave(1)@12.12.249.207:5051
I0625 21:58:01.437204 20050 master.cpp:3605] Removing slave 20140624-220751-1227492106-5050-20047-95 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:58:01.437597 20049 hierarchical_allocator_process.hpp:469] Removed slave 20140624-220751-1227492106-5050-20047-95
I0625 21:58:01.437765 20055 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:58:01.437700 20050 master.cpp:2302] Registering slave at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with id 20140624-220751-1227492106-5050-20047-96
I0625 21:58:01.440805 20050 log.cpp:680] Attempting to append 155 bytes to the log
I0625 21:58:01.441083 20051 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1829
I0625 21:58:01.443637 20051 replica.cpp:508] Replica received write request for position 1829
I0625 21:58:01.444200 20051 leveldb.cpp:343] Persisting action (175 bytes) to leveldb took 357206ns
I0625 21:58:01.444353 20051 replica.cpp:676] Persisted action at 1829
I0625 21:58:01.444967 20049 replica.cpp:655] Replica received learned notice for position 1829
I0625 21:58:01.446091 20049 leveldb.cpp:343] Persisting action (177 bytes) to leveldb took 953663ns
I0625 21:58:01.446291 20049 replica.cpp:676] Persisted action at 1829
I0625 21:58:01.446494 20049 replica.cpp:661] Replica learned APPEND action at position 1829
I0625 21:58:01.447037 20049 registrar.cpp:479] Successfully updated 'registry'
I0625 21:58:01.447120 20055 log.cpp:699] Attempting to truncate the log to 1829
I0625 21:58:01.447506 20055 master.cpp:3707] Removed slave 20140624-220751-1227492106-5050-20047-95 (pod1-08.mylab.com)
I0625 21:58:01.447648 20054 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1830
I0625 21:58:01.447707 20055 master.cpp:3722] Notifying framework 20140618-144705-1227492106-5050-7546-0000 of lost slave 20140624-220751-1227492106-5050-20047-95 (pod1-08.mylab.com) after recovering
I0625 21:58:01.447315 20049 registrar.cpp:422] Attempting to update the 'registry'
I0625 21:58:01.449792 20055 replica.cpp:508] Replica received write request for position 1830
I0625 21:58:01.455148 20055 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 5.032017ms
I0625 21:58:01.455442 20055 replica.cpp:676] Persisted action at 1830
I0625 21:58:01.455682 20055 replica.cpp:655] Replica received learned notice for position 1830
I0625 21:58:01.456747 20055 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 850925ns
I0625 21:58:01.457213 20055 leveldb.cpp:401] Deleting ~2 keys from leveldb took 48884ns
I0625 21:58:01.457458 20055 replica.cpp:676] Persisted action at 1830
I0625 21:58:01.457666 20055 replica.cpp:661] Replica learned TRUNCATE action at position 1830
I0625 21:58:01.458240 20053 log.cpp:680] Attempting to append 379 bytes to the log
I0625 21:58:01.458595 20053 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1831
I0625 21:58:01.459229 20048 replica.cpp:508] Replica received write request for position 1831
I0625 21:58:01.461084 20048 leveldb.cpp:343] Persisting action (399 bytes) to leveldb took 1.625313ms
I0625 21:58:01.461307 20048 replica.cpp:676] Persisted action at 1831
I0625 21:58:01.462285 20052 replica.cpp:655] Replica received learned notice for position 1831
I0625 21:58:01.463505 20052 leveldb.cpp:343] Persisting action (401 bytes) to leveldb took 730476ns
I0625 21:58:01.463745 20052 replica.cpp:676] Persisted action at 1831
I0625 21:58:01.463937 20052 replica.cpp:661] Replica learned APPEND action at position 1831
I0625 21:58:01.464560 20048 registrar.cpp:479] Successfully updated 'registry'
I0625 21:58:01.464893 20048 master.cpp:2342] Registered slave 20140624-220751-1227492106-5050-20047-96 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com)
I0625 21:58:01.465069 20048 master.cpp:3472] Adding slave 20140624-220751-1227492106-5050-20047-96 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]
I0625 21:58:01.465831 20053 master.cpp:712] Slave 20140624-220751-1227492106-5050-20047-96 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) disconnected
I0625 21:58:01.465868 20052 hierarchical_allocator_process.hpp:444] Added slave 20140624-220751-1227492106-5050-20047-96 (pod1-08.mylab.com) with cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (and cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] available)
I0625 21:58:01.466024 20053 master.cpp:1344] Disconnecting slave 20140624-220751-1227492106-5050-20047-96
I0625 21:58:01.466089 20048 log.cpp:699] Attempting to truncate the log to 1831
I0625 21:58:01.466577 20052 hierarchical_allocator_process.hpp:483] Slave 20140624-220751-1227492106-5050-20047-96 disconnected
W0625 21:58:01.466598 20049 master.cpp:2895] Master returning resources offered because slave 20140624-220751-1227492106-5050-20047-96 at slave(1)@12.12.249.207:5051 (pod1-08.mylab.com) is disconnected
I0625 21:58:01.468822 20055 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 1832
I0625 21:58:01.470536 20049 hierarchical_allocator_process.hpp:636] Recovered cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000] (total allocatable: cpus(*):32; mem(*):257417; disk(*):45276; ports(*):[31000-32000]) on slave 20140624-220751-1227492106-5050-20047-96 from framework 20140618-144705-1227492106-5050-7546-0000
I0625 21:58:01.470827 20049 replica.cpp:508] Replica received write request for position 1832
I0625 21:58:01.472054 20049 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 815861ns
I0625 21:58:01.472293 20049 replica.cpp:676] Persisted action at 1832
I0625 21:58:01.472831 20053 replica.cpp:655] Replica received learned notice for position 1832
I0625 21:58:01.473812 20053 leveldb.cpp:343] Persisting action (20 bytes) to leveldb took 763118ns
I0625 21:58:01.474059 20053 leveldb.cpp:401] Deleting ~2 keys from leveldb took 64923ns
I0625 21:58:01.476292 20053 replica.cpp:676] Persisted action at 1832
I0625 21:58:01.476521 20053 replica.cpp:661] Replica learned TRUNCATE action at position 1832


> Mesos slave continuous disconnection 
> -------------------------------------
>
>                 Key: MESOS-1541
>                 URL: https://issues.apache.org/jira/browse/MESOS-1541
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.19.0
>         Environment: Oracle Enterprise Linux 
>            Reporter: Umesh Batra
>            Priority: Blocker
>              Labels: disconnection, master, mesos, offer, rejection, slave
>
> I am seeing continuous disconnections and offer rejections in master's logs, its happening almost every 10-20 seconds 
>  
> As per suggestions from various reads 
>  
> I tried two approaches, 
>  
> 1. Setting --ip, --hostname and --port flags on the master and slave processes 
>  
> Here's one of the master/slave process details (ps -ef output) 
>  
> master: 
>  
> /usr/local/sbin/mesos-master --ip=11.11.42.73 --hostname=mesosd-lapp01.mylab.com --port=5050 --work_dir=/mesos/data/master --log_dir=/mesos/logs/master --quorum=1 --zk=zk://11.11.42.73:2181,11.11.42.78:2181,11.11.42.79:2181/mesos
>  
> slave: 
>  
> /usr/local/sbin/mesos-slave --ip=12.12.249.207 --hostname=pod1-08.mylab.com --port 5051 --work_dir=/mesos/data/slave --log_dir=/mesos/logs/slave --master=zk://11.11.42.73:2181,11.11.42.78:2181,11.11.42.79:2181/mesos --isolation=cgroups/cpu,cgroups/mem --cgroups_hierarchy=/cgroup --cgroups_root=/cgroup
>  
> 2. I also tried specifying external IP and real hostname in /etc/hosts file e.g. 
>  
> 10.11.42.73 mesosd-lapp01.mylab.com in master host and 
> 12.12.249.207 pod1-08.mylab.com in slave host 
>  
> appreciate your timely response. 
>  
> - rgds, 
> Umesh 
>  
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)