You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Matthias Veit (JIRA)" <ji...@apache.org> on 2015/11/02 10:42:27 UTC
[jira] [Commented] (MESOS-3793) Cannot start mesos local on a
Debian GNU/Linux 8 docker machine
[ https://issues.apache.org/jira/browse/MESOS-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14984970#comment-14984970 ]
Matthias Veit commented on MESOS-3793:
--------------------------------------
Starting mesos local with --launcher=posix has no effect.
With env variable export MESOS_LAUNCHER=posix I can start mesos local.
Mounting /sys/fs/cgroup and starting mesos local fails with this error:
{noformat}
➔ docker run -v /sys/fs/cgroup:/sys/fs/cgroup:rw -it marathon-buildbase:test sh
# mesos local
I1102 09:35:15.839287 5 leveldb.cpp:176] Opened db in 4.975612ms
I1102 09:35:15.840312 5 leveldb.cpp:183] Compacted db in 981189ns
I1102 09:35:15.840348 5 leveldb.cpp:198] Created db iterator in 9033ns
I1102 09:35:15.840353 5 leveldb.cpp:204] Seeked to beginning of db in 1414ns
I1102 09:35:15.840358 5 leveldb.cpp:273] Iterated through 0 keys in the db in 1025ns
I1102 09:35:15.840389 5 replica.cpp:744] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
I1102 09:35:15.840790 9 recover.cpp:449] Starting replica recovery
I1102 09:35:15.840991 10 recover.cpp:475] Replica is in EMPTY status
I1102 09:35:15.841492 9 replica.cpp:641] Replica in EMPTY status received a broadcasted recover request
I1102 09:35:15.841908 6 recover.cpp:195] Received a recover response from a replica in EMPTY status
I1102 09:35:15.842003 6 recover.cpp:566] Updating replica status to STARTING
I1102 09:35:15.843122 7 master.cpp:376] Master af8c1547-e308-4348-99d4-93879f06d853 (833b280a4c4a) started on 172.17.0.7:5050
I1102 09:35:15.843327 7 master.cpp:378] Flags at startup: --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate="false" --authenticate_slaves="false" --authenticators="crammd5" --authorizers="local" --framework_sorter="drf" --help="false" --hostname_lookup="true" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" --quiet="false" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" --registry_strict="false" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/usr/share/mesos/webui" --work_dir="/tmp/mesos/local/JU6SZj" --zk_session_timeout="10secs"
I1102 09:35:15.843575 7 master.cpp:425] Master allowing unauthenticated frameworks to register
I1102 09:35:15.843822 7 master.cpp:430] Master allowing unauthenticated slaves to register
I1102 09:35:15.843950 7 master.cpp:467] Using default 'crammd5' authenticator
W1102 09:35:15.844105 7 authenticator.cpp:505] No credentials provided, authentication requests will be refused
I1102 09:35:15.844224 7 authenticator.cpp:512] Initializing server SASL
I1102 09:35:15.843875 5 containerizer.cpp:143] Using isolation: posix/cpu,posix/mem,filesystem/posix
I1102 09:35:15.843231 11 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 1.186846ms
I1102 09:35:15.844820 11 replica.cpp:323] Persisted replica status to STARTING
I1102 09:35:15.845212 11 recover.cpp:475] Replica is in STARTING status
I1102 09:35:15.845577 11 replica.cpp:641] Replica in STARTING status received a broadcasted recover request
I1102 09:35:15.845881 11 recover.cpp:195] Received a recover response from a replica in STARTING status
I1102 09:35:15.846217 11 recover.cpp:566] Updating replica status to VOTING
I1102 09:35:15.846650 11 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 265224ns
I1102 09:35:15.846683 11 replica.cpp:323] Persisted replica status to VOTING
I1102 09:35:15.846721 11 recover.cpp:580] Successfully joined the Paxos group
I1102 09:35:15.846835 11 recover.cpp:464] Recover process terminated
I1102 09:35:15.849839 7 master.cpp:1603] The newly elected leader is master@172.17.0.7:5050 with id af8c1547-e308-4348-99d4-93879f06d853
I1102 09:35:15.853528 7 master.cpp:1616] Elected as the leading master!
I1102 09:35:15.853793 7 master.cpp:1376] Recovering from registrar
I1102 09:35:15.854033 13 registrar.cpp:309] Recovering registrar
I1102 09:35:15.854266 9 log.cpp:661] Attempting to start the writer
I1102 09:35:15.854802 9 replica.cpp:477] Replica received implicit promise request with proposal 1
I1102 09:35:15.853359 5 linux_launcher.cpp:103] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
I1102 09:35:15.856086 9 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 1.148617ms
I1102 09:35:15.856168 9 replica.cpp:345] Persisted promised to 1
I1102 09:35:15.857818 6 coordinator.cpp:231] Coordinator attemping to fill missing position
I1102 09:35:15.858723 13 replica.cpp:378] Replica received explicit promise request for position 0 with proposal 2
I1102 09:35:15.859380 13 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 599989ns
I1102 09:35:15.859414 13 replica.cpp:679] Persisted action at 0
I1102 09:35:15.859788 9 replica.cpp:511] Replica received write request for position 0
I1102 09:35:15.859863 9 leveldb.cpp:438] Reading position from leveldb took 16229ns
I1102 09:35:15.860203 9 leveldb.cpp:343] Persisting action (14 bytes) to leveldb took 317011ns
I1102 09:35:15.860257 9 replica.cpp:679] Persisted action at 0
I1102 09:35:15.860366 9 replica.cpp:658] Replica received learned notice for position 0
I1102 09:35:15.861297 9 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 789105ns
I1102 09:35:15.861330 9 replica.cpp:679] Persisted action at 0
I1102 09:35:15.861371 9 replica.cpp:664] Replica learned NOP action at position 0
I1102 09:35:15.861457 9 log.cpp:677] Writer started with ending position 0
I1102 09:35:15.861711 9 leveldb.cpp:438] Reading position from leveldb took 7791ns
I1102 09:35:15.862535 9 registrar.cpp:342] Successfully fetched the registry (0B) in 8.40192ms
I1102 09:35:15.862589 9 registrar.cpp:441] Applied 1 operations in 4352ns; attempting to update the 'registry'
I1102 09:35:15.862763 9 log.cpp:685] Attempting to append 165 bytes to the log
I1102 09:35:15.862846 9 coordinator.cpp:341] Coordinator attempting to write APPEND action at position 1
I1102 09:35:15.863004 9 replica.cpp:511] Replica received write request for position 1
I1102 09:35:15.863351 9 leveldb.cpp:343] Persisting action (184 bytes) to leveldb took 282975ns
I1102 09:35:15.863426 9 replica.cpp:679] Persisted action at 1
I1102 09:35:15.863567 10 replica.cpp:658] Replica received learned notice for position 1
I1102 09:35:15.863859 10 leveldb.cpp:343] Persisting action (186 bytes) to leveldb took 267957ns
I1102 09:35:15.863886 10 replica.cpp:679] Persisted action at 1
I1102 09:35:15.863898 10 replica.cpp:664] Replica learned APPEND action at position 1
I1102 09:35:15.864140 9 registrar.cpp:486] Successfully updated the 'registry' in 1.516032ms
I1102 09:35:15.864183 10 log.cpp:704] Attempting to truncate the log to 1
I1102 09:35:15.864302 10 coordinator.cpp:341] Coordinator attempting to write TRUNCATE action at position 2
I1102 09:35:15.864197 9 registrar.cpp:372] Successfully recovered registrar
I1102 09:35:15.864425 9 replica.cpp:511] Replica received write request for position 2
I1102 09:35:15.864423 10 master.cpp:1413] Recovered 0 slaves from the Registry (127B) ; allowing 10mins for slaves to re-register
I1102 09:35:15.866138 9 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 1.676671ms
I1102 09:35:15.866181 9 replica.cpp:679] Persisted action at 2
I1102 09:35:15.866294 9 replica.cpp:658] Replica received learned notice for position 2
I1102 09:35:15.866595 5 systemd.cpp:128] systemd version `215` detected
W1102 09:35:15.866622 5 systemd.cpp:136] Required functionality `Delegate` was introduced in Version `218`. Your system may not function properly; however since some distributions have patched systemd packages, your system may still be functional. This is why we keep running. See MESOS-3352 for more information
I1102 09:35:15.866664 9 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 294030ns
I1102 09:35:15.866722 9 leveldb.cpp:401] Deleting ~1 keys from leveldb took 13730ns
I1102 09:35:15.866750 9 replica.cpp:679] Persisted action at 2
I1102 09:35:15.866780 9 replica.cpp:664] Replica learned TRUNCATE action at position 2
Failed to create a containerizer: Could not create MesosContainerizer: Failed to create launcher: Failed to initialize systemd: Failed to locate systemd runtime directory: /run/systemd/system
{noformat}
> Cannot start mesos local on a Debian GNU/Linux 8 docker machine
> ---------------------------------------------------------------
>
> Key: MESOS-3793
> URL: https://issues.apache.org/jira/browse/MESOS-3793
> Project: Mesos
> Issue Type: Bug
> Affects Versions: 0.25.0
> Environment: Debian GNU/Linux 8 docker machine
> Reporter: Matthias Veit
> Assignee: Jojy Varghese
> Labels: mesosphere
>
> We updated the mesos version to 0.25.0 in our Marathon docker image, that runs our integration tests.
> We use mesos local for those tests. This fails with this message:
> {noformat}
> root@a06e4b4eb776:/marathon# mesos local
> I1022 18:42:26.852485 136 leveldb.cpp:176] Opened db in 6.103258ms
> I1022 18:42:26.853302 136 leveldb.cpp:183] Compacted db in 765740ns
> I1022 18:42:26.853343 136 leveldb.cpp:198] Created db iterator in 9001ns
> I1022 18:42:26.853355 136 leveldb.cpp:204] Seeked to beginning of db in 1287ns
> I1022 18:42:26.853366 136 leveldb.cpp:273] Iterated through 0 keys in the db in 1111ns
> I1022 18:42:26.853406 136 replica.cpp:744] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> I1022 18:42:26.853775 141 recover.cpp:449] Starting replica recovery
> I1022 18:42:26.853862 141 recover.cpp:475] Replica is in EMPTY status
> I1022 18:42:26.854751 138 replica.cpp:641] Replica in EMPTY status received a broadcasted recover request
> I1022 18:42:26.854856 140 recover.cpp:195] Received a recover response from a replica in EMPTY status
> I1022 18:42:26.855002 140 recover.cpp:566] Updating replica status to STARTING
> I1022 18:42:26.855655 138 master.cpp:376] Master a3f39818-1bda-4710-b96b-2a60ed4d12b8 (a06e4b4eb776) started on 172.17.0.14:5050
> I1022 18:42:26.855680 138 master.cpp:378] Flags at startup: --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate="false" --authenticate_slaves="false" --authenticators="crammd5" --authorizers="local" --framework_sorter="drf" --help="false" --hostname_lookup="true" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" --quiet="false" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" --registry_strict="false" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/usr/share/mesos/webui" --work_dir="/tmp/mesos/local/AK0XpG" --zk_session_timeout="10secs"
> I1022 18:42:26.855790 138 master.cpp:425] Master allowing unauthenticated frameworks to register
> I1022 18:42:26.855803 138 master.cpp:430] Master allowing unauthenticated slaves to register
> I1022 18:42:26.855815 138 master.cpp:467] Using default 'crammd5' authenticator
> W1022 18:42:26.855829 138 authenticator.cpp:505] No credentials provided, authentication requests will be refused
> I1022 18:42:26.855840 138 authenticator.cpp:512] Initializing server SASL
> I1022 18:42:26.856442 136 containerizer.cpp:143] Using isolation: posix/cpu,posix/mem,filesystem/posix
> I1022 18:42:26.856943 140 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 1.888185ms
> I1022 18:42:26.856987 140 replica.cpp:323] Persisted replica status to STARTING
> I1022 18:42:26.857115 140 recover.cpp:475] Replica is in STARTING status
> I1022 18:42:26.857270 140 replica.cpp:641] Replica in STARTING status received a broadcasted recover request
> I1022 18:42:26.857312 140 recover.cpp:195] Received a recover response from a replica in STARTING status
> I1022 18:42:26.857368 140 recover.cpp:566] Updating replica status to VOTING
> I1022 18:42:26.857781 140 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 371121ns
> I1022 18:42:26.857841 140 replica.cpp:323] Persisted replica status to VOTING
> I1022 18:42:26.857895 140 recover.cpp:580] Successfully joined the Paxos group
> I1022 18:42:26.857928 140 recover.cpp:464] Recover process terminated
> I1022 18:42:26.862455 137 master.cpp:1603] The newly elected leader is master@172.17.0.14:5050 with id a3f39818-1bda-4710-b96b-2a60ed4d12b8
> I1022 18:42:26.862498 137 master.cpp:1616] Elected as the leading master!
> I1022 18:42:26.862511 137 master.cpp:1376] Recovering from registrar
> I1022 18:42:26.862560 137 registrar.cpp:309] Recovering registrar
> Failed to create a containerizer: Could not create MesosContainerizer: Failed to create launcher: Failed to create Linux launcher: Failed to mount cgroups hierarchy at '/sys/fs/cgroup/freezer': 'freezer' is already attached to another hierarchy
> {noformat}
> The setup worked with mesos 0.24.0.
> The Dockerfile is here: https://github.com/mesosphere/marathon/blob/mv/mesos_0.25/Dockerfile
> {noformat}
> root@a06e4b4eb776:/marathon# ls /sys/fs/cgroup/
> root@a06e4b4eb776:/marathon#
> {noformat}
> {noformat}
> root@a06e4b4eb776:/marathon# cat /proc/mounts
> none / aufs rw,relatime,si=6e7ac87f36042e03,dio,dirperm1 0 0
> proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
> tmpfs /dev tmpfs rw,nosuid,mode=755 0 0
> devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666 0 0
> shm /dev/shm tmpfs rw,nosuid,nodev,noexec,relatime,size=65536k 0 0
> mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0
> sysfs /sys sysfs ro,nosuid,nodev,noexec,relatime 0 0
> /dev/sda1 /etc/resolv.conf ext4 rw,relatime,data=ordered 0 0
> /dev/sda1 /etc/hostname ext4 rw,relatime,data=ordered 0 0
> /dev/sda1 /etc/hosts ext4 rw,relatime,data=ordered 0 0
> devpts /dev/console devpts rw,relatime,mode=600,ptmxmode=000 0 0
> proc /proc/bus proc ro,nosuid,nodev,noexec,relatime 0 0
> proc /proc/fs proc ro,nosuid,nodev,noexec,relatime 0 0
> proc /proc/irq proc ro,nosuid,nodev,noexec,relatime 0 0
> proc /proc/sys proc ro,nosuid,nodev,noexec,relatime 0 0
> proc /proc/sysrq-trigger proc ro,nosuid,nodev,noexec,relatime 0 0
> tmpfs /proc/kcore tmpfs rw,nosuid,mode=755 0 0
> tmpfs /proc/timer_stats tmpfs rw,nosuid,mode=755 0 0
> {noformat}
> [~bernd-mesos] Can you please assign to the correct person?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)