You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Benjamin Mahler (JIRA)" <ji...@apache.org> on 2013/05/07 00:20:15 UTC

[jira] [Created] (MESOS-463) Detector ZNode creation failure.

Benjamin Mahler created MESOS-463:
-------------------------------------

             Summary: Detector ZNode creation failure.
                 Key: MESOS-463
                 URL: https://issues.apache.org/jira/browse/MESOS-463
             Project: Mesos
          Issue Type: Bug
            Reporter: Benjamin Mahler


The following failure message occured in a test cluster at Twitter:

    // We fail all non-OK return codes except ZNODEEXISTS (since that
    // means the path we were trying to create exists) and ZNOAUTH
    // (since it's possible that the ACLs on 'dirname(url.path)' don't
    // allow us to create a child znode but we are allowed to create
    // children of 'url.path' itself, which will be determined below
    // if we are contending). Note that it's also possible we got back
    // a ZNONODE because we could not create one of the intermediate
    // znodes (in which case we'll abort in the 'else' below since
    // ZNONODE is non-retryable). TODO(benh): Need to check that we
    // also can put a watch on the children of 'url.path'.
    if (code != ZOK && code != ZNODEEXISTS && code != ZNOAUTH) {
      LOG(FATAL) << "Failed to create '" << url.path
                 << "' in ZooKeeper: " << zk->message(code);
    }

It's interesting that there was a delay before the slave crashed:

F0506 21:59:46.769142 54944 detector.cpp:315] Failed to create '/home/mesos/test/master' in ZooKeeper: invalid zhandle state
*** Check failure stack trace: ***
I0506 21:59:47.398509 54940 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6663aee94840_tag
_c6f62383-28d1-42c2-aa65-15f9ec42db57
I0506 21:59:46.538936 54936 cgroups_isolator.cpp:804] Executor thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22 of framework 201103282247-0000000019-0000 terminated with status 0
I0506 22:00:01.329169 54936 cgroups_isolator.cpp:620] Killing executor thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22 of framework 201103282247-0000000019-0000
I0506 21:59:47.576560 54948 slave.cpp:721] Got assigned task 1367877572534-mesos-meta_slave_2-4-3b3755d9-9435-4480-9068-d851234cb91d for framework 201103282247-0000000019-0000
W0506 21:59:47.576581 54941 monitor.cpp:167] Failed to collect resource usage for executor 'thermos-1367877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc' of framework '201103282247-000000001
9-0000': 1
W0506 21:59:51.089640 54939 cgroups.cpp:1261] Unable to freeze /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_tag_b
26ba555-a6bd-4448-a5b4-439beb442820 within 51 attempts
W0506 21:59:51.090683 54943 cgroups.cpp:1261] Unable to freeze /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc_tag_
6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4 within 51 attempts
I0506 21:59:47.518194 54946 status_update_manager.cpp:359] Received status update acknowledgement c7701021-7eac-4711-96ac-093871462e44 for task 1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48
f50c4a of framework 201103282247-0000000019-0000
W0506 22:00:01.439838 54941 monitor.cpp:167] Failed to collect resource usage for executor 'thermos-1367877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6663aee94840' of framework '201103282247-0000000019-
0000': 1
I0506 22:00:01.468596 54946 status_update_manager.cpp:359] Received status update acknowledgement 76ed08a3-1f8a-4eb2-8d2a-28de822ee370 for task 1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767ac
b0d90 of framework 201103282247-0000000019-0000
I0506 22:00:01.580569 54946 status_update_manager.cpp:359] Received status update acknowledgement 146b821e-ed93-4059-8971-b3c73a7d02ed for task 1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5fa-d754a82
c86f6 of framework 201103282247-0000000019-0000
W0506 22:00:01.580670 54946 status_update_manager.cpp:382] Ignoring unexpected status update acknowledgment 146b821e-ed93-4059-8971-b3c73a7d02ed for task 1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5
fa-d754a82c86f6 of framework 201103282247-0000000019-0000
I0506 22:00:01.888521 54936 cgroups_isolator.cpp:804] Executor thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c of framework 201103282247-0000000019-0000 terminated with status 0
I0506 22:00:06.876126 54936 cgroups_isolator.cpp:620] Killing executor thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c of framework 201103282247-0000000019-0000
W0506 22:00:01.917793 54946 status_update_manager.cpp:432] Resending status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
f9dd7c22 of framework 201103282247-0000000019-0000
I0506 22:00:08.930923 54946 status_update_manager.cpp:335] Forwarding status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
0f9dd7c22 of framework 201103282247-0000000019-0000 to master@10.34.128.115:5050
I0506 22:00:02.215196 54949 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc
_tag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
I0506 22:00:02.243545 54945 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_
tag_b26ba555-a6bd-4448-a5b4-439beb442820
I0506 22:00:09.717344 54949 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc_t
ag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
W0506 22:00:01.917774 54941 monitor.cpp:167] Failed to collect resource usage for executor 'thermos-1367877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0' of framework '201103282247-0000000019
-0000': 1
I0506 22:00:06.931675 54936 cgroups_isolator.cpp:804] Executor thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5588019 of framework 201103282247-0000000019-0000 terminated with status 0
I0506 22:00:06.960703 54934 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e
1c_tag_720a3ca1-ea57-421f-9263-d47e347b866b
I0506 22:00:02.157475 54948 slave.cpp:514] Successfully attached file '/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/thermos-1367877538347-mesos-me
ta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-412b-8c86-799354704486'
I0506 22:00:02.243666 54937 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c2
2_tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
I0506 22:00:09.717412 54945 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_ta
g_b26ba555-a6bd-4448-a5b4-439beb442820
I0506 22:00:09.766968 54936 cgroups_isolator.cpp:620] Killing executor thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5588019 of framework 201103282247-0000000019-0000
I0506 22:00:09.794438 54948 slave.cpp:2031] Executor 'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-e69d558f9bb2' of framework 201103282247-0000000019-0000 has exited with status '0'
I0506 22:00:10.080941 54936 cgroups_isolator.cpp:804] Executor thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90 of framework 201103282247-0000000019-0000 terminated with status 0
I0506 22:00:10.081022 54936 cgroups_isolator.cpp:620] Killing executor thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90 of framework 201103282247-0000000019-0000
I0506 22:00:10.138010 54948 slave.cpp:2166] Cleaning up executor 'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-e69d558f9bb2' of framework 201103282247-0000000019-0000
    @     0x7f050768dbcd  google::LogMessage::Fail()
I0506 22:00:10.421581 54936 cgroups_isolator.cpp:804] Executor thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48f50c4a of framework 201103282247-0000000019-0000 terminated with status 0
    @     0x7f0507693837  google::LogMessage::SendToLog()
I0506 22:00:21.140630 54936 cgroups_isolator.cpp:620] Killing executor thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48f50c4a of framework 201103282247-0000000019-0000
I0506 22:00:10.591079 54948 slave.cpp:819] Launching task 1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a for framework 201103282247-0000000019-0000
I0506 22:00:10.703546 54934 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d9
0_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0
I0506 22:00:11.635795 54935 gc.cpp:143] Deleted '/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c2272/runs/c562a33
e-6870-47d8-ae53-8faec70ad328'
W0506 22:00:15.583375 54942 cgroups.cpp:1261] Unable to freeze /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_tag_2
407f4f8-2ad7-4415-abdd-016d5ab8e20d within 51 attempts
W0506 22:00:15.596472 54941 cgroups.cpp:1261] Unable to freeze /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c_tag_
720a3ca1-ea57-421f-9263-d47e347b866b within 51 attempts
W0506 22:00:19.717893 54946 status_update_manager.cpp:432] Resending status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
f9dd7c22 of framework 201103282247-0000000019-0000
I0506 22:00:10.453220 54949 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab558801
9_tag_9f6b6105-0a40-4e8f-8044-19a09b186984
I0506 22:00:21.141474 54946 status_update_manager.cpp:335] Forwarding status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
0f9dd7c22 of framework 201103282247-0000000019-0000 to master@10.34.128.115:5050
I0506 22:00:21.143030 54948 paths.hpp:302] Created executor directory '/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/thermos-1367877549319-mesos-me
ta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a/runs/bd162f3e-c45a-44a9-ae10-25914c460689'
I0506 22:00:21.256687 54948 slave.cpp:930] Queuing task '1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a' for executor thermos-1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac
7e-d365c0562d3a of framework '201103282247-0000000019-0000
I0506 22:00:21.256707 54936 cgroups_isolator.cpp:804] Executor thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb275599233 of framework 201103282247-0000000019-0000 terminated with status 0
I0506 22:00:21.256798 54948 slave.cpp:1789] Sending acknowledgement for status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f
400f9dd7c22 of framework 201103282247-0000000019-0000 to executor(1)@10.34.124.132:55886
I0506 22:00:21.256815 54936 cgroups_isolator.cpp:620] Killing executor thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb275599233 of framework 201103282247-0000000019-0000
    @     0x7f050768f47c  google::LogMessage::Flush()
I0506 22:00:21.377487 54935 gc.cpp:134] Deleting /var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c2272
I0506 22:00:21.377707 54935 gc.cpp:143] Deleted '/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c2272'
    @     0x7f050768f6e6  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f050741063d  mesos::internal::ZooKeeperMasterDetectorProcess::connected()
    @     0x7f05074111d8  std::tr1::_Function_handler<>::_M_invoke()
    @     0x7f0507413c84  std::tr1::_Function_handler<>::_M_invoke()
    @     0x7f050758d99a  process::ProcessManager::resume()
    @     0x7f050758e9af  process::schedule()
    @     0x7f0506d2773d  start_thread
I0506 22:00:21.487545 54940 cgroups.cpp:1214] Successfully froze cgroup /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0
d90_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0 after 3 attempts
I0506 22:00:21.487633 54942 cgroups.cpp:1214] Successfully froze cgroup /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5588
019_tag_9f6b6105-0a40-4e8f-8044-19a09b186984 after 3 attempts
I0506 22:00:21.519721 54935 gc.cpp:134] Deleting /var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/thermos-1367685279171-mesos-meta_slave_14-0-a67ae3d4
-3e67-4a3a-ba85-ec66ca255db8/runs/76e96368-7692-4d9b-a717-e27fe7873f8b
I0506 22:00:21.519790 54941 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48f50c
4a_tag_53fc2cf1-8f88-4913-9e87-d865dbff4cdf
I0506 22:00:21.577153 54946 status_update_manager.cpp:359] Received status update acknowledgement e7e02c18-abee-4781-9e62-70e0475b9fa3 for task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9d
d7c22 of framework 201103282247-0000000019-0000
I0506 22:00:21.577270 54948 slave.cpp:721] Got assigned task 1367877588667-mesos-meta_slave_33-39-79414744-8178-48e5-98df-eec82f8091f0 for framework 201103282247-0000000019-0000
I0506 22:00:21.674876 54938 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c
_tag_720a3ca1-ea57-421f-9263-d47e347b866b
I0506 22:00:21.828469 54936 cgroups_isolator.cpp:520] Launching thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16 (./thermos_executor) in /var/lib/mesos/slaves/201304262233-1937777
162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-412b-8c86-799354704486 with resources cpus=
0.25; mem=128 for framework 201103282247-0000000019-0000 in cgroup mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16_tag_ed262
fff-5bc0-412b-8c86-799354704486
I0506 22:00:21.828743 54949 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_
tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
I0506 22:00:21.828740 54937 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb27559923
3_tag_4b84157b-e9c6-43e8-84f9-9ad1467411a7
I0506 22:00:22.244297 54946 status_update_manager.cpp:480] Cleaning up status update stream for task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22 of framework 201103282247-0000000019-
0000
I0506 22:00:22.244469 54938 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c_t
ag_720a3ca1-ea57-421f-9263-d47e347b866b
I0506 22:00:22.244581 54949 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_ta
g_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
I0506 22:00:22.246649 54936 cgroups_isolator.cpp:655] Changing cgroup controls for executor thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16 of framework 201103282247-0000000019-0
000 with resources cpus=0.25; mem=128
I0506 22:00:22.247011 54936 cgroups_isolator.cpp:839] Updated 'cpu.shares' to 256 for executor thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16 of framework 201103282247-000000001
9-0000
I0506 22:00:22.247328 54936 cgroups_isolator.cpp:977] Updated 'memory.limit_in_bytes' to 134217728 for executor thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16 of framework 201103282247-0000000019-0000
I0506 22:00:22.622454 54948 http.cpp:279] HTTP request for '/slave(1)/stats.json'
I0506 22:00:23.052145 54947 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0
I0506 22:00:23.052280 54947 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0
I0506 22:00:23.283198 54936 cgroups_isolator.cpp:1003] Started listening for OOM events for executor thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16 of framework 201103282247-0000000019-0000
I0506 22:00:23.296195 54936 cgroups_isolator.cpp:550] Forked executor at = 20045
    @     0x7f050570bf6d  clone
Fetching resources into '/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-412b-8c86-799354704486'
Fetching resource '/usr/local/bin/thermos_executor'
Copying resource from '/usr/local/bin/thermos_executor' to .
/usr/local/bin/mesos-slave.sh: line 101: 54931 Aborted                 (core dumped) /usr/local/sbin/mesos-slave --port=5051 --resources="${MESOS_RESOURCES}" --attributes="${MESOS_ATTRIBUTES}" --master="${master_zoo_url}" --log_dir="${log_dir}" ${WORK_DIR} ${CGROUPS_ISOLATION} "$@"
Slave Exit Status: 134
I0506 22:01:17.601867 20405 main.cpp:124] Creating "cgroups" isolator
I0506 22:01:17.627717 20405 main.cpp:132] Build: 2013-04-26 19:49:25 by bmahler
I0506 22:01:17.627739 20405 main.cpp:133] Starting Mesos slave

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [jira] [Created] (MESOS-463) Detector ZNode creation failure.

Posted by "Mattmann, Chris A (398J)" <ch...@jpl.nasa.gov>.
Hey Vinod,

Yep unreleased for now b/c I have to go back and track down all the issues
that were associated with 0.10.0 (mainly will try and use date as a
delimiter
unless I hear something better from you guys). I'll leave it as unreleased
until I do the pathology, then will click "release" on it to solidify the
changelog.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Vinod Kone <vi...@gmail.com>
Reply-To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>
Date: Monday, May 6, 2013 4:03 PM
To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>
Cc: Ben Hindman <be...@twitter.com>
Subject: Re: [jira] [Created] (MESOS-463) Detector ZNode creation failure.

>Great. Thanks for getting this done. Definitely makes writing CHANGELOG
>easier.
>
>Curiously, I see that 0.10.0 is mentioned under "unreleased versions",
>even
>though we released it?
>
>
>On Mon, May 6, 2013 at 3:39 PM, Mattmann, Chris A (398J) <
>chris.a.mattmann@jpl.nasa.gov> wrote:
>
>> OK Gav hooked me up so we now have versions for 0.10.0, 0.11.0 and
>>0.12.0.
>>
>> I'll continue my pathology for 0.10.0 and 0.11.0 to get a decent change
>> log going (based on dates and SVN tags), but it's going to take a week
>> or so to get this in order.
>>
>> Will report back and update the CHANGELOG when I'm done. In the
>>meanwhile
>> when you guys are creating new issues please set the appropriate Fix
>> Version.
>>
>> Thanks all!
>>
>> Cheers,
>> Chris
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: <Mattmann>, jpluser <ch...@jpl.nasa.gov>
>> Reply-To: "mesos-dev@incubator.apache.org"
>><mesos-dev@incubator.apache.org
>> >
>> Date: Monday, May 6, 2013 3:30 PM
>> To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>,
>>Ben
>> Hindman <be...@twitter.com>
>> Subject: Re: [jira] [Created] (MESOS-463) Detector ZNode creation
>>failure.
>>
>> >Thanks Ben M -- we'll get it sorted! Thanks for trying. I'll poke
>> >infra@ too for my admin access to JIRA..
>> >
>> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >Chris Mattmann, Ph.D.
>> >Senior Computer Scientist
>> >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> >Office: 171-266B, Mailstop: 171-246
>> >Email: chris.a.mattmann@nasa.gov
>> >WWW:  http://sunset.usc.edu/~mattmann/
>> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >Adjunct Assistant Professor, Computer Science Department
>> >University of Southern California, Los Angeles, CA 90089 USA
>> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >
>> >
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Benjamin Mahler <be...@gmail.com>
>> >Reply-To: "mesos-dev@incubator.apache.org"
>> ><me...@incubator.apache.org>
>> >Date: Monday, May 6, 2013 3:26 PM
>> >To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>,
>> Ben
>> >Hindman <be...@twitter.com>
>> >Subject: Re: FW: [jira] [Created] (MESOS-463) Detector ZNode creation
>> >failure.
>> >
>> >>Yeah, I had tried doing it for this ticket but the versions don't yet
>> >>exist
>> >>in JIRA =/
>> >>
>> >>Added benh to this email.
>> >>
>> >>
>> >>On Mon, May 6, 2013 at 3:22 PM, Mattmann, Chris A (398J) <
>> >>chris.a.mattmann@jpl.nasa.gov> wrote:
>> >>
>> >>> Guys, FYI JIRA issues like this below should have a Fix version.
>> >>>
>> >>> Ben H, or someone with permissions, can you please create versions
>> >>> 0.11.0, 0.10.0, and 0.12.0 (which I'd assume MESOS-463 to be in that
>> >>> grouping). Then, once they are created, we can set the Fix version
>> >>> and start generating some boss-hog change logs around here.
>> >>>
>> >>> Cheers,
>> >>> Chris
>> >>>
>> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>> Chris Mattmann, Ph.D.
>> >>> Senior Computer Scientist
>> >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> >>> Office: 171-266B, Mailstop: 171-246
>> >>> Email: chris.a.mattmann@nasa.gov
>> >>> WWW:  http://sunset.usc.edu/~mattmann/
>> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>> Adjunct Assistant Professor, Computer Science Department
>> >>> University of Southern California, Los Angeles, CA 90089 USA
>> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> -----Original Message-----
>> >>> From: "Benjamin Mahler   (JIRA)" <ji...@apache.org>
>> >>> Reply-To: "mesos-dev@incubator.apache.org"
>> >>><mesos-dev@incubator.apache.org
>> >>> >
>> >>> Date: Monday, May 6, 2013 3:20 PM
>> >>> To: "mesos-dev@incubator.apache.org"
>><me...@incubator.apache.org>
>> >>> Subject: [jira] [Created] (MESOS-463) Detector ZNode creation
>>failure.
>> >>>
>> >>> >Benjamin Mahler created MESOS-463:
>> >>> >-------------------------------------
>> >>> >
>> >>> >             Summary: Detector ZNode creation failure.
>> >>> >                 Key: MESOS-463
>> >>> >                 URL:
>>https://issues.apache.org/jira/browse/MESOS-463
>> >>> >             Project: Mesos
>> >>> >          Issue Type: Bug
>> >>> >            Reporter: Benjamin Mahler
>> >>> >
>> >>> >
>> >>> >The following failure message occured in a test cluster at Twitter:
>> >>> >
>> >>> >    // We fail all non-OK return codes except ZNODEEXISTS (since
>>that
>> >>> >    // means the path we were trying to create exists) and ZNOAUTH
>> >>> >    // (since it's possible that the ACLs on 'dirname(url.path)'
>>don't
>> >>> >    // allow us to create a child znode but we are allowed to
>>create
>> >>> >    // children of 'url.path' itself, which will be determined
>>below
>> >>> >    // if we are contending). Note that it's also possible we got
>>back
>> >>> >    // a ZNONODE because we could not create one of the
>>intermediate
>> >>> >    // znodes (in which case we'll abort in the 'else' below since
>> >>> >    // ZNONODE is non-retryable). TODO(benh): Need to check that we
>> >>> >    // also can put a watch on the children of 'url.path'.
>> >>> >    if (code != ZOK && code != ZNODEEXISTS && code != ZNOAUTH) {
>> >>> >      LOG(FATAL) << "Failed to create '" << url.path
>> >>> >                 << "' in ZooKeeper: " << zk->message(code);
>> >>> >    }
>> >>> >
>> >>> >It's interesting that there was a delay before the slave crashed:
>> >>> >
>> >>> >F0506 21:59:46.769142 54944 detector.cpp:315] Failed to create
>> >>> >'/home/mesos/test/master' in ZooKeeper: invalid zhandle state
>> >>> >*** Check failure stack trace: ***
>> >>> >I0506 21:59:47.398509 54940 cgroups.cpp:1298] Successfully thawed
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6663aee94840_t
>>>>>>ag
>> >>> >_c6f62383-28d1-42c2-aa65-15f9ec42db57
>> >>> >I0506 21:59:46.538936 54936 cgroups_isolator.cpp:804] Executor
>> >>>
>> 
>>>>>>thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f
>>>>>>40
>> >>>>0
>> >>>>f9
>> >>> >dd7c22 of framework 201103282247-0000000019-0000 terminated with
>> >>>status 0
>> >>> >I0506 22:00:01.329169 54936 cgroups_isolator.cpp:620] Killing
>>executor
>> >>>
>> 
>>>>>>thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f
>>>>>>40
>> >>>>0
>> >>>>f9
>> >>> >dd7c22 of framework 201103282247-0000000019-0000
>> >>> >I0506 21:59:47.576560 54948 slave.cpp:721] Got assigned task
>> >>>
>> 
>>>>>>1367877572534-mesos-meta_slave_2-4-3b3755d9-9435-4480-9068-d851234cb9
>>>>>>1d
>> >>> >for framework 201103282247-0000000019-0000
>> >>> >W0506 21:59:47.576581 54941 monitor.cpp:167] Failed to collect
>> >>>resource
>> >>> >usage for executor
>> >>>
>> 
>>>>>>'thermos-1367877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de
>>>>>>-7
>> >>>>e
>> >>>>39
>> >>> >564e5bfc' of framework '201103282247-000000001
>> >>> >9-0000': 1
>> >>> >W0506 21:59:51.089640 54939 cgroups.cpp:1261] Unable to freeze
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_
>>>>>>ta
>> >>>>g
>> >>>>_b
>> >>> >26ba555-a6bd-4448-a5b4-439beb442820 within 51 attempts
>> >>> >W0506 21:59:51.090683 54943 cgroups.cpp:1261] Unable to freeze
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc
>>>>>>_t
>> >>>>a
>> >>>>g_
>> >>> >6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4 within 51 attempts
>> >>> >I0506 21:59:47.518194 54946 status_update_manager.cpp:359] Received
>> >>> >status update acknowledgement c7701021-7eac-4711-96ac-093871462e44
>>for
>> >>> >task
>> >>>1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48
>> >>> >f50c4a of framework 201103282247-0000000019-0000
>> >>> >W0506 22:00:01.439838 54941 monitor.cpp:167] Failed to collect
>> >>>resource
>> >>> >usage for executor
>> >>>
>> 
>>>>>>'thermos-1367877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6
>>>>>>66
>> >>>>3
>> >>>>ae
>> >>> >e94840' of framework '201103282247-0000000019-
>> >>> >0000': 1
>> >>> >I0506 22:00:01.468596 54946 status_update_manager.cpp:359] Received
>> >>> >status update acknowledgement 76ed08a3-1f8a-4eb2-8d2a-28de822ee370
>>for
>> >>> >task
>> >>>1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767ac
>> >>> >b0d90 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:01.580569 54946 status_update_manager.cpp:359] Received
>> >>> >status update acknowledgement 146b821e-ed93-4059-8971-b3c73a7d02ed
>>for
>> >>> >task
>> >>>1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5fa-d754a82
>> >>> >c86f6 of framework 201103282247-0000000019-0000
>> >>> >W0506 22:00:01.580670 54946 status_update_manager.cpp:382] Ignoring
>> >>> >unexpected status update acknowledgment
>> >>> >146b821e-ed93-4059-8971-b3c73a7d02ed for task
>> >>> >1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5
>> >>> >fa-d754a82c86f6 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:01.888521 54936 cgroups_isolator.cpp:804] Executor
>> >>>
>> 
>>>>>>thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-
>>>>>>0c
>> >>>>3
>> >>>>8f
>> >>> >cec5e1c of framework 201103282247-0000000019-0000 terminated with
>> >>>status 0
>> >>> >I0506 22:00:06.876126 54936 cgroups_isolator.cpp:620] Killing
>>executor
>> >>>
>> 
>>>>>>thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-
>>>>>>0c
>> >>>>3
>> >>>>8f
>> >>> >cec5e1c of framework 201103282247-0000000019-0000
>> >>> >W0506 22:00:01.917793 54946 status_update_manager.cpp:432]
>>Resending
>> >>> >status update TASK_LOST (UUID:
>>e7e02c18-abee-4781-9e62-70e0475b9fa3)
>> >>>for
>> >>> >task 
>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
>> >>> >f9dd7c22 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:08.930923 54946 status_update_manager.cpp:335]
>>Forwarding
>> >>> >status update TASK_LOST (UUID:
>>e7e02c18-abee-4781-9e62-70e0475b9fa3)
>> >>>for
>> >>> >task 
>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
>> >>> >0f9dd7c22 of framework 201103282247-0000000019-0000 to
>> >>> >master@10.34.128.115:5050
>> >>> >I0506 22:00:02.215196 54949 cgroups.cpp:1190] Trying to thaw cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> 
>>>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc
>> >>> >_tag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
>> >>> >I0506 22:00:02.243545 54945 cgroups.cpp:1190] Trying to thaw cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> 
>>>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_
>> >>> >tag_b26ba555-a6bd-4448-a5b4-439beb442820
>> >>> >I0506 22:00:09.717344 54949 cgroups.cpp:1298] Successfully thawed
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc
>>>>>>_t
>> >>> >ag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
>> >>> >W0506 22:00:01.917774 54941 monitor.cpp:167] Failed to collect
>> >>>resource
>> >>> >usage for executor
>> >>>
>> 
>>>>>>'thermos-1367877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-
>>>>>>37
>> >>>>6
>> >>>>ab
>> >>> >b0df8d0' of framework '201103282247-0000000019
>> >>> >-0000': 1
>> >>> >I0506 22:00:06.931675 54936 cgroups_isolator.cpp:804] Executor
>> >>>
>> 
>>>>>>thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e
>>>>>>2a
>> >>>>a
>> >>>>b5
>> >>> >588019 of framework 201103282247-0000000019-0000 terminated with
>> >>>status 0
>> >>> >I0506 22:00:06.960703 54934 cgroups.cpp:1175] Trying to freeze
>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e
>> >>> >1c_tag_720a3ca1-ea57-421f-9263-d47e347b866b
>> >>> >I0506 22:00:02.157475 54948 slave.cpp:514] Successfully attached
>>file
>> >>>
>> 
>>>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/framewor
>>>>>>ks
>> >>>>/
>> >>>>20
>> >>> >1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-me
>> >>>
>> 
>>>>>>ta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0
>>>>>>-4
>> >>>>1
>> >>>>2b
>> >>> >-8c86-799354704486'
>> >>> >I0506 22:00:02.243666 54937 cgroups.cpp:1175] Trying to freeze
>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c2
>> >>> >2_tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>> >>> >I0506 22:00:09.717412 54945 cgroups.cpp:1298] Successfully thawed
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_
>>>>>>ta
>> >>> >g_b26ba555-a6bd-4448-a5b4-439beb442820
>> >>> >I0506 22:00:09.766968 54936 cgroups_isolator.cpp:620] Killing
>>executor
>> >>>
>> 
>>>>>>thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e
>>>>>>2a
>> >>>>a
>> >>>>b5
>> >>> >588019 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:09.794438 54948 slave.cpp:2031] Executor
>> >>>
>> 
>>>>>>'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-
>>>>>>e6
>> >>>>9
>> >>>>d5
>> >>> >58f9bb2' of framework 201103282247-0000000019-0000 has exited with
>> >>>status
>> >>> >'0'
>> >>> >I0506 22:00:10.080941 54936 cgroups_isolator.cpp:804] Executor
>> >>>
>> 
>>>>>>thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-4
>>>>>>47
>> >>>>6
>> >>>>7a
>> >>> >cb0d90 of framework 201103282247-0000000019-0000 terminated with
>> >>>status 0
>> >>> >I0506 22:00:10.081022 54936 cgroups_isolator.cpp:620] Killing
>>executor
>> >>>
>> 
>>>>>>thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-4
>>>>>>47
>> >>>>6
>> >>>>7a
>> >>> >cb0d90 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:10.138010 54948 slave.cpp:2166] Cleaning up executor
>> >>>
>> 
>>>>>>'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-
>>>>>>e6
>> >>>>9
>> >>>>d5
>> >>> >58f9bb2' of framework 201103282247-0000000019-0000
>> >>> >    @     0x7f050768dbcd  google::LogMessage::Fail()
>> >>> >I0506 22:00:10.421581 54936 cgroups_isolator.cpp:804] Executor
>> >>>
>> 
>>>>>>thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-
>>>>>>9f
>> >>>>8
>> >>>>a4
>> >>> >8f50c4a of framework 201103282247-0000000019-0000 terminated with
>> >>>status 0
>> >>> >    @     0x7f0507693837  google::LogMessage::SendToLog()
>> >>> >I0506 22:00:21.140630 54936 cgroups_isolator.cpp:620] Killing
>>executor
>> >>>
>> 
>>>>>>thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-
>>>>>>9f
>> >>>>8
>> >>>>a4
>> >>> >8f50c4a of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:10.591079 54948 slave.cpp:819] Launching task
>> >>>
>> 
>>>>>>1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c056
>>>>>>2d
>> >>>>3
>> >>>>a
>> >>> >for framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:10.703546 54934 cgroups.cpp:1175] Trying to freeze
>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d9
>> >>> >0_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>> >>> >I0506 22:00:11.635795 54935 gc.cpp:143] Deleted
>> >>>
>> 
>>>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/framewor
>>>>>>ks
>> >>>>/
>> >>>>20
>> >>>
>> 
>>>>>>1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4
>>>>>>c3
>> >>>>c
>> >>>>22
>> >>> >72/runs/c562a33
>> >>> >e-6870-47d8-ae53-8faec70ad328'
>> >>> >W0506 22:00:15.583375 54942 cgroups.cpp:1261] Unable to freeze
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_
>>>>>>ta
>> >>>>g
>> >>>>_2
>> >>> >407f4f8-2ad7-4415-abdd-016d5ab8e20d within 51 attempts
>> >>> >W0506 22:00:15.596472 54941 cgroups.cpp:1261] Unable to freeze
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c
>>>>>>_t
>> >>>>a
>> >>>>g_
>> >>> >720a3ca1-ea57-421f-9263-d47e347b866b within 51 attempts
>> >>> >W0506 22:00:19.717893 54946 status_update_manager.cpp:432]
>>Resending
>> >>> >status update TASK_LOST (UUID:
>>e7e02c18-abee-4781-9e62-70e0475b9fa3)
>> >>>for
>> >>> >task 
>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
>> >>> >f9dd7c22 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:10.453220 54949 cgroups.cpp:1175] Trying to freeze
>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab558801
>> >>> >9_tag_9f6b6105-0a40-4e8f-8044-19a09b186984
>> >>> >I0506 22:00:21.141474 54946 status_update_manager.cpp:335]
>>Forwarding
>> >>> >status update TASK_LOST (UUID:
>>e7e02c18-abee-4781-9e62-70e0475b9fa3)
>> >>>for
>> >>> >task 
>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
>> >>> >0f9dd7c22 of framework 201103282247-0000000019-0000 to
>> >>> >master@10.34.128.115:5050
>> >>> >I0506 22:00:21.143030 54948 paths.hpp:302] Created executor
>>directory
>> >>>
>> 
>>>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/framewor
>>>>>>ks
>> >>>>/
>> >>>>20
>> >>> >1103282247-0000000019-0000/executors/thermos-1367877549319-mesos-me
>> >>>
>> 
>>>>>>ta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a/runs/bd162f3e-c45
>>>>>>a-
>> >>>>4
>> >>>>4a
>> >>> >9-ae10-25914c460689'
>> >>> >I0506 22:00:21.256687 54948 slave.cpp:930] Queuing task
>> >>>
>> 
>>>>>>'1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c05
>>>>>>62
>> >>>>d
>> >>>>3a
>> >>> >' for executor
>> >>> >thermos-1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac
>> >>> >7e-d365c0562d3a of framework '201103282247-0000000019-0000
>> >>> >I0506 22:00:21.256707 54936 cgroups_isolator.cpp:804] Executor
>> >>>
>> 
>>>>>>thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-d
>>>>>>db
>> >>>>2
>> >>>>75
>> >>> >599233 of framework 201103282247-0000000019-0000 terminated with
>> >>>status 0
>> >>> >I0506 22:00:21.256798 54948 slave.cpp:1789] Sending acknowledgement
>> >>>for
>> >>> >status update TASK_LOST (UUID:
>>e7e02c18-abee-4781-9e62-70e0475b9fa3)
>> >>>for
>> >>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f
>> >>> >400f9dd7c22 of framework 201103282247-0000000019-0000 to
>> >>> >executor(1)@10.34.124.132:55886
>> >>> >I0506 22:00:21.256815 54936 cgroups_isolator.cpp:620] Killing
>>executor
>> >>>
>> 
>>>>>>thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-d
>>>>>>db
>> >>>>2
>> >>>>75
>> >>> >599233 of framework 201103282247-0000000019-0000
>> >>> >    @     0x7f050768f47c  google::LogMessage::Flush()
>> >>> >I0506 22:00:21.377487 54935 gc.cpp:134] Deleting
>> >>>
>> 
>>>>>>/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/framework
>>>>>>s/
>> >>>>2
>> >>>>01
>> >>>
>> 
>>>>>>103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c
>>>>>>3c
>> >>>>2
>> >>>>27
>> >>> >2
>> >>> >I0506 22:00:21.377707 54935 gc.cpp:143] Deleted
>> >>>
>> 
>>>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/framewor
>>>>>>ks
>> >>>>/
>> >>>>20
>> >>>
>> 
>>>>>>1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4
>>>>>>c3
>> >>>>c
>> >>>>22
>> >>> >72'
>> >>> >    @     0x7f050768f6e6
>>google::LogMessageFatal::~LogMessageFatal()
>> >>> >    @     0x7f050741063d
>> >>> >mesos::internal::ZooKeeperMasterDetectorProcess::connected()
>> >>> >    @     0x7f05074111d8
>>std::tr1::_Function_handler<>::_M_invoke()
>> >>> >    @     0x7f0507413c84
>>std::tr1::_Function_handler<>::_M_invoke()
>> >>> >    @     0x7f050758d99a  process::ProcessManager::resume()
>> >>> >    @     0x7f050758e9af  process::schedule()
>> >>> >    @     0x7f0506d2773d  start_thread
>> >>> >I0506 22:00:21.487545 54940 cgroups.cpp:1214] Successfully froze
>> >>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0
>> >>> >d90_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0 after 3 attempts
>> >>> >I0506 22:00:21.487633 54942 cgroups.cpp:1214] Successfully froze
>> >>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5588
>> >>> >019_tag_9f6b6105-0a40-4e8f-8044-19a09b186984 after 3 attempts
>> >>> >I0506 22:00:21.519721 54935 gc.cpp:134] Deleting
>> >>>
>> 
>>>>>>/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/framework
>>>>>>s/
>> >>>>2
>> >>>>01
>> >>>
>> 
>>>>>>103282247-0000000019-0000/executors/thermos-1367685279171-mesos-meta_
>>>>>>sl
>> >>>>a
>> >>>>ve
>> >>> >_14-0-a67ae3d4
>> >>> 
>>>-3e67-4a3a-ba85-ec66ca255db8/runs/76e96368-7692-4d9b-a717-e27fe7873f8b
>> >>> >I0506 22:00:21.519790 54941 cgroups.cpp:1175] Trying to freeze
>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48f50c
>> >>> >4a_tag_53fc2cf1-8f88-4913-9e87-d865dbff4cdf
>> >>> >I0506 22:00:21.577153 54946 status_update_manager.cpp:359] Received
>> >>> >status update acknowledgement e7e02c18-abee-4781-9e62-70e0475b9fa3
>>for
>> >>> >task
>> >>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9d
>> >>> >d7c22 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:21.577270 54948 slave.cpp:721] Got assigned task
>> >>>
>> 
>>>>>>1367877588667-mesos-meta_slave_33-39-79414744-8178-48e5-98df-eec82f80
>>>>>>91
>> >>>>f
>> >>>>0
>> >>> >for framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:21.674876 54938 cgroups.cpp:1190] Trying to thaw cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> 
>>>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c
>> >>> >_tag_720a3ca1-ea57-421f-9263-d47e347b866b
>> >>> >I0506 22:00:21.828469 54936 cgroups_isolator.cpp:520] Launching
>> >>>
>> 
>>>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5
>>>>>>94
>> >>>>8
>> >>>>95
>> >>> >d16c16 (./thermos_executor) in
>> >>>/var/lib/mesos/slaves/201304262233-1937777
>> >>>
>> 
>>>>>>162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/t
>>>>>>he
>> >>>>r
>> >>>>mo
>> >>>
>> 
>>>>>>s-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d
>>>>>>16
>> >>>>c
>> >>>>16
>> >>> >/runs/ed262fff-5bc0-412b-8c86-799354704486 with resources cpus=
>> >>> >0.25; mem=128 for framework 201103282247-0000000019-0000 in cgroup
>> >>>
>> 
>>>>>>mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877
>>>>>>53
>> >>>>8
>> >>>>34
>> >>> 
>>>7-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16_tag_ed262
>> >>> >fff-5bc0-412b-8c86-799354704486
>> >>> >I0506 22:00:21.828743 54949 cgroups.cpp:1190] Trying to thaw cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> 
>>>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_
>> >>> >tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>> >>> >I0506 22:00:21.828740 54937 cgroups.cpp:1175] Trying to freeze
>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb27559923
>> >>> >3_tag_4b84157b-e9c6-43e8-84f9-9ad1467411a7
>> >>> >I0506 22:00:22.244297 54946 status_update_manager.cpp:480]
>>Cleaning up
>> >>> >status update stream for task
>> >>>
>> 
>>>>>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7
>>>>>>c2
>> >>>>2
>> >>> >of framework 201103282247-0000000019-
>> >>> >0000
>> >>> >I0506 22:00:22.244469 54938 cgroups.cpp:1298] Successfully thawed
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c
>>>>>>_t
>> >>> >ag_720a3ca1-ea57-421f-9263-d47e347b866b
>> >>> >I0506 22:00:22.244581 54949 cgroups.cpp:1298] Successfully thawed
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_
>>>>>>ta
>> >>> >g_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>> >>> >I0506 22:00:22.246649 54936 cgroups_isolator.cpp:655] Changing
>>cgroup
>> >>> >controls for executor
>> >>>
>> 
>>>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5
>>>>>>94
>> >>>>8
>> >>>>95
>> >>> >d16c16 of framework 201103282247-0000000019-0
>> >>> >000 with resources cpus=0.25; mem=128
>> >>> >I0506 22:00:22.247011 54936 cgroups_isolator.cpp:839] Updated
>> >>> >'cpu.shares' to 256 for executor
>> >>>
>> 
>>>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5
>>>>>>94
>> >>>>8
>> >>>>95
>> >>> >d16c16 of framework 201103282247-000000001
>> >>> >9-0000
>> >>> >I0506 22:00:22.247328 54936 cgroups_isolator.cpp:977] Updated
>> >>> >'memory.limit_in_bytes' to 134217728 for executor
>> >>>
>> 
>>>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5
>>>>>>94
>> >>>>8
>> >>>>95
>> >>> >d16c16 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:22.622454 54948 http.cpp:279] HTTP request for
>> >>> >'/slave(1)/stats.json'
>> >>> >I0506 22:00:23.052145 54947 cgroups.cpp:1190] Trying to thaw cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_
>>>>>>ta
>> >>>>g
>> >>>>_e
>> >>> >0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>> >>> >I0506 22:00:23.052280 54947 cgroups.cpp:1298] Successfully thawed
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_
>>>>>>ta
>> >>>>g
>> >>>>_e
>> >>> >0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>> >>> >I0506 22:00:23.283198 54936 cgroups_isolator.cpp:1003] Started
>> >>>listening
>> >>> >for OOM events for executor
>> >>>
>> 
>>>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5
>>>>>>94
>> >>>>8
>> >>>>95
>> >>> >d16c16 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:23.296195 54936 cgroups_isolator.cpp:550] Forked
>>executor
>> >>>at
>> >>> >= 20045
>> >>> >    @     0x7f050570bf6d  clone
>> >>> >Fetching resources into
>> >>>
>> 
>>>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/framewor
>>>>>>ks
>> >>>>/
>> >>>>20
>> >>>
>> 
>>>>>>1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-meta
>>>>>>_s
>> >>>>l
>> >>>>av
>> >>>
>> 
>>>>>>e_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-412b-8
>>>>>>c8
>> >>>>6
>> >>>>-7
>> >>> >99354704486'
>> >>> >Fetching resource '/usr/local/bin/thermos_executor'
>> >>> >Copying resource from '/usr/local/bin/thermos_executor' to .
>> >>> >/usr/local/bin/mesos-slave.sh: line 101: 54931 Aborted
>> >>> >(core dumped) /usr/local/sbin/mesos-slave --port=5051
>> >>> >--resources="${MESOS_RESOURCES}" --attributes="${MESOS_ATTRIBUTES}"
>> >>> >--master="${master_zoo_url}" --log_dir="${log_dir}" ${WORK_DIR}
>> >>> >${CGROUPS_ISOLATION} "$@"
>> >>> >Slave Exit Status: 134
>> >>> >I0506 22:01:17.601867 20405 main.cpp:124] Creating "cgroups"
>>isolator
>> >>> >I0506 22:01:17.627717 20405 main.cpp:132] Build: 2013-04-26
>>19:49:25
>> >>>by
>> >>> >bmahler
>> >>> >I0506 22:01:17.627739 20405 main.cpp:133] Starting Mesos slave
>> >>> >
>> >>> >--
>> >>> >This message is automatically generated by JIRA.
>> >>> >If you think it was sent incorrectly, please contact your JIRA
>> >>> >administrators
>> >>> >For more information on JIRA, see:
>> >>>http://www.atlassian.com/software/jira
>> >>>
>> >>>
>> >
>>
>>


Re: [jira] [Created] (MESOS-463) Detector ZNode creation failure.

Posted by Vinod Kone <vi...@gmail.com>.
Great. Thanks for getting this done. Definitely makes writing CHANGELOG
easier.

Curiously, I see that 0.10.0 is mentioned under "unreleased versions", even
though we released it?


On Mon, May 6, 2013 at 3:39 PM, Mattmann, Chris A (398J) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> OK Gav hooked me up so we now have versions for 0.10.0, 0.11.0 and 0.12.0.
>
> I'll continue my pathology for 0.10.0 and 0.11.0 to get a decent change
> log going (based on dates and SVN tags), but it's going to take a week
> or so to get this in order.
>
> Will report back and update the CHANGELOG when I'm done. In the meanwhile
> when you guys are creating new issues please set the appropriate Fix
> Version.
>
> Thanks all!
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: <Mattmann>, jpluser <ch...@jpl.nasa.gov>
> Reply-To: "mesos-dev@incubator.apache.org" <mesos-dev@incubator.apache.org
> >
> Date: Monday, May 6, 2013 3:30 PM
> To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>, Ben
> Hindman <be...@twitter.com>
> Subject: Re: [jira] [Created] (MESOS-463) Detector ZNode creation failure.
>
> >Thanks Ben M -- we'll get it sorted! Thanks for trying. I'll poke
> >infra@ too for my admin access to JIRA..
> >
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >Chris Mattmann, Ph.D.
> >Senior Computer Scientist
> >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >Office: 171-266B, Mailstop: 171-246
> >Email: chris.a.mattmann@nasa.gov
> >WWW:  http://sunset.usc.edu/~mattmann/
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >Adjunct Assistant Professor, Computer Science Department
> >University of Southern California, Los Angeles, CA 90089 USA
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> >
> >
> >
> >
> >
> >-----Original Message-----
> >From: Benjamin Mahler <be...@gmail.com>
> >Reply-To: "mesos-dev@incubator.apache.org"
> ><me...@incubator.apache.org>
> >Date: Monday, May 6, 2013 3:26 PM
> >To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>,
> Ben
> >Hindman <be...@twitter.com>
> >Subject: Re: FW: [jira] [Created] (MESOS-463) Detector ZNode creation
> >failure.
> >
> >>Yeah, I had tried doing it for this ticket but the versions don't yet
> >>exist
> >>in JIRA =/
> >>
> >>Added benh to this email.
> >>
> >>
> >>On Mon, May 6, 2013 at 3:22 PM, Mattmann, Chris A (398J) <
> >>chris.a.mattmann@jpl.nasa.gov> wrote:
> >>
> >>> Guys, FYI JIRA issues like this below should have a Fix version.
> >>>
> >>> Ben H, or someone with permissions, can you please create versions
> >>> 0.11.0, 0.10.0, and 0.12.0 (which I'd assume MESOS-463 to be in that
> >>> grouping). Then, once they are created, we can set the Fix version
> >>> and start generating some boss-hog change logs around here.
> >>>
> >>> Cheers,
> >>> Chris
> >>>
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> Chris Mattmann, Ph.D.
> >>> Senior Computer Scientist
> >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >>> Office: 171-266B, Mailstop: 171-246
> >>> Email: chris.a.mattmann@nasa.gov
> >>> WWW:  http://sunset.usc.edu/~mattmann/
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> Adjunct Assistant Professor, Computer Science Department
> >>> University of Southern California, Los Angeles, CA 90089 USA
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: "Benjamin Mahler   (JIRA)" <ji...@apache.org>
> >>> Reply-To: "mesos-dev@incubator.apache.org"
> >>><mesos-dev@incubator.apache.org
> >>> >
> >>> Date: Monday, May 6, 2013 3:20 PM
> >>> To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>
> >>> Subject: [jira] [Created] (MESOS-463) Detector ZNode creation failure.
> >>>
> >>> >Benjamin Mahler created MESOS-463:
> >>> >-------------------------------------
> >>> >
> >>> >             Summary: Detector ZNode creation failure.
> >>> >                 Key: MESOS-463
> >>> >                 URL: https://issues.apache.org/jira/browse/MESOS-463
> >>> >             Project: Mesos
> >>> >          Issue Type: Bug
> >>> >            Reporter: Benjamin Mahler
> >>> >
> >>> >
> >>> >The following failure message occured in a test cluster at Twitter:
> >>> >
> >>> >    // We fail all non-OK return codes except ZNODEEXISTS (since that
> >>> >    // means the path we were trying to create exists) and ZNOAUTH
> >>> >    // (since it's possible that the ACLs on 'dirname(url.path)' don't
> >>> >    // allow us to create a child znode but we are allowed to create
> >>> >    // children of 'url.path' itself, which will be determined below
> >>> >    // if we are contending). Note that it's also possible we got back
> >>> >    // a ZNONODE because we could not create one of the intermediate
> >>> >    // znodes (in which case we'll abort in the 'else' below since
> >>> >    // ZNONODE is non-retryable). TODO(benh): Need to check that we
> >>> >    // also can put a watch on the children of 'url.path'.
> >>> >    if (code != ZOK && code != ZNODEEXISTS && code != ZNOAUTH) {
> >>> >      LOG(FATAL) << "Failed to create '" << url.path
> >>> >                 << "' in ZooKeeper: " << zk->message(code);
> >>> >    }
> >>> >
> >>> >It's interesting that there was a delay before the slave crashed:
> >>> >
> >>> >F0506 21:59:46.769142 54944 detector.cpp:315] Failed to create
> >>> >'/home/mesos/test/master' in ZooKeeper: invalid zhandle state
> >>> >*** Check failure stack trace: ***
> >>> >I0506 21:59:47.398509 54940 cgroups.cpp:1298] Successfully thawed
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6663aee94840_tag
> >>> >_c6f62383-28d1-42c2-aa65-15f9ec42db57
> >>> >I0506 21:59:46.538936 54936 cgroups_isolator.cpp:804] Executor
> >>>
> >>>>thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
> >>>>0
> >>>>f9
> >>> >dd7c22 of framework 201103282247-0000000019-0000 terminated with
> >>>status 0
> >>> >I0506 22:00:01.329169 54936 cgroups_isolator.cpp:620] Killing executor
> >>>
> >>>>thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
> >>>>0
> >>>>f9
> >>> >dd7c22 of framework 201103282247-0000000019-0000
> >>> >I0506 21:59:47.576560 54948 slave.cpp:721] Got assigned task
> >>>
> >>>>1367877572534-mesos-meta_slave_2-4-3b3755d9-9435-4480-9068-d851234cb91d
> >>> >for framework 201103282247-0000000019-0000
> >>> >W0506 21:59:47.576581 54941 monitor.cpp:167] Failed to collect
> >>>resource
> >>> >usage for executor
> >>>
> >>>>'thermos-1367877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7
> >>>>e
> >>>>39
> >>> >564e5bfc' of framework '201103282247-000000001
> >>> >9-0000': 1
> >>> >W0506 21:59:51.089640 54939 cgroups.cpp:1261] Unable to freeze
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_ta
> >>>>g
> >>>>_b
> >>> >26ba555-a6bd-4448-a5b4-439beb442820 within 51 attempts
> >>> >W0506 21:59:51.090683 54943 cgroups.cpp:1261] Unable to freeze
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc_t
> >>>>a
> >>>>g_
> >>> >6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4 within 51 attempts
> >>> >I0506 21:59:47.518194 54946 status_update_manager.cpp:359] Received
> >>> >status update acknowledgement c7701021-7eac-4711-96ac-093871462e44 for
> >>> >task
> >>>1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48
> >>> >f50c4a of framework 201103282247-0000000019-0000
> >>> >W0506 22:00:01.439838 54941 monitor.cpp:167] Failed to collect
> >>>resource
> >>> >usage for executor
> >>>
> >>>>'thermos-1367877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-666
> >>>>3
> >>>>ae
> >>> >e94840' of framework '201103282247-0000000019-
> >>> >0000': 1
> >>> >I0506 22:00:01.468596 54946 status_update_manager.cpp:359] Received
> >>> >status update acknowledgement 76ed08a3-1f8a-4eb2-8d2a-28de822ee370 for
> >>> >task
> >>>1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767ac
> >>> >b0d90 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:01.580569 54946 status_update_manager.cpp:359] Received
> >>> >status update acknowledgement 146b821e-ed93-4059-8971-b3c73a7d02ed for
> >>> >task
> >>>1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5fa-d754a82
> >>> >c86f6 of framework 201103282247-0000000019-0000
> >>> >W0506 22:00:01.580670 54946 status_update_manager.cpp:382] Ignoring
> >>> >unexpected status update acknowledgment
> >>> >146b821e-ed93-4059-8971-b3c73a7d02ed for task
> >>> >1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5
> >>> >fa-d754a82c86f6 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:01.888521 54936 cgroups_isolator.cpp:804] Executor
> >>>
> >>>>thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c
> >>>>3
> >>>>8f
> >>> >cec5e1c of framework 201103282247-0000000019-0000 terminated with
> >>>status 0
> >>> >I0506 22:00:06.876126 54936 cgroups_isolator.cpp:620] Killing executor
> >>>
> >>>>thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c
> >>>>3
> >>>>8f
> >>> >cec5e1c of framework 201103282247-0000000019-0000
> >>> >W0506 22:00:01.917793 54946 status_update_manager.cpp:432] Resending
> >>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
> >>>for
> >>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
> >>> >f9dd7c22 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:08.930923 54946 status_update_manager.cpp:335] Forwarding
> >>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
> >>>for
> >>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
> >>> >0f9dd7c22 of framework 201103282247-0000000019-0000 to
> >>> >master@10.34.128.115:5050
> >>> >I0506 22:00:02.215196 54949 cgroups.cpp:1190] Trying to thaw cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc
> >>> >_tag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
> >>> >I0506 22:00:02.243545 54945 cgroups.cpp:1190] Trying to thaw cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_
> >>> >tag_b26ba555-a6bd-4448-a5b4-439beb442820
> >>> >I0506 22:00:09.717344 54949 cgroups.cpp:1298] Successfully thawed
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc_t
> >>> >ag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
> >>> >W0506 22:00:01.917774 54941 monitor.cpp:167] Failed to collect
> >>>resource
> >>> >usage for executor
> >>>
> >>>>'thermos-1367877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-37
> >>>>6
> >>>>ab
> >>> >b0df8d0' of framework '201103282247-0000000019
> >>> >-0000': 1
> >>> >I0506 22:00:06.931675 54936 cgroups_isolator.cpp:804] Executor
> >>>
> >>>>thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2a
> >>>>a
> >>>>b5
> >>> >588019 of framework 201103282247-0000000019-0000 terminated with
> >>>status 0
> >>> >I0506 22:00:06.960703 54934 cgroups.cpp:1175] Trying to freeze cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e
> >>> >1c_tag_720a3ca1-ea57-421f-9263-d47e347b866b
> >>> >I0506 22:00:02.157475 54948 slave.cpp:514] Successfully attached file
> >>>
> >>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
> >>>>/
> >>>>20
> >>> >1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-me
> >>>
> >>>>ta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-4
> >>>>1
> >>>>2b
> >>> >-8c86-799354704486'
> >>> >I0506 22:00:02.243666 54937 cgroups.cpp:1175] Trying to freeze cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c2
> >>> >2_tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
> >>> >I0506 22:00:09.717412 54945 cgroups.cpp:1298] Successfully thawed
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_ta
> >>> >g_b26ba555-a6bd-4448-a5b4-439beb442820
> >>> >I0506 22:00:09.766968 54936 cgroups_isolator.cpp:620] Killing executor
> >>>
> >>>>thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2a
> >>>>a
> >>>>b5
> >>> >588019 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:09.794438 54948 slave.cpp:2031] Executor
> >>>
> >>>>'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-e6
> >>>>9
> >>>>d5
> >>> >58f9bb2' of framework 201103282247-0000000019-0000 has exited with
> >>>status
> >>> >'0'
> >>> >I0506 22:00:10.080941 54936 cgroups_isolator.cpp:804] Executor
> >>>
> >>>>thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-447
> >>>>6
> >>>>7a
> >>> >cb0d90 of framework 201103282247-0000000019-0000 terminated with
> >>>status 0
> >>> >I0506 22:00:10.081022 54936 cgroups_isolator.cpp:620] Killing executor
> >>>
> >>>>thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-447
> >>>>6
> >>>>7a
> >>> >cb0d90 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:10.138010 54948 slave.cpp:2166] Cleaning up executor
> >>>
> >>>>'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-e6
> >>>>9
> >>>>d5
> >>> >58f9bb2' of framework 201103282247-0000000019-0000
> >>> >    @     0x7f050768dbcd  google::LogMessage::Fail()
> >>> >I0506 22:00:10.421581 54936 cgroups_isolator.cpp:804] Executor
> >>>
> >>>>thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f
> >>>>8
> >>>>a4
> >>> >8f50c4a of framework 201103282247-0000000019-0000 terminated with
> >>>status 0
> >>> >    @     0x7f0507693837  google::LogMessage::SendToLog()
> >>> >I0506 22:00:21.140630 54936 cgroups_isolator.cpp:620] Killing executor
> >>>
> >>>>thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f
> >>>>8
> >>>>a4
> >>> >8f50c4a of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:10.591079 54948 slave.cpp:819] Launching task
> >>>
> >>>>1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d
> >>>>3
> >>>>a
> >>> >for framework 201103282247-0000000019-0000
> >>> >I0506 22:00:10.703546 54934 cgroups.cpp:1175] Trying to freeze cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d9
> >>> >0_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0
> >>> >I0506 22:00:11.635795 54935 gc.cpp:143] Deleted
> >>>
> >>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
> >>>>/
> >>>>20
> >>>
> >>>>1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3
> >>>>c
> >>>>22
> >>> >72/runs/c562a33
> >>> >e-6870-47d8-ae53-8faec70ad328'
> >>> >W0506 22:00:15.583375 54942 cgroups.cpp:1261] Unable to freeze
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_ta
> >>>>g
> >>>>_2
> >>> >407f4f8-2ad7-4415-abdd-016d5ab8e20d within 51 attempts
> >>> >W0506 22:00:15.596472 54941 cgroups.cpp:1261] Unable to freeze
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c_t
> >>>>a
> >>>>g_
> >>> >720a3ca1-ea57-421f-9263-d47e347b866b within 51 attempts
> >>> >W0506 22:00:19.717893 54946 status_update_manager.cpp:432] Resending
> >>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
> >>>for
> >>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
> >>> >f9dd7c22 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:10.453220 54949 cgroups.cpp:1175] Trying to freeze cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab558801
> >>> >9_tag_9f6b6105-0a40-4e8f-8044-19a09b186984
> >>> >I0506 22:00:21.141474 54946 status_update_manager.cpp:335] Forwarding
> >>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
> >>>for
> >>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
> >>> >0f9dd7c22 of framework 201103282247-0000000019-0000 to
> >>> >master@10.34.128.115:5050
> >>> >I0506 22:00:21.143030 54948 paths.hpp:302] Created executor directory
> >>>
> >>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
> >>>>/
> >>>>20
> >>> >1103282247-0000000019-0000/executors/thermos-1367877549319-mesos-me
> >>>
> >>>>ta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a/runs/bd162f3e-c45a-
> >>>>4
> >>>>4a
> >>> >9-ae10-25914c460689'
> >>> >I0506 22:00:21.256687 54948 slave.cpp:930] Queuing task
> >>>
> >>>>'1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562
> >>>>d
> >>>>3a
> >>> >' for executor
> >>> >thermos-1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac
> >>> >7e-d365c0562d3a of framework '201103282247-0000000019-0000
> >>> >I0506 22:00:21.256707 54936 cgroups_isolator.cpp:804] Executor
> >>>
> >>>>thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb
> >>>>2
> >>>>75
> >>> >599233 of framework 201103282247-0000000019-0000 terminated with
> >>>status 0
> >>> >I0506 22:00:21.256798 54948 slave.cpp:1789] Sending acknowledgement
> >>>for
> >>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
> >>>for
> >>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f
> >>> >400f9dd7c22 of framework 201103282247-0000000019-0000 to
> >>> >executor(1)@10.34.124.132:55886
> >>> >I0506 22:00:21.256815 54936 cgroups_isolator.cpp:620] Killing executor
> >>>
> >>>>thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb
> >>>>2
> >>>>75
> >>> >599233 of framework 201103282247-0000000019-0000
> >>> >    @     0x7f050768f47c  google::LogMessage::Flush()
> >>> >I0506 22:00:21.377487 54935 gc.cpp:134] Deleting
> >>>
> >>>>/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/
> >>>>2
> >>>>01
> >>>
> >>>>103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c
> >>>>2
> >>>>27
> >>> >2
> >>> >I0506 22:00:21.377707 54935 gc.cpp:143] Deleted
> >>>
> >>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
> >>>>/
> >>>>20
> >>>
> >>>>1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3
> >>>>c
> >>>>22
> >>> >72'
> >>> >    @     0x7f050768f6e6  google::LogMessageFatal::~LogMessageFatal()
> >>> >    @     0x7f050741063d
> >>> >mesos::internal::ZooKeeperMasterDetectorProcess::connected()
> >>> >    @     0x7f05074111d8  std::tr1::_Function_handler<>::_M_invoke()
> >>> >    @     0x7f0507413c84  std::tr1::_Function_handler<>::_M_invoke()
> >>> >    @     0x7f050758d99a  process::ProcessManager::resume()
> >>> >    @     0x7f050758e9af  process::schedule()
> >>> >    @     0x7f0506d2773d  start_thread
> >>> >I0506 22:00:21.487545 54940 cgroups.cpp:1214] Successfully froze
> >>>cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0
> >>> >d90_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0 after 3 attempts
> >>> >I0506 22:00:21.487633 54942 cgroups.cpp:1214] Successfully froze
> >>>cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5588
> >>> >019_tag_9f6b6105-0a40-4e8f-8044-19a09b186984 after 3 attempts
> >>> >I0506 22:00:21.519721 54935 gc.cpp:134] Deleting
> >>>
> >>>>/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/
> >>>>2
> >>>>01
> >>>
> >>>>103282247-0000000019-0000/executors/thermos-1367685279171-mesos-meta_sl
> >>>>a
> >>>>ve
> >>> >_14-0-a67ae3d4
> >>> >-3e67-4a3a-ba85-ec66ca255db8/runs/76e96368-7692-4d9b-a717-e27fe7873f8b
> >>> >I0506 22:00:21.519790 54941 cgroups.cpp:1175] Trying to freeze cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48f50c
> >>> >4a_tag_53fc2cf1-8f88-4913-9e87-d865dbff4cdf
> >>> >I0506 22:00:21.577153 54946 status_update_manager.cpp:359] Received
> >>> >status update acknowledgement e7e02c18-abee-4781-9e62-70e0475b9fa3 for
> >>> >task
> >>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9d
> >>> >d7c22 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:21.577270 54948 slave.cpp:721] Got assigned task
> >>>
> >>>>1367877588667-mesos-meta_slave_33-39-79414744-8178-48e5-98df-eec82f8091
> >>>>f
> >>>>0
> >>> >for framework 201103282247-0000000019-0000
> >>> >I0506 22:00:21.674876 54938 cgroups.cpp:1190] Trying to thaw cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c
> >>> >_tag_720a3ca1-ea57-421f-9263-d47e347b866b
> >>> >I0506 22:00:21.828469 54936 cgroups_isolator.cpp:520] Launching
> >>>
> >>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
> >>>>8
> >>>>95
> >>> >d16c16 (./thermos_executor) in
> >>>/var/lib/mesos/slaves/201304262233-1937777
> >>>
> >>>>162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/the
> >>>>r
> >>>>mo
> >>>
> >>>>s-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16
> >>>>c
> >>>>16
> >>> >/runs/ed262fff-5bc0-412b-8c86-799354704486 with resources cpus=
> >>> >0.25; mem=128 for framework 201103282247-0000000019-0000 in cgroup
> >>>
> >>>>mesos/framework_201103282247-0000000019-0000_executor_thermos-136787753
> >>>>8
> >>>>34
> >>> >7-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16_tag_ed262
> >>> >fff-5bc0-412b-8c86-799354704486
> >>> >I0506 22:00:21.828743 54949 cgroups.cpp:1190] Trying to thaw cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_
> >>> >tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
> >>> >I0506 22:00:21.828740 54937 cgroups.cpp:1175] Trying to freeze cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb27559923
> >>> >3_tag_4b84157b-e9c6-43e8-84f9-9ad1467411a7
> >>> >I0506 22:00:22.244297 54946 status_update_manager.cpp:480] Cleaning up
> >>> >status update stream for task
> >>>
> >>>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c2
> >>>>2
> >>> >of framework 201103282247-0000000019-
> >>> >0000
> >>> >I0506 22:00:22.244469 54938 cgroups.cpp:1298] Successfully thawed
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c_t
> >>> >ag_720a3ca1-ea57-421f-9263-d47e347b866b
> >>> >I0506 22:00:22.244581 54949 cgroups.cpp:1298] Successfully thawed
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_ta
> >>> >g_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
> >>> >I0506 22:00:22.246649 54936 cgroups_isolator.cpp:655] Changing cgroup
> >>> >controls for executor
> >>>
> >>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
> >>>>8
> >>>>95
> >>> >d16c16 of framework 201103282247-0000000019-0
> >>> >000 with resources cpus=0.25; mem=128
> >>> >I0506 22:00:22.247011 54936 cgroups_isolator.cpp:839] Updated
> >>> >'cpu.shares' to 256 for executor
> >>>
> >>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
> >>>>8
> >>>>95
> >>> >d16c16 of framework 201103282247-000000001
> >>> >9-0000
> >>> >I0506 22:00:22.247328 54936 cgroups_isolator.cpp:977] Updated
> >>> >'memory.limit_in_bytes' to 134217728 for executor
> >>>
> >>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
> >>>>8
> >>>>95
> >>> >d16c16 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:22.622454 54948 http.cpp:279] HTTP request for
> >>> >'/slave(1)/stats.json'
> >>> >I0506 22:00:23.052145 54947 cgroups.cpp:1190] Trying to thaw cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_ta
> >>>>g
> >>>>_e
> >>> >0d3ee4d-e89c-4892-a46d-f5a85854b3a0
> >>> >I0506 22:00:23.052280 54947 cgroups.cpp:1298] Successfully thawed
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_ta
> >>>>g
> >>>>_e
> >>> >0d3ee4d-e89c-4892-a46d-f5a85854b3a0
> >>> >I0506 22:00:23.283198 54936 cgroups_isolator.cpp:1003] Started
> >>>listening
> >>> >for OOM events for executor
> >>>
> >>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
> >>>>8
> >>>>95
> >>> >d16c16 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:23.296195 54936 cgroups_isolator.cpp:550] Forked executor
> >>>at
> >>> >= 20045
> >>> >    @     0x7f050570bf6d  clone
> >>> >Fetching resources into
> >>>
> >>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
> >>>>/
> >>>>20
> >>>
> >>>>1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-meta_s
> >>>>l
> >>>>av
> >>>
> >>>>e_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-412b-8c8
> >>>>6
> >>>>-7
> >>> >99354704486'
> >>> >Fetching resource '/usr/local/bin/thermos_executor'
> >>> >Copying resource from '/usr/local/bin/thermos_executor' to .
> >>> >/usr/local/bin/mesos-slave.sh: line 101: 54931 Aborted
> >>> >(core dumped) /usr/local/sbin/mesos-slave --port=5051
> >>> >--resources="${MESOS_RESOURCES}" --attributes="${MESOS_ATTRIBUTES}"
> >>> >--master="${master_zoo_url}" --log_dir="${log_dir}" ${WORK_DIR}
> >>> >${CGROUPS_ISOLATION} "$@"
> >>> >Slave Exit Status: 134
> >>> >I0506 22:01:17.601867 20405 main.cpp:124] Creating "cgroups" isolator
> >>> >I0506 22:01:17.627717 20405 main.cpp:132] Build: 2013-04-26 19:49:25
> >>>by
> >>> >bmahler
> >>> >I0506 22:01:17.627739 20405 main.cpp:133] Starting Mesos slave
> >>> >
> >>> >--
> >>> >This message is automatically generated by JIRA.
> >>> >If you think it was sent incorrectly, please contact your JIRA
> >>> >administrators
> >>> >For more information on JIRA, see:
> >>>http://www.atlassian.com/software/jira
> >>>
> >>>
> >
>
>

Re: [jira] [Created] (MESOS-463) Detector ZNode creation failure.

Posted by "Mattmann, Chris A (398J)" <ch...@jpl.nasa.gov>.
No probs, done!

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Benjamin Mahler <bm...@twitter.com>
Reply-To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>
Date: Monday, May 6, 2013 3:47 PM
To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>
Cc: Ben Hindman <be...@twitter.com>
Subject: Re: [jira] [Created] (MESOS-463) Detector ZNode creation failure.

>Cool, can you also create a 0.13.0?
>
>
>On Mon, May 6, 2013 at 3:39 PM, Mattmann, Chris A (398J) <
>chris.a.mattmann@jpl.nasa.gov> wrote:
>
>> OK Gav hooked me up so we now have versions for 0.10.0, 0.11.0 and
>>0.12.0.
>>
>> I'll continue my pathology for 0.10.0 and 0.11.0 to get a decent change
>> log going (based on dates and SVN tags), but it's going to take a week
>> or so to get this in order.
>>
>> Will report back and update the CHANGELOG when I'm done. In the
>>meanwhile
>> when you guys are creating new issues please set the appropriate Fix
>> Version.
>>
>> Thanks all!
>>
>> Cheers,
>> Chris
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: <Mattmann>, jpluser <ch...@jpl.nasa.gov>
>> Reply-To: "mesos-dev@incubator.apache.org"
>><mesos-dev@incubator.apache.org
>> >
>> Date: Monday, May 6, 2013 3:30 PM
>> To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>,
>>Ben
>> Hindman <be...@twitter.com>
>> Subject: Re: [jira] [Created] (MESOS-463) Detector ZNode creation
>>failure.
>>
>> >Thanks Ben M -- we'll get it sorted! Thanks for trying. I'll poke
>> >infra@ too for my admin access to JIRA..
>> >
>> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >Chris Mattmann, Ph.D.
>> >Senior Computer Scientist
>> >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> >Office: 171-266B, Mailstop: 171-246
>> >Email: chris.a.mattmann@nasa.gov
>> >WWW:  http://sunset.usc.edu/~mattmann/
>> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >Adjunct Assistant Professor, Computer Science Department
>> >University of Southern California, Los Angeles, CA 90089 USA
>> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >
>> >
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Benjamin Mahler <be...@gmail.com>
>> >Reply-To: "mesos-dev@incubator.apache.org"
>> ><me...@incubator.apache.org>
>> >Date: Monday, May 6, 2013 3:26 PM
>> >To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>,
>> Ben
>> >Hindman <be...@twitter.com>
>> >Subject: Re: FW: [jira] [Created] (MESOS-463) Detector ZNode creation
>> >failure.
>> >
>> >>Yeah, I had tried doing it for this ticket but the versions don't yet
>> >>exist
>> >>in JIRA =/
>> >>
>> >>Added benh to this email.
>> >>
>> >>
>> >>On Mon, May 6, 2013 at 3:22 PM, Mattmann, Chris A (398J) <
>> >>chris.a.mattmann@jpl.nasa.gov> wrote:
>> >>
>> >>> Guys, FYI JIRA issues like this below should have a Fix version.
>> >>>
>> >>> Ben H, or someone with permissions, can you please create versions
>> >>> 0.11.0, 0.10.0, and 0.12.0 (which I'd assume MESOS-463 to be in that
>> >>> grouping). Then, once they are created, we can set the Fix version
>> >>> and start generating some boss-hog change logs around here.
>> >>>
>> >>> Cheers,
>> >>> Chris
>> >>>
>> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>> Chris Mattmann, Ph.D.
>> >>> Senior Computer Scientist
>> >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> >>> Office: 171-266B, Mailstop: 171-246
>> >>> Email: chris.a.mattmann@nasa.gov
>> >>> WWW:  http://sunset.usc.edu/~mattmann/
>> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>> Adjunct Assistant Professor, Computer Science Department
>> >>> University of Southern California, Los Angeles, CA 90089 USA
>> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> -----Original Message-----
>> >>> From: "Benjamin Mahler   (JIRA)" <ji...@apache.org>
>> >>> Reply-To: "mesos-dev@incubator.apache.org"
>> >>><mesos-dev@incubator.apache.org
>> >>> >
>> >>> Date: Monday, May 6, 2013 3:20 PM
>> >>> To: "mesos-dev@incubator.apache.org"
>><me...@incubator.apache.org>
>> >>> Subject: [jira] [Created] (MESOS-463) Detector ZNode creation
>>failure.
>> >>>
>> >>> >Benjamin Mahler created MESOS-463:
>> >>> >-------------------------------------
>> >>> >
>> >>> >             Summary: Detector ZNode creation failure.
>> >>> >                 Key: MESOS-463
>> >>> >                 URL:
>>https://issues.apache.org/jira/browse/MESOS-463
>> >>> >             Project: Mesos
>> >>> >          Issue Type: Bug
>> >>> >            Reporter: Benjamin Mahler
>> >>> >
>> >>> >
>> >>> >The following failure message occured in a test cluster at Twitter:
>> >>> >
>> >>> >    // We fail all non-OK return codes except ZNODEEXISTS (since
>>that
>> >>> >    // means the path we were trying to create exists) and ZNOAUTH
>> >>> >    // (since it's possible that the ACLs on 'dirname(url.path)'
>>don't
>> >>> >    // allow us to create a child znode but we are allowed to
>>create
>> >>> >    // children of 'url.path' itself, which will be determined
>>below
>> >>> >    // if we are contending). Note that it's also possible we got
>>back
>> >>> >    // a ZNONODE because we could not create one of the
>>intermediate
>> >>> >    // znodes (in which case we'll abort in the 'else' below since
>> >>> >    // ZNONODE is non-retryable). TODO(benh): Need to check that we
>> >>> >    // also can put a watch on the children of 'url.path'.
>> >>> >    if (code != ZOK && code != ZNODEEXISTS && code != ZNOAUTH) {
>> >>> >      LOG(FATAL) << "Failed to create '" << url.path
>> >>> >                 << "' in ZooKeeper: " << zk->message(code);
>> >>> >    }
>> >>> >
>> >>> >It's interesting that there was a delay before the slave crashed:
>> >>> >
>> >>> >F0506 21:59:46.769142 54944 detector.cpp:315] Failed to create
>> >>> >'/home/mesos/test/master' in ZooKeeper: invalid zhandle state
>> >>> >*** Check failure stack trace: ***
>> >>> >I0506 21:59:47.398509 54940 cgroups.cpp:1298] Successfully thawed
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6663aee94840_t
>>>>>>ag
>> >>> >_c6f62383-28d1-42c2-aa65-15f9ec42db57
>> >>> >I0506 21:59:46.538936 54936 cgroups_isolator.cpp:804] Executor
>> >>>
>> 
>>>>>>thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f
>>>>>>40
>> >>>>0
>> >>>>f9
>> >>> >dd7c22 of framework 201103282247-0000000019-0000 terminated with
>> >>>status 0
>> >>> >I0506 22:00:01.329169 54936 cgroups_isolator.cpp:620] Killing
>>executor
>> >>>
>> 
>>>>>>thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f
>>>>>>40
>> >>>>0
>> >>>>f9
>> >>> >dd7c22 of framework 201103282247-0000000019-0000
>> >>> >I0506 21:59:47.576560 54948 slave.cpp:721] Got assigned task
>> >>>
>> 
>>>>>>1367877572534-mesos-meta_slave_2-4-3b3755d9-9435-4480-9068-d851234cb9
>>>>>>1d
>> >>> >for framework 201103282247-0000000019-0000
>> >>> >W0506 21:59:47.576581 54941 monitor.cpp:167] Failed to collect
>> >>>resource
>> >>> >usage for executor
>> >>>
>> 
>>>>>>'thermos-1367877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de
>>>>>>-7
>> >>>>e
>> >>>>39
>> >>> >564e5bfc' of framework '201103282247-000000001
>> >>> >9-0000': 1
>> >>> >W0506 21:59:51.089640 54939 cgroups.cpp:1261] Unable to freeze
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_
>>>>>>ta
>> >>>>g
>> >>>>_b
>> >>> >26ba555-a6bd-4448-a5b4-439beb442820 within 51 attempts
>> >>> >W0506 21:59:51.090683 54943 cgroups.cpp:1261] Unable to freeze
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc
>>>>>>_t
>> >>>>a
>> >>>>g_
>> >>> >6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4 within 51 attempts
>> >>> >I0506 21:59:47.518194 54946 status_update_manager.cpp:359] Received
>> >>> >status update acknowledgement c7701021-7eac-4711-96ac-093871462e44
>>for
>> >>> >task
>> >>>1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48
>> >>> >f50c4a of framework 201103282247-0000000019-0000
>> >>> >W0506 22:00:01.439838 54941 monitor.cpp:167] Failed to collect
>> >>>resource
>> >>> >usage for executor
>> >>>
>> 
>>>>>>'thermos-1367877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6
>>>>>>66
>> >>>>3
>> >>>>ae
>> >>> >e94840' of framework '201103282247-0000000019-
>> >>> >0000': 1
>> >>> >I0506 22:00:01.468596 54946 status_update_manager.cpp:359] Received
>> >>> >status update acknowledgement 76ed08a3-1f8a-4eb2-8d2a-28de822ee370
>>for
>> >>> >task
>> >>>1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767ac
>> >>> >b0d90 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:01.580569 54946 status_update_manager.cpp:359] Received
>> >>> >status update acknowledgement 146b821e-ed93-4059-8971-b3c73a7d02ed
>>for
>> >>> >task
>> >>>1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5fa-d754a82
>> >>> >c86f6 of framework 201103282247-0000000019-0000
>> >>> >W0506 22:00:01.580670 54946 status_update_manager.cpp:382] Ignoring
>> >>> >unexpected status update acknowledgment
>> >>> >146b821e-ed93-4059-8971-b3c73a7d02ed for task
>> >>> >1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5
>> >>> >fa-d754a82c86f6 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:01.888521 54936 cgroups_isolator.cpp:804] Executor
>> >>>
>> 
>>>>>>thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-
>>>>>>0c
>> >>>>3
>> >>>>8f
>> >>> >cec5e1c of framework 201103282247-0000000019-0000 terminated with
>> >>>status 0
>> >>> >I0506 22:00:06.876126 54936 cgroups_isolator.cpp:620] Killing
>>executor
>> >>>
>> 
>>>>>>thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-
>>>>>>0c
>> >>>>3
>> >>>>8f
>> >>> >cec5e1c of framework 201103282247-0000000019-0000
>> >>> >W0506 22:00:01.917793 54946 status_update_manager.cpp:432]
>>Resending
>> >>> >status update TASK_LOST (UUID:
>>e7e02c18-abee-4781-9e62-70e0475b9fa3)
>> >>>for
>> >>> >task 
>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
>> >>> >f9dd7c22 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:08.930923 54946 status_update_manager.cpp:335]
>>Forwarding
>> >>> >status update TASK_LOST (UUID:
>>e7e02c18-abee-4781-9e62-70e0475b9fa3)
>> >>>for
>> >>> >task 
>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
>> >>> >0f9dd7c22 of framework 201103282247-0000000019-0000 to
>> >>> >master@10.34.128.115:5050
>> >>> >I0506 22:00:02.215196 54949 cgroups.cpp:1190] Trying to thaw cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> 
>>>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc
>> >>> >_tag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
>> >>> >I0506 22:00:02.243545 54945 cgroups.cpp:1190] Trying to thaw cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> 
>>>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_
>> >>> >tag_b26ba555-a6bd-4448-a5b4-439beb442820
>> >>> >I0506 22:00:09.717344 54949 cgroups.cpp:1298] Successfully thawed
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc
>>>>>>_t
>> >>> >ag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
>> >>> >W0506 22:00:01.917774 54941 monitor.cpp:167] Failed to collect
>> >>>resource
>> >>> >usage for executor
>> >>>
>> 
>>>>>>'thermos-1367877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-
>>>>>>37
>> >>>>6
>> >>>>ab
>> >>> >b0df8d0' of framework '201103282247-0000000019
>> >>> >-0000': 1
>> >>> >I0506 22:00:06.931675 54936 cgroups_isolator.cpp:804] Executor
>> >>>
>> 
>>>>>>thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e
>>>>>>2a
>> >>>>a
>> >>>>b5
>> >>> >588019 of framework 201103282247-0000000019-0000 terminated with
>> >>>status 0
>> >>> >I0506 22:00:06.960703 54934 cgroups.cpp:1175] Trying to freeze
>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e
>> >>> >1c_tag_720a3ca1-ea57-421f-9263-d47e347b866b
>> >>> >I0506 22:00:02.157475 54948 slave.cpp:514] Successfully attached
>>file
>> >>>
>> 
>>>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/framewor
>>>>>>ks
>> >>>>/
>> >>>>20
>> >>> >1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-me
>> >>>
>> 
>>>>>>ta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0
>>>>>>-4
>> >>>>1
>> >>>>2b
>> >>> >-8c86-799354704486'
>> >>> >I0506 22:00:02.243666 54937 cgroups.cpp:1175] Trying to freeze
>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c2
>> >>> >2_tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>> >>> >I0506 22:00:09.717412 54945 cgroups.cpp:1298] Successfully thawed
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_
>>>>>>ta
>> >>> >g_b26ba555-a6bd-4448-a5b4-439beb442820
>> >>> >I0506 22:00:09.766968 54936 cgroups_isolator.cpp:620] Killing
>>executor
>> >>>
>> 
>>>>>>thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e
>>>>>>2a
>> >>>>a
>> >>>>b5
>> >>> >588019 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:09.794438 54948 slave.cpp:2031] Executor
>> >>>
>> 
>>>>>>'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-
>>>>>>e6
>> >>>>9
>> >>>>d5
>> >>> >58f9bb2' of framework 201103282247-0000000019-0000 has exited with
>> >>>status
>> >>> >'0'
>> >>> >I0506 22:00:10.080941 54936 cgroups_isolator.cpp:804] Executor
>> >>>
>> 
>>>>>>thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-4
>>>>>>47
>> >>>>6
>> >>>>7a
>> >>> >cb0d90 of framework 201103282247-0000000019-0000 terminated with
>> >>>status 0
>> >>> >I0506 22:00:10.081022 54936 cgroups_isolator.cpp:620] Killing
>>executor
>> >>>
>> 
>>>>>>thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-4
>>>>>>47
>> >>>>6
>> >>>>7a
>> >>> >cb0d90 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:10.138010 54948 slave.cpp:2166] Cleaning up executor
>> >>>
>> 
>>>>>>'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-
>>>>>>e6
>> >>>>9
>> >>>>d5
>> >>> >58f9bb2' of framework 201103282247-0000000019-0000
>> >>> >    @     0x7f050768dbcd  google::LogMessage::Fail()
>> >>> >I0506 22:00:10.421581 54936 cgroups_isolator.cpp:804] Executor
>> >>>
>> 
>>>>>>thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-
>>>>>>9f
>> >>>>8
>> >>>>a4
>> >>> >8f50c4a of framework 201103282247-0000000019-0000 terminated with
>> >>>status 0
>> >>> >    @     0x7f0507693837  google::LogMessage::SendToLog()
>> >>> >I0506 22:00:21.140630 54936 cgroups_isolator.cpp:620] Killing
>>executor
>> >>>
>> 
>>>>>>thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-
>>>>>>9f
>> >>>>8
>> >>>>a4
>> >>> >8f50c4a of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:10.591079 54948 slave.cpp:819] Launching task
>> >>>
>> 
>>>>>>1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c056
>>>>>>2d
>> >>>>3
>> >>>>a
>> >>> >for framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:10.703546 54934 cgroups.cpp:1175] Trying to freeze
>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d9
>> >>> >0_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>> >>> >I0506 22:00:11.635795 54935 gc.cpp:143] Deleted
>> >>>
>> 
>>>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/framewor
>>>>>>ks
>> >>>>/
>> >>>>20
>> >>>
>> 
>>>>>>1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4
>>>>>>c3
>> >>>>c
>> >>>>22
>> >>> >72/runs/c562a33
>> >>> >e-6870-47d8-ae53-8faec70ad328'
>> >>> >W0506 22:00:15.583375 54942 cgroups.cpp:1261] Unable to freeze
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_
>>>>>>ta
>> >>>>g
>> >>>>_2
>> >>> >407f4f8-2ad7-4415-abdd-016d5ab8e20d within 51 attempts
>> >>> >W0506 22:00:15.596472 54941 cgroups.cpp:1261] Unable to freeze
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c
>>>>>>_t
>> >>>>a
>> >>>>g_
>> >>> >720a3ca1-ea57-421f-9263-d47e347b866b within 51 attempts
>> >>> >W0506 22:00:19.717893 54946 status_update_manager.cpp:432]
>>Resending
>> >>> >status update TASK_LOST (UUID:
>>e7e02c18-abee-4781-9e62-70e0475b9fa3)
>> >>>for
>> >>> >task 
>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
>> >>> >f9dd7c22 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:10.453220 54949 cgroups.cpp:1175] Trying to freeze
>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab558801
>> >>> >9_tag_9f6b6105-0a40-4e8f-8044-19a09b186984
>> >>> >I0506 22:00:21.141474 54946 status_update_manager.cpp:335]
>>Forwarding
>> >>> >status update TASK_LOST (UUID:
>>e7e02c18-abee-4781-9e62-70e0475b9fa3)
>> >>>for
>> >>> >task 
>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
>> >>> >0f9dd7c22 of framework 201103282247-0000000019-0000 to
>> >>> >master@10.34.128.115:5050
>> >>> >I0506 22:00:21.143030 54948 paths.hpp:302] Created executor
>>directory
>> >>>
>> 
>>>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/framewor
>>>>>>ks
>> >>>>/
>> >>>>20
>> >>> >1103282247-0000000019-0000/executors/thermos-1367877549319-mesos-me
>> >>>
>> 
>>>>>>ta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a/runs/bd162f3e-c45
>>>>>>a-
>> >>>>4
>> >>>>4a
>> >>> >9-ae10-25914c460689'
>> >>> >I0506 22:00:21.256687 54948 slave.cpp:930] Queuing task
>> >>>
>> 
>>>>>>'1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c05
>>>>>>62
>> >>>>d
>> >>>>3a
>> >>> >' for executor
>> >>> >thermos-1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac
>> >>> >7e-d365c0562d3a of framework '201103282247-0000000019-0000
>> >>> >I0506 22:00:21.256707 54936 cgroups_isolator.cpp:804] Executor
>> >>>
>> 
>>>>>>thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-d
>>>>>>db
>> >>>>2
>> >>>>75
>> >>> >599233 of framework 201103282247-0000000019-0000 terminated with
>> >>>status 0
>> >>> >I0506 22:00:21.256798 54948 slave.cpp:1789] Sending acknowledgement
>> >>>for
>> >>> >status update TASK_LOST (UUID:
>>e7e02c18-abee-4781-9e62-70e0475b9fa3)
>> >>>for
>> >>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f
>> >>> >400f9dd7c22 of framework 201103282247-0000000019-0000 to
>> >>> >executor(1)@10.34.124.132:55886
>> >>> >I0506 22:00:21.256815 54936 cgroups_isolator.cpp:620] Killing
>>executor
>> >>>
>> 
>>>>>>thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-d
>>>>>>db
>> >>>>2
>> >>>>75
>> >>> >599233 of framework 201103282247-0000000019-0000
>> >>> >    @     0x7f050768f47c  google::LogMessage::Flush()
>> >>> >I0506 22:00:21.377487 54935 gc.cpp:134] Deleting
>> >>>
>> 
>>>>>>/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/framework
>>>>>>s/
>> >>>>2
>> >>>>01
>> >>>
>> 
>>>>>>103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c
>>>>>>3c
>> >>>>2
>> >>>>27
>> >>> >2
>> >>> >I0506 22:00:21.377707 54935 gc.cpp:143] Deleted
>> >>>
>> 
>>>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/framewor
>>>>>>ks
>> >>>>/
>> >>>>20
>> >>>
>> 
>>>>>>1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4
>>>>>>c3
>> >>>>c
>> >>>>22
>> >>> >72'
>> >>> >    @     0x7f050768f6e6
>>google::LogMessageFatal::~LogMessageFatal()
>> >>> >    @     0x7f050741063d
>> >>> >mesos::internal::ZooKeeperMasterDetectorProcess::connected()
>> >>> >    @     0x7f05074111d8
>>std::tr1::_Function_handler<>::_M_invoke()
>> >>> >    @     0x7f0507413c84
>>std::tr1::_Function_handler<>::_M_invoke()
>> >>> >    @     0x7f050758d99a  process::ProcessManager::resume()
>> >>> >    @     0x7f050758e9af  process::schedule()
>> >>> >    @     0x7f0506d2773d  start_thread
>> >>> >I0506 22:00:21.487545 54940 cgroups.cpp:1214] Successfully froze
>> >>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0
>> >>> >d90_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0 after 3 attempts
>> >>> >I0506 22:00:21.487633 54942 cgroups.cpp:1214] Successfully froze
>> >>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5588
>> >>> >019_tag_9f6b6105-0a40-4e8f-8044-19a09b186984 after 3 attempts
>> >>> >I0506 22:00:21.519721 54935 gc.cpp:134] Deleting
>> >>>
>> 
>>>>>>/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/framework
>>>>>>s/
>> >>>>2
>> >>>>01
>> >>>
>> 
>>>>>>103282247-0000000019-0000/executors/thermos-1367685279171-mesos-meta_
>>>>>>sl
>> >>>>a
>> >>>>ve
>> >>> >_14-0-a67ae3d4
>> >>> 
>>>-3e67-4a3a-ba85-ec66ca255db8/runs/76e96368-7692-4d9b-a717-e27fe7873f8b
>> >>> >I0506 22:00:21.519790 54941 cgroups.cpp:1175] Trying to freeze
>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48f50c
>> >>> >4a_tag_53fc2cf1-8f88-4913-9e87-d865dbff4cdf
>> >>> >I0506 22:00:21.577153 54946 status_update_manager.cpp:359] Received
>> >>> >status update acknowledgement e7e02c18-abee-4781-9e62-70e0475b9fa3
>>for
>> >>> >task
>> >>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9d
>> >>> >d7c22 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:21.577270 54948 slave.cpp:721] Got assigned task
>> >>>
>> 
>>>>>>1367877588667-mesos-meta_slave_33-39-79414744-8178-48e5-98df-eec82f80
>>>>>>91
>> >>>>f
>> >>>>0
>> >>> >for framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:21.674876 54938 cgroups.cpp:1190] Trying to thaw cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> 
>>>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c
>> >>> >_tag_720a3ca1-ea57-421f-9263-d47e347b866b
>> >>> >I0506 22:00:21.828469 54936 cgroups_isolator.cpp:520] Launching
>> >>>
>> 
>>>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5
>>>>>>94
>> >>>>8
>> >>>>95
>> >>> >d16c16 (./thermos_executor) in
>> >>>/var/lib/mesos/slaves/201304262233-1937777
>> >>>
>> 
>>>>>>162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/t
>>>>>>he
>> >>>>r
>> >>>>mo
>> >>>
>> 
>>>>>>s-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d
>>>>>>16
>> >>>>c
>> >>>>16
>> >>> >/runs/ed262fff-5bc0-412b-8c86-799354704486 with resources cpus=
>> >>> >0.25; mem=128 for framework 201103282247-0000000019-0000 in cgroup
>> >>>
>> 
>>>>>>mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877
>>>>>>53
>> >>>>8
>> >>>>34
>> >>> 
>>>7-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16_tag_ed262
>> >>> >fff-5bc0-412b-8c86-799354704486
>> >>> >I0506 22:00:21.828743 54949 cgroups.cpp:1190] Trying to thaw cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> 
>>>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_
>> >>> >tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>> >>> >I0506 22:00:21.828740 54937 cgroups.cpp:1175] Trying to freeze
>>cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>> >877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb27559923
>> >>> >3_tag_4b84157b-e9c6-43e8-84f9-9ad1467411a7
>> >>> >I0506 22:00:22.244297 54946 status_update_manager.cpp:480]
>>Cleaning up
>> >>> >status update stream for task
>> >>>
>> 
>>>>>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7
>>>>>>c2
>> >>>>2
>> >>> >of framework 201103282247-0000000019-
>> >>> >0000
>> >>> >I0506 22:00:22.244469 54938 cgroups.cpp:1298] Successfully thawed
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c
>>>>>>_t
>> >>> >ag_720a3ca1-ea57-421f-9263-d47e347b866b
>> >>> >I0506 22:00:22.244581 54949 cgroups.cpp:1298] Successfully thawed
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_
>>>>>>ta
>> >>> >g_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>> >>> >I0506 22:00:22.246649 54936 cgroups_isolator.cpp:655] Changing
>>cgroup
>> >>> >controls for executor
>> >>>
>> 
>>>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5
>>>>>>94
>> >>>>8
>> >>>>95
>> >>> >d16c16 of framework 201103282247-0000000019-0
>> >>> >000 with resources cpus=0.25; mem=128
>> >>> >I0506 22:00:22.247011 54936 cgroups_isolator.cpp:839] Updated
>> >>> >'cpu.shares' to 256 for executor
>> >>>
>> 
>>>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5
>>>>>>94
>> >>>>8
>> >>>>95
>> >>> >d16c16 of framework 201103282247-000000001
>> >>> >9-0000
>> >>> >I0506 22:00:22.247328 54936 cgroups_isolator.cpp:977] Updated
>> >>> >'memory.limit_in_bytes' to 134217728 for executor
>> >>>
>> 
>>>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5
>>>>>>94
>> >>>>8
>> >>>>95
>> >>> >d16c16 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:22.622454 54948 http.cpp:279] HTTP request for
>> >>> >'/slave(1)/stats.json'
>> >>> >I0506 22:00:23.052145 54947 cgroups.cpp:1190] Trying to thaw cgroup
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_
>>>>>>ta
>> >>>>g
>> >>>>_e
>> >>> >0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>> >>> >I0506 22:00:23.052280 54947 cgroups.cpp:1298] Successfully thawed
>> >>>
>> 
>>>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos
>>>>>>-1
>> >>>>3
>> >>>>67
>> >>>
>> 
>>>>>>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_
>>>>>>ta
>> >>>>g
>> >>>>_e
>> >>> >0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>> >>> >I0506 22:00:23.283198 54936 cgroups_isolator.cpp:1003] Started
>> >>>listening
>> >>> >for OOM events for executor
>> >>>
>> 
>>>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5
>>>>>>94
>> >>>>8
>> >>>>95
>> >>> >d16c16 of framework 201103282247-0000000019-0000
>> >>> >I0506 22:00:23.296195 54936 cgroups_isolator.cpp:550] Forked
>>executor
>> >>>at
>> >>> >= 20045
>> >>> >    @     0x7f050570bf6d  clone
>> >>> >Fetching resources into
>> >>>
>> 
>>>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/framewor
>>>>>>ks
>> >>>>/
>> >>>>20
>> >>>
>> 
>>>>>>1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-meta
>>>>>>_s
>> >>>>l
>> >>>>av
>> >>>
>> 
>>>>>>e_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-412b-8
>>>>>>c8
>> >>>>6
>> >>>>-7
>> >>> >99354704486'
>> >>> >Fetching resource '/usr/local/bin/thermos_executor'
>> >>> >Copying resource from '/usr/local/bin/thermos_executor' to .
>> >>> >/usr/local/bin/mesos-slave.sh: line 101: 54931 Aborted
>> >>> >(core dumped) /usr/local/sbin/mesos-slave --port=5051
>> >>> >--resources="${MESOS_RESOURCES}" --attributes="${MESOS_ATTRIBUTES}"
>> >>> >--master="${master_zoo_url}" --log_dir="${log_dir}" ${WORK_DIR}
>> >>> >${CGROUPS_ISOLATION} "$@"
>> >>> >Slave Exit Status: 134
>> >>> >I0506 22:01:17.601867 20405 main.cpp:124] Creating "cgroups"
>>isolator
>> >>> >I0506 22:01:17.627717 20405 main.cpp:132] Build: 2013-04-26
>>19:49:25
>> >>>by
>> >>> >bmahler
>> >>> >I0506 22:01:17.627739 20405 main.cpp:133] Starting Mesos slave
>> >>> >
>> >>> >--
>> >>> >This message is automatically generated by JIRA.
>> >>> >If you think it was sent incorrectly, please contact your JIRA
>> >>> >administrators
>> >>> >For more information on JIRA, see:
>> >>>http://www.atlassian.com/software/jira
>> >>>
>> >>>
>> >
>>
>>


Re: [jira] [Created] (MESOS-463) Detector ZNode creation failure.

Posted by Benjamin Mahler <bm...@twitter.com>.
Cool, can you also create a 0.13.0?


On Mon, May 6, 2013 at 3:39 PM, Mattmann, Chris A (398J) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> OK Gav hooked me up so we now have versions for 0.10.0, 0.11.0 and 0.12.0.
>
> I'll continue my pathology for 0.10.0 and 0.11.0 to get a decent change
> log going (based on dates and SVN tags), but it's going to take a week
> or so to get this in order.
>
> Will report back and update the CHANGELOG when I'm done. In the meanwhile
> when you guys are creating new issues please set the appropriate Fix
> Version.
>
> Thanks all!
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: <Mattmann>, jpluser <ch...@jpl.nasa.gov>
> Reply-To: "mesos-dev@incubator.apache.org" <mesos-dev@incubator.apache.org
> >
> Date: Monday, May 6, 2013 3:30 PM
> To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>, Ben
> Hindman <be...@twitter.com>
> Subject: Re: [jira] [Created] (MESOS-463) Detector ZNode creation failure.
>
> >Thanks Ben M -- we'll get it sorted! Thanks for trying. I'll poke
> >infra@ too for my admin access to JIRA..
> >
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >Chris Mattmann, Ph.D.
> >Senior Computer Scientist
> >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >Office: 171-266B, Mailstop: 171-246
> >Email: chris.a.mattmann@nasa.gov
> >WWW:  http://sunset.usc.edu/~mattmann/
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >Adjunct Assistant Professor, Computer Science Department
> >University of Southern California, Los Angeles, CA 90089 USA
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> >
> >
> >
> >
> >
> >-----Original Message-----
> >From: Benjamin Mahler <be...@gmail.com>
> >Reply-To: "mesos-dev@incubator.apache.org"
> ><me...@incubator.apache.org>
> >Date: Monday, May 6, 2013 3:26 PM
> >To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>,
> Ben
> >Hindman <be...@twitter.com>
> >Subject: Re: FW: [jira] [Created] (MESOS-463) Detector ZNode creation
> >failure.
> >
> >>Yeah, I had tried doing it for this ticket but the versions don't yet
> >>exist
> >>in JIRA =/
> >>
> >>Added benh to this email.
> >>
> >>
> >>On Mon, May 6, 2013 at 3:22 PM, Mattmann, Chris A (398J) <
> >>chris.a.mattmann@jpl.nasa.gov> wrote:
> >>
> >>> Guys, FYI JIRA issues like this below should have a Fix version.
> >>>
> >>> Ben H, or someone with permissions, can you please create versions
> >>> 0.11.0, 0.10.0, and 0.12.0 (which I'd assume MESOS-463 to be in that
> >>> grouping). Then, once they are created, we can set the Fix version
> >>> and start generating some boss-hog change logs around here.
> >>>
> >>> Cheers,
> >>> Chris
> >>>
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> Chris Mattmann, Ph.D.
> >>> Senior Computer Scientist
> >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >>> Office: 171-266B, Mailstop: 171-246
> >>> Email: chris.a.mattmann@nasa.gov
> >>> WWW:  http://sunset.usc.edu/~mattmann/
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> Adjunct Assistant Professor, Computer Science Department
> >>> University of Southern California, Los Angeles, CA 90089 USA
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: "Benjamin Mahler   (JIRA)" <ji...@apache.org>
> >>> Reply-To: "mesos-dev@incubator.apache.org"
> >>><mesos-dev@incubator.apache.org
> >>> >
> >>> Date: Monday, May 6, 2013 3:20 PM
> >>> To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>
> >>> Subject: [jira] [Created] (MESOS-463) Detector ZNode creation failure.
> >>>
> >>> >Benjamin Mahler created MESOS-463:
> >>> >-------------------------------------
> >>> >
> >>> >             Summary: Detector ZNode creation failure.
> >>> >                 Key: MESOS-463
> >>> >                 URL: https://issues.apache.org/jira/browse/MESOS-463
> >>> >             Project: Mesos
> >>> >          Issue Type: Bug
> >>> >            Reporter: Benjamin Mahler
> >>> >
> >>> >
> >>> >The following failure message occured in a test cluster at Twitter:
> >>> >
> >>> >    // We fail all non-OK return codes except ZNODEEXISTS (since that
> >>> >    // means the path we were trying to create exists) and ZNOAUTH
> >>> >    // (since it's possible that the ACLs on 'dirname(url.path)' don't
> >>> >    // allow us to create a child znode but we are allowed to create
> >>> >    // children of 'url.path' itself, which will be determined below
> >>> >    // if we are contending). Note that it's also possible we got back
> >>> >    // a ZNONODE because we could not create one of the intermediate
> >>> >    // znodes (in which case we'll abort in the 'else' below since
> >>> >    // ZNONODE is non-retryable). TODO(benh): Need to check that we
> >>> >    // also can put a watch on the children of 'url.path'.
> >>> >    if (code != ZOK && code != ZNODEEXISTS && code != ZNOAUTH) {
> >>> >      LOG(FATAL) << "Failed to create '" << url.path
> >>> >                 << "' in ZooKeeper: " << zk->message(code);
> >>> >    }
> >>> >
> >>> >It's interesting that there was a delay before the slave crashed:
> >>> >
> >>> >F0506 21:59:46.769142 54944 detector.cpp:315] Failed to create
> >>> >'/home/mesos/test/master' in ZooKeeper: invalid zhandle state
> >>> >*** Check failure stack trace: ***
> >>> >I0506 21:59:47.398509 54940 cgroups.cpp:1298] Successfully thawed
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6663aee94840_tag
> >>> >_c6f62383-28d1-42c2-aa65-15f9ec42db57
> >>> >I0506 21:59:46.538936 54936 cgroups_isolator.cpp:804] Executor
> >>>
> >>>>thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
> >>>>0
> >>>>f9
> >>> >dd7c22 of framework 201103282247-0000000019-0000 terminated with
> >>>status 0
> >>> >I0506 22:00:01.329169 54936 cgroups_isolator.cpp:620] Killing executor
> >>>
> >>>>thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
> >>>>0
> >>>>f9
> >>> >dd7c22 of framework 201103282247-0000000019-0000
> >>> >I0506 21:59:47.576560 54948 slave.cpp:721] Got assigned task
> >>>
> >>>>1367877572534-mesos-meta_slave_2-4-3b3755d9-9435-4480-9068-d851234cb91d
> >>> >for framework 201103282247-0000000019-0000
> >>> >W0506 21:59:47.576581 54941 monitor.cpp:167] Failed to collect
> >>>resource
> >>> >usage for executor
> >>>
> >>>>'thermos-1367877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7
> >>>>e
> >>>>39
> >>> >564e5bfc' of framework '201103282247-000000001
> >>> >9-0000': 1
> >>> >W0506 21:59:51.089640 54939 cgroups.cpp:1261] Unable to freeze
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_ta
> >>>>g
> >>>>_b
> >>> >26ba555-a6bd-4448-a5b4-439beb442820 within 51 attempts
> >>> >W0506 21:59:51.090683 54943 cgroups.cpp:1261] Unable to freeze
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc_t
> >>>>a
> >>>>g_
> >>> >6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4 within 51 attempts
> >>> >I0506 21:59:47.518194 54946 status_update_manager.cpp:359] Received
> >>> >status update acknowledgement c7701021-7eac-4711-96ac-093871462e44 for
> >>> >task
> >>>1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48
> >>> >f50c4a of framework 201103282247-0000000019-0000
> >>> >W0506 22:00:01.439838 54941 monitor.cpp:167] Failed to collect
> >>>resource
> >>> >usage for executor
> >>>
> >>>>'thermos-1367877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-666
> >>>>3
> >>>>ae
> >>> >e94840' of framework '201103282247-0000000019-
> >>> >0000': 1
> >>> >I0506 22:00:01.468596 54946 status_update_manager.cpp:359] Received
> >>> >status update acknowledgement 76ed08a3-1f8a-4eb2-8d2a-28de822ee370 for
> >>> >task
> >>>1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767ac
> >>> >b0d90 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:01.580569 54946 status_update_manager.cpp:359] Received
> >>> >status update acknowledgement 146b821e-ed93-4059-8971-b3c73a7d02ed for
> >>> >task
> >>>1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5fa-d754a82
> >>> >c86f6 of framework 201103282247-0000000019-0000
> >>> >W0506 22:00:01.580670 54946 status_update_manager.cpp:382] Ignoring
> >>> >unexpected status update acknowledgment
> >>> >146b821e-ed93-4059-8971-b3c73a7d02ed for task
> >>> >1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5
> >>> >fa-d754a82c86f6 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:01.888521 54936 cgroups_isolator.cpp:804] Executor
> >>>
> >>>>thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c
> >>>>3
> >>>>8f
> >>> >cec5e1c of framework 201103282247-0000000019-0000 terminated with
> >>>status 0
> >>> >I0506 22:00:06.876126 54936 cgroups_isolator.cpp:620] Killing executor
> >>>
> >>>>thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c
> >>>>3
> >>>>8f
> >>> >cec5e1c of framework 201103282247-0000000019-0000
> >>> >W0506 22:00:01.917793 54946 status_update_manager.cpp:432] Resending
> >>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
> >>>for
> >>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
> >>> >f9dd7c22 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:08.930923 54946 status_update_manager.cpp:335] Forwarding
> >>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
> >>>for
> >>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
> >>> >0f9dd7c22 of framework 201103282247-0000000019-0000 to
> >>> >master@10.34.128.115:5050
> >>> >I0506 22:00:02.215196 54949 cgroups.cpp:1190] Trying to thaw cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc
> >>> >_tag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
> >>> >I0506 22:00:02.243545 54945 cgroups.cpp:1190] Trying to thaw cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_
> >>> >tag_b26ba555-a6bd-4448-a5b4-439beb442820
> >>> >I0506 22:00:09.717344 54949 cgroups.cpp:1298] Successfully thawed
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc_t
> >>> >ag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
> >>> >W0506 22:00:01.917774 54941 monitor.cpp:167] Failed to collect
> >>>resource
> >>> >usage for executor
> >>>
> >>>>'thermos-1367877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-37
> >>>>6
> >>>>ab
> >>> >b0df8d0' of framework '201103282247-0000000019
> >>> >-0000': 1
> >>> >I0506 22:00:06.931675 54936 cgroups_isolator.cpp:804] Executor
> >>>
> >>>>thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2a
> >>>>a
> >>>>b5
> >>> >588019 of framework 201103282247-0000000019-0000 terminated with
> >>>status 0
> >>> >I0506 22:00:06.960703 54934 cgroups.cpp:1175] Trying to freeze cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e
> >>> >1c_tag_720a3ca1-ea57-421f-9263-d47e347b866b
> >>> >I0506 22:00:02.157475 54948 slave.cpp:514] Successfully attached file
> >>>
> >>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
> >>>>/
> >>>>20
> >>> >1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-me
> >>>
> >>>>ta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-4
> >>>>1
> >>>>2b
> >>> >-8c86-799354704486'
> >>> >I0506 22:00:02.243666 54937 cgroups.cpp:1175] Trying to freeze cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c2
> >>> >2_tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
> >>> >I0506 22:00:09.717412 54945 cgroups.cpp:1298] Successfully thawed
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_ta
> >>> >g_b26ba555-a6bd-4448-a5b4-439beb442820
> >>> >I0506 22:00:09.766968 54936 cgroups_isolator.cpp:620] Killing executor
> >>>
> >>>>thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2a
> >>>>a
> >>>>b5
> >>> >588019 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:09.794438 54948 slave.cpp:2031] Executor
> >>>
> >>>>'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-e6
> >>>>9
> >>>>d5
> >>> >58f9bb2' of framework 201103282247-0000000019-0000 has exited with
> >>>status
> >>> >'0'
> >>> >I0506 22:00:10.080941 54936 cgroups_isolator.cpp:804] Executor
> >>>
> >>>>thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-447
> >>>>6
> >>>>7a
> >>> >cb0d90 of framework 201103282247-0000000019-0000 terminated with
> >>>status 0
> >>> >I0506 22:00:10.081022 54936 cgroups_isolator.cpp:620] Killing executor
> >>>
> >>>>thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-447
> >>>>6
> >>>>7a
> >>> >cb0d90 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:10.138010 54948 slave.cpp:2166] Cleaning up executor
> >>>
> >>>>'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-e6
> >>>>9
> >>>>d5
> >>> >58f9bb2' of framework 201103282247-0000000019-0000
> >>> >    @     0x7f050768dbcd  google::LogMessage::Fail()
> >>> >I0506 22:00:10.421581 54936 cgroups_isolator.cpp:804] Executor
> >>>
> >>>>thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f
> >>>>8
> >>>>a4
> >>> >8f50c4a of framework 201103282247-0000000019-0000 terminated with
> >>>status 0
> >>> >    @     0x7f0507693837  google::LogMessage::SendToLog()
> >>> >I0506 22:00:21.140630 54936 cgroups_isolator.cpp:620] Killing executor
> >>>
> >>>>thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f
> >>>>8
> >>>>a4
> >>> >8f50c4a of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:10.591079 54948 slave.cpp:819] Launching task
> >>>
> >>>>1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d
> >>>>3
> >>>>a
> >>> >for framework 201103282247-0000000019-0000
> >>> >I0506 22:00:10.703546 54934 cgroups.cpp:1175] Trying to freeze cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d9
> >>> >0_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0
> >>> >I0506 22:00:11.635795 54935 gc.cpp:143] Deleted
> >>>
> >>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
> >>>>/
> >>>>20
> >>>
> >>>>1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3
> >>>>c
> >>>>22
> >>> >72/runs/c562a33
> >>> >e-6870-47d8-ae53-8faec70ad328'
> >>> >W0506 22:00:15.583375 54942 cgroups.cpp:1261] Unable to freeze
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_ta
> >>>>g
> >>>>_2
> >>> >407f4f8-2ad7-4415-abdd-016d5ab8e20d within 51 attempts
> >>> >W0506 22:00:15.596472 54941 cgroups.cpp:1261] Unable to freeze
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c_t
> >>>>a
> >>>>g_
> >>> >720a3ca1-ea57-421f-9263-d47e347b866b within 51 attempts
> >>> >W0506 22:00:19.717893 54946 status_update_manager.cpp:432] Resending
> >>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
> >>>for
> >>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
> >>> >f9dd7c22 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:10.453220 54949 cgroups.cpp:1175] Trying to freeze cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab558801
> >>> >9_tag_9f6b6105-0a40-4e8f-8044-19a09b186984
> >>> >I0506 22:00:21.141474 54946 status_update_manager.cpp:335] Forwarding
> >>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
> >>>for
> >>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
> >>> >0f9dd7c22 of framework 201103282247-0000000019-0000 to
> >>> >master@10.34.128.115:5050
> >>> >I0506 22:00:21.143030 54948 paths.hpp:302] Created executor directory
> >>>
> >>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
> >>>>/
> >>>>20
> >>> >1103282247-0000000019-0000/executors/thermos-1367877549319-mesos-me
> >>>
> >>>>ta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a/runs/bd162f3e-c45a-
> >>>>4
> >>>>4a
> >>> >9-ae10-25914c460689'
> >>> >I0506 22:00:21.256687 54948 slave.cpp:930] Queuing task
> >>>
> >>>>'1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562
> >>>>d
> >>>>3a
> >>> >' for executor
> >>> >thermos-1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac
> >>> >7e-d365c0562d3a of framework '201103282247-0000000019-0000
> >>> >I0506 22:00:21.256707 54936 cgroups_isolator.cpp:804] Executor
> >>>
> >>>>thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb
> >>>>2
> >>>>75
> >>> >599233 of framework 201103282247-0000000019-0000 terminated with
> >>>status 0
> >>> >I0506 22:00:21.256798 54948 slave.cpp:1789] Sending acknowledgement
> >>>for
> >>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
> >>>for
> >>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f
> >>> >400f9dd7c22 of framework 201103282247-0000000019-0000 to
> >>> >executor(1)@10.34.124.132:55886
> >>> >I0506 22:00:21.256815 54936 cgroups_isolator.cpp:620] Killing executor
> >>>
> >>>>thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb
> >>>>2
> >>>>75
> >>> >599233 of framework 201103282247-0000000019-0000
> >>> >    @     0x7f050768f47c  google::LogMessage::Flush()
> >>> >I0506 22:00:21.377487 54935 gc.cpp:134] Deleting
> >>>
> >>>>/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/
> >>>>2
> >>>>01
> >>>
> >>>>103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c
> >>>>2
> >>>>27
> >>> >2
> >>> >I0506 22:00:21.377707 54935 gc.cpp:143] Deleted
> >>>
> >>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
> >>>>/
> >>>>20
> >>>
> >>>>1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3
> >>>>c
> >>>>22
> >>> >72'
> >>> >    @     0x7f050768f6e6  google::LogMessageFatal::~LogMessageFatal()
> >>> >    @     0x7f050741063d
> >>> >mesos::internal::ZooKeeperMasterDetectorProcess::connected()
> >>> >    @     0x7f05074111d8  std::tr1::_Function_handler<>::_M_invoke()
> >>> >    @     0x7f0507413c84  std::tr1::_Function_handler<>::_M_invoke()
> >>> >    @     0x7f050758d99a  process::ProcessManager::resume()
> >>> >    @     0x7f050758e9af  process::schedule()
> >>> >    @     0x7f0506d2773d  start_thread
> >>> >I0506 22:00:21.487545 54940 cgroups.cpp:1214] Successfully froze
> >>>cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0
> >>> >d90_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0 after 3 attempts
> >>> >I0506 22:00:21.487633 54942 cgroups.cpp:1214] Successfully froze
> >>>cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5588
> >>> >019_tag_9f6b6105-0a40-4e8f-8044-19a09b186984 after 3 attempts
> >>> >I0506 22:00:21.519721 54935 gc.cpp:134] Deleting
> >>>
> >>>>/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/
> >>>>2
> >>>>01
> >>>
> >>>>103282247-0000000019-0000/executors/thermos-1367685279171-mesos-meta_sl
> >>>>a
> >>>>ve
> >>> >_14-0-a67ae3d4
> >>> >-3e67-4a3a-ba85-ec66ca255db8/runs/76e96368-7692-4d9b-a717-e27fe7873f8b
> >>> >I0506 22:00:21.519790 54941 cgroups.cpp:1175] Trying to freeze cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48f50c
> >>> >4a_tag_53fc2cf1-8f88-4913-9e87-d865dbff4cdf
> >>> >I0506 22:00:21.577153 54946 status_update_manager.cpp:359] Received
> >>> >status update acknowledgement e7e02c18-abee-4781-9e62-70e0475b9fa3 for
> >>> >task
> >>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9d
> >>> >d7c22 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:21.577270 54948 slave.cpp:721] Got assigned task
> >>>
> >>>>1367877588667-mesos-meta_slave_33-39-79414744-8178-48e5-98df-eec82f8091
> >>>>f
> >>>>0
> >>> >for framework 201103282247-0000000019-0000
> >>> >I0506 22:00:21.674876 54938 cgroups.cpp:1190] Trying to thaw cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c
> >>> >_tag_720a3ca1-ea57-421f-9263-d47e347b866b
> >>> >I0506 22:00:21.828469 54936 cgroups_isolator.cpp:520] Launching
> >>>
> >>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
> >>>>8
> >>>>95
> >>> >d16c16 (./thermos_executor) in
> >>>/var/lib/mesos/slaves/201304262233-1937777
> >>>
> >>>>162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/the
> >>>>r
> >>>>mo
> >>>
> >>>>s-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16
> >>>>c
> >>>>16
> >>> >/runs/ed262fff-5bc0-412b-8c86-799354704486 with resources cpus=
> >>> >0.25; mem=128 for framework 201103282247-0000000019-0000 in cgroup
> >>>
> >>>>mesos/framework_201103282247-0000000019-0000_executor_thermos-136787753
> >>>>8
> >>>>34
> >>> >7-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16_tag_ed262
> >>> >fff-5bc0-412b-8c86-799354704486
> >>> >I0506 22:00:21.828743 54949 cgroups.cpp:1190] Trying to thaw cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_
> >>> >tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
> >>> >I0506 22:00:21.828740 54937 cgroups.cpp:1175] Trying to freeze cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>> >877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb27559923
> >>> >3_tag_4b84157b-e9c6-43e8-84f9-9ad1467411a7
> >>> >I0506 22:00:22.244297 54946 status_update_manager.cpp:480] Cleaning up
> >>> >status update stream for task
> >>>
> >>>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c2
> >>>>2
> >>> >of framework 201103282247-0000000019-
> >>> >0000
> >>> >I0506 22:00:22.244469 54938 cgroups.cpp:1298] Successfully thawed
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c_t
> >>> >ag_720a3ca1-ea57-421f-9263-d47e347b866b
> >>> >I0506 22:00:22.244581 54949 cgroups.cpp:1298] Successfully thawed
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_ta
> >>> >g_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
> >>> >I0506 22:00:22.246649 54936 cgroups_isolator.cpp:655] Changing cgroup
> >>> >controls for executor
> >>>
> >>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
> >>>>8
> >>>>95
> >>> >d16c16 of framework 201103282247-0000000019-0
> >>> >000 with resources cpus=0.25; mem=128
> >>> >I0506 22:00:22.247011 54936 cgroups_isolator.cpp:839] Updated
> >>> >'cpu.shares' to 256 for executor
> >>>
> >>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
> >>>>8
> >>>>95
> >>> >d16c16 of framework 201103282247-000000001
> >>> >9-0000
> >>> >I0506 22:00:22.247328 54936 cgroups_isolator.cpp:977] Updated
> >>> >'memory.limit_in_bytes' to 134217728 for executor
> >>>
> >>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
> >>>>8
> >>>>95
> >>> >d16c16 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:22.622454 54948 http.cpp:279] HTTP request for
> >>> >'/slave(1)/stats.json'
> >>> >I0506 22:00:23.052145 54947 cgroups.cpp:1190] Trying to thaw cgroup
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_ta
> >>>>g
> >>>>_e
> >>> >0d3ee4d-e89c-4892-a46d-f5a85854b3a0
> >>> >I0506 22:00:23.052280 54947 cgroups.cpp:1298] Successfully thawed
> >>>
> >>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
> >>>>3
> >>>>67
> >>>
> >>>>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_ta
> >>>>g
> >>>>_e
> >>> >0d3ee4d-e89c-4892-a46d-f5a85854b3a0
> >>> >I0506 22:00:23.283198 54936 cgroups_isolator.cpp:1003] Started
> >>>listening
> >>> >for OOM events for executor
> >>>
> >>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
> >>>>8
> >>>>95
> >>> >d16c16 of framework 201103282247-0000000019-0000
> >>> >I0506 22:00:23.296195 54936 cgroups_isolator.cpp:550] Forked executor
> >>>at
> >>> >= 20045
> >>> >    @     0x7f050570bf6d  clone
> >>> >Fetching resources into
> >>>
> >>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
> >>>>/
> >>>>20
> >>>
> >>>>1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-meta_s
> >>>>l
> >>>>av
> >>>
> >>>>e_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-412b-8c8
> >>>>6
> >>>>-7
> >>> >99354704486'
> >>> >Fetching resource '/usr/local/bin/thermos_executor'
> >>> >Copying resource from '/usr/local/bin/thermos_executor' to .
> >>> >/usr/local/bin/mesos-slave.sh: line 101: 54931 Aborted
> >>> >(core dumped) /usr/local/sbin/mesos-slave --port=5051
> >>> >--resources="${MESOS_RESOURCES}" --attributes="${MESOS_ATTRIBUTES}"
> >>> >--master="${master_zoo_url}" --log_dir="${log_dir}" ${WORK_DIR}
> >>> >${CGROUPS_ISOLATION} "$@"
> >>> >Slave Exit Status: 134
> >>> >I0506 22:01:17.601867 20405 main.cpp:124] Creating "cgroups" isolator
> >>> >I0506 22:01:17.627717 20405 main.cpp:132] Build: 2013-04-26 19:49:25
> >>>by
> >>> >bmahler
> >>> >I0506 22:01:17.627739 20405 main.cpp:133] Starting Mesos slave
> >>> >
> >>> >--
> >>> >This message is automatically generated by JIRA.
> >>> >If you think it was sent incorrectly, please contact your JIRA
> >>> >administrators
> >>> >For more information on JIRA, see:
> >>>http://www.atlassian.com/software/jira
> >>>
> >>>
> >
>
>

Re: [jira] [Created] (MESOS-463) Detector ZNode creation failure.

Posted by "Mattmann, Chris A (398J)" <ch...@jpl.nasa.gov>.
OK Gav hooked me up so we now have versions for 0.10.0, 0.11.0 and 0.12.0.

I'll continue my pathology for 0.10.0 and 0.11.0 to get a decent change
log going (based on dates and SVN tags), but it's going to take a week
or so to get this in order.

Will report back and update the CHANGELOG when I'm done. In the meanwhile
when you guys are creating new issues please set the appropriate Fix
Version.

Thanks all!

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: <Mattmann>, jpluser <ch...@jpl.nasa.gov>
Reply-To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>
Date: Monday, May 6, 2013 3:30 PM
To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>, Ben
Hindman <be...@twitter.com>
Subject: Re: [jira] [Created] (MESOS-463) Detector ZNode creation failure.

>Thanks Ben M -- we'll get it sorted! Thanks for trying. I'll poke
>infra@ too for my admin access to JIRA..
>
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Chris Mattmann, Ph.D.
>Senior Computer Scientist
>NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>Office: 171-266B, Mailstop: 171-246
>Email: chris.a.mattmann@nasa.gov
>WWW:  http://sunset.usc.edu/~mattmann/
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Adjunct Assistant Professor, Computer Science Department
>University of Southern California, Los Angeles, CA 90089 USA
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
>-----Original Message-----
>From: Benjamin Mahler <be...@gmail.com>
>Reply-To: "mesos-dev@incubator.apache.org"
><me...@incubator.apache.org>
>Date: Monday, May 6, 2013 3:26 PM
>To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>, Ben
>Hindman <be...@twitter.com>
>Subject: Re: FW: [jira] [Created] (MESOS-463) Detector ZNode creation
>failure.
>
>>Yeah, I had tried doing it for this ticket but the versions don't yet
>>exist
>>in JIRA =/
>>
>>Added benh to this email.
>>
>>
>>On Mon, May 6, 2013 at 3:22 PM, Mattmann, Chris A (398J) <
>>chris.a.mattmann@jpl.nasa.gov> wrote:
>>
>>> Guys, FYI JIRA issues like this below should have a Fix version.
>>>
>>> Ben H, or someone with permissions, can you please create versions
>>> 0.11.0, 0.10.0, and 0.12.0 (which I'd assume MESOS-463 to be in that
>>> grouping). Then, once they are created, we can set the Fix version
>>> and start generating some boss-hog change logs around here.
>>>
>>> Cheers,
>>> Chris
>>>
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Chris Mattmann, Ph.D.
>>> Senior Computer Scientist
>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 171-266B, Mailstop: 171-246
>>> Email: chris.a.mattmann@nasa.gov
>>> WWW:  http://sunset.usc.edu/~mattmann/
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Adjunct Assistant Professor, Computer Science Department
>>> University of Southern California, Los Angeles, CA 90089 USA
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: "Benjamin Mahler   (JIRA)" <ji...@apache.org>
>>> Reply-To: "mesos-dev@incubator.apache.org"
>>><mesos-dev@incubator.apache.org
>>> >
>>> Date: Monday, May 6, 2013 3:20 PM
>>> To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>
>>> Subject: [jira] [Created] (MESOS-463) Detector ZNode creation failure.
>>>
>>> >Benjamin Mahler created MESOS-463:
>>> >-------------------------------------
>>> >
>>> >             Summary: Detector ZNode creation failure.
>>> >                 Key: MESOS-463
>>> >                 URL: https://issues.apache.org/jira/browse/MESOS-463
>>> >             Project: Mesos
>>> >          Issue Type: Bug
>>> >            Reporter: Benjamin Mahler
>>> >
>>> >
>>> >The following failure message occured in a test cluster at Twitter:
>>> >
>>> >    // We fail all non-OK return codes except ZNODEEXISTS (since that
>>> >    // means the path we were trying to create exists) and ZNOAUTH
>>> >    // (since it's possible that the ACLs on 'dirname(url.path)' don't
>>> >    // allow us to create a child znode but we are allowed to create
>>> >    // children of 'url.path' itself, which will be determined below
>>> >    // if we are contending). Note that it's also possible we got back
>>> >    // a ZNONODE because we could not create one of the intermediate
>>> >    // znodes (in which case we'll abort in the 'else' below since
>>> >    // ZNONODE is non-retryable). TODO(benh): Need to check that we
>>> >    // also can put a watch on the children of 'url.path'.
>>> >    if (code != ZOK && code != ZNODEEXISTS && code != ZNOAUTH) {
>>> >      LOG(FATAL) << "Failed to create '" << url.path
>>> >                 << "' in ZooKeeper: " << zk->message(code);
>>> >    }
>>> >
>>> >It's interesting that there was a delay before the slave crashed:
>>> >
>>> >F0506 21:59:46.769142 54944 detector.cpp:315] Failed to create
>>> >'/home/mesos/test/master' in ZooKeeper: invalid zhandle state
>>> >*** Check failure stack trace: ***
>>> >I0506 21:59:47.398509 54940 cgroups.cpp:1298] Successfully thawed
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> 
>>>>877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6663aee94840_tag
>>> >_c6f62383-28d1-42c2-aa65-15f9ec42db57
>>> >I0506 21:59:46.538936 54936 cgroups_isolator.cpp:804] Executor
>>> 
>>>>thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
>>>>0
>>>>f9
>>> >dd7c22 of framework 201103282247-0000000019-0000 terminated with
>>>status 0
>>> >I0506 22:00:01.329169 54936 cgroups_isolator.cpp:620] Killing executor
>>> 
>>>>thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
>>>>0
>>>>f9
>>> >dd7c22 of framework 201103282247-0000000019-0000
>>> >I0506 21:59:47.576560 54948 slave.cpp:721] Got assigned task
>>> 
>>>>1367877572534-mesos-meta_slave_2-4-3b3755d9-9435-4480-9068-d851234cb91d
>>> >for framework 201103282247-0000000019-0000
>>> >W0506 21:59:47.576581 54941 monitor.cpp:167] Failed to collect
>>>resource
>>> >usage for executor
>>> 
>>>>'thermos-1367877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7
>>>>e
>>>>39
>>> >564e5bfc' of framework '201103282247-000000001
>>> >9-0000': 1
>>> >W0506 21:59:51.089640 54939 cgroups.cpp:1261] Unable to freeze
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> 
>>>>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_ta
>>>>g
>>>>_b
>>> >26ba555-a6bd-4448-a5b4-439beb442820 within 51 attempts
>>> >W0506 21:59:51.090683 54943 cgroups.cpp:1261] Unable to freeze
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> 
>>>>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc_t
>>>>a
>>>>g_
>>> >6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4 within 51 attempts
>>> >I0506 21:59:47.518194 54946 status_update_manager.cpp:359] Received
>>> >status update acknowledgement c7701021-7eac-4711-96ac-093871462e44 for
>>> >task 
>>>1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48
>>> >f50c4a of framework 201103282247-0000000019-0000
>>> >W0506 22:00:01.439838 54941 monitor.cpp:167] Failed to collect
>>>resource
>>> >usage for executor
>>> 
>>>>'thermos-1367877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-666
>>>>3
>>>>ae
>>> >e94840' of framework '201103282247-0000000019-
>>> >0000': 1
>>> >I0506 22:00:01.468596 54946 status_update_manager.cpp:359] Received
>>> >status update acknowledgement 76ed08a3-1f8a-4eb2-8d2a-28de822ee370 for
>>> >task 
>>>1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767ac
>>> >b0d90 of framework 201103282247-0000000019-0000
>>> >I0506 22:00:01.580569 54946 status_update_manager.cpp:359] Received
>>> >status update acknowledgement 146b821e-ed93-4059-8971-b3c73a7d02ed for
>>> >task 
>>>1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5fa-d754a82
>>> >c86f6 of framework 201103282247-0000000019-0000
>>> >W0506 22:00:01.580670 54946 status_update_manager.cpp:382] Ignoring
>>> >unexpected status update acknowledgment
>>> >146b821e-ed93-4059-8971-b3c73a7d02ed for task
>>> >1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5
>>> >fa-d754a82c86f6 of framework 201103282247-0000000019-0000
>>> >I0506 22:00:01.888521 54936 cgroups_isolator.cpp:804] Executor
>>> 
>>>>thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c
>>>>3
>>>>8f
>>> >cec5e1c of framework 201103282247-0000000019-0000 terminated with
>>>status 0
>>> >I0506 22:00:06.876126 54936 cgroups_isolator.cpp:620] Killing executor
>>> 
>>>>thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c
>>>>3
>>>>8f
>>> >cec5e1c of framework 201103282247-0000000019-0000
>>> >W0506 22:00:01.917793 54946 status_update_manager.cpp:432] Resending
>>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
>>>for
>>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
>>> >f9dd7c22 of framework 201103282247-0000000019-0000
>>> >I0506 22:00:08.930923 54946 status_update_manager.cpp:335] Forwarding
>>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
>>>for
>>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
>>> >0f9dd7c22 of framework 201103282247-0000000019-0000 to
>>> >master@10.34.128.115:5050
>>> >I0506 22:00:02.215196 54949 cgroups.cpp:1190] Trying to thaw cgroup
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> >877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc
>>> >_tag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
>>> >I0506 22:00:02.243545 54945 cgroups.cpp:1190] Trying to thaw cgroup
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> >877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_
>>> >tag_b26ba555-a6bd-4448-a5b4-439beb442820
>>> >I0506 22:00:09.717344 54949 cgroups.cpp:1298] Successfully thawed
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> 
>>>>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc_t
>>> >ag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
>>> >W0506 22:00:01.917774 54941 monitor.cpp:167] Failed to collect
>>>resource
>>> >usage for executor
>>> 
>>>>'thermos-1367877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-37
>>>>6
>>>>ab
>>> >b0df8d0' of framework '201103282247-0000000019
>>> >-0000': 1
>>> >I0506 22:00:06.931675 54936 cgroups_isolator.cpp:804] Executor
>>> 
>>>>thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2a
>>>>a
>>>>b5
>>> >588019 of framework 201103282247-0000000019-0000 terminated with
>>>status 0
>>> >I0506 22:00:06.960703 54934 cgroups.cpp:1175] Trying to freeze cgroup
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e
>>> >1c_tag_720a3ca1-ea57-421f-9263-d47e347b866b
>>> >I0506 22:00:02.157475 54948 slave.cpp:514] Successfully attached file
>>> 
>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
>>>>/
>>>>20
>>> >1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-me
>>> 
>>>>ta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-4
>>>>1
>>>>2b
>>> >-8c86-799354704486'
>>> >I0506 22:00:02.243666 54937 cgroups.cpp:1175] Trying to freeze cgroup
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c2
>>> >2_tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>>> >I0506 22:00:09.717412 54945 cgroups.cpp:1298] Successfully thawed
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> 
>>>>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_ta
>>> >g_b26ba555-a6bd-4448-a5b4-439beb442820
>>> >I0506 22:00:09.766968 54936 cgroups_isolator.cpp:620] Killing executor
>>> 
>>>>thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2a
>>>>a
>>>>b5
>>> >588019 of framework 201103282247-0000000019-0000
>>> >I0506 22:00:09.794438 54948 slave.cpp:2031] Executor
>>> 
>>>>'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-e6
>>>>9
>>>>d5
>>> >58f9bb2' of framework 201103282247-0000000019-0000 has exited with
>>>status
>>> >'0'
>>> >I0506 22:00:10.080941 54936 cgroups_isolator.cpp:804] Executor
>>> 
>>>>thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-447
>>>>6
>>>>7a
>>> >cb0d90 of framework 201103282247-0000000019-0000 terminated with
>>>status 0
>>> >I0506 22:00:10.081022 54936 cgroups_isolator.cpp:620] Killing executor
>>> 
>>>>thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-447
>>>>6
>>>>7a
>>> >cb0d90 of framework 201103282247-0000000019-0000
>>> >I0506 22:00:10.138010 54948 slave.cpp:2166] Cleaning up executor
>>> 
>>>>'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-e6
>>>>9
>>>>d5
>>> >58f9bb2' of framework 201103282247-0000000019-0000
>>> >    @     0x7f050768dbcd  google::LogMessage::Fail()
>>> >I0506 22:00:10.421581 54936 cgroups_isolator.cpp:804] Executor
>>> 
>>>>thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f
>>>>8
>>>>a4
>>> >8f50c4a of framework 201103282247-0000000019-0000 terminated with
>>>status 0
>>> >    @     0x7f0507693837  google::LogMessage::SendToLog()
>>> >I0506 22:00:21.140630 54936 cgroups_isolator.cpp:620] Killing executor
>>> 
>>>>thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f
>>>>8
>>>>a4
>>> >8f50c4a of framework 201103282247-0000000019-0000
>>> >I0506 22:00:10.591079 54948 slave.cpp:819] Launching task
>>> 
>>>>1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d
>>>>3
>>>>a
>>> >for framework 201103282247-0000000019-0000
>>> >I0506 22:00:10.703546 54934 cgroups.cpp:1175] Trying to freeze cgroup
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d9
>>> >0_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>>> >I0506 22:00:11.635795 54935 gc.cpp:143] Deleted
>>> 
>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
>>>>/
>>>>20
>>> 
>>>>1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3
>>>>c
>>>>22
>>> >72/runs/c562a33
>>> >e-6870-47d8-ae53-8faec70ad328'
>>> >W0506 22:00:15.583375 54942 cgroups.cpp:1261] Unable to freeze
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> 
>>>>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_ta
>>>>g
>>>>_2
>>> >407f4f8-2ad7-4415-abdd-016d5ab8e20d within 51 attempts
>>> >W0506 22:00:15.596472 54941 cgroups.cpp:1261] Unable to freeze
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> 
>>>>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c_t
>>>>a
>>>>g_
>>> >720a3ca1-ea57-421f-9263-d47e347b866b within 51 attempts
>>> >W0506 22:00:19.717893 54946 status_update_manager.cpp:432] Resending
>>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
>>>for
>>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
>>> >f9dd7c22 of framework 201103282247-0000000019-0000
>>> >I0506 22:00:10.453220 54949 cgroups.cpp:1175] Trying to freeze cgroup
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> >877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab558801
>>> >9_tag_9f6b6105-0a40-4e8f-8044-19a09b186984
>>> >I0506 22:00:21.141474 54946 status_update_manager.cpp:335] Forwarding
>>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
>>>for
>>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
>>> >0f9dd7c22 of framework 201103282247-0000000019-0000 to
>>> >master@10.34.128.115:5050
>>> >I0506 22:00:21.143030 54948 paths.hpp:302] Created executor directory
>>> 
>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
>>>>/
>>>>20
>>> >1103282247-0000000019-0000/executors/thermos-1367877549319-mesos-me
>>> 
>>>>ta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a/runs/bd162f3e-c45a-
>>>>4
>>>>4a
>>> >9-ae10-25914c460689'
>>> >I0506 22:00:21.256687 54948 slave.cpp:930] Queuing task
>>> 
>>>>'1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562
>>>>d
>>>>3a
>>> >' for executor
>>> >thermos-1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac
>>> >7e-d365c0562d3a of framework '201103282247-0000000019-0000
>>> >I0506 22:00:21.256707 54936 cgroups_isolator.cpp:804] Executor
>>> 
>>>>thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb
>>>>2
>>>>75
>>> >599233 of framework 201103282247-0000000019-0000 terminated with
>>>status 0
>>> >I0506 22:00:21.256798 54948 slave.cpp:1789] Sending acknowledgement
>>>for
>>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
>>>for
>>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f
>>> >400f9dd7c22 of framework 201103282247-0000000019-0000 to
>>> >executor(1)@10.34.124.132:55886
>>> >I0506 22:00:21.256815 54936 cgroups_isolator.cpp:620] Killing executor
>>> 
>>>>thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb
>>>>2
>>>>75
>>> >599233 of framework 201103282247-0000000019-0000
>>> >    @     0x7f050768f47c  google::LogMessage::Flush()
>>> >I0506 22:00:21.377487 54935 gc.cpp:134] Deleting
>>> 
>>>>/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/
>>>>2
>>>>01
>>> 
>>>>103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c
>>>>2
>>>>27
>>> >2
>>> >I0506 22:00:21.377707 54935 gc.cpp:143] Deleted
>>> 
>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
>>>>/
>>>>20
>>> 
>>>>1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3
>>>>c
>>>>22
>>> >72'
>>> >    @     0x7f050768f6e6  google::LogMessageFatal::~LogMessageFatal()
>>> >    @     0x7f050741063d
>>> >mesos::internal::ZooKeeperMasterDetectorProcess::connected()
>>> >    @     0x7f05074111d8  std::tr1::_Function_handler<>::_M_invoke()
>>> >    @     0x7f0507413c84  std::tr1::_Function_handler<>::_M_invoke()
>>> >    @     0x7f050758d99a  process::ProcessManager::resume()
>>> >    @     0x7f050758e9af  process::schedule()
>>> >    @     0x7f0506d2773d  start_thread
>>> >I0506 22:00:21.487545 54940 cgroups.cpp:1214] Successfully froze
>>>cgroup
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0
>>> >d90_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0 after 3 attempts
>>> >I0506 22:00:21.487633 54942 cgroups.cpp:1214] Successfully froze
>>>cgroup
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> >877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5588
>>> >019_tag_9f6b6105-0a40-4e8f-8044-19a09b186984 after 3 attempts
>>> >I0506 22:00:21.519721 54935 gc.cpp:134] Deleting
>>> 
>>>>/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/
>>>>2
>>>>01
>>> 
>>>>103282247-0000000019-0000/executors/thermos-1367685279171-mesos-meta_sl
>>>>a
>>>>ve
>>> >_14-0-a67ae3d4
>>> >-3e67-4a3a-ba85-ec66ca255db8/runs/76e96368-7692-4d9b-a717-e27fe7873f8b
>>> >I0506 22:00:21.519790 54941 cgroups.cpp:1175] Trying to freeze cgroup
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> >877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48f50c
>>> >4a_tag_53fc2cf1-8f88-4913-9e87-d865dbff4cdf
>>> >I0506 22:00:21.577153 54946 status_update_manager.cpp:359] Received
>>> >status update acknowledgement e7e02c18-abee-4781-9e62-70e0475b9fa3 for
>>> >task 
>>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9d
>>> >d7c22 of framework 201103282247-0000000019-0000
>>> >I0506 22:00:21.577270 54948 slave.cpp:721] Got assigned task
>>> 
>>>>1367877588667-mesos-meta_slave_33-39-79414744-8178-48e5-98df-eec82f8091
>>>>f
>>>>0
>>> >for framework 201103282247-0000000019-0000
>>> >I0506 22:00:21.674876 54938 cgroups.cpp:1190] Trying to thaw cgroup
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c
>>> >_tag_720a3ca1-ea57-421f-9263-d47e347b866b
>>> >I0506 22:00:21.828469 54936 cgroups_isolator.cpp:520] Launching
>>> 
>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
>>>>8
>>>>95
>>> >d16c16 (./thermos_executor) in
>>>/var/lib/mesos/slaves/201304262233-1937777
>>> 
>>>>162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/the
>>>>r
>>>>mo
>>> 
>>>>s-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16
>>>>c
>>>>16
>>> >/runs/ed262fff-5bc0-412b-8c86-799354704486 with resources cpus=
>>> >0.25; mem=128 for framework 201103282247-0000000019-0000 in cgroup
>>> 
>>>>mesos/framework_201103282247-0000000019-0000_executor_thermos-136787753
>>>>8
>>>>34
>>> >7-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16_tag_ed262
>>> >fff-5bc0-412b-8c86-799354704486
>>> >I0506 22:00:21.828743 54949 cgroups.cpp:1190] Trying to thaw cgroup
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_
>>> >tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>>> >I0506 22:00:21.828740 54937 cgroups.cpp:1175] Trying to freeze cgroup
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> >877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb27559923
>>> >3_tag_4b84157b-e9c6-43e8-84f9-9ad1467411a7
>>> >I0506 22:00:22.244297 54946 status_update_manager.cpp:480] Cleaning up
>>> >status update stream for task
>>> 
>>>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c2
>>>>2
>>> >of framework 201103282247-0000000019-
>>> >0000
>>> >I0506 22:00:22.244469 54938 cgroups.cpp:1298] Successfully thawed
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> 
>>>>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c_t
>>> >ag_720a3ca1-ea57-421f-9263-d47e347b866b
>>> >I0506 22:00:22.244581 54949 cgroups.cpp:1298] Successfully thawed
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> 
>>>>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_ta
>>> >g_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>>> >I0506 22:00:22.246649 54936 cgroups_isolator.cpp:655] Changing cgroup
>>> >controls for executor
>>> 
>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
>>>>8
>>>>95
>>> >d16c16 of framework 201103282247-0000000019-0
>>> >000 with resources cpus=0.25; mem=128
>>> >I0506 22:00:22.247011 54936 cgroups_isolator.cpp:839] Updated
>>> >'cpu.shares' to 256 for executor
>>> 
>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
>>>>8
>>>>95
>>> >d16c16 of framework 201103282247-000000001
>>> >9-0000
>>> >I0506 22:00:22.247328 54936 cgroups_isolator.cpp:977] Updated
>>> >'memory.limit_in_bytes' to 134217728 for executor
>>> 
>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
>>>>8
>>>>95
>>> >d16c16 of framework 201103282247-0000000019-0000
>>> >I0506 22:00:22.622454 54948 http.cpp:279] HTTP request for
>>> >'/slave(1)/stats.json'
>>> >I0506 22:00:23.052145 54947 cgroups.cpp:1190] Trying to thaw cgroup
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> 
>>>>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_ta
>>>>g
>>>>_e
>>> >0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>>> >I0506 22:00:23.052280 54947 cgroups.cpp:1298] Successfully thawed
>>> 
>>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1
>>>>3
>>>>67
>>> 
>>>>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_ta
>>>>g
>>>>_e
>>> >0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>>> >I0506 22:00:23.283198 54936 cgroups_isolator.cpp:1003] Started
>>>listening
>>> >for OOM events for executor
>>> 
>>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594
>>>>8
>>>>95
>>> >d16c16 of framework 201103282247-0000000019-0000
>>> >I0506 22:00:23.296195 54936 cgroups_isolator.cpp:550] Forked executor
>>>at
>>> >= 20045
>>> >    @     0x7f050570bf6d  clone
>>> >Fetching resources into
>>> 
>>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks
>>>>/
>>>>20
>>> 
>>>>1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-meta_s
>>>>l
>>>>av
>>> 
>>>>e_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-412b-8c8
>>>>6
>>>>-7
>>> >99354704486'
>>> >Fetching resource '/usr/local/bin/thermos_executor'
>>> >Copying resource from '/usr/local/bin/thermos_executor' to .
>>> >/usr/local/bin/mesos-slave.sh: line 101: 54931 Aborted
>>> >(core dumped) /usr/local/sbin/mesos-slave --port=5051
>>> >--resources="${MESOS_RESOURCES}" --attributes="${MESOS_ATTRIBUTES}"
>>> >--master="${master_zoo_url}" --log_dir="${log_dir}" ${WORK_DIR}
>>> >${CGROUPS_ISOLATION} "$@"
>>> >Slave Exit Status: 134
>>> >I0506 22:01:17.601867 20405 main.cpp:124] Creating "cgroups" isolator
>>> >I0506 22:01:17.627717 20405 main.cpp:132] Build: 2013-04-26 19:49:25
>>>by
>>> >bmahler
>>> >I0506 22:01:17.627739 20405 main.cpp:133] Starting Mesos slave
>>> >
>>> >--
>>> >This message is automatically generated by JIRA.
>>> >If you think it was sent incorrectly, please contact your JIRA
>>> >administrators
>>> >For more information on JIRA, see:
>>>http://www.atlassian.com/software/jira
>>>
>>>
>


Re: [jira] [Created] (MESOS-463) Detector ZNode creation failure.

Posted by "Mattmann, Chris A (398J)" <ch...@jpl.nasa.gov>.
Thanks Ben M -- we'll get it sorted! Thanks for trying. I'll poke
infra@ too for my admin access to JIRA..

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Benjamin Mahler <be...@gmail.com>
Reply-To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>
Date: Monday, May 6, 2013 3:26 PM
To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>, Ben
Hindman <be...@twitter.com>
Subject: Re: FW: [jira] [Created] (MESOS-463) Detector ZNode creation
failure.

>Yeah, I had tried doing it for this ticket but the versions don't yet
>exist
>in JIRA =/
>
>Added benh to this email.
>
>
>On Mon, May 6, 2013 at 3:22 PM, Mattmann, Chris A (398J) <
>chris.a.mattmann@jpl.nasa.gov> wrote:
>
>> Guys, FYI JIRA issues like this below should have a Fix version.
>>
>> Ben H, or someone with permissions, can you please create versions
>> 0.11.0, 0.10.0, and 0.12.0 (which I'd assume MESOS-463 to be in that
>> grouping). Then, once they are created, we can set the Fix version
>> and start generating some boss-hog change logs around here.
>>
>> Cheers,
>> Chris
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: "Benjamin Mahler   (JIRA)" <ji...@apache.org>
>> Reply-To: "mesos-dev@incubator.apache.org"
>><mesos-dev@incubator.apache.org
>> >
>> Date: Monday, May 6, 2013 3:20 PM
>> To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>
>> Subject: [jira] [Created] (MESOS-463) Detector ZNode creation failure.
>>
>> >Benjamin Mahler created MESOS-463:
>> >-------------------------------------
>> >
>> >             Summary: Detector ZNode creation failure.
>> >                 Key: MESOS-463
>> >                 URL: https://issues.apache.org/jira/browse/MESOS-463
>> >             Project: Mesos
>> >          Issue Type: Bug
>> >            Reporter: Benjamin Mahler
>> >
>> >
>> >The following failure message occured in a test cluster at Twitter:
>> >
>> >    // We fail all non-OK return codes except ZNODEEXISTS (since that
>> >    // means the path we were trying to create exists) and ZNOAUTH
>> >    // (since it's possible that the ACLs on 'dirname(url.path)' don't
>> >    // allow us to create a child znode but we are allowed to create
>> >    // children of 'url.path' itself, which will be determined below
>> >    // if we are contending). Note that it's also possible we got back
>> >    // a ZNONODE because we could not create one of the intermediate
>> >    // znodes (in which case we'll abort in the 'else' below since
>> >    // ZNONODE is non-retryable). TODO(benh): Need to check that we
>> >    // also can put a watch on the children of 'url.path'.
>> >    if (code != ZOK && code != ZNODEEXISTS && code != ZNOAUTH) {
>> >      LOG(FATAL) << "Failed to create '" << url.path
>> >                 << "' in ZooKeeper: " << zk->message(code);
>> >    }
>> >
>> >It's interesting that there was a delay before the slave crashed:
>> >
>> >F0506 21:59:46.769142 54944 detector.cpp:315] Failed to create
>> >'/home/mesos/test/master' in ZooKeeper: invalid zhandle state
>> >*** Check failure stack trace: ***
>> >I0506 21:59:47.398509 54940 cgroups.cpp:1298] Successfully thawed
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6663aee94840_tag
>> >_c6f62383-28d1-42c2-aa65-15f9ec42db57
>> >I0506 21:59:46.538936 54936 cgroups_isolator.cpp:804] Executor
>> 
>>>thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
>>>f9
>> >dd7c22 of framework 201103282247-0000000019-0000 terminated with
>>status 0
>> >I0506 22:00:01.329169 54936 cgroups_isolator.cpp:620] Killing executor
>> 
>>>thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
>>>f9
>> >dd7c22 of framework 201103282247-0000000019-0000
>> >I0506 21:59:47.576560 54948 slave.cpp:721] Got assigned task
>> >1367877572534-mesos-meta_slave_2-4-3b3755d9-9435-4480-9068-d851234cb91d
>> >for framework 201103282247-0000000019-0000
>> >W0506 21:59:47.576581 54941 monitor.cpp:167] Failed to collect resource
>> >usage for executor
>> 
>>>'thermos-1367877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e
>>>39
>> >564e5bfc' of framework '201103282247-000000001
>> >9-0000': 1
>> >W0506 21:59:51.089640 54939 cgroups.cpp:1261] Unable to freeze
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> 
>>>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_tag
>>>_b
>> >26ba555-a6bd-4448-a5b4-439beb442820 within 51 attempts
>> >W0506 21:59:51.090683 54943 cgroups.cpp:1261] Unable to freeze
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> 
>>>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc_ta
>>>g_
>> >6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4 within 51 attempts
>> >I0506 21:59:47.518194 54946 status_update_manager.cpp:359] Received
>> >status update acknowledgement c7701021-7eac-4711-96ac-093871462e44 for
>> >task 
>>1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48
>> >f50c4a of framework 201103282247-0000000019-0000
>> >W0506 22:00:01.439838 54941 monitor.cpp:167] Failed to collect resource
>> >usage for executor
>> 
>>>'thermos-1367877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6663
>>>ae
>> >e94840' of framework '201103282247-0000000019-
>> >0000': 1
>> >I0506 22:00:01.468596 54946 status_update_manager.cpp:359] Received
>> >status update acknowledgement 76ed08a3-1f8a-4eb2-8d2a-28de822ee370 for
>> >task 
>>1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767ac
>> >b0d90 of framework 201103282247-0000000019-0000
>> >I0506 22:00:01.580569 54946 status_update_manager.cpp:359] Received
>> >status update acknowledgement 146b821e-ed93-4059-8971-b3c73a7d02ed for
>> >task 
>>1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5fa-d754a82
>> >c86f6 of framework 201103282247-0000000019-0000
>> >W0506 22:00:01.580670 54946 status_update_manager.cpp:382] Ignoring
>> >unexpected status update acknowledgment
>> >146b821e-ed93-4059-8971-b3c73a7d02ed for task
>> >1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5
>> >fa-d754a82c86f6 of framework 201103282247-0000000019-0000
>> >I0506 22:00:01.888521 54936 cgroups_isolator.cpp:804] Executor
>> 
>>>thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c3
>>>8f
>> >cec5e1c of framework 201103282247-0000000019-0000 terminated with
>>status 0
>> >I0506 22:00:06.876126 54936 cgroups_isolator.cpp:620] Killing executor
>> 
>>>thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c3
>>>8f
>> >cec5e1c of framework 201103282247-0000000019-0000
>> >W0506 22:00:01.917793 54946 status_update_manager.cpp:432] Resending
>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
>>for
>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
>> >f9dd7c22 of framework 201103282247-0000000019-0000
>> >I0506 22:00:08.930923 54946 status_update_manager.cpp:335] Forwarding
>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
>>for
>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
>> >0f9dd7c22 of framework 201103282247-0000000019-0000 to
>> >master@10.34.128.115:5050
>> >I0506 22:00:02.215196 54949 cgroups.cpp:1190] Trying to thaw cgroup
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc
>> >_tag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
>> >I0506 22:00:02.243545 54945 cgroups.cpp:1190] Trying to thaw cgroup
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_
>> >tag_b26ba555-a6bd-4448-a5b4-439beb442820
>> >I0506 22:00:09.717344 54949 cgroups.cpp:1298] Successfully thawed
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc_t
>> >ag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
>> >W0506 22:00:01.917774 54941 monitor.cpp:167] Failed to collect resource
>> >usage for executor
>> 
>>>'thermos-1367877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376
>>>ab
>> >b0df8d0' of framework '201103282247-0000000019
>> >-0000': 1
>> >I0506 22:00:06.931675 54936 cgroups_isolator.cpp:804] Executor
>> 
>>>thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aa
>>>b5
>> >588019 of framework 201103282247-0000000019-0000 terminated with
>>status 0
>> >I0506 22:00:06.960703 54934 cgroups.cpp:1175] Trying to freeze cgroup
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e
>> >1c_tag_720a3ca1-ea57-421f-9263-d47e347b866b
>> >I0506 22:00:02.157475 54948 slave.cpp:514] Successfully attached file
>> 
>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/
>>>20
>> >1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-me
>> 
>>>ta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-41
>>>2b
>> >-8c86-799354704486'
>> >I0506 22:00:02.243666 54937 cgroups.cpp:1175] Trying to freeze cgroup
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c2
>> >2_tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>> >I0506 22:00:09.717412 54945 cgroups.cpp:1298] Successfully thawed
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_ta
>> >g_b26ba555-a6bd-4448-a5b4-439beb442820
>> >I0506 22:00:09.766968 54936 cgroups_isolator.cpp:620] Killing executor
>> 
>>>thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aa
>>>b5
>> >588019 of framework 201103282247-0000000019-0000
>> >I0506 22:00:09.794438 54948 slave.cpp:2031] Executor
>> 
>>>'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-e69
>>>d5
>> >58f9bb2' of framework 201103282247-0000000019-0000 has exited with
>>status
>> >'0'
>> >I0506 22:00:10.080941 54936 cgroups_isolator.cpp:804] Executor
>> 
>>>thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-4476
>>>7a
>> >cb0d90 of framework 201103282247-0000000019-0000 terminated with
>>status 0
>> >I0506 22:00:10.081022 54936 cgroups_isolator.cpp:620] Killing executor
>> 
>>>thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-4476
>>>7a
>> >cb0d90 of framework 201103282247-0000000019-0000
>> >I0506 22:00:10.138010 54948 slave.cpp:2166] Cleaning up executor
>> 
>>>'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-e69
>>>d5
>> >58f9bb2' of framework 201103282247-0000000019-0000
>> >    @     0x7f050768dbcd  google::LogMessage::Fail()
>> >I0506 22:00:10.421581 54936 cgroups_isolator.cpp:804] Executor
>> 
>>>thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8
>>>a4
>> >8f50c4a of framework 201103282247-0000000019-0000 terminated with
>>status 0
>> >    @     0x7f0507693837  google::LogMessage::SendToLog()
>> >I0506 22:00:21.140630 54936 cgroups_isolator.cpp:620] Killing executor
>> 
>>>thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8
>>>a4
>> >8f50c4a of framework 201103282247-0000000019-0000
>> >I0506 22:00:10.591079 54948 slave.cpp:819] Launching task
>> 
>>>1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3
>>>a
>> >for framework 201103282247-0000000019-0000
>> >I0506 22:00:10.703546 54934 cgroups.cpp:1175] Trying to freeze cgroup
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d9
>> >0_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>> >I0506 22:00:11.635795 54935 gc.cpp:143] Deleted
>> 
>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/
>>>20
>> 
>>>1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c
>>>22
>> >72/runs/c562a33
>> >e-6870-47d8-ae53-8faec70ad328'
>> >W0506 22:00:15.583375 54942 cgroups.cpp:1261] Unable to freeze
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> 
>>>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_tag
>>>_2
>> >407f4f8-2ad7-4415-abdd-016d5ab8e20d within 51 attempts
>> >W0506 22:00:15.596472 54941 cgroups.cpp:1261] Unable to freeze
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> 
>>>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c_ta
>>>g_
>> >720a3ca1-ea57-421f-9263-d47e347b866b within 51 attempts
>> >W0506 22:00:19.717893 54946 status_update_manager.cpp:432] Resending
>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
>>for
>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
>> >f9dd7c22 of framework 201103282247-0000000019-0000
>> >I0506 22:00:10.453220 54949 cgroups.cpp:1175] Trying to freeze cgroup
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab558801
>> >9_tag_9f6b6105-0a40-4e8f-8044-19a09b186984
>> >I0506 22:00:21.141474 54946 status_update_manager.cpp:335] Forwarding
>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
>>for
>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
>> >0f9dd7c22 of framework 201103282247-0000000019-0000 to
>> >master@10.34.128.115:5050
>> >I0506 22:00:21.143030 54948 paths.hpp:302] Created executor directory
>> 
>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/
>>>20
>> >1103282247-0000000019-0000/executors/thermos-1367877549319-mesos-me
>> 
>>>ta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a/runs/bd162f3e-c45a-4
>>>4a
>> >9-ae10-25914c460689'
>> >I0506 22:00:21.256687 54948 slave.cpp:930] Queuing task
>> 
>>>'1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d
>>>3a
>> >' for executor
>> >thermos-1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac
>> >7e-d365c0562d3a of framework '201103282247-0000000019-0000
>> >I0506 22:00:21.256707 54936 cgroups_isolator.cpp:804] Executor
>> 
>>>thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb2
>>>75
>> >599233 of framework 201103282247-0000000019-0000 terminated with
>>status 0
>> >I0506 22:00:21.256798 54948 slave.cpp:1789] Sending acknowledgement for
>> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3)
>>for
>> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f
>> >400f9dd7c22 of framework 201103282247-0000000019-0000 to
>> >executor(1)@10.34.124.132:55886
>> >I0506 22:00:21.256815 54936 cgroups_isolator.cpp:620] Killing executor
>> 
>>>thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb2
>>>75
>> >599233 of framework 201103282247-0000000019-0000
>> >    @     0x7f050768f47c  google::LogMessage::Flush()
>> >I0506 22:00:21.377487 54935 gc.cpp:134] Deleting
>> 
>>>/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/2
>>>01
>> 
>>>103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c2
>>>27
>> >2
>> >I0506 22:00:21.377707 54935 gc.cpp:143] Deleted
>> 
>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/
>>>20
>> 
>>>1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c
>>>22
>> >72'
>> >    @     0x7f050768f6e6  google::LogMessageFatal::~LogMessageFatal()
>> >    @     0x7f050741063d
>> >mesos::internal::ZooKeeperMasterDetectorProcess::connected()
>> >    @     0x7f05074111d8  std::tr1::_Function_handler<>::_M_invoke()
>> >    @     0x7f0507413c84  std::tr1::_Function_handler<>::_M_invoke()
>> >    @     0x7f050758d99a  process::ProcessManager::resume()
>> >    @     0x7f050758e9af  process::schedule()
>> >    @     0x7f0506d2773d  start_thread
>> >I0506 22:00:21.487545 54940 cgroups.cpp:1214] Successfully froze cgroup
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0
>> >d90_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0 after 3 attempts
>> >I0506 22:00:21.487633 54942 cgroups.cpp:1214] Successfully froze cgroup
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5588
>> >019_tag_9f6b6105-0a40-4e8f-8044-19a09b186984 after 3 attempts
>> >I0506 22:00:21.519721 54935 gc.cpp:134] Deleting
>> 
>>>/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/2
>>>01
>> 
>>>103282247-0000000019-0000/executors/thermos-1367685279171-mesos-meta_sla
>>>ve
>> >_14-0-a67ae3d4
>> >-3e67-4a3a-ba85-ec66ca255db8/runs/76e96368-7692-4d9b-a717-e27fe7873f8b
>> >I0506 22:00:21.519790 54941 cgroups.cpp:1175] Trying to freeze cgroup
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48f50c
>> >4a_tag_53fc2cf1-8f88-4913-9e87-d865dbff4cdf
>> >I0506 22:00:21.577153 54946 status_update_manager.cpp:359] Received
>> >status update acknowledgement e7e02c18-abee-4781-9e62-70e0475b9fa3 for
>> >task 
>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9d
>> >d7c22 of framework 201103282247-0000000019-0000
>> >I0506 22:00:21.577270 54948 slave.cpp:721] Got assigned task
>> 
>>>1367877588667-mesos-meta_slave_33-39-79414744-8178-48e5-98df-eec82f8091f
>>>0
>> >for framework 201103282247-0000000019-0000
>> >I0506 22:00:21.674876 54938 cgroups.cpp:1190] Trying to thaw cgroup
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c
>> >_tag_720a3ca1-ea57-421f-9263-d47e347b866b
>> >I0506 22:00:21.828469 54936 cgroups_isolator.cpp:520] Launching
>> 
>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5948
>>>95
>> >d16c16 (./thermos_executor) in
>>/var/lib/mesos/slaves/201304262233-1937777
>> 
>>>162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/ther
>>>mo
>> 
>>>s-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c
>>>16
>> >/runs/ed262fff-5bc0-412b-8c86-799354704486 with resources cpus=
>> >0.25; mem=128 for framework 201103282247-0000000019-0000 in cgroup
>> 
>>>mesos/framework_201103282247-0000000019-0000_executor_thermos-1367877538
>>>34
>> >7-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16_tag_ed262
>> >fff-5bc0-412b-8c86-799354704486
>> >I0506 22:00:21.828743 54949 cgroups.cpp:1190] Trying to thaw cgroup
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_
>> >tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>> >I0506 22:00:21.828740 54937 cgroups.cpp:1175] Trying to freeze cgroup
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb27559923
>> >3_tag_4b84157b-e9c6-43e8-84f9-9ad1467411a7
>> >I0506 22:00:22.244297 54946 status_update_manager.cpp:480] Cleaning up
>> >status update stream for task
>> 
>>>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22
>> >of framework 201103282247-0000000019-
>> >0000
>> >I0506 22:00:22.244469 54938 cgroups.cpp:1298] Successfully thawed
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c_t
>> >ag_720a3ca1-ea57-421f-9263-d47e347b866b
>> >I0506 22:00:22.244581 54949 cgroups.cpp:1298] Successfully thawed
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_ta
>> >g_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>> >I0506 22:00:22.246649 54936 cgroups_isolator.cpp:655] Changing cgroup
>> >controls for executor
>> 
>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5948
>>>95
>> >d16c16 of framework 201103282247-0000000019-0
>> >000 with resources cpus=0.25; mem=128
>> >I0506 22:00:22.247011 54936 cgroups_isolator.cpp:839] Updated
>> >'cpu.shares' to 256 for executor
>> 
>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5948
>>>95
>> >d16c16 of framework 201103282247-000000001
>> >9-0000
>> >I0506 22:00:22.247328 54936 cgroups_isolator.cpp:977] Updated
>> >'memory.limit_in_bytes' to 134217728 for executor
>> 
>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5948
>>>95
>> >d16c16 of framework 201103282247-0000000019-0000
>> >I0506 22:00:22.622454 54948 http.cpp:279] HTTP request for
>> >'/slave(1)/stats.json'
>> >I0506 22:00:23.052145 54947 cgroups.cpp:1190] Trying to thaw cgroup
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> 
>>>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_tag
>>>_e
>> >0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>> >I0506 22:00:23.052280 54947 cgroups.cpp:1298] Successfully thawed
>> 
>>>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-13
>>>67
>> 
>>>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_tag
>>>_e
>> >0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>> >I0506 22:00:23.283198 54936 cgroups_isolator.cpp:1003] Started
>>listening
>> >for OOM events for executor
>> 
>>>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-5948
>>>95
>> >d16c16 of framework 201103282247-0000000019-0000
>> >I0506 22:00:23.296195 54936 cgroups_isolator.cpp:550] Forked executor
>>at
>> >= 20045
>> >    @     0x7f050570bf6d  clone
>> >Fetching resources into
>> 
>>>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/
>>>20
>> 
>>>1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-meta_sl
>>>av
>> 
>>>e_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-412b-8c86
>>>-7
>> >99354704486'
>> >Fetching resource '/usr/local/bin/thermos_executor'
>> >Copying resource from '/usr/local/bin/thermos_executor' to .
>> >/usr/local/bin/mesos-slave.sh: line 101: 54931 Aborted
>> >(core dumped) /usr/local/sbin/mesos-slave --port=5051
>> >--resources="${MESOS_RESOURCES}" --attributes="${MESOS_ATTRIBUTES}"
>> >--master="${master_zoo_url}" --log_dir="${log_dir}" ${WORK_DIR}
>> >${CGROUPS_ISOLATION} "$@"
>> >Slave Exit Status: 134
>> >I0506 22:01:17.601867 20405 main.cpp:124] Creating "cgroups" isolator
>> >I0506 22:01:17.627717 20405 main.cpp:132] Build: 2013-04-26 19:49:25 by
>> >bmahler
>> >I0506 22:01:17.627739 20405 main.cpp:133] Starting Mesos slave
>> >
>> >--
>> >This message is automatically generated by JIRA.
>> >If you think it was sent incorrectly, please contact your JIRA
>> >administrators
>> >For more information on JIRA, see:
>>http://www.atlassian.com/software/jira
>>
>>


Re: FW: [jira] [Created] (MESOS-463) Detector ZNode creation failure.

Posted by Benjamin Mahler <be...@gmail.com>.
Yeah, I had tried doing it for this ticket but the versions don't yet exist
in JIRA =/

Added benh to this email.


On Mon, May 6, 2013 at 3:22 PM, Mattmann, Chris A (398J) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Guys, FYI JIRA issues like this below should have a Fix version.
>
> Ben H, or someone with permissions, can you please create versions
> 0.11.0, 0.10.0, and 0.12.0 (which I'd assume MESOS-463 to be in that
> grouping). Then, once they are created, we can set the Fix version
> and start generating some boss-hog change logs around here.
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: "Benjamin Mahler   (JIRA)" <ji...@apache.org>
> Reply-To: "mesos-dev@incubator.apache.org" <mesos-dev@incubator.apache.org
> >
> Date: Monday, May 6, 2013 3:20 PM
> To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>
> Subject: [jira] [Created] (MESOS-463) Detector ZNode creation failure.
>
> >Benjamin Mahler created MESOS-463:
> >-------------------------------------
> >
> >             Summary: Detector ZNode creation failure.
> >                 Key: MESOS-463
> >                 URL: https://issues.apache.org/jira/browse/MESOS-463
> >             Project: Mesos
> >          Issue Type: Bug
> >            Reporter: Benjamin Mahler
> >
> >
> >The following failure message occured in a test cluster at Twitter:
> >
> >    // We fail all non-OK return codes except ZNODEEXISTS (since that
> >    // means the path we were trying to create exists) and ZNOAUTH
> >    // (since it's possible that the ACLs on 'dirname(url.path)' don't
> >    // allow us to create a child znode but we are allowed to create
> >    // children of 'url.path' itself, which will be determined below
> >    // if we are contending). Note that it's also possible we got back
> >    // a ZNONODE because we could not create one of the intermediate
> >    // znodes (in which case we'll abort in the 'else' below since
> >    // ZNONODE is non-retryable). TODO(benh): Need to check that we
> >    // also can put a watch on the children of 'url.path'.
> >    if (code != ZOK && code != ZNODEEXISTS && code != ZNOAUTH) {
> >      LOG(FATAL) << "Failed to create '" << url.path
> >                 << "' in ZooKeeper: " << zk->message(code);
> >    }
> >
> >It's interesting that there was a delay before the slave crashed:
> >
> >F0506 21:59:46.769142 54944 detector.cpp:315] Failed to create
> >'/home/mesos/test/master' in ZooKeeper: invalid zhandle state
> >*** Check failure stack trace: ***
> >I0506 21:59:47.398509 54940 cgroups.cpp:1298] Successfully thawed
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6663aee94840_tag
> >_c6f62383-28d1-42c2-aa65-15f9ec42db57
> >I0506 21:59:46.538936 54936 cgroups_isolator.cpp:804] Executor
> >thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9
> >dd7c22 of framework 201103282247-0000000019-0000 terminated with status 0
> >I0506 22:00:01.329169 54936 cgroups_isolator.cpp:620] Killing executor
> >thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9
> >dd7c22 of framework 201103282247-0000000019-0000
> >I0506 21:59:47.576560 54948 slave.cpp:721] Got assigned task
> >1367877572534-mesos-meta_slave_2-4-3b3755d9-9435-4480-9068-d851234cb91d
> >for framework 201103282247-0000000019-0000
> >W0506 21:59:47.576581 54941 monitor.cpp:167] Failed to collect resource
> >usage for executor
> >'thermos-1367877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39
> >564e5bfc' of framework '201103282247-000000001
> >9-0000': 1
> >W0506 21:59:51.089640 54939 cgroups.cpp:1261] Unable to freeze
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_tag_b
> >26ba555-a6bd-4448-a5b4-439beb442820 within 51 attempts
> >W0506 21:59:51.090683 54943 cgroups.cpp:1261] Unable to freeze
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc_tag_
> >6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4 within 51 attempts
> >I0506 21:59:47.518194 54946 status_update_manager.cpp:359] Received
> >status update acknowledgement c7701021-7eac-4711-96ac-093871462e44 for
> >task 1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48
> >f50c4a of framework 201103282247-0000000019-0000
> >W0506 22:00:01.439838 54941 monitor.cpp:167] Failed to collect resource
> >usage for executor
> >'thermos-1367877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6663ae
> >e94840' of framework '201103282247-0000000019-
> >0000': 1
> >I0506 22:00:01.468596 54946 status_update_manager.cpp:359] Received
> >status update acknowledgement 76ed08a3-1f8a-4eb2-8d2a-28de822ee370 for
> >task 1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767ac
> >b0d90 of framework 201103282247-0000000019-0000
> >I0506 22:00:01.580569 54946 status_update_manager.cpp:359] Received
> >status update acknowledgement 146b821e-ed93-4059-8971-b3c73a7d02ed for
> >task 1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5fa-d754a82
> >c86f6 of framework 201103282247-0000000019-0000
> >W0506 22:00:01.580670 54946 status_update_manager.cpp:382] Ignoring
> >unexpected status update acknowledgment
> >146b821e-ed93-4059-8971-b3c73a7d02ed for task
> >1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5
> >fa-d754a82c86f6 of framework 201103282247-0000000019-0000
> >I0506 22:00:01.888521 54936 cgroups_isolator.cpp:804] Executor
> >thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38f
> >cec5e1c of framework 201103282247-0000000019-0000 terminated with status 0
> >I0506 22:00:06.876126 54936 cgroups_isolator.cpp:620] Killing executor
> >thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38f
> >cec5e1c of framework 201103282247-0000000019-0000
> >W0506 22:00:01.917793 54946 status_update_manager.cpp:432] Resending
> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for
> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
> >f9dd7c22 of framework 201103282247-0000000019-0000
> >I0506 22:00:08.930923 54946 status_update_manager.cpp:335] Forwarding
> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for
> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
> >0f9dd7c22 of framework 201103282247-0000000019-0000 to
> >master@10.34.128.115:5050
> >I0506 22:00:02.215196 54949 cgroups.cpp:1190] Trying to thaw cgroup
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc
> >_tag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
> >I0506 22:00:02.243545 54945 cgroups.cpp:1190] Trying to thaw cgroup
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_
> >tag_b26ba555-a6bd-4448-a5b4-439beb442820
> >I0506 22:00:09.717344 54949 cgroups.cpp:1298] Successfully thawed
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc_t
> >ag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
> >W0506 22:00:01.917774 54941 monitor.cpp:167] Failed to collect resource
> >usage for executor
> >'thermos-1367877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376ab
> >b0df8d0' of framework '201103282247-0000000019
> >-0000': 1
> >I0506 22:00:06.931675 54936 cgroups_isolator.cpp:804] Executor
> >thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5
> >588019 of framework 201103282247-0000000019-0000 terminated with status 0
> >I0506 22:00:06.960703 54934 cgroups.cpp:1175] Trying to freeze cgroup
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e
> >1c_tag_720a3ca1-ea57-421f-9263-d47e347b866b
> >I0506 22:00:02.157475 54948 slave.cpp:514] Successfully attached file
> >'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/20
> >1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-me
> >ta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-412b
> >-8c86-799354704486'
> >I0506 22:00:02.243666 54937 cgroups.cpp:1175] Trying to freeze cgroup
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c2
> >2_tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
> >I0506 22:00:09.717412 54945 cgroups.cpp:1298] Successfully thawed
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_ta
> >g_b26ba555-a6bd-4448-a5b4-439beb442820
> >I0506 22:00:09.766968 54936 cgroups_isolator.cpp:620] Killing executor
> >thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5
> >588019 of framework 201103282247-0000000019-0000
> >I0506 22:00:09.794438 54948 slave.cpp:2031] Executor
> >'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-e69d5
> >58f9bb2' of framework 201103282247-0000000019-0000 has exited with status
> >'0'
> >I0506 22:00:10.080941 54936 cgroups_isolator.cpp:804] Executor
> >thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767a
> >cb0d90 of framework 201103282247-0000000019-0000 terminated with status 0
> >I0506 22:00:10.081022 54936 cgroups_isolator.cpp:620] Killing executor
> >thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767a
> >cb0d90 of framework 201103282247-0000000019-0000
> >I0506 22:00:10.138010 54948 slave.cpp:2166] Cleaning up executor
> >'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-e69d5
> >58f9bb2' of framework 201103282247-0000000019-0000
> >    @     0x7f050768dbcd  google::LogMessage::Fail()
> >I0506 22:00:10.421581 54936 cgroups_isolator.cpp:804] Executor
> >thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a4
> >8f50c4a of framework 201103282247-0000000019-0000 terminated with status 0
> >    @     0x7f0507693837  google::LogMessage::SendToLog()
> >I0506 22:00:21.140630 54936 cgroups_isolator.cpp:620] Killing executor
> >thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a4
> >8f50c4a of framework 201103282247-0000000019-0000
> >I0506 22:00:10.591079 54948 slave.cpp:819] Launching task
> >1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a
> >for framework 201103282247-0000000019-0000
> >I0506 22:00:10.703546 54934 cgroups.cpp:1175] Trying to freeze cgroup
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d9
> >0_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0
> >I0506 22:00:11.635795 54935 gc.cpp:143] Deleted
> >'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/20
> >1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c22
> >72/runs/c562a33
> >e-6870-47d8-ae53-8faec70ad328'
> >W0506 22:00:15.583375 54942 cgroups.cpp:1261] Unable to freeze
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_tag_2
> >407f4f8-2ad7-4415-abdd-016d5ab8e20d within 51 attempts
> >W0506 22:00:15.596472 54941 cgroups.cpp:1261] Unable to freeze
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c_tag_
> >720a3ca1-ea57-421f-9263-d47e347b866b within 51 attempts
> >W0506 22:00:19.717893 54946 status_update_manager.cpp:432] Resending
> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for
> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
> >f9dd7c22 of framework 201103282247-0000000019-0000
> >I0506 22:00:10.453220 54949 cgroups.cpp:1175] Trying to freeze cgroup
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab558801
> >9_tag_9f6b6105-0a40-4e8f-8044-19a09b186984
> >I0506 22:00:21.141474 54946 status_update_manager.cpp:335] Forwarding
> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for
> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
> >0f9dd7c22 of framework 201103282247-0000000019-0000 to
> >master@10.34.128.115:5050
> >I0506 22:00:21.143030 54948 paths.hpp:302] Created executor directory
> >'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/20
> >1103282247-0000000019-0000/executors/thermos-1367877549319-mesos-me
> >ta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a/runs/bd162f3e-c45a-44a
> >9-ae10-25914c460689'
> >I0506 22:00:21.256687 54948 slave.cpp:930] Queuing task
> >'1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a
> >' for executor
> >thermos-1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac
> >7e-d365c0562d3a of framework '201103282247-0000000019-0000
> >I0506 22:00:21.256707 54936 cgroups_isolator.cpp:804] Executor
> >thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb275
> >599233 of framework 201103282247-0000000019-0000 terminated with status 0
> >I0506 22:00:21.256798 54948 slave.cpp:1789] Sending acknowledgement for
> >status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for
> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f
> >400f9dd7c22 of framework 201103282247-0000000019-0000 to
> >executor(1)@10.34.124.132:55886
> >I0506 22:00:21.256815 54936 cgroups_isolator.cpp:620] Killing executor
> >thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb275
> >599233 of framework 201103282247-0000000019-0000
> >    @     0x7f050768f47c  google::LogMessage::Flush()
> >I0506 22:00:21.377487 54935 gc.cpp:134] Deleting
> >/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/201
> >103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c227
> >2
> >I0506 22:00:21.377707 54935 gc.cpp:143] Deleted
> >'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/20
> >1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c22
> >72'
> >    @     0x7f050768f6e6  google::LogMessageFatal::~LogMessageFatal()
> >    @     0x7f050741063d
> >mesos::internal::ZooKeeperMasterDetectorProcess::connected()
> >    @     0x7f05074111d8  std::tr1::_Function_handler<>::_M_invoke()
> >    @     0x7f0507413c84  std::tr1::_Function_handler<>::_M_invoke()
> >    @     0x7f050758d99a  process::ProcessManager::resume()
> >    @     0x7f050758e9af  process::schedule()
> >    @     0x7f0506d2773d  start_thread
> >I0506 22:00:21.487545 54940 cgroups.cpp:1214] Successfully froze cgroup
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0
> >d90_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0 after 3 attempts
> >I0506 22:00:21.487633 54942 cgroups.cpp:1214] Successfully froze cgroup
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5588
> >019_tag_9f6b6105-0a40-4e8f-8044-19a09b186984 after 3 attempts
> >I0506 22:00:21.519721 54935 gc.cpp:134] Deleting
> >/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/201
> >103282247-0000000019-0000/executors/thermos-1367685279171-mesos-meta_slave
> >_14-0-a67ae3d4
> >-3e67-4a3a-ba85-ec66ca255db8/runs/76e96368-7692-4d9b-a717-e27fe7873f8b
> >I0506 22:00:21.519790 54941 cgroups.cpp:1175] Trying to freeze cgroup
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48f50c
> >4a_tag_53fc2cf1-8f88-4913-9e87-d865dbff4cdf
> >I0506 22:00:21.577153 54946 status_update_manager.cpp:359] Received
> >status update acknowledgement e7e02c18-abee-4781-9e62-70e0475b9fa3 for
> >task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9d
> >d7c22 of framework 201103282247-0000000019-0000
> >I0506 22:00:21.577270 54948 slave.cpp:721] Got assigned task
> >1367877588667-mesos-meta_slave_33-39-79414744-8178-48e5-98df-eec82f8091f0
> >for framework 201103282247-0000000019-0000
> >I0506 22:00:21.674876 54938 cgroups.cpp:1190] Trying to thaw cgroup
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c
> >_tag_720a3ca1-ea57-421f-9263-d47e347b866b
> >I0506 22:00:21.828469 54936 cgroups_isolator.cpp:520] Launching
> >thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895
> >d16c16 (./thermos_executor) in /var/lib/mesos/slaves/201304262233-1937777
> >162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/thermo
> >s-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16
> >/runs/ed262fff-5bc0-412b-8c86-799354704486 with resources cpus=
> >0.25; mem=128 for framework 201103282247-0000000019-0000 in cgroup
> >mesos/framework_201103282247-0000000019-0000_executor_thermos-136787753834
> >7-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16_tag_ed262
> >fff-5bc0-412b-8c86-799354704486
> >I0506 22:00:21.828743 54949 cgroups.cpp:1190] Trying to thaw cgroup
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_
> >tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
> >I0506 22:00:21.828740 54937 cgroups.cpp:1175] Trying to freeze cgroup
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb27559923
> >3_tag_4b84157b-e9c6-43e8-84f9-9ad1467411a7
> >I0506 22:00:22.244297 54946 status_update_manager.cpp:480] Cleaning up
> >status update stream for task
> >1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22
> >of framework 201103282247-0000000019-
> >0000
> >I0506 22:00:22.244469 54938 cgroups.cpp:1298] Successfully thawed
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c_t
> >ag_720a3ca1-ea57-421f-9263-d47e347b866b
> >I0506 22:00:22.244581 54949 cgroups.cpp:1298] Successfully thawed
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_ta
> >g_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
> >I0506 22:00:22.246649 54936 cgroups_isolator.cpp:655] Changing cgroup
> >controls for executor
> >thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895
> >d16c16 of framework 201103282247-0000000019-0
> >000 with resources cpus=0.25; mem=128
> >I0506 22:00:22.247011 54936 cgroups_isolator.cpp:839] Updated
> >'cpu.shares' to 256 for executor
> >thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895
> >d16c16 of framework 201103282247-000000001
> >9-0000
> >I0506 22:00:22.247328 54936 cgroups_isolator.cpp:977] Updated
> >'memory.limit_in_bytes' to 134217728 for executor
> >thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895
> >d16c16 of framework 201103282247-0000000019-0000
> >I0506 22:00:22.622454 54948 http.cpp:279] HTTP request for
> >'/slave(1)/stats.json'
> >I0506 22:00:23.052145 54947 cgroups.cpp:1190] Trying to thaw cgroup
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_tag_e
> >0d3ee4d-e89c-4892-a46d-f5a85854b3a0
> >I0506 22:00:23.052280 54947 cgroups.cpp:1298] Successfully thawed
> >/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
> >877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_tag_e
> >0d3ee4d-e89c-4892-a46d-f5a85854b3a0
> >I0506 22:00:23.283198 54936 cgroups_isolator.cpp:1003] Started listening
> >for OOM events for executor
> >thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895
> >d16c16 of framework 201103282247-0000000019-0000
> >I0506 22:00:23.296195 54936 cgroups_isolator.cpp:550] Forked executor at
> >= 20045
> >    @     0x7f050570bf6d  clone
> >Fetching resources into
> >'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/20
> >1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-meta_slav
> >e_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-412b-8c86-7
> >99354704486'
> >Fetching resource '/usr/local/bin/thermos_executor'
> >Copying resource from '/usr/local/bin/thermos_executor' to .
> >/usr/local/bin/mesos-slave.sh: line 101: 54931 Aborted
> >(core dumped) /usr/local/sbin/mesos-slave --port=5051
> >--resources="${MESOS_RESOURCES}" --attributes="${MESOS_ATTRIBUTES}"
> >--master="${master_zoo_url}" --log_dir="${log_dir}" ${WORK_DIR}
> >${CGROUPS_ISOLATION} "$@"
> >Slave Exit Status: 134
> >I0506 22:01:17.601867 20405 main.cpp:124] Creating "cgroups" isolator
> >I0506 22:01:17.627717 20405 main.cpp:132] Build: 2013-04-26 19:49:25 by
> >bmahler
> >I0506 22:01:17.627739 20405 main.cpp:133] Starting Mesos slave
> >
> >--
> >This message is automatically generated by JIRA.
> >If you think it was sent incorrectly, please contact your JIRA
> >administrators
> >For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>

FW: [jira] [Created] (MESOS-463) Detector ZNode creation failure.

Posted by "Mattmann, Chris A (398J)" <ch...@jpl.nasa.gov>.
Guys, FYI JIRA issues like this below should have a Fix version.

Ben H, or someone with permissions, can you please create versions
0.11.0, 0.10.0, and 0.12.0 (which I'd assume MESOS-463 to be in that
grouping). Then, once they are created, we can set the Fix version
and start generating some boss-hog change logs around here.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: "Benjamin Mahler   (JIRA)" <ji...@apache.org>
Reply-To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>
Date: Monday, May 6, 2013 3:20 PM
To: "mesos-dev@incubator.apache.org" <me...@incubator.apache.org>
Subject: [jira] [Created] (MESOS-463) Detector ZNode creation failure.

>Benjamin Mahler created MESOS-463:
>-------------------------------------
>
>             Summary: Detector ZNode creation failure.
>                 Key: MESOS-463
>                 URL: https://issues.apache.org/jira/browse/MESOS-463
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Benjamin Mahler
>
>
>The following failure message occured in a test cluster at Twitter:
>
>    // We fail all non-OK return codes except ZNODEEXISTS (since that
>    // means the path we were trying to create exists) and ZNOAUTH
>    // (since it's possible that the ACLs on 'dirname(url.path)' don't
>    // allow us to create a child znode but we are allowed to create
>    // children of 'url.path' itself, which will be determined below
>    // if we are contending). Note that it's also possible we got back
>    // a ZNONODE because we could not create one of the intermediate
>    // znodes (in which case we'll abort in the 'else' below since
>    // ZNONODE is non-retryable). TODO(benh): Need to check that we
>    // also can put a watch on the children of 'url.path'.
>    if (code != ZOK && code != ZNODEEXISTS && code != ZNOAUTH) {
>      LOG(FATAL) << "Failed to create '" << url.path
>                 << "' in ZooKeeper: " << zk->message(code);
>    }
>
>It's interesting that there was a delay before the slave crashed:
>
>F0506 21:59:46.769142 54944 detector.cpp:315] Failed to create
>'/home/mesos/test/master' in ZooKeeper: invalid zhandle state
>*** Check failure stack trace: ***
>I0506 21:59:47.398509 54940 cgroups.cpp:1298] Successfully thawed
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6663aee94840_tag
>_c6f62383-28d1-42c2-aa65-15f9ec42db57
>I0506 21:59:46.538936 54936 cgroups_isolator.cpp:804] Executor
>thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9
>dd7c22 of framework 201103282247-0000000019-0000 terminated with status 0
>I0506 22:00:01.329169 54936 cgroups_isolator.cpp:620] Killing executor
>thermos-1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9
>dd7c22 of framework 201103282247-0000000019-0000
>I0506 21:59:47.576560 54948 slave.cpp:721] Got assigned task
>1367877572534-mesos-meta_slave_2-4-3b3755d9-9435-4480-9068-d851234cb91d
>for framework 201103282247-0000000019-0000
>W0506 21:59:47.576581 54941 monitor.cpp:167] Failed to collect resource
>usage for executor
>'thermos-1367877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39
>564e5bfc' of framework '201103282247-000000001
>9-0000': 1
>W0506 21:59:51.089640 54939 cgroups.cpp:1261] Unable to freeze
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_tag_b
>26ba555-a6bd-4448-a5b4-439beb442820 within 51 attempts
>W0506 21:59:51.090683 54943 cgroups.cpp:1261] Unable to freeze
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc_tag_
>6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4 within 51 attempts
>I0506 21:59:47.518194 54946 status_update_manager.cpp:359] Received
>status update acknowledgement c7701021-7eac-4711-96ac-093871462e44 for
>task 1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48
>f50c4a of framework 201103282247-0000000019-0000
>W0506 22:00:01.439838 54941 monitor.cpp:167] Failed to collect resource
>usage for executor
>'thermos-1367877344969-mesos-meta_slave_4-7-045d344f-1737-4901-9720-6663ae
>e94840' of framework '201103282247-0000000019-
>0000': 1
>I0506 22:00:01.468596 54946 status_update_manager.cpp:359] Received
>status update acknowledgement 76ed08a3-1f8a-4eb2-8d2a-28de822ee370 for
>task 1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767ac
>b0d90 of framework 201103282247-0000000019-0000
>I0506 22:00:01.580569 54946 status_update_manager.cpp:359] Received
>status update acknowledgement 146b821e-ed93-4059-8971-b3c73a7d02ed for
>task 1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5fa-d754a82
>c86f6 of framework 201103282247-0000000019-0000
>W0506 22:00:01.580670 54946 status_update_manager.cpp:382] Ignoring
>unexpected status update acknowledgment
>146b821e-ed93-4059-8971-b3c73a7d02ed for task
>1367877517296-mesos-meta_slave_22-4-e2101290-0a8e-4de1-a5
>fa-d754a82c86f6 of framework 201103282247-0000000019-0000
>I0506 22:00:01.888521 54936 cgroups_isolator.cpp:804] Executor
>thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38f
>cec5e1c of framework 201103282247-0000000019-0000 terminated with status 0
>I0506 22:00:06.876126 54936 cgroups_isolator.cpp:620] Killing executor
>thermos-1367877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38f
>cec5e1c of framework 201103282247-0000000019-0000
>W0506 22:00:01.917793 54946 status_update_manager.cpp:432] Resending
>status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for
>task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
>f9dd7c22 of framework 201103282247-0000000019-0000
>I0506 22:00:08.930923 54946 status_update_manager.cpp:335] Forwarding
>status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for
>task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
>0f9dd7c22 of framework 201103282247-0000000019-0000 to
>master@10.34.128.115:5050
>I0506 22:00:02.215196 54949 cgroups.cpp:1190] Trying to thaw cgroup
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc
>_tag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
>I0506 22:00:02.243545 54945 cgroups.cpp:1190] Trying to thaw cgroup
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_
>tag_b26ba555-a6bd-4448-a5b4-439beb442820
>I0506 22:00:09.717344 54949 cgroups.cpp:1298] Successfully thawed
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877382194-mesos-meta_slave_14-19-cb386113-baa0-4474-a8de-7e39564e5bfc_t
>ag_6ce11d7c-ab7c-4e25-9d9d-7f37fd04c3f4
>W0506 22:00:01.917774 54941 monitor.cpp:167] Failed to collect resource
>usage for executor
>'thermos-1367877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376ab
>b0df8d0' of framework '201103282247-0000000019
>-0000': 1
>I0506 22:00:06.931675 54936 cgroups_isolator.cpp:804] Executor
>thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5
>588019 of framework 201103282247-0000000019-0000 terminated with status 0
>I0506 22:00:06.960703 54934 cgroups.cpp:1175] Trying to freeze cgroup
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e
>1c_tag_720a3ca1-ea57-421f-9263-d47e347b866b
>I0506 22:00:02.157475 54948 slave.cpp:514] Successfully attached file
>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/20
>1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-me
>ta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-412b
>-8c86-799354704486'
>I0506 22:00:02.243666 54937 cgroups.cpp:1175] Trying to freeze cgroup
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c2
>2_tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>I0506 22:00:09.717412 54945 cgroups.cpp:1298] Successfully thawed
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877355794-mesos-meta_slave_41-2-6f44eb1d-9d77-4201-a3dd-376abb0df8d0_ta
>g_b26ba555-a6bd-4448-a5b4-439beb442820
>I0506 22:00:09.766968 54936 cgroups_isolator.cpp:620] Killing executor
>thermos-1367877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5
>588019 of framework 201103282247-0000000019-0000
>I0506 22:00:09.794438 54948 slave.cpp:2031] Executor
>'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-e69d5
>58f9bb2' of framework 201103282247-0000000019-0000 has exited with status
>'0'
>I0506 22:00:10.080941 54936 cgroups_isolator.cpp:804] Executor
>thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767a
>cb0d90 of framework 201103282247-0000000019-0000 terminated with status 0
>I0506 22:00:10.081022 54936 cgroups_isolator.cpp:620] Killing executor
>thermos-1367877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767a
>cb0d90 of framework 201103282247-0000000019-0000
>I0506 22:00:10.138010 54948 slave.cpp:2166] Cleaning up executor
>'thermos-1367877283998-mesos-meta_slave_25-5-b56d573d-e21c-446d-be54-e69d5
>58f9bb2' of framework 201103282247-0000000019-0000
>    @     0x7f050768dbcd  google::LogMessage::Fail()
>I0506 22:00:10.421581 54936 cgroups_isolator.cpp:804] Executor
>thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a4
>8f50c4a of framework 201103282247-0000000019-0000 terminated with status 0
>    @     0x7f0507693837  google::LogMessage::SendToLog()
>I0506 22:00:21.140630 54936 cgroups_isolator.cpp:620] Killing executor
>thermos-1367877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a4
>8f50c4a of framework 201103282247-0000000019-0000
>I0506 22:00:10.591079 54948 slave.cpp:819] Launching task
>1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a
>for framework 201103282247-0000000019-0000
>I0506 22:00:10.703546 54934 cgroups.cpp:1175] Trying to freeze cgroup
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d9
>0_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>I0506 22:00:11.635795 54935 gc.cpp:143] Deleted
>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/20
>1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c22
>72/runs/c562a33
>e-6870-47d8-ae53-8faec70ad328'
>W0506 22:00:15.583375 54942 cgroups.cpp:1261] Unable to freeze
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_tag_2
>407f4f8-2ad7-4415-abdd-016d5ab8e20d within 51 attempts
>W0506 22:00:15.596472 54941 cgroups.cpp:1261] Unable to freeze
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c_tag_
>720a3ca1-ea57-421f-9263-d47e347b866b within 51 attempts
>W0506 22:00:19.717893 54946 status_update_manager.cpp:432] Resending
>status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for
>task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400
>f9dd7c22 of framework 201103282247-0000000019-0000
>I0506 22:00:10.453220 54949 cgroups.cpp:1175] Trying to freeze cgroup
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab558801
>9_tag_9f6b6105-0a40-4e8f-8044-19a09b186984
>I0506 22:00:21.141474 54946 status_update_manager.cpp:335] Forwarding
>status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for
>task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f40
>0f9dd7c22 of framework 201103282247-0000000019-0000 to
>master@10.34.128.115:5050
>I0506 22:00:21.143030 54948 paths.hpp:302] Created executor directory
>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/20
>1103282247-0000000019-0000/executors/thermos-1367877549319-mesos-me
>ta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a/runs/bd162f3e-c45a-44a
>9-ae10-25914c460689'
>I0506 22:00:21.256687 54948 slave.cpp:930] Queuing task
>'1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac7e-d365c0562d3a
>' for executor 
>thermos-1367877549319-mesos-meta_slave_42-19-269da040-c052-42e0-ac
>7e-d365c0562d3a of framework '201103282247-0000000019-0000
>I0506 22:00:21.256707 54936 cgroups_isolator.cpp:804] Executor
>thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb275
>599233 of framework 201103282247-0000000019-0000 terminated with status 0
>I0506 22:00:21.256798 54948 slave.cpp:1789] Sending acknowledgement for
>status update TASK_LOST (UUID: e7e02c18-abee-4781-9e62-70e0475b9fa3) for
>task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f
>400f9dd7c22 of framework 201103282247-0000000019-0000 to
>executor(1)@10.34.124.132:55886
>I0506 22:00:21.256815 54936 cgroups_isolator.cpp:620] Killing executor
>thermos-1367877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb275
>599233 of framework 201103282247-0000000019-0000
>    @     0x7f050768f47c  google::LogMessage::Flush()
>I0506 22:00:21.377487 54935 gc.cpp:134] Deleting
>/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/201
>103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c227
>2
>I0506 22:00:21.377707 54935 gc.cpp:143] Deleted
>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/20
>1103282247-0000000019-0000/executors/gc-d8f1a18a-8527-4452-99d7-a2da4c3c22
>72'
>    @     0x7f050768f6e6  google::LogMessageFatal::~LogMessageFatal()
>    @     0x7f050741063d
>mesos::internal::ZooKeeperMasterDetectorProcess::connected()
>    @     0x7f05074111d8  std::tr1::_Function_handler<>::_M_invoke()
>    @     0x7f0507413c84  std::tr1::_Function_handler<>::_M_invoke()
>    @     0x7f050758d99a  process::ProcessManager::resume()
>    @     0x7f050758e9af  process::schedule()
>    @     0x7f0506d2773d  start_thread
>I0506 22:00:21.487545 54940 cgroups.cpp:1214] Successfully froze cgroup
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0
>d90_tag_e0d3ee4d-e89c-4892-a46d-f5a85854b3a0 after 3 attempts
>I0506 22:00:21.487633 54942 cgroups.cpp:1214] Successfully froze cgroup
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877427667-mesos-meta_slave_37-7-7cf097a4-c9da-4972-83b9-e2aab5588
>019_tag_9f6b6105-0a40-4e8f-8044-19a09b186984 after 3 attempts
>I0506 22:00:21.519721 54935 gc.cpp:134] Deleting
>/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/201
>103282247-0000000019-0000/executors/thermos-1367685279171-mesos-meta_slave
>_14-0-a67ae3d4
>-3e67-4a3a-ba85-ec66ca255db8/runs/76e96368-7692-4d9b-a717-e27fe7873f8b
>I0506 22:00:21.519790 54941 cgroups.cpp:1175] Trying to freeze cgroup
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877461237-mesos-meta_slave_33-21-e1474932-1a9b-4adf-acbf-9f8a48f50c
>4a_tag_53fc2cf1-8f88-4913-9e87-d865dbff4cdf
>I0506 22:00:21.577153 54946 status_update_manager.cpp:359] Received
>status update acknowledgement e7e02c18-abee-4781-9e62-70e0475b9fa3 for
>task 1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9d
>d7c22 of framework 201103282247-0000000019-0000
>I0506 22:00:21.577270 54948 slave.cpp:721] Got assigned task
>1367877588667-mesos-meta_slave_33-39-79414744-8178-48e5-98df-eec82f8091f0
>for framework 201103282247-0000000019-0000
>I0506 22:00:21.674876 54938 cgroups.cpp:1190] Trying to thaw cgroup
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c
>_tag_720a3ca1-ea57-421f-9263-d47e347b866b
>I0506 22:00:21.828469 54936 cgroups_isolator.cpp:520] Launching
>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895
>d16c16 (./thermos_executor) in /var/lib/mesos/slaves/201304262233-1937777
>162-5050-9099-380/frameworks/201103282247-0000000019-0000/executors/thermo
>s-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16
>/runs/ed262fff-5bc0-412b-8c86-799354704486 with resources cpus=
>0.25; mem=128 for framework 201103282247-0000000019-0000 in cgroup
>mesos/framework_201103282247-0000000019-0000_executor_thermos-136787753834
>7-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895d16c16_tag_ed262
>fff-5bc0-412b-8c86-799354704486
>I0506 22:00:21.828743 54949 cgroups.cpp:1190] Trying to thaw cgroup
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_
>tag_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>I0506 22:00:21.828740 54937 cgroups.cpp:1175] Trying to freeze cgroup
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877408255-mesos-meta_slave_0-24-f4400b8c-4ba0-46db-b878-ddb27559923
>3_tag_4b84157b-e9c6-43e8-84f9-9ad1467411a7
>I0506 22:00:22.244297 54946 status_update_manager.cpp:480] Cleaning up
>status update stream for task
>1367877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22
>of framework 201103282247-0000000019-
>0000
>I0506 22:00:22.244469 54938 cgroups.cpp:1298] Successfully thawed
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877395272-mesos-meta_slave_32-42-b55e4f5b-6ee1-4785-bd49-0c38fcec5e1c_t
>ag_720a3ca1-ea57-421f-9263-d47e347b866b
>I0506 22:00:22.244581 54949 cgroups.cpp:1298] Successfully thawed
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877474973-mesos-meta_slave_9-28-45a3d2d8-004c-4724-9590-f400f9dd7c22_ta
>g_2407f4f8-2ad7-4415-abdd-016d5ab8e20d
>I0506 22:00:22.246649 54936 cgroups_isolator.cpp:655] Changing cgroup
>controls for executor
>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895
>d16c16 of framework 201103282247-0000000019-0
>000 with resources cpus=0.25; mem=128
>I0506 22:00:22.247011 54936 cgroups_isolator.cpp:839] Updated
>'cpu.shares' to 256 for executor
>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895
>d16c16 of framework 201103282247-000000001
>9-0000
>I0506 22:00:22.247328 54936 cgroups_isolator.cpp:977] Updated
>'memory.limit_in_bytes' to 134217728 for executor
>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895
>d16c16 of framework 201103282247-0000000019-0000
>I0506 22:00:22.622454 54948 http.cpp:279] HTTP request for
>'/slave(1)/stats.json'
>I0506 22:00:23.052145 54947 cgroups.cpp:1190] Trying to thaw cgroup
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_tag_e
>0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>I0506 22:00:23.052280 54947 cgroups.cpp:1298] Successfully thawed
>/cgroup/mesos/framework_201103282247-0000000019-0000_executor_thermos-1367
>877449740-mesos-meta_slave_4-34-b1914d5b-c474-4f01-b16d-44767acb0d90_tag_e
>0d3ee4d-e89c-4892-a46d-f5a85854b3a0
>I0506 22:00:23.283198 54936 cgroups_isolator.cpp:1003] Started listening
>for OOM events for executor
>thermos-1367877538347-mesos-meta_slave_25-5-51502091-a395-43d9-9d56-594895
>d16c16 of framework 201103282247-0000000019-0000
>I0506 22:00:23.296195 54936 cgroups_isolator.cpp:550] Forked executor at
>= 20045
>    @     0x7f050570bf6d  clone
>Fetching resources into
>'/var/lib/mesos/slaves/201304262233-1937777162-5050-9099-380/frameworks/20
>1103282247-0000000019-0000/executors/thermos-1367877538347-mesos-meta_slav
>e_25-5-51502091-a395-43d9-9d56-594895d16c16/runs/ed262fff-5bc0-412b-8c86-7
>99354704486'
>Fetching resource '/usr/local/bin/thermos_executor'
>Copying resource from '/usr/local/bin/thermos_executor' to .
>/usr/local/bin/mesos-slave.sh: line 101: 54931 Aborted
>(core dumped) /usr/local/sbin/mesos-slave --port=5051
>--resources="${MESOS_RESOURCES}" --attributes="${MESOS_ATTRIBUTES}"
>--master="${master_zoo_url}" --log_dir="${log_dir}" ${WORK_DIR}
>${CGROUPS_ISOLATION} "$@"
>Slave Exit Status: 134
>I0506 22:01:17.601867 20405 main.cpp:124] Creating "cgroups" isolator
>I0506 22:01:17.627717 20405 main.cpp:132] Build: 2013-04-26 19:49:25 by
>bmahler
>I0506 22:01:17.627739 20405 main.cpp:133] Starting Mesos slave
>
>--
>This message is automatically generated by JIRA.
>If you think it was sent incorrectly, please contact your JIRA
>administrators
>For more information on JIRA, see: http://www.atlassian.com/software/jira