You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Vinod Kone (JIRA)" <ji...@apache.org> on 2013/04/03 02:17:18 UTC
[jira] [Commented] (MESOS-424) CgroupsIsolatorTest.BalloonFramework
runs forever
[ https://issues.apache.org/jira/browse/MESOS-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13620434#comment-13620434 ]
Vinod Kone commented on MESOS-424:
----------------------------------
Hmm, this is surprising!
Looks like cgroups freezer is being invoked multiple times for the same cgroup, though I can't see (from looking at the code) how that is possible. The cgroups isolator should be calling cgroups::destroy() only once per cgroup. Also interesting to see that the cgroup wasn't being able to be freezed despite multiple attempts.
> CgroupsIsolatorTest.BalloonFramework runs forever
> -------------------------------------------------
>
> Key: MESOS-424
> URL: https://issues.apache.org/jira/browse/MESOS-424
> Project: Mesos
> Issue Type: Bug
> Reporter: Thomas Marshall
>
> On Ubuntu 12.04 Server, running as root:
> bin/mesos-tests.sh --gtest_filter=*Balloon* --verbose
> Source directory: /root/mesos
> Build directory: /root/mesos/build
> Note: Google Test filter = *Balloon*-
> [==========] Running 1 test from 1 test case.
> [----------] Global test environment set-up.
> [----------] 1 test from CgroupsIsolatorTest
> [ RUN ] CgroupsIsolatorTest.ROOT_CGROUPS_BalloonFramework
> Using temporary directory '/tmp/CgroupsIsolatorTest_ROOT_CGROUPS_BalloonFramework_1JMuXO'
> Launched master at 1770
> I0402 15:20:23.570971 1770 main.cpp:116] Build: 2013-04-02 14:41:50 by root
> I0402 15:20:23.571444 1770 main.cpp:117] Starting Mesos master
> I0402 15:20:23.572792 1788 master.cpp:309] Master started on 127.0.1.1:5432
> I0402 15:20:23.573097 1788 master.cpp:324] Master ID: 201304021520-16842879-5432-1770
> W0402 15:20:23.574090 1787 master.cpp:81] No whitelist given. Advertising offers for all slaves
> I0402 15:20:23.577419 1788 master.cpp:603] Elected as master!
> Launched slave at 1790
> I0402 15:20:25.570708 1790 main.cpp:124] Creating "cgroups" isolator
> I0402 15:20:25.571761 1790 main.cpp:132] Build: 2013-04-02 14:41:50 by root
> I0402 15:20:25.571790 1790 main.cpp:133] Starting Mesos slave
> I0402 15:20:25.574848 1808 slave.cpp:203] Slave started on 1)@127.0.1.1:51739
> I0402 15:20:25.574906 1808 slave.cpp:204] Slave resources: cpus=1; mem=96; ports=[31000-32000]; disk=7572
> I0402 15:20:25.575526 1805 cgroups_isolator.cpp:236] Using /cgroup as cgroups hierarchy root
> I0402 15:20:25.577657 1807 slave.cpp:453] New master detected at master@127.0.0.1:5432
> I0402 15:20:25.577888 1807 status_update_manager.cpp:132] New master detected at master@127.0.0.1:5432
> I0402 15:20:25.586076 1805 cgroups_isolator.cpp:690] Recovering isolator
> I0402 15:20:25.586915 1808 slave.cpp:377] Finished recovery
> I0402 15:20:25.588171 1787 master.cpp:968] Attempting to register slave on ubuntu at slave(1)@127.0.1.1:51739
> I0402 15:20:25.588276 1787 master.cpp:1224] Master now considering a slave at ubuntu:51739 as active
> I0402 15:20:25.589035 1787 master.cpp:1862] Adding slave 201304021520-16842879-5432-1770-0 at ubuntu with cpus=1; mem=96; ports=[31000-32000]; disk=7572
> I0402 15:20:25.589582 1787 hierarchical_allocator_process.hpp:395] Added slave 201304021520-16842879-5432-1770-0 (ubuntu) with cpus=1; mem=96; ports=[31000-32000]; disk=7572 (and cpus=1; mem=96; ports=[31000-32000]; disk=7572 available)
> I0402 15:20:25.589867 1807 slave.cpp:487] Registered with master; given slave ID 201304021520-16842879-5432-1770-0
> I0402 15:20:27.567234 1786 master.cpp:646] Registering framework 201304021520-16842879-5432-1770-0000 at scheduler(1)@127.0.1.1:54177
> I0402 15:20:27.567627 1786 hierarchical_allocator_process.hpp:268] Added framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:27.568018 1786 master.hpp:309] Adding offer with resources cpus=1; mem=96; ports=[31000-32000]; disk=7572 on slave 201304021520-16842879-5432-1770-0
> Registered
> I0402 15:20:27.568243 1786 master.cpp:1327] Sending 1 offers to framework 201304021520-16842879-5432-1770-0000
> Resource offers received
> Starting the task
> I0402 15:20:27.569226 1788 master.cpp:1534] Processing reply for offer 201304021520-16842879-5432-1770-0 on slave 201304021520-16842879-5432-1770-0 (ubuntu) for framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:27.569449 1788 master.hpp:289] Adding task with resources mem=32 on slave 201304021520-16842879-5432-1770-0
> I0402 15:20:27.569537 1788 master.cpp:1651] Launching task 1 of framework 201304021520-16842879-5432-1770-0000 with resources mem=32 on slave 201304021520-16842879-5432-1770-0 (ubuntu)
> I0402 15:20:27.569792 1788 master.hpp:318] Removing offer with resources cpus=1; mem=96; ports=[31000-32000]; disk=7572 on slave 201304021520-16842879-5432-1770-0
> I0402 15:20:27.569903 1785 hierarchical_allocator_process.hpp:497] Framework 201304021520-16842879-5432-1770-0000 filtered slave 201304021520-16842879-5432-1770-0 for 5.00secs
> I0402 15:20:27.570047 1805 slave.cpp:587] Got assigned task 1 for framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:27.572463 1805 paths.hpp:302] Created executor directory '/tmp/mesos/slaves/201304021520-16842879-5432-1770-0/frameworks/201304021520-16842879-5432-1770-0000/executors/default/runs/a6115dfa-8195-4cf4-b044-f9b7e7531e9e'
> I0402 15:20:27.573072 1805 slave.cpp:436] Successfully attached file '/tmp/mesos/slaves/201304021520-16842879-5432-1770-0/frameworks/201304021520-16842879-5432-1770-0000/executors/default/runs/a6115dfa-8195-4cf4-b044-f9b7e7531e9e'
> I0402 15:20:27.573310 1806 cgroups_isolator.cpp:488] Launching default (/root/mesos/build/src/balloon-executor) in /tmp/mesos/slaves/201304021520-16842879-5432-1770-0/frameworks/201304021520-16842879-5432-1770-0000/executors/default/runs/a6115dfa-8195-4cf4-b044-f9b7e7531e9e with resources mem=64 for framework 201304021520-16842879-5432-1770-0000 in cgroup mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:20:27.573943 1806 cgroups_isolator.cpp:631] Changing cgroup controls for executor default of framework 201304021520-16842879-5432-1770-0000 with resources mem=64
> I0402 15:20:27.574291 1806 cgroups_isolator.cpp:898] Updated 'memory.limit_in_bytes' to 67108864 for executor default of framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:27.574923 1806 cgroups_isolator.cpp:924] Started listening for OOM events for executor default of framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:27.575889 1806 cgroups_isolator.cpp:517] Forked executor at = 1829
> Fetching resources into '/tmp/mesos/slaves/201304021520-16842879-5432-1770-0/frameworks/201304021520-16842879-5432-1770-0000/executors/default/runs/a6115dfa-8195-4cf4-b044-f9b7e7531e9e'
> I0402 15:20:27.641137 1808 slave.cpp:1046] Got registration for executor 'default' of framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:27.641315 1808 slave.cpp:1121] Flushing queued tasks for framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:27.641386 1808 cgroups_isolator.cpp:631] Changing cgroup controls for executor default of framework 201304021520-16842879-5432-1770-0000 with resources mem=96
> I0402 15:20:27.641913 1808 cgroups_isolator.cpp:898] Updated 'memory.limit_in_bytes' to 100663296 for executor default of framework 201304021520-16842879-5432-1770-0000
> W0402 15:20:28.575875 1788 master.cpp:81] No whitelist given. Advertising offers for all slaves
> I0402 15:20:28.897797 1807 cgroups_isolator.cpp:944] OOM notifier is triggered for executor default of framework 201304021520-16842879-5432-1770-0000 with uuid a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:20:28.897902 1807 cgroups_isolator.cpp:989] OOM detected for executor default of framework 201304021520-16842879-5432-1770-0000 with uuid a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:20:28.899562 1807 cgroups_isolator.cpp:1030] Memory limit exceeded: Requested: 96MB Used: 96MB
> MEMORY STATISTICS:
> cache 0
> rss 100663296
> mapped_file 0
> swap 2424832
> pgpgin 25936
> pgpgout 1360
> pgfault 31673
> pgmajfault 1
> inactive_anon 0
> active_anon 0
> inactive_file 0
> active_file 0
> unevictable 100663296
> hierarchical_memory_limit 100663296
> hierarchical_memsw_limit 9223372036854775807
> total_cache 0
> total_rss 100663296
> total_mapped_file 0
> total_swap 2424832
> total_pgpgin 25936
> total_pgpgout 1360
> total_pgfault 31673
> total_pgmajfault 1
> total_inactive_anon 0
> total_active_anon 0
> total_inactive_file 0
> total_active_file 0
> total_unevictable 100663296
> I0402 15:20:28.899739 1807 cgroups_isolator.cpp:596] Killing executor default of framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:28.901882 1805 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:20:32.578037 1807 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:20:33.578172 1788 master.cpp:81] No whitelist given. Advertising offers for all slaves
> W0402 15:20:34.065656 1805 cgroups.cpp:1261] Unable to freeze /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e within 51 attempts
> I0402 15:20:34.067944 1805 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:20:34.068300 1805 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:20:34.582098 1805 cgroups_isolator.cpp:766] Executor default of framework 201304021520-16842879-5432-1770-0000 terminated with status 9
> W0402 15:20:37.579793 1807 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:20:38.580425 1787 master.cpp:81] No whitelist given. Advertising offers for all slaves
> I0402 15:20:39.216334 1807 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:20:42.580739 1806 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:20:43.582556 1788 master.cpp:81] No whitelist given. Advertising offers for all slaves
> W0402 15:20:44.377604 1808 cgroups.cpp:1261] Unable to freeze /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e within 51 attempts
> I0402 15:20:44.379775 1805 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:20:44.379935 1805 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:20:47.581902 1807 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:20:48.584782 1786 master.cpp:81] No whitelist given. Advertising offers for all slaves
> I0402 15:20:49.528096 1807 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:20:52.583258 1806 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:20:53.586912 1787 master.cpp:81] No whitelist given. Advertising offers for all slaves
> W0402 15:20:54.691306 1808 cgroups.cpp:1261] Unable to freeze /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e within 51 attempts
> I0402 15:20:54.693431 1808 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:20:54.693737 1808 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:20:57.584837 1806 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:20:58.588543 1788 master.cpp:81] No whitelist given. Advertising offers for all slaves
> I0402 15:20:59.842075 1807 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:21:02.586467 1806 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:21:03.590638 1787 master.cpp:81] No whitelist given. Advertising offers for all slaves
> W0402 15:21:05.003955 1806 cgroups.cpp:1261] Unable to freeze /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e within 51 attempts
> I0402 15:21:05.006346 1807 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:21:05.006577 1807 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:21:07.588361 1807 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:21:08.592641 1786 master.cpp:81] No whitelist given. Advertising offers for all slaves
> I0402 15:21:10.155868 1807 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:21:12.590788 1806 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:21:13.594530 1787 master.cpp:81] No whitelist given. Advertising offers for all slaves
> W0402 15:21:15.316937 1807 cgroups.cpp:1261] Unable to freeze /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e within 51 attempts
> I0402 15:21:15.319368 1808 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:21:15.319533 1808 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:21:17.591588 1805 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> ...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira