You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Vinod Kone (JIRA)" <ji...@apache.org> on 2013/04/03 02:17:18 UTC

[jira] [Commented] (MESOS-424) CgroupsIsolatorTest.BalloonFramework runs forever

    [ https://issues.apache.org/jira/browse/MESOS-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13620434#comment-13620434 ] 

Vinod Kone commented on MESOS-424:
----------------------------------

Hmm, this is surprising!

Looks like cgroups freezer is being invoked multiple times for the same cgroup, though I can't see (from looking at the code) how that is possible. The cgroups isolator should be calling cgroups::destroy() only once per cgroup. Also interesting to see that the cgroup wasn't being able to be freezed despite multiple attempts.
                
> CgroupsIsolatorTest.BalloonFramework runs forever
> -------------------------------------------------
>
>                 Key: MESOS-424
>                 URL: https://issues.apache.org/jira/browse/MESOS-424
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Thomas Marshall
>
> On Ubuntu 12.04 Server, running as root:
> bin/mesos-tests.sh --gtest_filter=*Balloon* --verbose
> Source directory: /root/mesos
> Build directory: /root/mesos/build
> Note: Google Test filter = *Balloon*-
> [==========] Running 1 test from 1 test case.
> [----------] Global test environment set-up.
> [----------] 1 test from CgroupsIsolatorTest
> [ RUN      ] CgroupsIsolatorTest.ROOT_CGROUPS_BalloonFramework
> Using temporary directory '/tmp/CgroupsIsolatorTest_ROOT_CGROUPS_BalloonFramework_1JMuXO'
> Launched master at 1770
> I0402 15:20:23.570971  1770 main.cpp:116] Build: 2013-04-02 14:41:50 by root
> I0402 15:20:23.571444  1770 main.cpp:117] Starting Mesos master
> I0402 15:20:23.572792  1788 master.cpp:309] Master started on 127.0.1.1:5432
> I0402 15:20:23.573097  1788 master.cpp:324] Master ID: 201304021520-16842879-5432-1770
> W0402 15:20:23.574090  1787 master.cpp:81] No whitelist given. Advertising offers for all slaves
> I0402 15:20:23.577419  1788 master.cpp:603] Elected as master!
> Launched slave at 1790
> I0402 15:20:25.570708  1790 main.cpp:124] Creating "cgroups" isolator
> I0402 15:20:25.571761  1790 main.cpp:132] Build: 2013-04-02 14:41:50 by root
> I0402 15:20:25.571790  1790 main.cpp:133] Starting Mesos slave
> I0402 15:20:25.574848  1808 slave.cpp:203] Slave started on 1)@127.0.1.1:51739
> I0402 15:20:25.574906  1808 slave.cpp:204] Slave resources: cpus=1; mem=96; ports=[31000-32000]; disk=7572
> I0402 15:20:25.575526  1805 cgroups_isolator.cpp:236] Using /cgroup as cgroups hierarchy root
> I0402 15:20:25.577657  1807 slave.cpp:453] New master detected at master@127.0.0.1:5432
> I0402 15:20:25.577888  1807 status_update_manager.cpp:132] New master detected at master@127.0.0.1:5432
> I0402 15:20:25.586076  1805 cgroups_isolator.cpp:690] Recovering isolator
> I0402 15:20:25.586915  1808 slave.cpp:377] Finished recovery
> I0402 15:20:25.588171  1787 master.cpp:968] Attempting to register slave on ubuntu at slave(1)@127.0.1.1:51739
> I0402 15:20:25.588276  1787 master.cpp:1224] Master now considering a slave at ubuntu:51739 as active
> I0402 15:20:25.589035  1787 master.cpp:1862] Adding slave 201304021520-16842879-5432-1770-0 at ubuntu with cpus=1; mem=96; ports=[31000-32000]; disk=7572
> I0402 15:20:25.589582  1787 hierarchical_allocator_process.hpp:395] Added slave 201304021520-16842879-5432-1770-0 (ubuntu) with cpus=1; mem=96; ports=[31000-32000]; disk=7572 (and cpus=1; mem=96; ports=[31000-32000]; disk=7572 available)
> I0402 15:20:25.589867  1807 slave.cpp:487] Registered with master; given slave ID 201304021520-16842879-5432-1770-0
> I0402 15:20:27.567234  1786 master.cpp:646] Registering framework 201304021520-16842879-5432-1770-0000 at scheduler(1)@127.0.1.1:54177
> I0402 15:20:27.567627  1786 hierarchical_allocator_process.hpp:268] Added framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:27.568018  1786 master.hpp:309] Adding offer with resources cpus=1; mem=96; ports=[31000-32000]; disk=7572 on slave 201304021520-16842879-5432-1770-0
> Registered
> I0402 15:20:27.568243  1786 master.cpp:1327] Sending 1 offers to framework 201304021520-16842879-5432-1770-0000
> Resource offers received
> Starting the task
> I0402 15:20:27.569226  1788 master.cpp:1534] Processing reply for offer 201304021520-16842879-5432-1770-0 on slave 201304021520-16842879-5432-1770-0 (ubuntu) for framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:27.569449  1788 master.hpp:289] Adding task with resources mem=32 on slave 201304021520-16842879-5432-1770-0
> I0402 15:20:27.569537  1788 master.cpp:1651] Launching task 1 of framework 201304021520-16842879-5432-1770-0000 with resources mem=32 on slave 201304021520-16842879-5432-1770-0 (ubuntu)
> I0402 15:20:27.569792  1788 master.hpp:318] Removing offer with resources cpus=1; mem=96; ports=[31000-32000]; disk=7572 on slave 201304021520-16842879-5432-1770-0
> I0402 15:20:27.569903  1785 hierarchical_allocator_process.hpp:497] Framework 201304021520-16842879-5432-1770-0000 filtered slave 201304021520-16842879-5432-1770-0 for 5.00secs
> I0402 15:20:27.570047  1805 slave.cpp:587] Got assigned task 1 for framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:27.572463  1805 paths.hpp:302] Created executor directory '/tmp/mesos/slaves/201304021520-16842879-5432-1770-0/frameworks/201304021520-16842879-5432-1770-0000/executors/default/runs/a6115dfa-8195-4cf4-b044-f9b7e7531e9e'
> I0402 15:20:27.573072  1805 slave.cpp:436] Successfully attached file '/tmp/mesos/slaves/201304021520-16842879-5432-1770-0/frameworks/201304021520-16842879-5432-1770-0000/executors/default/runs/a6115dfa-8195-4cf4-b044-f9b7e7531e9e'
> I0402 15:20:27.573310  1806 cgroups_isolator.cpp:488] Launching default (/root/mesos/build/src/balloon-executor) in /tmp/mesos/slaves/201304021520-16842879-5432-1770-0/frameworks/201304021520-16842879-5432-1770-0000/executors/default/runs/a6115dfa-8195-4cf4-b044-f9b7e7531e9e with resources mem=64 for framework 201304021520-16842879-5432-1770-0000 in cgroup mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:20:27.573943  1806 cgroups_isolator.cpp:631] Changing cgroup controls for executor default of framework 201304021520-16842879-5432-1770-0000 with resources mem=64
> I0402 15:20:27.574291  1806 cgroups_isolator.cpp:898] Updated 'memory.limit_in_bytes' to 67108864 for executor default of framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:27.574923  1806 cgroups_isolator.cpp:924] Started listening for OOM events for executor default of framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:27.575889  1806 cgroups_isolator.cpp:517] Forked executor at = 1829
> Fetching resources into '/tmp/mesos/slaves/201304021520-16842879-5432-1770-0/frameworks/201304021520-16842879-5432-1770-0000/executors/default/runs/a6115dfa-8195-4cf4-b044-f9b7e7531e9e'
> I0402 15:20:27.641137  1808 slave.cpp:1046] Got registration for executor 'default' of framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:27.641315  1808 slave.cpp:1121] Flushing queued tasks for framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:27.641386  1808 cgroups_isolator.cpp:631] Changing cgroup controls for executor default of framework 201304021520-16842879-5432-1770-0000 with resources mem=96
> I0402 15:20:27.641913  1808 cgroups_isolator.cpp:898] Updated 'memory.limit_in_bytes' to 100663296 for executor default of framework 201304021520-16842879-5432-1770-0000
> W0402 15:20:28.575875  1788 master.cpp:81] No whitelist given. Advertising offers for all slaves
> I0402 15:20:28.897797  1807 cgroups_isolator.cpp:944] OOM notifier is triggered for executor default of framework 201304021520-16842879-5432-1770-0000 with uuid a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:20:28.897902  1807 cgroups_isolator.cpp:989] OOM detected for executor default of framework 201304021520-16842879-5432-1770-0000 with uuid a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:20:28.899562  1807 cgroups_isolator.cpp:1030] Memory limit exceeded: Requested: 96MB Used: 96MB
> MEMORY STATISTICS: 
> cache 0
> rss 100663296
> mapped_file 0
> swap 2424832
> pgpgin 25936
> pgpgout 1360
> pgfault 31673
> pgmajfault 1
> inactive_anon 0
> active_anon 0
> inactive_file 0
> active_file 0
> unevictable 100663296
> hierarchical_memory_limit 100663296
> hierarchical_memsw_limit 9223372036854775807
> total_cache 0
> total_rss 100663296
> total_mapped_file 0
> total_swap 2424832
> total_pgpgin 25936
> total_pgpgout 1360
> total_pgfault 31673
> total_pgmajfault 1
> total_inactive_anon 0
> total_active_anon 0
> total_inactive_file 0
> total_active_file 0
> total_unevictable 100663296
> I0402 15:20:28.899739  1807 cgroups_isolator.cpp:596] Killing executor default of framework 201304021520-16842879-5432-1770-0000
> I0402 15:20:28.901882  1805 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:20:32.578037  1807 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:20:33.578172  1788 master.cpp:81] No whitelist given. Advertising offers for all slaves
> W0402 15:20:34.065656  1805 cgroups.cpp:1261] Unable to freeze /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e within 51 attempts
> I0402 15:20:34.067944  1805 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:20:34.068300  1805 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:20:34.582098  1805 cgroups_isolator.cpp:766] Executor default of framework 201304021520-16842879-5432-1770-0000 terminated with status 9
> W0402 15:20:37.579793  1807 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:20:38.580425  1787 master.cpp:81] No whitelist given. Advertising offers for all slaves
> I0402 15:20:39.216334  1807 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:20:42.580739  1806 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:20:43.582556  1788 master.cpp:81] No whitelist given. Advertising offers for all slaves
> W0402 15:20:44.377604  1808 cgroups.cpp:1261] Unable to freeze /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e within 51 attempts
> I0402 15:20:44.379775  1805 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:20:44.379935  1805 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:20:47.581902  1807 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:20:48.584782  1786 master.cpp:81] No whitelist given. Advertising offers for all slaves
> I0402 15:20:49.528096  1807 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:20:52.583258  1806 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:20:53.586912  1787 master.cpp:81] No whitelist given. Advertising offers for all slaves
> W0402 15:20:54.691306  1808 cgroups.cpp:1261] Unable to freeze /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e within 51 attempts
> I0402 15:20:54.693431  1808 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:20:54.693737  1808 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:20:57.584837  1806 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:20:58.588543  1788 master.cpp:81] No whitelist given. Advertising offers for all slaves
> I0402 15:20:59.842075  1807 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:21:02.586467  1806 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:21:03.590638  1787 master.cpp:81] No whitelist given. Advertising offers for all slaves
> W0402 15:21:05.003955  1806 cgroups.cpp:1261] Unable to freeze /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e within 51 attempts
> I0402 15:21:05.006346  1807 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:21:05.006577  1807 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:21:07.588361  1807 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:21:08.592641  1786 master.cpp:81] No whitelist given. Advertising offers for all slaves
> I0402 15:21:10.155868  1807 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:21:12.590788  1806 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> W0402 15:21:13.594530  1787 master.cpp:81] No whitelist given. Advertising offers for all slaves
> W0402 15:21:15.316937  1807 cgroups.cpp:1261] Unable to freeze /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e within 51 attempts
> I0402 15:21:15.319368  1808 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> I0402 15:21:15.319533  1808 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201304021520-16842879-5432-1770-0000_executor_default_tag_a6115dfa-8195-4cf4-b044-f9b7e7531e9e
> W0402 15:21:17.591588  1805 monitor.cpp:212] Failed to collect resource usage for executor 'default' of framework '201304021520-16842879-5432-1770-0000': 1
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira