You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "longfei (JIRA)" <ji...@apache.org> on 2019/01/18 08:19:00 UTC

[jira] [Created] (MESOS-9528) MemoryPressureMesosTest failed because of OOM.

longfei created MESOS-9528:
------------------------------

             Summary: MemoryPressureMesosTest failed because of OOM.
                 Key: MESOS-9528
                 URL: https://issues.apache.org/jira/browse/MESOS-9528
             Project: Mesos
          Issue Type: Bug
            Reporter: longfei


I found that MemoryPressureMesosTest.ROOT_CGROUPS_Statistics and ROOT_CGROUPS_Statistics.ROOT_CGROUPS_SlaveRecovery would fail because of OOM when I ran make check.

The log is as follows:


I0118 16:01:00.918741 185574 task_status_update_manager.cpp:401] Received task status update acknowledgement (UUID: a0a8dc75-c016-4f4a-9c78-c042c642b7a8) for task 52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f of framework 842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000
I0118 16:01:01.093305 185557 memory.cpp:515] OOM detected for container 8353191d-5d91-4780-b096-9d9aa28b0723
I0118 16:01:01.093466 185557 memory.cpp:555] Memory limit exceeded: Requested: 288MB Maximum Used: 288MB

MEMORY STATISTICS:
cache 291041280
rss 10948608
rss_huge 0
mapped_file 0
writeback 0
swap 0
pgpgin 77127
pgpgout 3399
pgfault 11626
pgmajfault 0
inactive_anon 290988032
active_anon 10829824
inactive_file 0
active_file 0
unevictable 0
hierarchical_memory_limit 301989888
hierarchical_memsw_limit 18446744073709551615
total_cache 291041280
total_rss 10948608
total_rss_huge 0
total_mapped_file 0
total_writeback 0
total_swap 0
total_pgpgin 77127
total_pgpgout 3399
total_pgfault 11626
total_pgmajfault 0
total_inactive_anon 290988032
total_active_anon 10829824
total_inactive_file 0
total_active_file 0
total_unevictable 0
dd: error writing './temp': Cannot allocate memory
278+0 records in
277+0 records out
291041280 bytes (291 MB) copied, 0.122189 s, 2.4 GB/s
I0118 16:01:01.093724 185584 containerizer.cpp:2995] Container 8353191d-5d91-4780-b096-9d9aa28b0723 has reached its limit for resource [\{"name":"mem","scalar":{"value":288.0},"type":"SCALAR"}] and will be terminated
I0118 16:01:01.093787 185584 containerizer.cpp:2469] Destroying container 8353191d-5d91-4780-b096-9d9aa28b0723 in RUNNING state
I0118 16:01:01.093801 185584 containerizer.cpp:3136] Transitioning the state of container 8353191d-5d91-4780-b096-9d9aa28b0723 from RUNNING to DESTROYING
I0118 16:01:01.093897 185571 linux_launcher.cpp:576] Asked to destroy container 8353191d-5d91-4780-b096-9d9aa28b0723
I0118 16:01:01.093941 185571 linux_launcher.cpp:618] Destroying cgroup '/sys/fs/cgroup/freezer/mesos_test_4c169006-c6fb-4486-9a6e-5a3d0e9777e6/8353191d-5d91-4780-b096-9d9aa28b0723'
I0118 16:01:01.094094 185559 cgroups.cpp:2854] Freezing cgroup /sys/fs/cgroup/freezer/mesos_test_4c169006-c6fb-4486-9a6e-5a3d0e9777e6/8353191d-5d91-4780-b096-9d9aa28b0723
I0118 16:01:01.094204 185578 cgroups.cpp:1242] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos_test_4c169006-c6fb-4486-9a6e-5a3d0e9777e6/8353191d-5d91-4780-b096-9d9aa28b0723 after 77056ns
I0118 16:01:01.094406 185556 cgroups.cpp:2872] Thawing cgroup /sys/fs/cgroup/freezer/mesos_test_4c169006-c6fb-4486-9a6e-5a3d0e9777e6/8353191d-5d91-4780-b096-9d9aa28b0723
I0118 16:01:01.094564 185575 cgroups.cpp:1271] Successfully thawed cgroup /sys/fs/cgroup/freezer/mesos_test_4c169006-c6fb-4486-9a6e-5a3d0e9777e6/8353191d-5d91-4780-b096-9d9aa28b0723 after 128us
I0118 16:01:01.096833 185564 slave.cpp:5988] Got exited event for executor(1)@10.10.23.200:18282
I0118 16:01:01.105190 185568 containerizer.cpp:2975] Container 8353191d-5d91-4780-b096-9d9aa28b0723 has exited
I0118 16:01:01.106215 185594 slave.cpp:6384] Executor '52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f' of framework 842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000 terminated with signal Killed
I0118 16:01:01.107043 185594 slave.cpp:5316] Handling status update TASK_FAILED (Status UUID: 23265b36-faf6-4709-9f67-b9fd817c90ba) for task 52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f of framework 842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000 from @0.0.0.0:0
E0118 16:01:01.107215 185560 slave.cpp:5647] Failed to update resources for container 8353191d-5d91-4780-b096-9d9aa28b0723 of executor '52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f' running task 52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f on status update for terminal task, destroying container: Container not found
W0118 16:01:01.107260 185587 composing.cpp:609] Attempted to destroy unknown container 8353191d-5d91-4780-b096-9d9aa28b0723
I0118 16:01:01.107281 185555 task_status_update_manager.cpp:328] Received task status update TASK_FAILED (Status UUID: 23265b36-faf6-4709-9f67-b9fd817c90ba) for task 52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f of framework 842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000
I0118 16:01:01.107362 185569 slave.cpp:5808] Forwarding the update TASK_FAILED (Status UUID: 23265b36-faf6-4709-9f67-b9fd817c90ba) for task 52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f of framework 842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000 to master@10.10.23.200:26211
I0118 16:01:01.107483 185563 master.cpp:8496] Status update TASK_FAILED (Status UUID: 23265b36-faf6-4709-9f67-b9fd817c90ba) for task 52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f of framework 842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000 from agent 842f64ee-274e-4eb3-9787-ce8ee1000ffe-S0 at slave(1)@10.10.23.200:26211 (n10-023-200.byted.org)
I0118 16:01:01.107515 185563 master.cpp:8553] Forwarding status update TASK_FAILED (Status UUID: 23265b36-faf6-4709-9f67-b9fd817c90ba) for task 52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f of framework 842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000
I0118 16:01:01.107573 185563 master.cpp:11190] Updating the state of task 52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f of framework 842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000 (latest state: TASK_FAILED, status update state: TASK_FAILED)
I0118 16:01:01.107702 185563 master.cpp:6319] Processing ACKNOWLEDGE call for status 23265b36-faf6-4709-9f67-b9fd817c90ba for task 52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f of framework 842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000 (default) at scheduler-e042c538-0c59-415d-a3c4-5adf7d47dd29@10.10.23.200:26211 on agent 842f64ee-274e-4eb3-9787-ce8ee1000ffe-S0
I0118 16:01:01.107728 185563 master.cpp:11288] Removing task 52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f with resources cpus(allocated: *):1; mem(allocated: *):256; disk(allocated: *):1024 of framework 842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000 on agent 842f64ee-274e-4eb3-9787-ce8ee1000ffe-S0 at slave(1)@10.10.23.200:26211 (n10-023-200.byted.org)
I0118 16:01:01.107863 185556 task_status_update_manager.cpp:401] Received task status update acknowledgement (UUID: 23265b36-faf6-4709-9f67-b9fd817c90ba) for task 52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f of framework 842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000
I0118 16:01:01.108021 185575 slave.cpp:6482] Cleaning up executor '52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f' of framework 842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000 at executor(1)@10.10.23.200:18282
I0118 16:01:01.108121 185579 gc.cpp:95] Scheduling '/tmp/MemoryPressureMesosTest_ROOT_CGROUPS_Statistics_i83ugR/slaves/842f64ee-274e-4eb3-9787-ce8ee1000ffe-S0/frameworks/842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000/executors/52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f/runs/8353191d-5d91-4780-b096-9d9aa28b0723' for gc 6.99999874884444days in the future
I0118 16:01:01.108178 185579 gc.cpp:95] Scheduling '/tmp/MemoryPressureMesosTest_ROOT_CGROUPS_Statistics_i83ugR/slaves/842f64ee-274e-4eb3-9787-ce8ee1000ffe-S0/frameworks/842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000/executors/52b5ebc8-26b0-4439-ac3c-a7bb4dc5330f' for gc 6.99999874812444days in the future
I0118 16:01:01.108189 185575 slave.cpp:6611] Cleaning up framework 842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000
I0118 16:01:01.108223 185564 task_status_update_manager.cpp:289] Closing task status update streams for framework 842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000
I0118 16:01:01.108242 185559 gc.cpp:95] Scheduling '/tmp/MemoryPressureMesosTest_ROOT_CGROUPS_Statistics_i83ugR/slaves/842f64ee-274e-4eb3-9787-ce8ee1000ffe-S0/frameworks/842f64ee-274e-4eb3-9787-ce8ee1000ffe-0000' for gc 6.99999874740741days in the future
../../src/tests/containerizer/memory_pressure_tests.cpp:156: Failure
(usage).failure(): Unknown container 8353191d-5d91-4780-b096-9d9aa28b0723

 

It seemed that memory was consumed by cache.  I could not tell why. And the test will pass if I change the offer's memory from 256 to 512MB. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)