You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@openwhisk.apache.org by GitBox <gi...@apache.org> on 2018/02/02 23:31:04 UTC

[GitHub] eweiter opened a new issue #3246: ActionLimitsTests memory test failing

eweiter opened a new issue #3246: ActionLimitsTests memory test failing
URL: https://github.com/apache/incubator-openwhisk/issues/3246

The mainOpenwhisk build has been breaking for the past >10 builds.
The reason for the failure was
"limits should be aborted when exceeding its memory limits"
the test tries to verify that an action is aborted when it hits a set memory limit but the abortion is not happening, so when the test tries to verify that an error exists on the response the test will fail.

A discussion occurred on the whisk-debug channel. Here is a transcript of what was discussed as the solution to this issue.
```
Markus Th?mmes [1:40 PM]
I?d think we need to delay execution of the action a bit so the OOM killer can do it?s job

Markus Th?mmes [1:42 PM]
that action should always fail
but I don?t know how the memory limit is enforced. Are cgroups ?synchronous?? As in: Should it be impossible to even allocate the memory? (edited)

Sven Lange-Last [1:48 PM]
cgroups check for physical memory. once the physical memory sum of all processes in a container exceeds the limit, the oom killer selects the largest memory consuming process in the container and kills it.

https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
2.5 Reclaim

Each cgroup maintains a per cgroup LRU which has the same structure as
global VM. When a cgroup goes over its limit, we first try
to reclaim memory from the cgroup so as to make space for the new
pages that the cgroup has touched. If the reclaim is unsuccessful,
an OOM routine is invoked to select and kill the bulkiest task in the
cgroup.

Markus Th?mmes [1:51 PM]
that is: it is async? (edited)

Sven Lange-Last [1:51 PM]
it is async.

Markus Th?mmes [1:52 PM]
@rabbah ^^ our nodejs code might exit (and nuke memory) too fast, so we could delay the actions execution by 2-3 seconds to give the killer time to operate. That was my idea

Sven Lange-Last [1:53 PM]
in particular, we need to make sure that the node.js code keeps pages in physical memory. otherwise, the memory handler may evoke pages.

=> write some random values to each of the pages and keep reading these values from all pages every 100 ms will keep all pages active.

otherwise: if the test is run on a system that has memory pressure, allocated pages may be paged out and won?t get accounted for physical memory any more.
```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services