You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Chengxin Ma (Jira)" <ji...@apache.org> on 2020/05/19 15:55:00 UTC
[jira] [Created] (ARROW-8861) Memory not released until Plasma
process is killed
Chengxin Ma created ARROW-8861:
----------------------------------
Summary: Memory not released until Plasma process is killed
Key: ARROW-8861
URL: https://issues.apache.org/jira/browse/ARROW-8861
Project: Apache Arrow
Issue Type: Bug
Components: C++ - Plasma
Affects Versions: 0.16.0
Environment: Singularity container (Ubuntu 18.04)
Reporter: Chengxin Ma
Invoking the {{Delete(const ObjectID& object_id)}} method of a plasma client seems not really to free up the memory used by the object.
To reproduce:
1. use {{htop}} (or other similar tools) to monitor memory usage;
2. start up the Plasma Object Store by {{plasma_store -m 1000000000 -s /tmp/plasma}};
3. use {{put.py}} to put an object into Plasma;
4. compile and run {{delete.cc}} ({{g++ delete.cc `pkg-config --cflags --libs arrow plasma` --std=c++11 -o delete}});
5. kill the {{plasma_store}} process.
Memory usage drops at Step 5, rather than Step 4.
How to free up the memory while keeping Plasma Object Store running?
{{put.py}}:
{code:java}
from pyarrow import plasma
if __name__ == "__main__":
client = plasma.connect("/tmp/plasma")
object_id = plasma.ObjectID(20 * b"a")
object_size = 500000000
buffer = memoryview(client.create(object_id, object_size))
for i in range(500000000):
buffer[i] = i % 128
client.seal(object_id)
client.disconnect()
{code}
{{delete.cc}}:
{code:java}
#include "arrow/util/logging.h"
#include <plasma/client.h>
using namespace plasma;
int main(int argc, char **argv)
{
PlasmaClient client;
ARROW_CHECK_OK(client.Connect("/tmp/plasma"));
ObjectID object_id = ObjectID::from_binary("aaaaaaaaaaaaaaaaaaaa");
client.Delete(object_id);
ARROW_CHECK_OK(client.Disconnect());
}
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)