You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Micah Kornfield (JIRA)" <ji...@apache.org> on 2016/08/19 02:54:20 UTC
[jira] [Commented] (ARROW-263) Design an initial IPC mechanism for Arrow Vectors

    [ https://issues.apache.org/jira/browse/ARROW-263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427516#comment-15427516 ] 

Micah Kornfield commented on ARROW-263:
---------------------------------------

TL;DR:  Current rough proposal/thoughts:
1.  used memory mapped files (clients of the library should use a directory backed by an in memory file system, e.g. TMPFS, but for debugging it could be useful to use traditional file system).  
2.  An initial version of IPC will focus on one way, producer->consumer channel.  UDF execution can be handled by creating two channels (one in each direction from the process).
3.  It is still an open question, but trying to do IPC purely though shared memory seems like a more difficult approach compared to a traditional RPC (e.g. thrift/GRPC) but more complex.


More in depth analysis based on my research:

On POSIX systems there are a two core APIs to consider from the C++ implementation side:
1.  Shared Memory APIs [1]:  These create shared memory objects.  Shared memory objects are named and persist after a process terminates (but not after a system restart).  The APIs for manipulating shared memory return a file descriptor that is MMAPPED.
2.  MMAP APIs [2]: These take either a file descriptor (either shared memory descriptor or a traditional file descriptor) and map the contents of the object into the processes memory space (an option exists that is anonymous MMAP, but this is only useful for sharing memory between forked processes, that don't run execve, which doesn't conform to our use-case).

 "Shared memory" vs traditional file system:
1.  From what I can tell (experiments needed), memory mapping a file created on tmpfs should have identical performance to mapping shared memory object.
2.  Both approaches suffer from the fact that if all processes crash for some reason there will be garbage left over that consumes resources* (if a file is stored on a memory backed file system then it will get collected on a system reset).
3.  Java doesn't support shared memory natively.  It does support memory mapping files.  The behavior for when other processes see changes to files is undefined, but based on various articles it seems that at least on linux, setting up the file as read/write should be suitable for RPC (without having to call sync repeatedly)

*In C/C++ is theoretically possible open a file (get a file descriptor) then unlink the file, then send the file descriptor over a Unix domain socket to another process.  Once all processes using the file descriptor exist all resources should get cleaned up.  Java doesn't support unix domain sockets in its core library and this solution is less likely to be as easily portable to non-posix operating systems.

[1] http://man7.org/linux/man-pages/man3/shm_open.3.html
[2] http://man7.org/linux/man-pages/man2/mmap.2.html

> Design an initial IPC mechanism for Arrow Vectors
> -------------------------------------------------
>
>                 Key: ARROW-263
>                 URL: https://issues.apache.org/jira/browse/ARROW-263
>             Project: Apache Arrow
>          Issue Type: New Feature
>            Reporter: Micah Kornfield
>            Assignee: Micah Kornfield
>
> Prior discussion on this topic [1].
> Use-cases:
> 1.  User defined function (UDF) execution:  One process wants to execute a user defined function written in another language (e.g. Java executing a function defined in python, this involves creating Arrow Arrays in java, sending them to python and receiving a new set of Arrow Arrays produced in python back in the java process).
> 2.  If a storage system and a query engine are running on the same host we might want use IPC instead of RPC (e.g. Apache Drill querying Apache Kudu)
> Assumptions:
> 1.  IPC mechanism should be useable from the core set of supported languages (Java, Python, C) on POSIX and ideally windows systems.  Ideally, we would not need to add dependencies on additional libraries outside of each languages outside of this document.
> We want leverage shared memory for Arrays to avoid doubling RAM requirements by duplicating the same Array in different memory locations.  
> 2. Under some circumstances shared memory might be more efficient than FIFOs or sockets (in other scenarios they won’t see thread below).
> 3. Security is not a concern for V1, we assume all processes running are “trusted”.
> Requirements:
> 1.Resource management: 
>     a.  Both processes need a way of allocating memory for Arrow Arrays so that data can be passed from one process to another.
>     b. There must be a mechanism to cleanup unused Arrow Arrays to limit resource usage but avoid race conditions when processing arrays
> 2.  Schema negotiation - before sending data, both processes need to agree on schema each one will produce.
> Out of scope requirements:
> 1.  IPC channel metadata discovery is out of scope of this document.  Discovery can be provided by passing appropriate command line arguments, configuration files or other mechanisms like RPC (in which case RPC channel discovery is still an issue).
> [1] http://mail-archives.apache.org/mod_mbox/arrow-dev/201603.mbox/%3C8D5F7E3237B3ED47B84CF187BB17B666148E7322@SHSMSX103.ccr.corp.intel.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)