You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by comic fans <co...@gmail.com> on 2020/11/13 05:43:08 UTC

does rust API support memory mapped read/write ?

Hello everyone , I'd like to use rust to read/write feather format
files (by memory mapped file support) ,but rust API only accept Reader
as input (arrow::ipc::reader::FileReader) , and I didn't find any
memory map api/crate usage in arrow , does that mean currently rust
native API didn't support memory mapped read/write ?  (I can do this
by arrow C++ API, but I'd like to do this through native rust API ) .

Re: does rust API support memory mapped read/write ?

Posted by Andrew Lamb <al...@influxdata.com>.
I think one way to think about mmap is outsourcing caching decisions to the
OS kernel (aka letting it decide how to manage the available memory and
move data between slower storage and memory pages). Often the kernel does
quite well at this task, but sometimes, especially under load, you might
want more direct control over what data is in memory and when,

In InfluxDB IOx, https://github.com/influxdata/influxdb_iox, we are not
planning to not mmap and instead plan to directly control and manage
buffers ourselves. This will likely involve a tradeoff between more
complexity (more cache management logic) but more control over memory
usage.

Also, for what it is worth, I think adding a memmap API to the Rust / Arrow
implementation is an interesting idea that could definitely be
incorporated. Thank you for explaining your usecase.

Andrew

On Sun, Nov 15, 2020 at 10:13 PM comic fans <co...@gmail.com> wrote:

> Yes, I mean rust API didn't support using mmap to read/write feather
> format.
>
> I'm trying to write a simple arrow based timeseries storage(based on
> feather format),  already use arrow C++ memmap API to implement
> zero-copy/low-latency prototype (ideal alike
> https://questdb.io/docs/concept/storage-model/),  now I want to
> implement more components in Rust. So I also want rust component read
> through memmapped API, instead of parse file again and again. If rust
> didn't support this, I need to pass pointer around,  more complicated.
>
> and yes memmap behavior is hard to model in rust ownership model.  but
> I hope rust API exists for these R/W actions (even unsafe ones).  In
> my opinion, memmap R/W ability is the most important feature of arrow.
> BTW, I've heard influx is also experimenting with arrow and rust, how
> can it handle big data effectively if rust API not support memory
> mapped read/write ?
>
> On Sun, Nov 15, 2020 at 8:22 PM Andrew Lamb <al...@influxdata.com> wrote:
> >
> > I think your conclusion that the Rust API doesn't support using mmap'd
> files as a way to read/write arrow files.
> >
> > In general, I suspect using mmap in Rust is a bit dicey (aka unsafe)  as
> the normal Rust rules of ownership are hard to apply to chunks of memory
> that can be (potentially) modified by different processes etc. It is
> probably fine in most read only cases.
> >
> > I am curious about your desire to use mmap for reading ipc streams (to
> understand more if we should be looking into mmap support). As I understand
> it, the ipc interface is designed for streaming reads/writes, and thus the
> easy random access of mmap seems less important. Are you concerned about
> Reader performance?
> >
> > Andrew
> >
> > On Fri, Nov 13, 2020 at 12:43 AM comic fans <co...@gmail.com>
> wrote:
> >>
> >> Hello everyone , I'd like to use rust to read/write feather format
> >> files (by memory mapped file support) ,but rust API only accept Reader
> >> as input (arrow::ipc::reader::FileReader) , and I didn't find any
> >> memory map api/crate usage in arrow , does that mean currently rust
> >> native API didn't support memory mapped read/write ?  (I can do this
> >> by arrow C++ API, but I'd like to do this through native rust API ) .
>

Re: does rust API support memory mapped read/write ?

Posted by comic fans <co...@gmail.com>.
Yes, I mean rust API didn't support using mmap to read/write feather format.

I'm trying to write a simple arrow based timeseries storage(based on
feather format),  already use arrow C++ memmap API to implement
zero-copy/low-latency prototype (ideal alike
https://questdb.io/docs/concept/storage-model/),  now I want to
implement more components in Rust. So I also want rust component read
through memmapped API, instead of parse file again and again. If rust
didn't support this, I need to pass pointer around,  more complicated.

and yes memmap behavior is hard to model in rust ownership model.  but
I hope rust API exists for these R/W actions (even unsafe ones).  In
my opinion, memmap R/W ability is the most important feature of arrow.
BTW, I've heard influx is also experimenting with arrow and rust, how
can it handle big data effectively if rust API not support memory
mapped read/write ?

On Sun, Nov 15, 2020 at 8:22 PM Andrew Lamb <al...@influxdata.com> wrote:
>
> I think your conclusion that the Rust API doesn't support using mmap'd files as a way to read/write arrow files.
>
> In general, I suspect using mmap in Rust is a bit dicey (aka unsafe)  as the normal Rust rules of ownership are hard to apply to chunks of memory that can be (potentially) modified by different processes etc. It is probably fine in most read only cases.
>
> I am curious about your desire to use mmap for reading ipc streams (to understand more if we should be looking into mmap support). As I understand it, the ipc interface is designed for streaming reads/writes, and thus the easy random access of mmap seems less important. Are you concerned about Reader performance?
>
> Andrew
>
> On Fri, Nov 13, 2020 at 12:43 AM comic fans <co...@gmail.com> wrote:
>>
>> Hello everyone , I'd like to use rust to read/write feather format
>> files (by memory mapped file support) ,but rust API only accept Reader
>> as input (arrow::ipc::reader::FileReader) , and I didn't find any
>> memory map api/crate usage in arrow , does that mean currently rust
>> native API didn't support memory mapped read/write ?  (I can do this
>> by arrow C++ API, but I'd like to do this through native rust API ) .

Re: does rust API support memory mapped read/write ?

Posted by Andrew Lamb <al...@influxdata.com>.
I think your conclusion that the Rust API doesn't support using mmap'd
files as a way to read/write arrow files.

In general, I suspect using mmap in Rust is a bit dicey (aka unsafe)  as
the normal Rust rules of ownership are hard to apply to chunks of memory
that can be (potentially) modified by different processes etc. It is
probably fine in most read only cases.

I am curious about your desire to use mmap for reading ipc streams (to
understand more if we should be looking into mmap support). As I understand
it, the ipc interface is designed for streaming reads/writes, and thus the
easy random access of mmap seems less important. Are you concerned about
Reader performance?

Andrew

On Fri, Nov 13, 2020 at 12:43 AM comic fans <co...@gmail.com> wrote:

> Hello everyone , I'd like to use rust to read/write feather format
> files (by memory mapped file support) ,but rust API only accept Reader
> as input (arrow::ipc::reader::FileReader) , and I didn't find any
> memory map api/crate usage in arrow , does that mean currently rust
> native API didn't support memory mapped read/write ?  (I can do this
> by arrow C++ API, but I'd like to do this through native rust API ) .
>