You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@arrow.apache.org by Zhenyuan Zhao <zz...@gmail.com> on 2018/09/07 05:26:29 UTC

[JAVA] Supporting zero copy arrow-vector

Hello Team,

I'm working on using arrow as intermediate format for transferring columnar
data from server to client. In this case, the client will only need to read
from the format so I would like to avoid any unnecessary copy of the data.
Looking into arrow, while arrow-format/flatbuffers does support zero copy,
current arrow-vector java implementation is not. I was trying to hack zero
copy for readonly scenarios, but saw two main blockers:

   1.

   ArrowBuf is the only buffer implementation used exclusively across
   ArrowReader/ArrowRecordBatch/Vectors. It's final, which means there isn't a
   way for me to override its logic in order to wrap some existing buffer.
   It's absolutely necessary to use ArrowBuf for write scenarios due to buffer
   allocation, but for read, I was hoping vector can just serve as view on top
   of existing memory buffer (like java ByteBuffer or netty ByteBuf). Seems
   safe for read only case.
   2.

   As a result of #1 <https://github.com/apache/arrow/pull/1> described
   above, the only layer which seems reusable is the arrow-format. Then I have
   to implement effectively a readonly copy of arrow-vector that references
   existing buffer. Put aside the effort doing that, it introduces a big gap
   to keep up with future changes/fixes made to arrow-vector.

Wondering if you guys have put any thoughts into such readonly scenarios.
Any suggestion how I can approach this myself?

Thanks

Re: [JAVA] Supporting zero copy arrow-vector

Posted by Jacques Nadeau <ja...@apache.org>.

On Sat, Sep 8, 2018 at 8:00 AM Zhenyuan Zhao <zz...@gmail.com> wrote:

> After digging into it a little deeper, I have more questions:
>
> First, vector takes allocator. Zero copy means we should not do any
> additional allocation which implies a dummy allocator with (at most)
> capability of allocating zero length (getEmpty) ArrowBuf is sufficient.
>
I don't see the downside to using the existing allocator as opposed to
creating a dummy. if it doesn't allocate anything, what's the problem?


> However, there are places in vector that requires more allocation:
>
>
> https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseFixedWidthVector.java#L511
>
> https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BitVectorHelper.java#L180
>
> Vector will allocate in the case of all null or non null. Id does seem like
> optimization that can be done, but why it reallocate without looking into
> if validity buffer is really empty? Take fixed width vector as example, it
> in fact does check buffers count is two, and for my simple test case, I saw
> validity buffer is still being sent in non null case.
>
> Agree that ideally we should only allocate if the source was not provided.
Seems like that could be improved.


> Second, arrow made a decision to only support off-heap buffer. Why? Doesn't
> affect my use case, but sounds like this can be more flexible.
>

Perf and GC.

Re: [JAVA] Supporting zero copy arrow-vector

Posted by Zhenyuan Zhao <zz...@gmail.com>.

After digging into it a little deeper, I have more questions:

First, vector takes allocator. Zero copy means we should not do any
additional allocation which implies a dummy allocator with (at most)
capability of allocating zero length (getEmpty) ArrowBuf is sufficient.
However, there are places in vector that requires more allocation:

https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseFixedWidthVector.java#L511
https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BitVectorHelper.java#L180

Vector will allocate in the case of all null or non null. Id does seem like
optimization that can be done, but why it reallocate without looking into
if validity buffer is really empty? Take fixed width vector as example, it
in fact does check buffers count is two, and for my simple test case, I saw
validity buffer is still being sent in non null case.

Second, arrow made a decision to only support off-heap buffer. Why? Doesn't
affect my use case, but sounds like this can be more flexible.

Unsafe supports two sets of APIs:
getXXX(long address)
getXXX(Object obj, long offset)
So it should work with both direct and off-heap:

initializing:
if (buffer.isDirect()) {
  this.ref = null;
  this.offset = getAddressOfDirect(buffer);
} else if (buffer.hasArray()) {
  this.ref = buffer.array();
  this.offset = Unsafe.ARRAY_BYTE_BASE_OFFSET + buf.arrayOffset();
}
this.offset += buffer.position();

then call:
theUnsafe.getXXX(this.ref, this.offset)

It likely has one more if branch comparing to getXXX(address).

Have a nice weekend!

On Fri, Sep 7, 2018 at 1:33 PM Zhenyuan Zhao <zz...@gmail.com> wrote:

> Thanks. That's crystal clear for me now.
>
> On Fri, Sep 7, 2018 at 1:16 PM Jacques Nadeau <ja...@apache.org> wrote:
>
>> I opened a jira to describe what I think needs to be done here. Check it
>> out:
>>
>> https://issues.apache.org/jira/browse/ARROW-3191
>>
>>
>> On Fri, Sep 7, 2018 at 10:47 AM Wes McKinney <we...@gmail.com> wrote:
>>
>> > Seems like you should be able to construct an UnsafeDirectByteBuf from
>> > a MappedByteBuffer, and then wrap that with UnsafeDirectLittleEndian
>> > to get zero-copy access to a memory map. Does that sound right?
>> >
>> >
>> >
>> https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/UnpooledUnsafeDirectByteBuf.java
>> > On Fri, Sep 7, 2018 at 12:46 PM Zhenyuan Zhao <zz...@gmail.com> wrote:
>> > >
>> > > Interesting, so basically I can still use the public constructor
>> > >
>> > > public ArrowBuf(AtomicInteger refCnt, BufferLedger ledger,
>> > > UnsafeDirectLittleEndian byteBuf, BufferManager manager,
>> > > ArrowByteBufAllocator alloc, int offset, int length, boolean isEmpty)
>> > >
>> > > Instead, override BufferLedger/UnsafeDirectLittleEndian/BufferManager
>> to
>> > > make it reference existing buffer. That is a much more plausible
>> option
>> > as
>> > > it will reuse the Vectors. All I need is to implement my own
>> > deserializer.
>> > > Did I get you right?
>> > >
>> > > Thanks
>> > >
>> > > On Fri, Sep 7, 2018 at 7:09 AM Jacques Nadeau <ja...@apache.org>
>> > wrote:
>> > >
>> > > > It is on purpose that the ArrowBuf is final. It is done to ensure a
>> > single
>> > > > impl and performance reasons. ArrowBuf is primarily a memory address
>> > and a
>> > > > length and wants zero indirection to the reading/writing of that.
>> > > >
>> > > > It does, however, wrap several types of substructures as long as
>> they
>> > have
>> > > > that property. For example, an ArrowBuf almost always currently
>> wraps a
>> > > > Netty UnsafeDirectLittleEndian object. At that level you could
>> propose
>> > a
>> > > > way to wrap more types of memory addresses+lengths.
>> > > >
>> > > > On Thu, Sep 6, 2018, 10:26 PM Zhenyuan Zhao <zz...@gmail.com>
>> wrote:
>> > > >
>> > > > > Hello Team,
>> > > > >
>> > > > > I'm working on using arrow as intermediate format for transferring
>> > > > columnar
>> > > > > data from server to client. In this case, the client will only
>> need
>> > to
>> > > > read
>> > > > > from the format so I would like to avoid any unnecessary copy of
>> the
>> > > > data.
>> > > > > Looking into arrow, while arrow-format/flatbuffers does support
>> zero
>> > > > copy,
>> > > > > current arrow-vector java implementation is not. I was trying to
>> hack
>> > > > zero
>> > > > > copy for readonly scenarios, but saw two main blockers:
>> > > > >
>> > > > >    1.
>> > > > >
>> > > > >    ArrowBuf is the only buffer implementation used exclusively
>> across
>> > > > >    ArrowReader/ArrowRecordBatch/Vectors. It's final, which means
>> > there
>> > > > > isn't a
>> > > > >    way for me to override its logic in order to wrap some existing
>> > > > buffer.
>> > > > >    It's absolutely necessary to use ArrowBuf for write scenarios
>> due
>> > to
>> > > > > buffer
>> > > > >    allocation, but for read, I was hoping vector can just serve as
>> > view
>> > > > on
>> > > > > top
>> > > > >    of existing memory buffer (like java ByteBuffer or netty
>> ByteBuf).
>> > > > Seems
>> > > > >    safe for read only case.
>> > > > >    2.
>> > > > >
>> > > > >    As a result of #1 <https://github.com/apache/arrow/pull/1>
>> > described
>> > > > >    above, the only layer which seems reusable is the arrow-format.
>> > Then I
>> > > > > have
>> > > > >    to implement effectively a readonly copy of arrow-vector that
>> > > > references
>> > > > >    existing buffer. Put aside the effort doing that, it
>> introduces a
>> > big
>> > > > > gap
>> > > > >    to keep up with future changes/fixes made to arrow-vector.
>> > > > >
>> > > > > Wondering if you guys have put any thoughts into such readonly
>> > scenarios.
>> > > > > Any suggestion how I can approach this myself?
>> > > > >
>> > > > > Thanks
>> > > > >
>> > > >
>> >
>>
>

Re: [JAVA] Supporting zero copy arrow-vector

Posted by Zhenyuan Zhao <zz...@gmail.com>.

Thanks. That's crystal clear for me now.

On Fri, Sep 7, 2018 at 1:16 PM Jacques Nadeau <ja...@apache.org> wrote:

> I opened a jira to describe what I think needs to be done here. Check it
> out:
>
> https://issues.apache.org/jira/browse/ARROW-3191
>
>
> On Fri, Sep 7, 2018 at 10:47 AM Wes McKinney <we...@gmail.com> wrote:
>
> > Seems like you should be able to construct an UnsafeDirectByteBuf from
> > a MappedByteBuffer, and then wrap that with UnsafeDirectLittleEndian
> > to get zero-copy access to a memory map. Does that sound right?
> >
> >
> >
> https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/UnpooledUnsafeDirectByteBuf.java
> > On Fri, Sep 7, 2018 at 12:46 PM Zhenyuan Zhao <zz...@gmail.com> wrote:
> > >
> > > Interesting, so basically I can still use the public constructor
> > >
> > > public ArrowBuf(AtomicInteger refCnt, BufferLedger ledger,
> > > UnsafeDirectLittleEndian byteBuf, BufferManager manager,
> > > ArrowByteBufAllocator alloc, int offset, int length, boolean isEmpty)
> > >
> > > Instead, override BufferLedger/UnsafeDirectLittleEndian/BufferManager
> to
> > > make it reference existing buffer. That is a much more plausible option
> > as
> > > it will reuse the Vectors. All I need is to implement my own
> > deserializer.
> > > Did I get you right?
> > >
> > > Thanks
> > >
> > > On Fri, Sep 7, 2018 at 7:09 AM Jacques Nadeau <ja...@apache.org>
> > wrote:
> > >
> > > > It is on purpose that the ArrowBuf is final. It is done to ensure a
> > single
> > > > impl and performance reasons. ArrowBuf is primarily a memory address
> > and a
> > > > length and wants zero indirection to the reading/writing of that.
> > > >
> > > > It does, however, wrap several types of substructures as long as they
> > have
> > > > that property. For example, an ArrowBuf almost always currently
> wraps a
> > > > Netty UnsafeDirectLittleEndian object. At that level you could
> propose
> > a
> > > > way to wrap more types of memory addresses+lengths.
> > > >
> > > > On Thu, Sep 6, 2018, 10:26 PM Zhenyuan Zhao <zz...@gmail.com>
> wrote:
> > > >
> > > > > Hello Team,
> > > > >
> > > > > I'm working on using arrow as intermediate format for transferring
> > > > columnar
> > > > > data from server to client. In this case, the client will only need
> > to
> > > > read
> > > > > from the format so I would like to avoid any unnecessary copy of
> the
> > > > data.
> > > > > Looking into arrow, while arrow-format/flatbuffers does support
> zero
> > > > copy,
> > > > > current arrow-vector java implementation is not. I was trying to
> hack
> > > > zero
> > > > > copy for readonly scenarios, but saw two main blockers:
> > > > >
> > > > >    1.
> > > > >
> > > > >    ArrowBuf is the only buffer implementation used exclusively
> across
> > > > >    ArrowReader/ArrowRecordBatch/Vectors. It's final, which means
> > there
> > > > > isn't a
> > > > >    way for me to override its logic in order to wrap some existing
> > > > buffer.
> > > > >    It's absolutely necessary to use ArrowBuf for write scenarios
> due
> > to
> > > > > buffer
> > > > >    allocation, but for read, I was hoping vector can just serve as
> > view
> > > > on
> > > > > top
> > > > >    of existing memory buffer (like java ByteBuffer or netty
> ByteBuf).
> > > > Seems
> > > > >    safe for read only case.
> > > > >    2.
> > > > >
> > > > >    As a result of #1 <https://github.com/apache/arrow/pull/1>
> > described
> > > > >    above, the only layer which seems reusable is the arrow-format.
> > Then I
> > > > > have
> > > > >    to implement effectively a readonly copy of arrow-vector that
> > > > references
> > > > >    existing buffer. Put aside the effort doing that, it introduces
> a
> > big
> > > > > gap
> > > > >    to keep up with future changes/fixes made to arrow-vector.
> > > > >
> > > > > Wondering if you guys have put any thoughts into such readonly
> > scenarios.
> > > > > Any suggestion how I can approach this myself?
> > > > >
> > > > > Thanks
> > > > >
> > > >
> >
>

Re: [JAVA] Supporting zero copy arrow-vector

Posted by Jacques Nadeau <ja...@apache.org>.

I opened a jira to describe what I think needs to be done here. Check it
out:

https://issues.apache.org/jira/browse/ARROW-3191


On Fri, Sep 7, 2018 at 10:47 AM Wes McKinney <we...@gmail.com> wrote:

> Seems like you should be able to construct an UnsafeDirectByteBuf from
> a MappedByteBuffer, and then wrap that with UnsafeDirectLittleEndian
> to get zero-copy access to a memory map. Does that sound right?
>
>
> https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/UnpooledUnsafeDirectByteBuf.java
> On Fri, Sep 7, 2018 at 12:46 PM Zhenyuan Zhao <zz...@gmail.com> wrote:
> >
> > Interesting, so basically I can still use the public constructor
> >
> > public ArrowBuf(AtomicInteger refCnt, BufferLedger ledger,
> > UnsafeDirectLittleEndian byteBuf, BufferManager manager,
> > ArrowByteBufAllocator alloc, int offset, int length, boolean isEmpty)
> >
> > Instead, override BufferLedger/UnsafeDirectLittleEndian/BufferManager to
> > make it reference existing buffer. That is a much more plausible option
> as
> > it will reuse the Vectors. All I need is to implement my own
> deserializer.
> > Did I get you right?
> >
> > Thanks
> >
> > On Fri, Sep 7, 2018 at 7:09 AM Jacques Nadeau <ja...@apache.org>
> wrote:
> >
> > > It is on purpose that the ArrowBuf is final. It is done to ensure a
> single
> > > impl and performance reasons. ArrowBuf is primarily a memory address
> and a
> > > length and wants zero indirection to the reading/writing of that.
> > >
> > > It does, however, wrap several types of substructures as long as they
> have
> > > that property. For example, an ArrowBuf almost always currently wraps a
> > > Netty UnsafeDirectLittleEndian object. At that level you could propose
> a
> > > way to wrap more types of memory addresses+lengths.
> > >
> > > On Thu, Sep 6, 2018, 10:26 PM Zhenyuan Zhao <zz...@gmail.com> wrote:
> > >
> > > > Hello Team,
> > > >
> > > > I'm working on using arrow as intermediate format for transferring
> > > columnar
> > > > data from server to client. In this case, the client will only need
> to
> > > read
> > > > from the format so I would like to avoid any unnecessary copy of the
> > > data.
> > > > Looking into arrow, while arrow-format/flatbuffers does support zero
> > > copy,
> > > > current arrow-vector java implementation is not. I was trying to hack
> > > zero
> > > > copy for readonly scenarios, but saw two main blockers:
> > > >
> > > >    1.
> > > >
> > > >    ArrowBuf is the only buffer implementation used exclusively across
> > > >    ArrowReader/ArrowRecordBatch/Vectors. It's final, which means
> there
> > > > isn't a
> > > >    way for me to override its logic in order to wrap some existing
> > > buffer.
> > > >    It's absolutely necessary to use ArrowBuf for write scenarios due
> to
> > > > buffer
> > > >    allocation, but for read, I was hoping vector can just serve as
> view
> > > on
> > > > top
> > > >    of existing memory buffer (like java ByteBuffer or netty ByteBuf).
> > > Seems
> > > >    safe for read only case.
> > > >    2.
> > > >
> > > >    As a result of #1 <https://github.com/apache/arrow/pull/1>
> described
> > > >    above, the only layer which seems reusable is the arrow-format.
> Then I
> > > > have
> > > >    to implement effectively a readonly copy of arrow-vector that
> > > references
> > > >    existing buffer. Put aside the effort doing that, it introduces a
> big
> > > > gap
> > > >    to keep up with future changes/fixes made to arrow-vector.
> > > >
> > > > Wondering if you guys have put any thoughts into such readonly
> scenarios.
> > > > Any suggestion how I can approach this myself?
> > > >
> > > > Thanks
> > > >
> > >
>

Re: [JAVA] Supporting zero copy arrow-vector

Posted by Wes McKinney <we...@gmail.com>.

Seems like you should be able to construct an UnsafeDirectByteBuf from
a MappedByteBuffer, and then wrap that with UnsafeDirectLittleEndian
to get zero-copy access to a memory map. Does that sound right?

https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/UnpooledUnsafeDirectByteBuf.java
On Fri, Sep 7, 2018 at 12:46 PM Zhenyuan Zhao <zz...@gmail.com> wrote:
>
> Interesting, so basically I can still use the public constructor
>
> public ArrowBuf(AtomicInteger refCnt, BufferLedger ledger,
> UnsafeDirectLittleEndian byteBuf, BufferManager manager,
> ArrowByteBufAllocator alloc, int offset, int length, boolean isEmpty)
>
> Instead, override BufferLedger/UnsafeDirectLittleEndian/BufferManager to
> make it reference existing buffer. That is a much more plausible option as
> it will reuse the Vectors. All I need is to implement my own deserializer.
> Did I get you right?
>
> Thanks
>
> On Fri, Sep 7, 2018 at 7:09 AM Jacques Nadeau <ja...@apache.org> wrote:
>
> > It is on purpose that the ArrowBuf is final. It is done to ensure a single
> > impl and performance reasons. ArrowBuf is primarily a memory address and a
> > length and wants zero indirection to the reading/writing of that.
> >
> > It does, however, wrap several types of substructures as long as they have
> > that property. For example, an ArrowBuf almost always currently wraps a
> > Netty UnsafeDirectLittleEndian object. At that level you could propose a
> > way to wrap more types of memory addresses+lengths.
> >
> > On Thu, Sep 6, 2018, 10:26 PM Zhenyuan Zhao <zz...@gmail.com> wrote:
> >
> > > Hello Team,
> > >
> > > I'm working on using arrow as intermediate format for transferring
> > columnar
> > > data from server to client. In this case, the client will only need to
> > read
> > > from the format so I would like to avoid any unnecessary copy of the
> > data.
> > > Looking into arrow, while arrow-format/flatbuffers does support zero
> > copy,
> > > current arrow-vector java implementation is not. I was trying to hack
> > zero
> > > copy for readonly scenarios, but saw two main blockers:
> > >
> > >    1.
> > >
> > >    ArrowBuf is the only buffer implementation used exclusively across
> > >    ArrowReader/ArrowRecordBatch/Vectors. It's final, which means there
> > > isn't a
> > >    way for me to override its logic in order to wrap some existing
> > buffer.
> > >    It's absolutely necessary to use ArrowBuf for write scenarios due to
> > > buffer
> > >    allocation, but for read, I was hoping vector can just serve as view
> > on
> > > top
> > >    of existing memory buffer (like java ByteBuffer or netty ByteBuf).
> > Seems
> > >    safe for read only case.
> > >    2.
> > >
> > >    As a result of #1 <https://github.com/apache/arrow/pull/1> described
> > >    above, the only layer which seems reusable is the arrow-format. Then I
> > > have
> > >    to implement effectively a readonly copy of arrow-vector that
> > references
> > >    existing buffer. Put aside the effort doing that, it introduces a big
> > > gap
> > >    to keep up with future changes/fixes made to arrow-vector.
> > >
> > > Wondering if you guys have put any thoughts into such readonly scenarios.
> > > Any suggestion how I can approach this myself?
> > >
> > > Thanks
> > >
> >

Re: [JAVA] Supporting zero copy arrow-vector

Posted by Zhenyuan Zhao <zz...@gmail.com>.

Interesting, so basically I can still use the public constructor

public ArrowBuf(AtomicInteger refCnt, BufferLedger ledger,
UnsafeDirectLittleEndian byteBuf, BufferManager manager,
ArrowByteBufAllocator alloc, int offset, int length, boolean isEmpty)

Instead, override BufferLedger/UnsafeDirectLittleEndian/BufferManager to
make it reference existing buffer. That is a much more plausible option as
it will reuse the Vectors. All I need is to implement my own deserializer.
Did I get you right?

Thanks

On Fri, Sep 7, 2018 at 7:09 AM Jacques Nadeau <ja...@apache.org> wrote:

> It is on purpose that the ArrowBuf is final. It is done to ensure a single
> impl and performance reasons. ArrowBuf is primarily a memory address and a
> length and wants zero indirection to the reading/writing of that.
>
> It does, however, wrap several types of substructures as long as they have
> that property. For example, an ArrowBuf almost always currently wraps a
> Netty UnsafeDirectLittleEndian object. At that level you could propose a
> way to wrap more types of memory addresses+lengths.
>
> On Thu, Sep 6, 2018, 10:26 PM Zhenyuan Zhao <zz...@gmail.com> wrote:
>
> > Hello Team,
> >
> > I'm working on using arrow as intermediate format for transferring
> columnar
> > data from server to client. In this case, the client will only need to
> read
> > from the format so I would like to avoid any unnecessary copy of the
> data.
> > Looking into arrow, while arrow-format/flatbuffers does support zero
> copy,
> > current arrow-vector java implementation is not. I was trying to hack
> zero
> > copy for readonly scenarios, but saw two main blockers:
> >
> >    1.
> >
> >    ArrowBuf is the only buffer implementation used exclusively across
> >    ArrowReader/ArrowRecordBatch/Vectors. It's final, which means there
> > isn't a
> >    way for me to override its logic in order to wrap some existing
> buffer.
> >    It's absolutely necessary to use ArrowBuf for write scenarios due to
> > buffer
> >    allocation, but for read, I was hoping vector can just serve as view
> on
> > top
> >    of existing memory buffer (like java ByteBuffer or netty ByteBuf).
> Seems
> >    safe for read only case.
> >    2.
> >
> >    As a result of #1 <https://github.com/apache/arrow/pull/1> described
> >    above, the only layer which seems reusable is the arrow-format. Then I
> > have
> >    to implement effectively a readonly copy of arrow-vector that
> references
> >    existing buffer. Put aside the effort doing that, it introduces a big
> > gap
> >    to keep up with future changes/fixes made to arrow-vector.
> >
> > Wondering if you guys have put any thoughts into such readonly scenarios.
> > Any suggestion how I can approach this myself?
> >
> > Thanks
> >
>

Re: [JAVA] Supporting zero copy arrow-vector

Posted by Jacques Nadeau <ja...@apache.org>.

It is on purpose that the ArrowBuf is final. It is done to ensure a single
impl and performance reasons. ArrowBuf is primarily a memory address and a
length and wants zero indirection to the reading/writing of that.

It does, however, wrap several types of substructures as long as they have
that property. For example, an ArrowBuf almost always currently wraps a
Netty UnsafeDirectLittleEndian object. At that level you could propose a
way to wrap more types of memory addresses+lengths.

On Thu, Sep 6, 2018, 10:26 PM Zhenyuan Zhao <zz...@gmail.com> wrote:

> Hello Team,
>
> I'm working on using arrow as intermediate format for transferring columnar
> data from server to client. In this case, the client will only need to read
> from the format so I would like to avoid any unnecessary copy of the data.
> Looking into arrow, while arrow-format/flatbuffers does support zero copy,
> current arrow-vector java implementation is not. I was trying to hack zero
> copy for readonly scenarios, but saw two main blockers:
>
>    1.
>
>    ArrowBuf is the only buffer implementation used exclusively across
>    ArrowReader/ArrowRecordBatch/Vectors. It's final, which means there
> isn't a
>    way for me to override its logic in order to wrap some existing buffer.
>    It's absolutely necessary to use ArrowBuf for write scenarios due to
> buffer
>    allocation, but for read, I was hoping vector can just serve as view on
> top
>    of existing memory buffer (like java ByteBuffer or netty ByteBuf). Seems
>    safe for read only case.
>    2.
>
>    As a result of #1 <https://github.com/apache/arrow/pull/1> described
>    above, the only layer which seems reusable is the arrow-format. Then I
> have
>    to implement effectively a readonly copy of arrow-vector that references
>    existing buffer. Put aside the effort doing that, it introduces a big
> gap
>    to keep up with future changes/fixes made to arrow-vector.
>
> Wondering if you guys have put any thoughts into such readonly scenarios.
> Any suggestion how I can approach this myself?
>
> Thanks
>