You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Li Jin <ic...@gmail.com> on 2017/01/03 22:11:39 UTC
Best way to create/set validity bitmap buffer in Java
Hi,
I am working on a function where I want to create an arrow record batch
from Spark dataset. I am curious what's the best way to create validity
bitmap buffer for a field?
I found BitVector which has a mutator
<https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BitVector.java#L425>
that
seems to do the thing I want, I am wondering should I:
(1) create a BitVector
(2) set the value in the BitVector using the mutator
(3) pass the ArrowBuf from the BitVector
<https://github.com/apache/arrow/blob/45ed7e7a36fb2a69de468c41132b6b3bbd270c92/java/vector/src/main/java/org/apache/arrow/vector/BaseDataValueVector.java#L116>
to
the record batch?
Thanks,
Li
Re: Best way to create/set validity bitmap buffer in Java
Posted by Julien Le Dem <ju...@dremio.com>.
Hi Li,
The missing piece is that you are supposed to call setValueCount() on the
mutator when you are done writing to it.
that would be 2.5 in your original email.
Julien
On Wed, Jan 4, 2017 at 8:52 AM, Wes McKinney <we...@gmail.com> wrote:
> You can use the approach in the test suite:
>
> https://github.com/apache/arrow/blob/a2ead2f646baad78de01fcb1b90f71
> 0fa1eae70b/java/vector/src/test/java/org/apache/arrow/
> vector/TestValueVector.java#L289
>
> If the Java folks have some other recommendations, I'll let them chime in.
>
> - Wes
>
> On Tue, Jan 3, 2017 at 6:42 PM, Li Jin <ic...@gmail.com> wrote:
> > To answer myself: the above doesn't work because mutator doesn't update
> > writerIndex (only ArrowBuf.writeXXX does).
> >
> > Still wondering if there is a good way of setting validity map in Java?
> >
> > Li
> >
> > On Tue, Jan 3, 2017 at 5:11 PM, Li Jin <ic...@gmail.com> wrote:
> >
> >> Hi,
> >>
> >> I am working on a function where I want to create an arrow record batch
> >> from Spark dataset. I am curious what's the best way to create validity
> >> bitmap buffer for a field?
> >>
> >> I found BitVector which has a mutator
> >> <https://github.com/apache/arrow/blob/master/java/vector/
> src/main/java/org/apache/arrow/vector/BitVector.java#L425> that
> >> seems to do the thing I want, I am wondering should I:
> >>
> >> (1) create a BitVector
> >> (2) set the value in the BitVector using the mutator
> >> (3) pass the ArrowBuf from the BitVector
> >> <https://github.com/apache/arrow/blob/45ed7e7a36fb2a69de468c41132b6b
> 3bbd270c92/java/vector/src/main/java/org/apache/arrow/
> vector/BaseDataValueVector.java#L116> to
> >> the record batch?
> >>
> >> Thanks,
> >> Li
> >>
>
--
Julien
Re: Best way to create/set validity bitmap buffer in Java
Posted by Wes McKinney <we...@gmail.com>.
You can use the approach in the test suite:
https://github.com/apache/arrow/blob/a2ead2f646baad78de01fcb1b90f710fa1eae70b/java/vector/src/test/java/org/apache/arrow/vector/TestValueVector.java#L289
If the Java folks have some other recommendations, I'll let them chime in.
- Wes
On Tue, Jan 3, 2017 at 6:42 PM, Li Jin <ic...@gmail.com> wrote:
> To answer myself: the above doesn't work because mutator doesn't update
> writerIndex (only ArrowBuf.writeXXX does).
>
> Still wondering if there is a good way of setting validity map in Java?
>
> Li
>
> On Tue, Jan 3, 2017 at 5:11 PM, Li Jin <ic...@gmail.com> wrote:
>
>> Hi,
>>
>> I am working on a function where I want to create an arrow record batch
>> from Spark dataset. I am curious what's the best way to create validity
>> bitmap buffer for a field?
>>
>> I found BitVector which has a mutator
>> <https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BitVector.java#L425> that
>> seems to do the thing I want, I am wondering should I:
>>
>> (1) create a BitVector
>> (2) set the value in the BitVector using the mutator
>> (3) pass the ArrowBuf from the BitVector
>> <https://github.com/apache/arrow/blob/45ed7e7a36fb2a69de468c41132b6b3bbd270c92/java/vector/src/main/java/org/apache/arrow/vector/BaseDataValueVector.java#L116> to
>> the record batch?
>>
>> Thanks,
>> Li
>>
Re: Best way to create/set validity bitmap buffer in Java
Posted by Li Jin <ic...@gmail.com>.
To answer myself: the above doesn't work because mutator doesn't update
writerIndex (only ArrowBuf.writeXXX does).
Still wondering if there is a good way of setting validity map in Java?
Li
On Tue, Jan 3, 2017 at 5:11 PM, Li Jin <ic...@gmail.com> wrote:
> Hi,
>
> I am working on a function where I want to create an arrow record batch
> from Spark dataset. I am curious what's the best way to create validity
> bitmap buffer for a field?
>
> I found BitVector which has a mutator
> <https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BitVector.java#L425> that
> seems to do the thing I want, I am wondering should I:
>
> (1) create a BitVector
> (2) set the value in the BitVector using the mutator
> (3) pass the ArrowBuf from the BitVector
> <https://github.com/apache/arrow/blob/45ed7e7a36fb2a69de468c41132b6b3bbd270c92/java/vector/src/main/java/org/apache/arrow/vector/BaseDataValueVector.java#L116> to
> the record batch?
>
> Thanks,
> Li
>