You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Li Jin <ic...@gmail.com> on 2017/01/03 22:11:39 UTC

Best way to create/set validity bitmap buffer in Java

Hi,

I am working on a function where I want to create an arrow record batch
from Spark dataset. I am curious what's the best way to create validity
bitmap buffer for a field?

I found BitVector which has a mutator
<https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BitVector.java#L425>
that
seems to do the thing I want, I am wondering should I:

(1) create a BitVector
(2) set the value in the BitVector using the mutator
(3) pass the ArrowBuf from the BitVector
<https://github.com/apache/arrow/blob/45ed7e7a36fb2a69de468c41132b6b3bbd270c92/java/vector/src/main/java/org/apache/arrow/vector/BaseDataValueVector.java#L116>
to
the record batch?

Thanks,
Li

Re: Best way to create/set validity bitmap buffer in Java

Posted by Julien Le Dem <ju...@dremio.com>.
Hi Li,
The missing piece is that you are supposed to call setValueCount() on the
mutator when you are done writing to it.
that would be 2.5 in your original email.
Julien


On Wed, Jan 4, 2017 at 8:52 AM, Wes McKinney <we...@gmail.com> wrote:

> You can use the approach in the test suite:
>
> https://github.com/apache/arrow/blob/a2ead2f646baad78de01fcb1b90f71
> 0fa1eae70b/java/vector/src/test/java/org/apache/arrow/
> vector/TestValueVector.java#L289
>
> If the Java folks have some other recommendations, I'll let them chime in.
>
> - Wes
>
> On Tue, Jan 3, 2017 at 6:42 PM, Li Jin <ic...@gmail.com> wrote:
> > To answer myself: the above doesn't work because mutator doesn't update
> > writerIndex (only ArrowBuf.writeXXX does).
> >
> > Still wondering if there is a good way of setting validity map in Java?
> >
> > Li
> >
> > On Tue, Jan 3, 2017 at 5:11 PM, Li Jin <ic...@gmail.com> wrote:
> >
> >> Hi,
> >>
> >> I am working on a function where I want to create an arrow record batch
> >> from Spark dataset. I am curious what's the best way to create validity
> >> bitmap buffer for a field?
> >>
> >> I found BitVector which has a mutator
> >> <https://github.com/apache/arrow/blob/master/java/vector/
> src/main/java/org/apache/arrow/vector/BitVector.java#L425> that
> >> seems to do the thing I want, I am wondering should I:
> >>
> >> (1) create a BitVector
> >> (2) set the value in the BitVector using the mutator
> >> (3) pass the ArrowBuf from the BitVector
> >> <https://github.com/apache/arrow/blob/45ed7e7a36fb2a69de468c41132b6b
> 3bbd270c92/java/vector/src/main/java/org/apache/arrow/
> vector/BaseDataValueVector.java#L116> to
> >> the record batch?
> >>
> >> Thanks,
> >> Li
> >>
>



-- 
Julien

Re: Best way to create/set validity bitmap buffer in Java

Posted by Wes McKinney <we...@gmail.com>.
You can use the approach in the test suite:

https://github.com/apache/arrow/blob/a2ead2f646baad78de01fcb1b90f710fa1eae70b/java/vector/src/test/java/org/apache/arrow/vector/TestValueVector.java#L289

If the Java folks have some other recommendations, I'll let them chime in.

- Wes

On Tue, Jan 3, 2017 at 6:42 PM, Li Jin <ic...@gmail.com> wrote:
> To answer myself: the above doesn't work because mutator doesn't update
> writerIndex (only ArrowBuf.writeXXX does).
>
> Still wondering if there is a good way of setting validity map in Java?
>
> Li
>
> On Tue, Jan 3, 2017 at 5:11 PM, Li Jin <ic...@gmail.com> wrote:
>
>> Hi,
>>
>> I am working on a function where I want to create an arrow record batch
>> from Spark dataset. I am curious what's the best way to create validity
>> bitmap buffer for a field?
>>
>> I found BitVector which has a mutator
>> <https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BitVector.java#L425> that
>> seems to do the thing I want, I am wondering should I:
>>
>> (1) create a BitVector
>> (2) set the value in the BitVector using the mutator
>> (3) pass the ArrowBuf from the BitVector
>> <https://github.com/apache/arrow/blob/45ed7e7a36fb2a69de468c41132b6b3bbd270c92/java/vector/src/main/java/org/apache/arrow/vector/BaseDataValueVector.java#L116> to
>> the record batch?
>>
>> Thanks,
>> Li
>>

Re: Best way to create/set validity bitmap buffer in Java

Posted by Li Jin <ic...@gmail.com>.
To answer myself: the above doesn't work because mutator doesn't update
writerIndex (only ArrowBuf.writeXXX does).

Still wondering if there is a good way of setting validity map in Java?

Li

On Tue, Jan 3, 2017 at 5:11 PM, Li Jin <ic...@gmail.com> wrote:

> Hi,
>
> I am working on a function where I want to create an arrow record batch
> from Spark dataset. I am curious what's the best way to create validity
> bitmap buffer for a field?
>
> I found BitVector which has a mutator
> <https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BitVector.java#L425> that
> seems to do the thing I want, I am wondering should I:
>
> (1) create a BitVector
> (2) set the value in the BitVector using the mutator
> (3) pass the ArrowBuf from the BitVector
> <https://github.com/apache/arrow/blob/45ed7e7a36fb2a69de468c41132b6b3bbd270c92/java/vector/src/main/java/org/apache/arrow/vector/BaseDataValueVector.java#L116> to
> the record batch?
>
> Thanks,
> Li
>