You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Atul Dambalkar <at...@xoriant.com> on 2018/04/10 21:23:53 UTC

Correct way to set NULL values in VarCharVector (Java API)?

Hi,

I wanted to know what's the best way to handle NULL string values coming from a relational database. I am trying to set the string values in Java API - VarCharVector. Like few other Arrow Vectors (TimeStampVector, TimeMilliVector), the VarCharVector doesn't have a way to set a NULL value as one of the elements. Can someone advise what's the correct mechanism to store NULL values in this case.

Regards,
-Atul


RE: Correct way to set NULL values in VarCharVector (Java API)?

Posted by Atul Dambalkar <at...@xoriant.com>.
Hi Sid, Emilio,

It was a mistake on my part. I was not setting the holder.start and holder.end values inside the NullableVarCharHolder, which was causing the issue. It works now.

Regards,
-Atul

-----Original Message-----
From: Atul Dambalkar 
Sent: Wednesday, April 11, 2018 5:18 PM
To: dev@arrow.apache.org
Subject: RE: Correct way to set NULL values in VarCharVector (Java API)?

Hi Sid, Emilio,

Need some more help. Here is how I am using the NullableVarCharHolder -

----------------------
        String value = "some text string";
        NullableVarCharHolder holder = new NullableVarCharHolder();
        holder.isSet = 1;
        byte[] bytes = value.getBytes(StandardCharsets.UTF_8);
        holder.buffer = varcharVector.getAllocator().buffer(bytes.length);
        holder.buffer.setBytes(0, bytes, 0, bytes.length);
        varcharVector.setIndexDefined(index);
        varcharVector.setSafe(index, holder);
        varcharVector.setValueCount(index + 1);
-------------------------

When I try to access the byte[] from VarCharVector as varcharVector.get(index) it's returning me null array. If I access the holder.buffer value before putting it in the VarCharVector, I can access the correct byte[], but after I set it inside the vector, I am getting it as null. Is this correct usage for the API? 

-Atul



-----Original Message-----
From: Siddharth Teotia [mailto:siddharth@dremio.com]
Sent: Wednesday, April 11, 2018 10:27 AM
To: dev@arrow.apache.org
Subject: Re: Correct way to set NULL values in VarCharVector (Java API)?

Another option is to use the set() API that allows you to indicate whether the value is NULL or not using an isSet parameter (0 for NULL, 1 otherwise). This is similar to holder based APIs where you need to indicate in holder.isSet whether value is NULL or not.

https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java#L1095

Thanks,
Siddharth

On Wed, Apr 11, 2018 at 6:14 AM, Emilio Lahr-Vivaz <el...@ccri.com>
wrote:

> Hi Atul,
>
> You should be able to use the overloaded 'set' method that takes a
> NullableVarCharHolder:
>
> https://github.com/apache/arrow/blob/master/java/vector/src/
> main/java/org/apache/arrow/vector/VarCharVector.java#L237
>
> Thanks,
>
> Emilio
>
>
> On 04/10/2018 05:23 PM, Atul Dambalkar wrote:
>
>> Hi,
>>
>> I wanted to know what's the best way to handle NULL string values 
>> coming from a relational database. I am trying to set the string 
>> values in Java API - VarCharVector. Like few other Arrow Vectors 
>> (TimeStampVector, TimeMilliVector), the VarCharVector doesn't have a 
>> way to set a NULL value as one of the elements. Can someone advise 
>> what's the correct mechanism to store NULL values in this case.
>>
>> Regards,
>> -Atul
>>
>>
>>
>

RE: Correct way to set NULL values in VarCharVector (Java API)?

Posted by Atul Dambalkar <at...@xoriant.com>.
Thanks Sid and Emilio. I think, this can be extended to pretty much all the SQL and corresponding Arrow data types. 

-Atul

-----Original Message-----
From: Siddharth Teotia [mailto:siddharth@dremio.com] 
Sent: Wednesday, April 11, 2018 10:27 AM
To: dev@arrow.apache.org
Subject: Re: Correct way to set NULL values in VarCharVector (Java API)?

Another option is to use the set() API that allows you to indicate whether the value is NULL or not using an isSet parameter (0 for NULL, 1 otherwise). This is similar to holder based APIs where you need to indicate in holder.isSet whether value is NULL or not.

https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java#L1095

Thanks,
Siddharth

On Wed, Apr 11, 2018 at 6:14 AM, Emilio Lahr-Vivaz <el...@ccri.com>
wrote:

> Hi Atul,
>
> You should be able to use the overloaded 'set' method that takes a
> NullableVarCharHolder:
>
> https://github.com/apache/arrow/blob/master/java/vector/src/
> main/java/org/apache/arrow/vector/VarCharVector.java#L237
>
> Thanks,
>
> Emilio
>
>
> On 04/10/2018 05:23 PM, Atul Dambalkar wrote:
>
>> Hi,
>>
>> I wanted to know what's the best way to handle NULL string values 
>> coming from a relational database. I am trying to set the string 
>> values in Java API - VarCharVector. Like few other Arrow Vectors 
>> (TimeStampVector, TimeMilliVector), the VarCharVector doesn't have a 
>> way to set a NULL value as one of the elements. Can someone advise 
>> what's the correct mechanism to store NULL values in this case.
>>
>> Regards,
>> -Atul
>>
>>
>>
>

RE: Correct way to set NULL values in VarCharVector (Java API)?

Posted by Atul Dambalkar <at...@xoriant.com>.
Hi Sid, Emilio,

Need some more help. Here is how I am using the NullableVarCharHolder -

----------------------
        String value = "some text string";
        NullableVarCharHolder holder = new NullableVarCharHolder();
        holder.isSet = 1;
        byte[] bytes = value.getBytes(StandardCharsets.UTF_8);
        holder.buffer = varcharVector.getAllocator().buffer(bytes.length);
        holder.buffer.setBytes(0, bytes, 0, bytes.length);
        varcharVector.setIndexDefined(index);
        varcharVector.setSafe(index, holder);
        varcharVector.setValueCount(index + 1);
-------------------------

When I try to access the byte[] from VarCharVector as varcharVector.get(index) it's returning me null array. If I access the holder.buffer value before putting it in the VarCharVector, I can access the correct byte[], but after I set it inside the vector, I am getting it as null. Is this correct usage for the API? 

-Atul



-----Original Message-----
From: Siddharth Teotia [mailto:siddharth@dremio.com] 
Sent: Wednesday, April 11, 2018 10:27 AM
To: dev@arrow.apache.org
Subject: Re: Correct way to set NULL values in VarCharVector (Java API)?

Another option is to use the set() API that allows you to indicate whether the value is NULL or not using an isSet parameter (0 for NULL, 1 otherwise). This is similar to holder based APIs where you need to indicate in holder.isSet whether value is NULL or not.

https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java#L1095

Thanks,
Siddharth

On Wed, Apr 11, 2018 at 6:14 AM, Emilio Lahr-Vivaz <el...@ccri.com>
wrote:

> Hi Atul,
>
> You should be able to use the overloaded 'set' method that takes a
> NullableVarCharHolder:
>
> https://github.com/apache/arrow/blob/master/java/vector/src/
> main/java/org/apache/arrow/vector/VarCharVector.java#L237
>
> Thanks,
>
> Emilio
>
>
> On 04/10/2018 05:23 PM, Atul Dambalkar wrote:
>
>> Hi,
>>
>> I wanted to know what's the best way to handle NULL string values 
>> coming from a relational database. I am trying to set the string 
>> values in Java API - VarCharVector. Like few other Arrow Vectors 
>> (TimeStampVector, TimeMilliVector), the VarCharVector doesn't have a 
>> way to set a NULL value as one of the elements. Can someone advise 
>> what's the correct mechanism to store NULL values in this case.
>>
>> Regards,
>> -Atul
>>
>>
>>
>

Re: Correct way to set NULL values in VarCharVector (Java API)?

Posted by Siddharth Teotia <si...@dremio.com>.
Another option is to use the set() API that allows you to indicate whether
the value is NULL or not using an isSet parameter (0 for NULL, 1
otherwise). This is similar to holder based APIs where you need to indicate
in holder.isSet whether value is NULL or not.

https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java#L1095

Thanks,
Siddharth

On Wed, Apr 11, 2018 at 6:14 AM, Emilio Lahr-Vivaz <el...@ccri.com>
wrote:

> Hi Atul,
>
> You should be able to use the overloaded 'set' method that takes a
> NullableVarCharHolder:
>
> https://github.com/apache/arrow/blob/master/java/vector/src/
> main/java/org/apache/arrow/vector/VarCharVector.java#L237
>
> Thanks,
>
> Emilio
>
>
> On 04/10/2018 05:23 PM, Atul Dambalkar wrote:
>
>> Hi,
>>
>> I wanted to know what's the best way to handle NULL string values coming
>> from a relational database. I am trying to set the string values in Java
>> API - VarCharVector. Like few other Arrow Vectors (TimeStampVector,
>> TimeMilliVector), the VarCharVector doesn't have a way to set a NULL value
>> as one of the elements. Can someone advise what's the correct mechanism to
>> store NULL values in this case.
>>
>> Regards,
>> -Atul
>>
>>
>>
>

Re: Correct way to set NULL values in VarCharVector (Java API)?

Posted by Emilio Lahr-Vivaz <el...@ccri.com>.
Hi Atul,

You should be able to use the overloaded 'set' method that takes a 
NullableVarCharHolder:

https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/VarCharVector.java#L237

Thanks,

Emilio

On 04/10/2018 05:23 PM, Atul Dambalkar wrote:
> Hi,
>
> I wanted to know what's the best way to handle NULL string values coming from a relational database. I am trying to set the string values in Java API - VarCharVector. Like few other Arrow Vectors (TimeStampVector, TimeMilliVector), the VarCharVector doesn't have a way to set a NULL value as one of the elements. Can someone advise what's the correct mechanism to store NULL values in this case.
>
> Regards,
> -Atul
>
>