You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by Vibhatha Abeykoon <vi...@gmail.com> on 2020/11/18 19:19:09 UTC

PyArrow Compute is_null Usage

Hello,

I am looking into the is_null compute and observed the following.

import pyarrow as pa
arw_ar = pa.array([1, 2, 3, 4, None])
arw_ar_1 = pa.array([[1, 2, 3, 4, None], [11, 12, 13, None, 15]])
arw_ar_2 = pa.array([[1, 2, 3, 4], [1, 2, 3, 4]])
arw_ar_3 = pa.array([[None, None, None, None, None], [11, 12, 13, None, 15
]])

# Case 1 with random None value in a 1D array

print(arw_ar.is_null())

# [
# false,
# false,
# false,
# false,
# true
# ]

# Case 2 with random None value in a 2-D array

print(arw_ar_1.is_null())

# [
# false,
# false
# ]

# Case 3 without random None value in a 2-D array
print(arw_ar_2.is_null())

# [
# false,
# false
# ]

# Case 4 with None value in a 2-D array
print(arw_ar_3.is_null())
# [
# false,
# false
# ]

Is this an expected behavior?


With Regards,
Vibhatha Abeykoon,

Re: PyArrow Compute is_null Usage

Posted by Vibhatha Abeykoon <vi...@gmail.com>.
I understand your point. Even if null elements are present, it would
consider as not_null. I assumed this operator would be similar in function
to Pandas isnull.

import pandas as pd
pdf = pd.DataFrame([[1, 2, 3, 4, None], [11, 12, 13, None, 15]])
pdf.isnull()

For instance in this one, element-wise check is done.


With Regards,
Vibhatha Abeykoon


On Wed, Nov 18, 2020 at 2:28 PM Micah Kornfield <em...@gmail.com>
wrote:

> It looks right.  Could you clarify why you think it might not be expected
> behavior?  The arrays that are being constructed are of two different
> types.  One is an integer array (first example).   The rest on List<Int>.
> None of the lists in the examples are null (but they do contain null
> elements).
>
> On Wed, Nov 18, 2020 at 11:19 AM Vibhatha Abeykoon <vi...@gmail.com>
> wrote:
>
>> Hello,
>>
>> I am looking into the is_null compute and observed the following.
>>
>> import pyarrow as pa
>> arw_ar = pa.array([1, 2, 3, 4, None])
>> arw_ar_1 = pa.array([[1, 2, 3, 4, None], [11, 12, 13, None, 15]])
>> arw_ar_2 = pa.array([[1, 2, 3, 4], [1, 2, 3, 4]])
>> arw_ar_3 = pa.array([[None, None, None, None, None], [11, 12, 13, None,
>> 15]])
>>
>> # Case 1 with random None value in a 1D array
>>
>> print(arw_ar.is_null())
>>
>> # [
>> # false,
>> # false,
>> # false,
>> # false,
>> # true
>> # ]
>>
>> # Case 2 with random None value in a 2-D array
>>
>> print(arw_ar_1.is_null())
>>
>> # [
>> # false,
>> # false
>> # ]
>>
>> # Case 3 without random None value in a 2-D array
>> print(arw_ar_2.is_null())
>>
>> # [
>> # false,
>> # false
>> # ]
>>
>> # Case 4 with None value in a 2-D array
>> print(arw_ar_3.is_null())
>> # [
>> # false,
>> # false
>> # ]
>>
>> Is this an expected behavior?
>>
>>
>> With Regards,
>> Vibhatha Abeykoon,
>>
>>

Re: PyArrow Compute is_null Usage

Posted by Micah Kornfield <em...@gmail.com>.
It looks right.  Could you clarify why you think it might not be expected
behavior?  The arrays that are being constructed are of two different
types.  One is an integer array (first example).   The rest on List<Int>.
None of the lists in the examples are null (but they do contain null
elements).

On Wed, Nov 18, 2020 at 11:19 AM Vibhatha Abeykoon <vi...@gmail.com>
wrote:

> Hello,
>
> I am looking into the is_null compute and observed the following.
>
> import pyarrow as pa
> arw_ar = pa.array([1, 2, 3, 4, None])
> arw_ar_1 = pa.array([[1, 2, 3, 4, None], [11, 12, 13, None, 15]])
> arw_ar_2 = pa.array([[1, 2, 3, 4], [1, 2, 3, 4]])
> arw_ar_3 = pa.array([[None, None, None, None, None], [11, 12, 13, None, 15
> ]])
>
> # Case 1 with random None value in a 1D array
>
> print(arw_ar.is_null())
>
> # [
> # false,
> # false,
> # false,
> # false,
> # true
> # ]
>
> # Case 2 with random None value in a 2-D array
>
> print(arw_ar_1.is_null())
>
> # [
> # false,
> # false
> # ]
>
> # Case 3 without random None value in a 2-D array
> print(arw_ar_2.is_null())
>
> # [
> # false,
> # false
> # ]
>
> # Case 4 with None value in a 2-D array
> print(arw_ar_3.is_null())
> # [
> # false,
> # false
> # ]
>
> Is this an expected behavior?
>
>
> With Regards,
> Vibhatha Abeykoon,
>
>