You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Sergey Mozharov (Jira)" <ji...@apache.org> on 2021/04/30 12:30:00 UTC

[jira] [Created] (ARROW-12609) TypeError when accessing length of a ListScalar with list-like data type

Sergey Mozharov created ARROW-12609:
---------------------------------------

             Summary: TypeError when accessing length of a ListScalar with list-like data type
                 Key: ARROW-12609
                 URL: https://issues.apache.org/jira/browse/ARROW-12609
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 4.0.0, 3.0.0
         Environment: python=3.9.2
pyarrow=4.0.0 (3.0.0 has the same behavior)
            Reporter: Sergey Mozharov


For List-like data types, the scalar type corresponding to a missing value has '__len__' attribute, but TypeError is raised when it is accessed

```python

data_type = pa.list_(pa.struct([
 ('a', pa.int64()),
 ('b', pa.bool_())
]))
data = [[\{'a': 1, 'b': False}, \{'a': 2, 'b': True}], None]

arr = pa.array(data, type=data_type)
missing_scalar = arr[1]  # <pyarrow.ListScalar: None>
assert hasattr(missing_scalar, '__len__')
assert len(missing_scalar) == 0  # --> TypeError: object of type 'NoneType' has no len()

```

Expected behavior: length is expected to be 0.

This issue causes several pandas unit tests to fail when an ExtensionArray backed by arrow array with this data type is built.

This behavior is also inconsistent with a similar example where the data type is a struct:

```python

data_type = pa.struct([
 ('a', pa.int64()),
 ('b', pa.bool_())
])
data = [\{'a': 1, 'b': False}, None]

arr = pa.array(data, type=data_type)
missing_scalar = arr[1] # <pyarrow.StructScalar: None>
assert hasattr(missing_scalar, '__len__')
assert len(missing_scalar) == 0  # Ok

```

In this second example the TypeError is not raised.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)