You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "&res (Jira)" <ji...@apache.org> on 2021/05/07 07:46:00 UTC

[jira] [Created] (ARROW-12677) Add a mask argument to pyarrow.StructArray.from_arrays

&res created ARROW-12677:
----------------------------

             Summary: Add a mask argument to pyarrow.StructArray.from_arrays
                 Key: ARROW-12677
                 URL: https://issues.apache.org/jira/browse/ARROW-12677
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Python
            Reporter: &res


The python API for creating StructArray from a list of array doesn't allow to pass a missing value mask. 

At the moment the only way to create a StructArray with missing value is to use `pyarrow.array` and passing a vector of tuple.

{code:python}
>>> pyarrow.array(
    [
        None,
        (1, "foo"),
    ],
    type=pyarrow.struct(
        [pyarrow.field('col1', pyarrow.int64()), pyarrow.field("col2", pyarrow.string())]
    )
)
-- is_valid:
  [
    false,
    true
  ]
-- child 0 type: int64
  [
    0,
    1
  ]
-- child 1 type: string
  [
    "",
    "foo"
  ]
>>> pyarrow.StructArray.from_arrays(
    [
        [None, 1],
        [None, "foo"]
    ],
    fields=[pyarrow.field('col1', pyarrow.int64()), pyarrow.field("col2", pyarrow.string())]
)
-- is_valid: all not null
-- child 0 type: int64
  [
    null,
    1
  ]
-- child 1 type: string
  [
    null,
    "foo"
  ]
{code}

The C++ API allows it, so it should be easy to add.


see [this so question|https://stackoverflow.com/questions/67417110/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)