You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "AlenkaF (via GitHub)" <gi...@apache.org> on 2023/03/27 12:54:41 UTC

[GitHub] [arrow] AlenkaF commented on issue #34739: [Python] Cannot append nullable string columns to table

AlenkaF commented on issue #34739:
URL: https://github.com/apache/arrow/issues/34739#issuecomment-1485064943

   The issue in the example that doesn't work is that a `NullArray` is created (in [table.add_column](https://github.com/apache/arrow/blob/main/python/pyarrow/table.pxi#L4562-L4565)) as the only element in the column being appended is `None`. `NullArray` is of type `pa.null()` and not `pa.string()` and so we get an `ArrowInvalid` error:
   
   ```python
   >>> pa.chunked_array([["x"]])
   <pyarrow.lib.ChunkedArray object at 0x11672b600>
   [
     [
       "x"
     ]
   ]
   >>> pa.chunked_array([["x"]]).chunk(0)
   <pyarrow.lib.StringArray object at 0x11671ae60>
   [
     "x"
   ]
   >>> pa.chunked_array([[None]])
   <pyarrow.lib.ChunkedArray object at 0x11672b740>
   [
   1 nulls
   ]
   >>> pa.chunked_array([[None]]).chunk(0)
   <pyarrow.lib.NullArray object at 0x11671ae60>
   1 nulls
   ```
   
   That will not happen if you have examples with more than one row and not all elements of a column missing:
   
   ```python
   >>> pa.chunked_array([[None, "x"]]).chunk(0)
   <pyarrow.lib.StringArray object at 0x11671af80>
   [
     null,
     "x"
   ]
   ```
   
   ```python
   import pyarrow as pa
   table = pa.Table.from_pylist([{"a": None}, {"a": "first"}], pa.schema([pa.field("a", pa.string(), nullable=True)]))
   table = table.append_column(pa.field("b", pa.string(), nullable=True), [["x", "y"]])
   table = table.append_column(pa.field("n", pa.string(), nullable=True), [[None, "second"]])
   table
   # pyarrow.Table
   # a: string
   # b: string
   # n: string
   # ----
   # a: [[null,"first"]]
   # b: [["x","y"]]
   # n: [[null,"second"]]
   table.schema.field("n")
   # pyarrow.Field<n: string>
   table.schema.field("n").nullable
   # True
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org