You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by jo...@apache.org on 2023/06/28 16:11:34 UTC
[arrow] branch main updated: GH-36168: [C++][Python] Support halffloat for Arrow list to pandas (#35944)
This is an automated email from the ASF dual-hosted git repository.
jorisvandenbossche pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new e5de6a59f4 GH-36168: [C++][Python] Support halffloat for Arrow list to pandas (#35944)
e5de6a59f4 is described below
commit e5de6a59f410a3255cc84138b44fc5802b627afc
Author: Igor Izvekov <iz...@gmail.com>
AuthorDate: Wed Jun 28 19:11:27 2023 +0300
GH-36168: [C++][Python] Support halffloat for Arrow list to pandas (#35944)
### Rationale for this change
When we translate `float16` from `pyarrow` to `pandas`, no error occurs:
```
>>> df = pd.DataFrame({"a": [np.float16(1), np.float16(2), np.float16(3)]})
>>> table = pa.Table.from_pandas(df)
>>> df_new = table.to_pandas()
```
But, when we use list:
```
>>> df = pd.DataFrame({"a": [[np.float16(1)], [np.float16(2)], [np.float16(3)]]})
>>> table = pa.Table.from_pandas(df)
>>> df_new = table.to_pandas()
pyarrow.lib.ArrowNotImplementedError: Not implemented type for Arrow list to pandas: halffloat
```
### What changes are included in this PR?
### Are these changes tested?
Yes
### Are there any user-facing changes?
Yes
* Closes: #36168
Authored-by: izveigor <iz...@gmail.com>
Signed-off-by: Joris Van den Bossche <jo...@gmail.com>
---
python/pyarrow/src/arrow/python/arrow_to_pandas.cc | 1 +
python/pyarrow/tests/test_array.py | 10 ++++++++++
2 files changed, 11 insertions(+)
diff --git a/python/pyarrow/src/arrow/python/arrow_to_pandas.cc b/python/pyarrow/src/arrow/python/arrow_to_pandas.cc
index f6b7ca9d54..2cd6f5c26d 100644
--- a/python/pyarrow/src/arrow/python/arrow_to_pandas.cc
+++ b/python/pyarrow/src/arrow/python/arrow_to_pandas.cc
@@ -165,6 +165,7 @@ static inline bool ListTypeSupported(const DataType& type) {
case Type::INT32:
case Type::INT64:
case Type::UINT64:
+ case Type::HALF_FLOAT:
case Type::FLOAT:
case Type::DOUBLE:
case Type::DECIMAL128:
diff --git a/python/pyarrow/tests/test_array.py b/python/pyarrow/tests/test_array.py
index 65f69a9c0f..77da6a3ebd 100644
--- a/python/pyarrow/tests/test_array.py
+++ b/python/pyarrow/tests/test_array.py
@@ -3347,6 +3347,16 @@ def test_to_pandas_timezone():
assert s.dt.tz is not None
+@pytest.mark.pandas
+def test_to_pandas_float16_list():
+ # https://github.com/apache/arrow/issues/36168
+ expected = [[np.float16(1)], [np.float16(2)], [np.float16(3)]]
+ arr = pa.array(expected)
+ result = arr.to_pandas()
+ assert result[0].dtype == "float16"
+ assert result.tolist() == expected
+
+
def test_array_sort():
arr = pa.array([5, 7, 35], type=pa.int64())
sorted_arr = arr.sort("descending")