You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by jo...@apache.org on 2023/06/28 16:11:34 UTC

[arrow] branch main updated: GH-36168: [C++][Python] Support halffloat for Arrow list to pandas (#35944)

This is an automated email from the ASF dual-hosted git repository.

jorisvandenbossche pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/main by this push:
     new e5de6a59f4 GH-36168: [C++][Python] Support halffloat for Arrow list to pandas (#35944)
e5de6a59f4 is described below

commit e5de6a59f410a3255cc84138b44fc5802b627afc
Author: Igor Izvekov <iz...@gmail.com>
AuthorDate: Wed Jun 28 19:11:27 2023 +0300

    GH-36168: [C++][Python] Support halffloat for Arrow list to pandas (#35944)
    
    
    
    ### Rationale for this change
    When we translate `float16` from `pyarrow` to `pandas`, no error occurs:
    ```
    >>> df = pd.DataFrame({"a": [np.float16(1), np.float16(2), np.float16(3)]})
    >>> table = pa.Table.from_pandas(df)
    >>> df_new = table.to_pandas()
    ```
    But, when we use list:
    ```
    >>> df = pd.DataFrame({"a": [[np.float16(1)], [np.float16(2)], [np.float16(3)]]})
    >>> table = pa.Table.from_pandas(df)
    >>> df_new = table.to_pandas()
    pyarrow.lib.ArrowNotImplementedError: Not implemented type for Arrow list to pandas: halffloat
    ```
    
    ### What changes are included in this PR?
    
    ### Are these changes tested?
    Yes
    
    ### Are there any user-facing changes?
    Yes
    
    * Closes: #36168
    
    Authored-by: izveigor <iz...@gmail.com>
    Signed-off-by: Joris Van den Bossche <jo...@gmail.com>
---
 python/pyarrow/src/arrow/python/arrow_to_pandas.cc |  1 +
 python/pyarrow/tests/test_array.py                 | 10 ++++++++++
 2 files changed, 11 insertions(+)

diff --git a/python/pyarrow/src/arrow/python/arrow_to_pandas.cc b/python/pyarrow/src/arrow/python/arrow_to_pandas.cc
index f6b7ca9d54..2cd6f5c26d 100644
--- a/python/pyarrow/src/arrow/python/arrow_to_pandas.cc
+++ b/python/pyarrow/src/arrow/python/arrow_to_pandas.cc
@@ -165,6 +165,7 @@ static inline bool ListTypeSupported(const DataType& type) {
     case Type::INT32:
     case Type::INT64:
     case Type::UINT64:
+    case Type::HALF_FLOAT:
     case Type::FLOAT:
     case Type::DOUBLE:
     case Type::DECIMAL128:
diff --git a/python/pyarrow/tests/test_array.py b/python/pyarrow/tests/test_array.py
index 65f69a9c0f..77da6a3ebd 100644
--- a/python/pyarrow/tests/test_array.py
+++ b/python/pyarrow/tests/test_array.py
@@ -3347,6 +3347,16 @@ def test_to_pandas_timezone():
     assert s.dt.tz is not None
 
 
+@pytest.mark.pandas
+def test_to_pandas_float16_list():
+    # https://github.com/apache/arrow/issues/36168
+    expected = [[np.float16(1)], [np.float16(2)], [np.float16(3)]]
+    arr = pa.array(expected)
+    result = arr.to_pandas()
+    assert result[0].dtype == "float16"
+    assert result.tolist() == expected
+
+
 def test_array_sort():
     arr = pa.array([5, 7, 35], type=pa.int64())
     sorted_arr = arr.sort("descending")