You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Rok Mihevc (Jira)" <ji...@apache.org> on 2019/08/22 20:09:00 UTC

[jira] [Created] (ARROW-6327) [Python] Conversion of pandas.SparseArray columns in pandas.DataFrames to pyarrow.Table and back

Rok Mihevc created ARROW-6327:
---------------------------------

             Summary: [Python] Conversion of pandas.SparseArray columns in pandas.DataFrames to pyarrow.Table and back
                 Key: ARROW-6327
                 URL: https://issues.apache.org/jira/browse/ARROW-6327
             Project: Apache Arrow
          Issue Type: New Feature
          Components: Python
            Reporter: Rok Mihevc


We would like to convert sparse columns from Pandas to Arrow:

{code:python}
import numpy as np
import pandas
import pyarrow

arr = pandas.Series([1, 2, 3])
sparr = pandas.SparseArray(np.array([1, 0, 0], dtype='int64'))
df = pandas.DataFrame({'sparr': sparr, 'arr': arr})

table = pyarrow.table(df)
df == table.to_pandas()
{code}

I assume `pandas.SparseArray` is a 1D sparse COO Tensor that would map to `pyarrow.SparseTensorCOO`.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)