You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Antoine Pitrou (Jira)" <ji...@apache.org> on 2021/06/04 17:06:00 UTC

[jira] [Created] (ARROW-12976) [Python] Arrow-to-Python conversion is slow

Antoine Pitrou created ARROW-12976:
--------------------------------------

             Summary: [Python] Arrow-to-Python conversion is slow
                 Key: ARROW-12976
                 URL: https://issues.apache.org/jira/browse/ARROW-12976
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Python
            Reporter: Antoine Pitrou


It seems that we are almost 10x slower for converting the exact same data to a Python list.

With integers:
{code:python}
>>> arr = np.arange(0,1000, dtype=np.int64)
>>> %timeit arr.tolist()
9.68 µs ± 9.53 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> parr = pa.array(arr)
>>> %timeit parr.to_pylist()
846 µs ± 4.25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
{code}

With floats:
{code:python}
>>> arr = np.arange(0,1000, dtype=np.float64)
>>> %timeit arr.tolist()
10.3 µs ± 289 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> parr = pa.array(arr)
>>> %timeit parr.to_pylist()
878 µs ± 2.75 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)