You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Devavret Makkar (Jira)" <ji...@apache.org> on 2020/06/24 02:36:00 UTC

[jira] [Created] (ARROW-9215) pyarrow parquet writer converts uint32 columns to int64

Devavret Makkar created ARROW-9215:
--------------------------------------

             Summary: pyarrow parquet writer converts uint32 columns to int64
                 Key: ARROW-9215
                 URL: https://issues.apache.org/jira/browse/ARROW-9215
             Project: Apache Arrow
          Issue Type: Bug
            Reporter: Devavret Makkar


pyarrow parquet writer changes uint32 columns to int64. This change is not made for other types and uint8, uint16, and uint64 columns retain their type.
{code:python}
In [1]: import pandas as pd

In [2]: import pyarrow as pa

In [3]: import pyarrow.parquet as pq

In [5]: df = pd.DataFrame({'a':pd.Series([1,2,3], dtype='uint32')})

In [6]: padf = pa.Table.from_pandas(df)

In [7]: padf
Out[7]: 
pyarrow.Table
a: uint32

In [8]: pq.write_table(padf, 'pa.parquet')

In [9]: pq.read_table('pa.parquet')
Out[9]: 
pyarrow.Table
a: int64
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)