You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Devavret Makkar (Jira)" <ji...@apache.org> on 2020/06/24 02:36:00 UTC
[jira] [Created] (ARROW-9215) pyarrow parquet writer converts
uint32 columns to int64
Devavret Makkar created ARROW-9215:
--------------------------------------
Summary: pyarrow parquet writer converts uint32 columns to int64
Key: ARROW-9215
URL: https://issues.apache.org/jira/browse/ARROW-9215
Project: Apache Arrow
Issue Type: Bug
Reporter: Devavret Makkar
pyarrow parquet writer changes uint32 columns to int64. This change is not made for other types and uint8, uint16, and uint64 columns retain their type.
{code:python}
In [1]: import pandas as pd
In [2]: import pyarrow as pa
In [3]: import pyarrow.parquet as pq
In [5]: df = pd.DataFrame({'a':pd.Series([1,2,3], dtype='uint32')})
In [6]: padf = pa.Table.from_pandas(df)
In [7]: padf
Out[7]:
pyarrow.Table
a: uint32
In [8]: pq.write_table(padf, 'pa.parquet')
In [9]: pq.read_table('pa.parquet')
Out[9]:
pyarrow.Table
a: int64
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)