You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Mark Waddle (Jira)" <ji...@apache.org> on 2020/05/27 17:43:00 UTC

[jira] [Created] (ARROW-8967) [Python] [Parquet] Table.to_pandas() fails to convert valid TIMESTAMP_MILLIS fails to convert to pandas timestamp

Mark Waddle created ARROW-8967:
----------------------------------

             Summary: [Python] [Parquet] Table.to_pandas() fails to convert valid TIMESTAMP_MILLIS fails to convert to pandas timestamp
                 Key: ARROW-8967
                 URL: https://issues.apache.org/jira/browse/ARROW-8967
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 0.17.0
            Reporter: Mark Waddle


reading a parquet file with a valid TIMESTAMP_MILLIS value of -61552915200000 (0019-06-20) results in the following error
{noformat}
File "pyarrow/array.pxi", line 587, in pyarrow.lib._PandasConvertible.to_pandas
  File "pyarrow/table.pxi", line 1640, in pyarrow.lib.Table._to_pandas
  File "/Users/mark/.local/share/virtualenvs/parquetpy-BNIqCtDj/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 766, in table_to_blockmanager
    blocks = _table_to_blocks(options, table, categories, ext_columns_dtypes)
  File "/Users/mark/.local/share/virtualenvs/parquetpy-BNIqCtDj/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 1102, in _table_to_blocks
    list(extension_columns.keys()))
  File "pyarrow/table.pxi", line 1107, in pyarrow.lib.table_to_blocks
  File "pyarrow/error.pxi", line 85, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Casting from timestamp[ms] to timestamp[ns] would result in out of bounds timestamp: -61552915200000
{noformat}

as it stands there is no way to read this file

i would like to be able to choose the timestamp unit when reading, much like you can when writing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)