You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/03/17 04:34:14 UTC

[GitHub] [arrow-rs] anliakho2 opened a new issue #1459: Timestamps with time unit of MICROS or MILLIS are read incorrectly

anliakho2 opened a new issue #1459:
URL: https://github.com/apache/arrow-rs/issues/1459


   **Describe the bug**
   If parquet is written with timestamps with time unit other than `ns` reading such file would produce incorrect dates, whereas pandas is reading the dates correctly 
   
   **To Reproduce**
   Generate parquet file as follows:
   `
   import pandas as pd
   import numpy as np
   
   np.random.seed(0)
   # create an array of 5 dates starting at '2015-02-24', one per minute
   rng = pd.date_range('2020-01-01', periods=5, freq='H')
   df = pd.DataFrame({ 'Date': rng, 'Val': np.random.randn(len(rng)) })
   df.to_parquet('data/myfile.parquet', coerce_timestamps='ms', allow_truncated_timestamps=True)
   `
   
   **Expected behavior**
   Data is not corrupted and dates are read back correctly.
   
   **Additional context**
   _


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] jmahenriques commented on issue #1459: Timestamps with time unit of MICROS or MILLIS are read incorrectly

Posted by GitBox <gi...@apache.org>.
jmahenriques commented on issue #1459:
URL: https://github.com/apache/arrow-rs/issues/1459#issuecomment-1082977600


   We experience the same issue (using datafusion), parquet metadata states microsecond but it is inferred as nanosecond, completely distorting the timestamps.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org