You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "liuxiaoyu (Jira)" <ji...@apache.org> on 2022/10/12 12:34:00 UTC

[jira] [Updated] (ORC-1287) C++ read timestamp value is different from java read when using csv-import tool convert CSV to ORC files

     [ https://issues.apache.org/jira/browse/ORC-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

liuxiaoyu updated ORC-1287:
---------------------------
    Summary: C++ read timestamp value is different from java read when using csv-import tool convert CSV to ORC files  (was: C++ read timestamp value is different from java read when using csv-import tool converter CSV to ORC files)

> C++ read timestamp value is different from java read when using csv-import tool convert CSV to ORC files
> --------------------------------------------------------------------------------------------------------
>
>                 Key: ORC-1287
>                 URL: https://issues.apache.org/jira/browse/ORC-1287
>             Project: ORC
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: 1.7.3
>         Environment: centos7
>            Reporter: liuxiaoyu
>            Priority: Major
>         Attachments: test.csv, test.orc
>
>
> I have a csv file. Convert to orc files with the c++ csv-import tool.   
> ORC Version is v1.7.3  
>  
> Command
> ```
> csv-import struct<a:timestamp> ./test.csv ./test.orc
> java -jar orc-tools-1.7.3-uber.jar data test.orc
> orc-contents ./test.orc
> ```
>   
>  
> CSV File
> ```
> 0001-01-01 00:00:00.000000
> 0001-10-19 10:23:54.123456
> 0099-10-19 10:23:54.123456
> 1900-10-19 10:23:54.123456
> 1969-12-31 23:59:59.001
> 1969-12-31 23:59:59.999999
> 1970-01-01 00:00:00.000
> 1970-01-01 00:00:00.001
> 1970-01-01 23:59:59.999999
> ```
> c++ read orc file
> ```
> {"a": "1-01-01 00:00:00.0"}
> {"a": "1-10-19 10:23:54.123456"}
> {"a": "99-10-19 10:23:54.123456"}
> {"a": "1900-10-19 10:23:54.123456"}
> {"a": "1970-01-01 00:00:00.001"}
> {"a": "1970-01-01 00:00:00.999999"}
> {"a": "1970-01-01 00:00:00.0"}
> {"a": "1970-01-01 00:00:00.001"}
> {"a": "1970-01-01 23:59:59.999999"}
> ```
> java read orc file
> ```
> {"a":"0001-01-03 08:00:00.0"}
> {"a":"0001-10-21 18:23:54.123456"}
> {"a":"0099-10-21 18:23:54.123456"}
> {"a":"1900-10-19 18:29:37.123456"}
> {"a":"1970-01-01 08:00:00.001"}
> {"a":"1970-01-01 08:00:00.999999"}
> {"a":"1970-01-01 08:00:00.0"}
> {"a":"1970-01-01 08:00:00.001"}
> {"a":"1970-01-02 07:59:59.999999"}
> ```
> `0001-01-01 00:00:00.000000`  java and c++ show timestamp are different
>   
> Tried the version orc main branch is the same results.
>   
> this issue looks similar to this issue
> https://issues.apache.org/jira/browse/ORC-1055



--
This message was sent by Atlassian Jira
(v8.20.10#820010)