You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by "Varun Raval (Jira)" <ji...@apache.org> on 2021/11/29 21:12:00 UTC

[jira] [Created] (ORC-1054) Unable to compare data (generated using CSV to ORC converter) on timestamp column

Varun Raval created ORC-1054:
--------------------------------

             Summary: Unable to compare data (generated using CSV to ORC converter) on timestamp column
                 Key: ORC-1054
                 URL: https://issues.apache.org/jira/browse/ORC-1054
             Project: ORC
          Issue Type: Bug
          Components: C++, Java
            Reporter: Varun Raval


I have a CSV file with timestamp columns. Then I convert CSV file to ORC file using CSV to ORC converter and place the ORC file in a hive table backed by ORC files. I am not able to query the data using timestamp column on Apache Hive beeline. If timestamp is present in the select query, the corresponding rows are not retrieved.

For example, table csvtest has single column (t) as timestamp datatype. It has a row '2021-11-10 01:02:15'. Query "select * from csvtest where t > '2021-11-10 00:00:00'" does not return any result. Query "select * from csvtest" returns the correct row.

However, the same query "select * from csvtest where t > '2021-11-10 00:00:00'" works with Spark SQL and rows are retrieved correctly.

Is this issue with how ORC file is created or is it some hive configuration issue?

I have tested it on the master branch and results are same for both cpp and java csv to orc converters.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)