You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Vitalii Diravka (JIRA)" <ji...@apache.org> on 2017/07/10 14:27:00 UTC

[jira] [Commented] (DRILL-4060) CTAS to csv or json files gives incorrect time values

    [ https://issues.apache.org/jira/browse/DRILL-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080402#comment-16080402 ] 

Vitalii Diravka commented on DRILL-4060:
----------------------------------------

It looks like when DRILL-5332 and DRILL-5334 is solved, this issue will go away.

The workaround is not to use the cast to date, time, timestamp functions in CTAS.

{code}
0: jdbc:drill:zk=local> ALTER SESSION SET `store.format`='json';
+-------+------------------------+
|  ok   |        summary         |
+-------+------------------------+
| true  | store.format updated.  |
+-------+------------------------+
1 row selected (0.095 seconds)

0: jdbc:drill:zk=local> select timeofday() from sys.version;
+----------------------------------------------+
|                    EXPR$0                    |
+----------------------------------------------+
| 2017-07-10 14:19:39.715 America/Los_Angeles  |
+----------------------------------------------+
1 row selected (0.133 seconds)

0: jdbc:drill:zk=local> select columns[4] from dfs.`/home/vitalii/resources/allData.csv` limit 1;
+--------------------------+
|          EXPR$0          |
+--------------------------+
| 2014-06-02 00:28:02.418  |
+--------------------------+
1 row selected (0.162 seconds)

0: jdbc:drill:zk=local> create table `test_table` as select columns[4] from dfs.`/home/vitalii/resources/allData.csv` limit 1;
+-----------+----------------------------+
| Fragment  | Number of records written  |
+-----------+----------------------------+
| 0_0       | 1                          |
+-----------+----------------------------+
1 row selected (0.148 seconds)

0: jdbc:drill:zk=local> select * from `test_table`;
+--------------------------+
|          EXPR$0          |
+--------------------------+
| 2014-06-02 00:28:02.418  |
+--------------------------+
1 row selected (0.099 seconds)
{code}

> CTAS to csv or json files gives incorrect time values
> -----------------------------------------------------
>
>                 Key: DRILL-4060
>                 URL: https://issues.apache.org/jira/browse/DRILL-4060
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Writer
>            Reporter: Krystal
>            Priority: Critical
>
> I have a csv file with the following data:
> select columns[8], columns[10] from `interval_data.csv`;
> +-----------+----------------------+
> |  EXPR$0   |        EXPR$1        |
> +-----------+----------------------+
> | 01:35:56  | 2015-01-24 07:27:05  |
> | 12:22:07  | 2014-05-25 03:41:54  |
> | 00:01:28  | 2014-07-30 08:03:11  |
> | 00:00:01  | 2014-09-15 02:33:11  |
> | 06:59:59  | 2014-06-17 13:04:09  |
> | 23:38:16  | 2015-02-01 02:02:37  |
> | 15:00:00  | 2014-08-16 13:11:12  |
> | 05:55:36  | 2014-04-13 15:06:36  |
> | 10:48:36  | 2013-03-23 01:24:20  |
> +-----------+----------------------+
> I created a json file using CTAS:
> alter session set `store.format` = 'json';
> create table `test1.json` as select cast(columns[8] as time) c_time,cast(columns[10] as timestamp) c_timestamp from `interval_data.csv`;
> select c_time, c_timestamp from `test1.json`;
> +---------------+--------------------------+
> |    c_time     |       c_timestamp        |
> +---------------+--------------------------+
> | 09:35:56.000  | 2015-01-24 15:27:05.000  |
> | 20:22:07.000  | 2014-05-25 10:41:54.000  |
> | 08:01:28.000  | 2014-07-30 15:03:11.000  |
> | 08:00:01.000  | 2014-09-15 09:33:11.000  |
> | 14:59:59.000  | 2014-06-17 20:04:09.000  |
> | 07:38:16.000  | 2015-02-01 10:02:37.000  |
> | 23:00:00.000  | 2014-08-16 20:11:12.000  |
> | 13:55:36.000  | 2014-04-13 22:06:36.000  |
> | 18:48:36.000  | 2013-03-23 08:24:20.000  |
> +---------------+--------------------------+
> Notice that the times have 8 hours added to the original values.
> I got the same result when create another CSV file using CTAS on the same data.
> For CTAS as parquet, however, the resulting data is the same as the orginal data. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)