You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Vitalii Diravka (JIRA)" <ji...@apache.org> on 2017/07/10 14:27:00 UTC
[jira] [Commented] (DRILL-4060) CTAS to csv or json files gives
incorrect time values
[ https://issues.apache.org/jira/browse/DRILL-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080402#comment-16080402 ]
Vitalii Diravka commented on DRILL-4060:
----------------------------------------
It looks like when DRILL-5332 and DRILL-5334 is solved, this issue will go away.
The workaround is not to use the cast to date, time, timestamp functions in CTAS.
{code}
0: jdbc:drill:zk=local> ALTER SESSION SET `store.format`='json';
+-------+------------------------+
| ok | summary |
+-------+------------------------+
| true | store.format updated. |
+-------+------------------------+
1 row selected (0.095 seconds)
0: jdbc:drill:zk=local> select timeofday() from sys.version;
+----------------------------------------------+
| EXPR$0 |
+----------------------------------------------+
| 2017-07-10 14:19:39.715 America/Los_Angeles |
+----------------------------------------------+
1 row selected (0.133 seconds)
0: jdbc:drill:zk=local> select columns[4] from dfs.`/home/vitalii/resources/allData.csv` limit 1;
+--------------------------+
| EXPR$0 |
+--------------------------+
| 2014-06-02 00:28:02.418 |
+--------------------------+
1 row selected (0.162 seconds)
0: jdbc:drill:zk=local> create table `test_table` as select columns[4] from dfs.`/home/vitalii/resources/allData.csv` limit 1;
+-----------+----------------------------+
| Fragment | Number of records written |
+-----------+----------------------------+
| 0_0 | 1 |
+-----------+----------------------------+
1 row selected (0.148 seconds)
0: jdbc:drill:zk=local> select * from `test_table`;
+--------------------------+
| EXPR$0 |
+--------------------------+
| 2014-06-02 00:28:02.418 |
+--------------------------+
1 row selected (0.099 seconds)
{code}
> CTAS to csv or json files gives incorrect time values
> -----------------------------------------------------
>
> Key: DRILL-4060
> URL: https://issues.apache.org/jira/browse/DRILL-4060
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - Writer
> Reporter: Krystal
> Priority: Critical
>
> I have a csv file with the following data:
> select columns[8], columns[10] from `interval_data.csv`;
> +-----------+----------------------+
> | EXPR$0 | EXPR$1 |
> +-----------+----------------------+
> | 01:35:56 | 2015-01-24 07:27:05 |
> | 12:22:07 | 2014-05-25 03:41:54 |
> | 00:01:28 | 2014-07-30 08:03:11 |
> | 00:00:01 | 2014-09-15 02:33:11 |
> | 06:59:59 | 2014-06-17 13:04:09 |
> | 23:38:16 | 2015-02-01 02:02:37 |
> | 15:00:00 | 2014-08-16 13:11:12 |
> | 05:55:36 | 2014-04-13 15:06:36 |
> | 10:48:36 | 2013-03-23 01:24:20 |
> +-----------+----------------------+
> I created a json file using CTAS:
> alter session set `store.format` = 'json';
> create table `test1.json` as select cast(columns[8] as time) c_time,cast(columns[10] as timestamp) c_timestamp from `interval_data.csv`;
> select c_time, c_timestamp from `test1.json`;
> +---------------+--------------------------+
> | c_time | c_timestamp |
> +---------------+--------------------------+
> | 09:35:56.000 | 2015-01-24 15:27:05.000 |
> | 20:22:07.000 | 2014-05-25 10:41:54.000 |
> | 08:01:28.000 | 2014-07-30 15:03:11.000 |
> | 08:00:01.000 | 2014-09-15 09:33:11.000 |
> | 14:59:59.000 | 2014-06-17 20:04:09.000 |
> | 07:38:16.000 | 2015-02-01 10:02:37.000 |
> | 23:00:00.000 | 2014-08-16 20:11:12.000 |
> | 13:55:36.000 | 2014-04-13 22:06:36.000 |
> | 18:48:36.000 | 2013-03-23 08:24:20.000 |
> +---------------+--------------------------+
> Notice that the times have 8 hours added to the original values.
> I got the same result when create another CSV file using CTAS on the same data.
> For CTAS as parquet, however, the resulting data is the same as the orginal data.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)