You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@iotdb.apache.org by "Jialin Qiao (Jira)" <ji...@apache.org> on 2021/09/13 01:28:00 UTC

[jira] [Reopened] (IOTDB-842) Better Export/Import-CSV Tool

     [ https://issues.apache.org/jira/browse/IOTDB-842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jialin Qiao reopened IOTDB-842:
-------------------------------

> Better Export/Import-CSV Tool
> -----------------------------
>
>                 Key: IOTDB-842
>                 URL: https://issues.apache.org/jira/browse/IOTDB-842
>             Project: Apache IoTDB
>          Issue Type: Task
>          Components: Tools/Others
>            Reporter: Xiangdong Huang
>            Assignee: Xuan Ronaldo
>            Priority: Major
>              Labels: pull-request-available
>
> Hi, our import-csv tool is currently implemented by JDBC and requires a fossil format:
> e.g., 
> {code:java}
> Time,root.sg.d1.s1,root.sg.d1.s2,root.sg.d2.s1,root.sg.d2.s2,root.sg.d2.s3
> 2020-08-18T10:22:31.603+08:00,1,2.0,null,null,null
> 2020-08-18T10:22:35.631+08:00,1,2.0,null,null,null
> 2020-08-18T10:22:41.093+08:00,null,null,1,2.0,null
> 2020-08-18T10:22:52.603+08:00,null,null,1,2.0,true
> {code}
> Requirement 1:
> As we support 3 kinds of output format: align all series (by default), align by device, without alignment, it is better to support such 3 kinds of import-csv format:
> a. 
> {code:java}
> Time,root.sg.d1.s1,root.sg.d1.s2,root.sg.d2.s1,root.sg.d2.s2,root.sg.d2.s3
> 2020-08-18T10:22:31.603+08:00,1,2.0,null,null,null
> 2020-08-18T10:22:35.631+08:00,1,2.0,null,null,null
> 2020-08-18T10:22:41.093+08:00,null,null,1,2.0,null
> 2020-08-18T10:22:52.603+08:00,null,null,1,2.0,true
> {code}
> b. 
> {code:java}
> Time,Device,s1,s2,s3
> 2020-08-18T10:22:31.603+08:00,root.sg.d1,1,2.0,null
> 2020-08-18T10:22:35.631+08:00,root.sg.d1,1,2.0,null
> 2020-08-18T10:22:41.093+08:00,root.sg.d2,1,2.0,null
> 2020-08-18T10:22:52.603+08:00,root.sg.d2,1,2.0,true
> {code}
> c.
> (it is strange, I'd like to do not support such format.)
> Requment2:
> Different users may have different time formats for the first column.
> So, we'd better support different kinds of time format. e.g., let users define how to parse their timestamp: yyyy-MM-ddHH:mm:ss.SSS etc..
> Requirement 3:
> Support NULL as well as empty char to describe the null data point. For example, the following  3 lines are the same:
> 2020-08-18T10:22:31.603+08:00,root.sg.d1,1,null,null
> 2020-08-18T10:22:31.603+08:00,root.sg.d1,1,,
> 2020-08-18T10:22:31.603+08:00,root.sg.d1,1,    ,
> Requirement 4:
> Support claiming the storage group name once rather than repeat the storage group name for each line:
> e.g., for format b, we can tell the tool the sg is `root.sg` and then each row looks like:
> 2020-08-18T10:22:35.631+08:00,d1,1,2.0,null
> Another option is add a new column called storage_group for each row.
> For UT:
> 1. all data type should be covered;
> 2. incorrect csv format should be covered;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)