You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iotdb.apache.org by jincheng sun <su...@gmail.com> on 2020/03/12 02:30:50 UTC

Re: [DISCUSS] Add Flink Connector Support for TsFile

Hi folks,

I would like to bring up the VOTE thread as there no objections feedback
both in mailing list and design doc for a long time.

Best,
Jincheng



jincheng sun <su...@gmail.com> 于2020年2月28日周五 下午11:01写道:

> Hi folks,
>
> TsFile is a columnar storage file format in Apache IoTDB. It is designed
> for time series data and  supports efficient compression and query and is
> easy to be integrated into big data processing frameworks.
>
> Apache Flink is a framework and distributed processing engine for stateful
> computations over unbounded and bounded data streams and becoming more and
> more popular in IOT scenes. So, it would be great to integrate IoTDB and
> Flink.
>
> I would like to introduce the TsFile Flink Connector, i.e., allows Flink
> to read, write TsFile, the detail can be found in the design doc [1].
>
> Welcome any feedback and comments in design doc.
>
> Best,
> Jincheng
>
> [1]
> https://docs.google.com/document/d/1h0_aGboUCqlke-CdSKy1yNCsHTBKhw3JHWlwLBtbmGU/edit?usp=sharing
>
>

Re: [DISCUSS] Add Flink Connector Support for TsFile

Posted by Dawei Liu <at...@163.com>.
Hi,

Sorry, I forgot to reply to you.

1. FlinkRowRecordParse.parse().  [page 10]

 int pos = paths.indexOf(new Path(fieldName));

Can we guarantee the index correspondence between the type and name to reduce the need for each row of data to match the index

2. Read Sequence Diagram [page 7]
When you use the QueryDataSet please note that hasNext() must be called before calling next()

3. Don't see RowBatch to used, but I think you've thought about it.




Thanks
---
Dawei Liu



> 2020年3月12日 下午5:00,Jialin Qiao <qj...@mails.tsinghua.edu.cn> 写道:
> 
> Hi,
> 
> Sorry, I also miss this email..
> 
> I read the design doc and it looks good!
> 
> Just one question, in the "Write Sequence Diagram", the TSRecordConverter and TSRecordCollector issue the write to TsFileWriter.
> But in the "Batch Write Sequence Diagram", the RowBatchOutputFormat issues the write command to TsFileWriter.
> 
> What's the difference between them?
> 
> Intuitively, the Converter is a tool and the collector is a data buffer. 
> So, should the write/flush commands should be issued either by a Writer or the OutputFormat?
> 
> Thanks,
> --
> Jialin Qiao
> School of Software, Tsinghua University
> 
> 乔嘉林
> 清华大学 软件学院
> 
>> -----原始邮件-----
>> 发件人: "jincheng sun" <su...@gmail.com>
>> 发送时间: 2020-03-12 10:30:50 (星期四)
>> 收件人: dev@iotdb.apache.org
>> 抄送: 
>> 主题: Re: [DISCUSS] Add Flink Connector Support for TsFile
>> 
>> Hi folks,
>> 
>> I would like to bring up the VOTE thread as there no objections feedback
>> both in mailing list and design doc for a long time.
>> 
>> Best,
>> Jincheng
>> 
>> 
>> 
>> jincheng sun <su...@gmail.com> 于2020年2月28日周五 下午11:01写道:
>> 
>>> Hi folks,
>>> 
>>> TsFile is a columnar storage file format in Apache IoTDB. It is designed
>>> for time series data and  supports efficient compression and query and is
>>> easy to be integrated into big data processing frameworks.
>>> 
>>> Apache Flink is a framework and distributed processing engine for stateful
>>> computations over unbounded and bounded data streams and becoming more and
>>> more popular in IOT scenes. So, it would be great to integrate IoTDB and
>>> Flink.
>>> 
>>> I would like to introduce the TsFile Flink Connector, i.e., allows Flink
>>> to read, write TsFile, the detail can be found in the design doc [1].
>>> 
>>> Welcome any feedback and comments in design doc.
>>> 
>>> Best,
>>> Jincheng
>>> 
>>> [1]
>>> https://docs.google.com/document/d/1h0_aGboUCqlke-CdSKy1yNCsHTBKhw3JHWlwLBtbmGU/edit?usp=sharing
>>> 
>>> 


Re: [DISCUSS] Add Flink Connector Support for TsFile

Posted by Jialin Qiao <qj...@mails.tsinghua.edu.cn>.
Hi,

Sorry, I also miss this email..

I read the design doc and it looks good!

Just one question, in the "Write Sequence Diagram", the TSRecordConverter and TSRecordCollector issue the write to TsFileWriter.
But in the "Batch Write Sequence Diagram", the RowBatchOutputFormat issues the write command to TsFileWriter.

What's the difference between them?

Intuitively, the Converter is a tool and the collector is a data buffer. 
So, should the write/flush commands should be issued either by a Writer or the OutputFormat?

Thanks,
--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

> -----原始邮件-----
> 发件人: "jincheng sun" <su...@gmail.com>
> 发送时间: 2020-03-12 10:30:50 (星期四)
> 收件人: dev@iotdb.apache.org
> 抄送: 
> 主题: Re: [DISCUSS] Add Flink Connector Support for TsFile
> 
> Hi folks,
> 
> I would like to bring up the VOTE thread as there no objections feedback
> both in mailing list and design doc for a long time.
> 
> Best,
> Jincheng
> 
> 
> 
> jincheng sun <su...@gmail.com> 于2020年2月28日周五 下午11:01写道:
> 
> > Hi folks,
> >
> > TsFile is a columnar storage file format in Apache IoTDB. It is designed
> > for time series data and  supports efficient compression and query and is
> > easy to be integrated into big data processing frameworks.
> >
> > Apache Flink is a framework and distributed processing engine for stateful
> > computations over unbounded and bounded data streams and becoming more and
> > more popular in IOT scenes. So, it would be great to integrate IoTDB and
> > Flink.
> >
> > I would like to introduce the TsFile Flink Connector, i.e., allows Flink
> > to read, write TsFile, the detail can be found in the design doc [1].
> >
> > Welcome any feedback and comments in design doc.
> >
> > Best,
> > Jincheng
> >
> > [1]
> > https://docs.google.com/document/d/1h0_aGboUCqlke-CdSKy1yNCsHTBKhw3JHWlwLBtbmGU/edit?usp=sharing
> >
> >