You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by 爱在西元前 <37...@qq.com> on 2019/10/29 06:59:51 UTC

[DISCUSSION] Support write Flink streaming data to Carbon

The write process is:

Write flink streaming data to local file system of flink task node use flink StreamingFileSink and carbon SDK;

Copy local carbon data file to carbon data store system, such as HDFS, S3;

Generate and write segment file to ${tablePath}/load_details;

Run "alter table ${tableName} collect segments" command on server, to compact segment files in ${tablePath}/load_details, and then move the compacted segment file to ${tablePath}/Metadata/Segments/,update table status file finally.

Have raised a jira&nbsp;https://issues.apache.org/jira/browse/CARBONDATA-3557

Welcome you opinion and suggestions.

Re: [DISCUSSION] Support write Flink streaming data to Carbon

Posted by Raghunandan S <ca...@gmail.com>.
+1

On Thu, 31 Oct, 2019, 9:13 AM Jacky Li, <ja...@apache.org> wrote:

> +1 for these feature, in my opinion, flink-carbon is a good fit for near
> realtiem analytics
>
> One doubt is that in your design, the Collect Segment command and
> Compaction command are two separate commands, right?
>
> Collect Segment command modify the metadata files (tablestatus file and
> segment file), while Compaction command merges small data files and build
> indexes.
>
> Is my understanding right?
>
> Regards,
> Jacky
>
> On 2019/10/29 06:59:51, "爱在西元前" <37...@qq.com> wrote:
> > The write process is:
> >
> > Write flink streaming data to local file system of flink task node use
> flink StreamingFileSink and carbon SDK;
> >
> > Copy local carbon data file to carbon data store system, such as HDFS,
> S3;
> >
> > Generate and write segment file to ${tablePath}/load_details;
> >
> > Run "alter table ${tableName} collect segments" command on server, to
> compact segment files in ${tablePath}/load_details, and then move the
> compacted segment file to ${tablePath}/Metadata/Segments/,update table
> status file finally.
> >
> > Have raised a jira https://issues.apache.org/jira/browse/CARBONDATA-3557
> >
> > Welcome you opinion and suggestions.
>

Re: [DISCUSSION] Support write Flink streaming data to Carbon

Posted by Jacky Li <ja...@apache.org>.
+1 for these feature, in my opinion, flink-carbon is a good fit for near realtiem analytics

One doubt is that in your design, the Collect Segment command and Compaction command are two separate commands, right?

Collect Segment command modify the metadata files (tablestatus file and segment file), while Compaction command merges small data files and build indexes.

Is my understanding right?

Regards,
Jacky

On 2019/10/29 06:59:51, "爱在西元前" <37...@qq.com> wrote: 
> The write process is:
> 
> Write flink streaming data to local file system of flink task node use flink StreamingFileSink and carbon SDK;
> 
> Copy local carbon data file to carbon data store system, such as HDFS, S3;
> 
> Generate and write segment file to ${tablePath}/load_details;
> 
> Run "alter table ${tableName} collect segments" command on server, to compact segment files in ${tablePath}/load_details, and then move the compacted segment file to ${tablePath}/Metadata/Segments/,update table status file finally.
> 
> Have raised a jira https://issues.apache.org/jira/browse/CARBONDATA-3557
> 
> Welcome you opinion and suggestions.