You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Chen Qin <qi...@gmail.com> on 2020/11/19 06:30:39 UTC
Hive Streaming write compaction
Hi there,
We are testing out writing Kafka to hive table as parquet format.
Currently, we have seen user has to choose to create lots of small files in
min level folder to gain latency benefits. I recall FF2020 Global folks
mentioned implement compaction logic during the checkpointing time. Wonder
how that goes? Love collaborate on this topic.
Chen
Pinterest
Re: Hive Streaming write compaction
Posted by Kurt Young <yk...@gmail.com>.
We just added this feature to 1.12 [1][2], it would be great that you can
download the 1.12 RC to test
it out, and give us some feedback.
In case you will wonder why I linked 2 jiras, it's because both FileSystem
& Hive connector share
the same option options and also the implementations.
[1] https://issues.apache.org/jira/browse/FLINK-19875
[2] https://issues.apache.org/jira/browse/FLINK-19886
Best,
Kurt
On Thu, Nov 19, 2020 at 2:31 PM Chen Qin <qi...@gmail.com> wrote:
> Hi there,
>
> We are testing out writing Kafka to hive table as parquet format.
> Currently, we have seen user has to choose to create lots of small files in
> min level folder to gain latency benefits. I recall FF2020 Global folks
> mentioned implement compaction logic during the checkpointing time. Wonder
> how that goes? Love collaborate on this topic.
>
> Chen
> Pinterest
>
Re: Hive Streaming write compaction
Posted by Jingsong Li <ji...@gmail.com>.
Hi Chen,
Table Filesystem/Hive sink file compaction has been merged into master,
detail in [1]. It is included in Flink 1.12.
Hope you can have a try and test.
[1]https://issues.apache.org/jira/browse/FLINK-19345
Best,
Jingsong
On Thu, Nov 19, 2020 at 2:31 PM Chen Qin <qi...@gmail.com> wrote:
> Hi there,
>
> We are testing out writing Kafka to hive table as parquet format.
> Currently, we have seen user has to choose to create lots of small files in
> min level folder to gain latency benefits. I recall FF2020 Global folks
> mentioned implement compaction logic during the checkpointing time. Wonder
> how that goes? Love collaborate on this topic.
>
> Chen
> Pinterest
>
--
Best, Jingsong Lee