You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Avi Levi <av...@bluevoyant.com> on 2018/12/03 10:45:40 UTC

Looking for example for bucketingSink / StreamingFileSink

Hi Guys,
very new to flink so my apology for the newbie questions :)
but I desperately looking for a good example for streaming to file
using bucketingSink / StreamingFileSink . Unfortunately the examples in the
documentation are not event compiling (at least not the ones in scala
https://issues.apache.org/jira/browse/FLINK-11053 )

I tried using bucketing sink with streamingFileSink (or just
streamingFileSink ) and finally tried to implement a writer but with no
luck.
BucketingSink seems to be a perfect fit because I can set the batch
interval by time interval or size which is exactly what I need.

This is my last attempt (sample project)
<https://bitbucket.org/avilevi/kafka-flink-parquet/src/master/> which
results a lot of "pending" files.
*Any help would be appreciated*

*Thanks*
*Avi*

Re: Looking for example for bucketingSink / StreamingFileSink

Posted by miki haiat <mi...@gmail.com>.
HI Avi ,
Im assuming that the cause  of the "pending" file is because the checkpoint
isn't finished successfully [1]
Can you try to change the checkpoint time to 1 min as well .


Thanks,
Miki



[1]
 https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L131
<https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L131>

On Mon, Dec 3, 2018 at 12:45 PM Avi Levi <av...@bluevoyant.com> wrote:

> Hi Guys,
> very new to flink so my apology for the newbie questions :)
> but I desperately looking for a good example for streaming to file
> using bucketingSink / StreamingFileSink . Unfortunately the examples in the
> documentation are not event compiling (at least not the ones in scala
> https://issues.apache.org/jira/browse/FLINK-11053 )
>
> I tried using bucketing sink with streamingFileSink (or just
> streamingFileSink ) and finally tried to implement a writer but with no
> luck.
> BucketingSink seems to be a perfect fit because I can set the batch
> interval by time interval or size which is exactly what I need.
>
> This is my last attempt (sample project)
> <https://bitbucket.org/avilevi/kafka-flink-parquet/src/master/> which
> results a lot of "pending" files.
> *Any help would be appreciated*
>
> *Thanks*
> *Avi*
>