You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "Raja.Aravapalli" <Ra...@target.com> on 2017/10/06 21:02:45 UTC

Bucketing/Rolling Sink: How to overwrite method "openNewPartFile" - to append a new timestamp to part file path every time a new part file is being created

Hi,

I want to overwrite the method “openNewPartFile” in the BucketingSink Class such that it creates part file name with inclusion of timestamp whenever it rolls a new part file.

Can someone share some thoughts on how I can do this.

Thanks a ton, in advance.


Regards,
Raja.

Re: Bucketing/Rolling Sink: How to overwrite method "openNewPartFile" - to append a new timestamp to part file path every time a new part file is being created

Posted by Kostas Kloudas <k....@data-artisans.com>.
Hi Raja,

To know about the method, I suppose you have looked at the source code of the sink.
I think that including the timestamp of the element in the path file is not as easy as overriding the openNewPartFile.

The reason is that the filenames serve as identities for the associated state of the bucket and this searches for 
complete equality of the filename, rather that “contains()”, when checking for partial filenames to transition from
pending to finished state.

A way to bypass this, it to write along each element, its timestamp, so that when you check out the content of the 
file, you can see the timestamp of the first element. You will have to write more data though.

Does this fit your needs?

Kostas

> On Oct 6, 2017, at 11:02 PM, Raja.Aravapalli <Ra...@target.com> wrote:
> 
>  
> Hi,
>  
> I want to overwrite the method “openNewPartFile” in the BucketingSink Class such that it creates part file name with inclusion of timestamp whenever it rolls a new part file.
>  
> Can someone share some thoughts on how I can do this.                 
>  
> Thanks a ton, in advance. 
>  
>  
> Regards,
> Raja.