You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Rinat <r....@cleverdata.ru> on 2018/10/11 12:10:16 UTC

Re: [BucketingSink] notify on moving into pending/ final state

Hi Piotr, during the migration to the latest Flink version, we’ve decided to try to contribute this functionality to the master branch.

PR is available here https://github.com/apache/flink/pull/6824 
More details about hooking the state changes in BucketingSink are available in https://issues.apache.org/jira/browse/FLINK-9592 

Thx !

> On 14 Jun 2018, at 23:29, Rinat <r....@cleverdata.ru> wrote:
> 
> Hi Piotr, I’ve create an issue https://issues.apache.org/jira/browse/FLINK-9592 <https://issues.apache.org/jira/browse/FLINK-9592>
> 
> The third proposal looks great, may I try to contribute this issue ?
> 
>> On 14 Jun 2018, at 12:29, Piotr Nowojski <piotr@data-artisans.com <ma...@data-artisans.com>> wrote:
>> 
>> Hi,
>> 
>> Couple of things:
>> 
>> 1. Please create a Jira ticket with this proposal, so we can move discussion from user mailing list.
>> 
>> I haven’t thought it through, so take my comments with a grain of salt, however:
>> 
>> 2. If we were to go with such callback, I would prefer to have one BucketStateChangeCallback, with methods like `onInProgressToPending(…)`, `onPendingToFinal`, `onPendingToCancelled(…)`, etc, in oppose to having one interface passed three times/four times for different purposes.
>> 
>> 3. Other thing that I had in mind is that BucketingSink could be rewritten to extend TwoPhaseCommitSinkFunction. In that case, with 
>> 
>> public class BucketingSink2 extends TwoPhaseCommitSinkFunction<???>
>> 
>> user could add his own hooks by overriding following methods
>> 
>> BucketingSink2#beginTransaction, BucketingSink2#preCommit, BucketingSink2#commit, BucketingSink2#abort. For example:
>> 
>> public class MyBucketingSink extends BucketingSink2 {
>>   @Override
>>   protected void  commit(??? txn) {
>>     super.commit(txn);
>>     // My hook on moving file from pending to commit state
>>   };
>> }
>> 
>> Alternatively, we could implement before mentioned callbacks support in TwoPhaseCommitSinkFunction and provide such feature to Kafka/Pravega/BucketingSink at once.
>> 
>> Piotrek
> 
> Sincerely yours,
> Rinat Sharipov
> Software Engineer at 1DMP CORE Team
> 
> email: r.sharipov@cleverdata.ru <ma...@cleverdata.ru>
> mobile: +7 (925) 416-37-26
> 
> CleverDATA
> make your data clever
> 

Sincerely yours,
Rinat Sharipov
Software Engineer at 1DMP CORE Team

email: r.sharipov@cleverdata.ru <ma...@cleverdata.ru>
mobile: +7 (925) 416-37-26

CleverDATA
make your data clever


Re: [BucketingSink] notify on moving into pending/ final state

Posted by Kostas Kloudas <k....@data-artisans.com>.
Hi Rinat,

I have commented on your PR and on the JIRA.
Let me know what you think.

Cheers,
Kostas

> On Oct 11, 2018, at 4:45 PM, Dawid Wysakowicz <dw...@apache.org> wrote:
> 
> Hi Ribat, 
> I haven't checked your PR but we introduced a new connector in flink 1.6 called StreamingFileSink that is supposed to replace BucketingSink long term. I think it might solve a few problems of yours. Have you checked it by chance?
> 
> Best,
> Dawid
> 
> On Thu, 11 Oct 2018, 14:10 Rinat, <r.sharipov@cleverdata.ru <ma...@cleverdata.ru>> wrote:
> Hi Piotr, during the migration to the latest Flink version, we’ve decided to try to contribute this functionality to the master branch.
> 
> PR is available here https://github.com/apache/flink/pull/6824 <https://github.com/apache/flink/pull/6824> 
> More details about hooking the state changes in BucketingSink are available in https://issues.apache.org/jira/browse/FLINK-9592 <https://issues.apache.org/jira/browse/FLINK-9592> 
> 
> Thx !
> 
>> On 14 Jun 2018, at 23:29, Rinat <r.sharipov@cleverdata.ru <ma...@cleverdata.ru>> wrote:
>> 
>> Hi Piotr, I’ve create an issue https://issues.apache.org/jira/browse/FLINK-9592 <https://issues.apache.org/jira/browse/FLINK-9592>
>> 
>> The third proposal looks great, may I try to contribute this issue ?
>> 
>>> On 14 Jun 2018, at 12:29, Piotr Nowojski <piotr@data-artisans.com <ma...@data-artisans.com>> wrote:
>>> 
>>> Hi,
>>> 
>>> Couple of things:
>>> 
>>> 1. Please create a Jira ticket with this proposal, so we can move discussion from user mailing list.
>>> 
>>> I haven’t thought it through, so take my comments with a grain of salt, however:
>>> 
>>> 2. If we were to go with such callback, I would prefer to have one BucketStateChangeCallback, with methods like `onInProgressToPending(…)`, `onPendingToFinal`, `onPendingToCancelled(…)`, etc, in oppose to having one interface passed three times/four times for different purposes.
>>> 
>>> 3. Other thing that I had in mind is that BucketingSink could be rewritten to extend TwoPhaseCommitSinkFunction. In that case, with 
>>> 
>>> public class BucketingSink2 extends TwoPhaseCommitSinkFunction<???>
>>> 
>>> user could add his own hooks by overriding following methods
>>> 
>>> BucketingSink2#beginTransaction, BucketingSink2#preCommit, BucketingSink2#commit, BucketingSink2#abort. For example:
>>> 
>>> public class MyBucketingSink extends BucketingSink2 {
>>>   @Override
>>>   protected void  commit(??? txn) {
>>>     super.commit(txn);
>>>     // My hook on moving file from pending to commit state
>>>   };
>>> }
>>> 
>>> Alternatively, we could implement before mentioned callbacks support in TwoPhaseCommitSinkFunction and provide such feature to Kafka/Pravega/BucketingSink at once.
>>> 
>>> Piotrek
>> 
>> Sincerely yours,
>> Rinat Sharipov
>> Software Engineer at 1DMP CORE Team
>> 
>> email: r.sharipov@cleverdata.ru <ma...@cleverdata.ru>
>> mobile: +7 (925) 416-37-26
>> 
>> CleverDATA
>> make your data clever
>> 
> 
> Sincerely yours,
> Rinat Sharipov
> Software Engineer at 1DMP CORE Team
> 
> email: r.sharipov@cleverdata.ru <ma...@cleverdata.ru>
> mobile: +7 (925) 416-37-26
> 
> CleverDATA
> make your data clever
> 


Re: [BucketingSink] notify on moving into pending/ final state

Posted by Dawid Wysakowicz <dw...@apache.org>.
Hi Ribat,
I haven't checked your PR but we introduced a new connector in flink 1.6
called StreamingFileSink that is supposed to replace BucketingSink long
term. I think it might solve a few problems of yours. Have you checked it
by chance?

Best,
Dawid

On Thu, 11 Oct 2018, 14:10 Rinat, <r....@cleverdata.ru> wrote:

> Hi Piotr, during the migration to the latest Flink version, we’ve decided
> to try to contribute this functionality to the master branch.
>
> PR is available here https://github.com/apache/flink/pull/6824
> More details about hooking the state changes in BucketingSink are
> available in https://issues.apache.org/jira/browse/FLINK-9592
>
> Thx !
>
> On 14 Jun 2018, at 23:29, Rinat <r....@cleverdata.ru> wrote:
>
> Hi Piotr, I’ve create an issue
> https://issues.apache.org/jira/browse/FLINK-9592
>
> The third proposal looks great, may I try to contribute this issue ?
>
> On 14 Jun 2018, at 12:29, Piotr Nowojski <pi...@data-artisans.com> wrote:
>
> Hi,
>
> Couple of things:
>
> 1. Please create a Jira ticket with this proposal, so we can move
> discussion from user mailing list.
>
> I haven’t thought it through, so take my comments with a grain of salt,
> however:
>
> 2. If we were to go with such callback, I would prefer to have one
> BucketStateChangeCallback, with methods like `onInProgressToPending(…)`,
> `onPendingToFinal`, `onPendingToCancelled(…)`, etc, in oppose to having one
> interface passed three times/four times for different purposes.
>
> 3. Other thing that I had in mind is that BucketingSink could be rewritten
> to extend TwoPhaseCommitSinkFunction. In that case, with
>
> public class BucketingSink2 extends TwoPhaseCommitSinkFunction<???>
>
> user could add his own hooks by overriding following methods
>
> BucketingSink2#beginTransaction,
> BucketingSink2#preCommit, BucketingSink2#commit, BucketingSink2#abort. For
> example:
>
> public class MyBucketingSink extends BucketingSink2 {
>   @Override
>   protected void  commit(??? txn) {
>     super.commit(txn);
>     // My hook on moving file from pending to commit state
>   };
> }
>
> Alternatively, we could implement before mentioned callbacks support in
> TwoPhaseCommitSinkFunction and provide such feature to
> Kafka/Pravega/BucketingSink at once.
>
> Piotrek
>
>
> Sincerely yours,
> *Rinat Sharipov*
> Software Engineer at 1DMP CORE Team
>
> email: r.sharipov@cleverdata.ru <a....@cleverdata.ru>
> mobile: +7 (925) 416-37-26
>
> CleverDATA
> make your data clever
>
>
> Sincerely yours,
> *Rinat Sharipov*
> Software Engineer at 1DMP CORE Team
>
> email: r.sharipov@cleverdata.ru <a....@cleverdata.ru>
> mobile: +7 (925) 416-37-26
>
> CleverDATA
> make your data clever
>
>