You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@apex.apache.org by Guilherme Hott <gu...@gmail.com> on 2017/05/26 02:39:26 UTC

BoundedDedup or TimeBasedDedup

Hi everyone,

I have in my kafka operator messages coming and in my input port and
I have to process and emit a batch of transactions to a Dedup operator.
Should I use BoundedDedup or TimeBasedDedup?

Thanks

-- 
*Guilherme Hott*
*Software Engineer*
Skype: guilhermehott
@guilhermehott
https://www.linkedin.com/in/guilhermehott

Re: BoundedDedup or TimeBasedDedup

Posted by Guilherme Hott <gu...@gmail.com>.
Thank you Bhupesh. I think this is the best to do.

On Thu, May 25, 2017 at 7:54 PM, Bhupesh Chawda <bh...@datatorrent.com>
wrote:

> Hi,
>
> If you are just de-duplicating based on a key and have a limited batch of
> transactions, then you should go with BoundedDedup.
>
> TimeBasedDedup is for cases where you want to dedup within a stream with
> expiry based on the time in your tuples.
>
> ~ Bhupesh
>
>
> _______________________________________________________
>
> Bhupesh Chawda
>
> E: bhupesh@datatorrent.com | Twitter: @bhupeshsc
>
> www.datatorrent.com  |  apex.apache.org
>
>
>
> On Thu, May 25, 2017 at 7:39 PM, Guilherme Hott <gu...@gmail.com>
> wrote:
>
>> Hi everyone,
>>
>> I have in my kafka operator messages coming and in my input port and
>> I have to process and emit a batch of transactions to a Dedup operator.
>> Should I use BoundedDedup or TimeBasedDedup?
>>
>> Thanks
>>
>> --
>> *Guilherme Hott*
>> *Software Engineer*
>> Skype: guilhermehott
>> @guilhermehott
>> https://www.linkedin.com/in/guilhermehott
>>
>>
>


-- 
*Guilherme Hott*
*Software Engineer*
Skype: guilhermehott
@guilhermehott
https://www.linkedin.com/in/guilhermehott

Re: BoundedDedup or TimeBasedDedup

Posted by Bhupesh Chawda <bh...@datatorrent.com>.
Hi,

If you are just de-duplicating based on a key and have a limited batch of
transactions, then you should go with BoundedDedup.

TimeBasedDedup is for cases where you want to dedup within a stream with
expiry based on the time in your tuples.

~ Bhupesh


_______________________________________________________

Bhupesh Chawda

E: bhupesh@datatorrent.com | Twitter: @bhupeshsc

www.datatorrent.com  |  apex.apache.org



On Thu, May 25, 2017 at 7:39 PM, Guilherme Hott <gu...@gmail.com>
wrote:

> Hi everyone,
>
> I have in my kafka operator messages coming and in my input port and
> I have to process and emit a batch of transactions to a Dedup operator.
> Should I use BoundedDedup or TimeBasedDedup?
>
> Thanks
>
> --
> *Guilherme Hott*
> *Software Engineer*
> Skype: guilhermehott
> @guilhermehott
> https://www.linkedin.com/in/guilhermehott
>
>