You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Men Lim <zu...@gmail.com> on 2021/09/29 23:09:26 UTC

mm2: size of target vs source topic

Hi all,
I'm setting up two new clusters today and using mm2 to replicate data from
clusterA --> clusterB.  I noticed that the topic has the same amount of
record but the size is small by 5x.

source topic is 6.975 MB
target topic is: 1.136 MB

It has the same number of record.  both cluster is using gzip. any idea why
this is?

thanks,

Re: mm2: size of target vs source topic

Posted by Urbán Dániel <ur...@gmail.com>.
My guess is that MM2 decompressed, batched, then compressed the data 
again in bigger chunks - in case the batches in the source were small, 
MM2 could be able to improve the compression with bigger batches.

2021.10.04. 19:21 keltezéssel, Men Lim írta:
> bump
>
> On Wed, Sep 29, 2021 at 4:09 PM Men Lim <zu...@gmail.com> wrote:
>
>> Hi all,
>> I'm setting up two new clusters today and using mm2 to replicate data from
>> clusterA --> clusterB.  I noticed that the topic has the same amount of
>> record but the size is small by 5x.
>>
>> source topic is 6.975 MB
>> target topic is: 1.136 MB
>>
>> It has the same number of record.  both cluster is using gzip. any idea
>> why this is?
>>
>> thanks,
>>

-- 
Az e-mailen az AVG vírusellenőrzést végzett.
http://www.avg.com


Re: mm2: size of target vs source topic

Posted by Men Lim <zu...@gmail.com>.
bump

On Wed, Sep 29, 2021 at 4:09 PM Men Lim <zu...@gmail.com> wrote:

> Hi all,
> I'm setting up two new clusters today and using mm2 to replicate data from
> clusterA --> clusterB.  I noticed that the topic has the same amount of
> record but the size is small by 5x.
>
> source topic is 6.975 MB
> target topic is: 1.136 MB
>
> It has the same number of record.  both cluster is using gzip. any idea
> why this is?
>
> thanks,
>