You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by David Watzke <wa...@avast.com> on 2015/01/22 13:38:40 UTC

clarification of the per-topic retention.bytes setting

Hi list,

please help me understand the per-topic retention.* setting (in kafka 
0.8.1.1) done by:

bin/kafka-topics.sh --zookeeper $ZK --alter --topic $TOPIC --config 
retention.bytes=VALUE

I understand from this:
http://search-hadoop.com/m/4TaT4E9f78/retention.bytes&subj=estimating+log+retention+bytes
that log.retention.bytes limit applies to a single partition, so 
basically each partition directory (in all directories listed in 
log.dirs) can take up at most "log.retention.bytes" bytes.

But what happens if I increase this limit for a single topic? Is this 
limit per-that-topic's-partition? So the overall topic size limit would be

LIMIT = retention.bytes * TOPIC'S_PARTITION_COUNT

and it could take up to LIMIT * TOPIC'S_REPLICATION_FACTOR of disk space?

Or is this setting "per-topic", meaning that retention.bytes property 
sets the upper overall topic size limit directly?

Thanks in advance!

-- 
David Watzke


Re: clarification of the per-topic retention.bytes setting

Posted by Guozhang Wang <wa...@gmail.com>.
Hi David,

The "per-topic" configs will just override the global configs for that
specific topic; for the retention.bytes config it will be applied to all
partitions of that topic.

So if you have two topics each with two partitions and replication factor 1
with retention.bytes valued A then the total limit will be

2 (topics) * 2 (partitions per topic) * A (bytes)

And if you specify "per-topic-retention" for one of the topics to value B
the total limit will be changed to

2 (partitions per topic) * A (bytes) + 2 (partitions per topic) * B (bytes)

Guozhang

On Thu, Jan 22, 2015 at 4:38 AM, David Watzke <wa...@avast.com> wrote:

> Hi list,
>
> please help me understand the per-topic retention.* setting (in kafka
> 0.8.1.1) done by:
>
> bin/kafka-topics.sh --zookeeper $ZK --alter --topic $TOPIC --config
> retention.bytes=VALUE
>
> I understand from this:
> http://search-hadoop.com/m/4TaT4E9f78/retention.bytes&
> subj=estimating+log+retention+bytes
> that log.retention.bytes limit applies to a single partition, so basically
> each partition directory (in all directories listed in log.dirs) can take
> up at most "log.retention.bytes" bytes.
>
> But what happens if I increase this limit for a single topic? Is this
> limit per-that-topic's-partition? So the overall topic size limit would be
>
> LIMIT = retention.bytes * TOPIC'S_PARTITION_COUNT
>
> and it could take up to LIMIT * TOPIC'S_REPLICATION_FACTOR of disk space?
>
> Or is this setting "per-topic", meaning that retention.bytes property sets
> the upper overall topic size limit directly?
>
> Thanks in advance!
>
> --
> David Watzke
>
>


-- 
-- Guozhang