You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Andrew Jorgensen <aj...@twitter.com.INVALID> on 2015/04/06 22:10:37 UTC

New broker ignoring retention

I can't find any, but does anyone know of any bugs in 0.8.1.1 that would
cause new brokers added to an existing cluster to ignore the per-topic
configuration for retention?

I had a 8 node cluster with a topic with per topic retention set
like: Configs:retention.ms=5400000. I attempted to add 2 more brokers to
the cluster today and transfer 3 of the existing partitions over to the new
nodes. In general, the existing nodes stay around 60% disk usage but the
new brokers start at around 60% and then fall over the course of about two
hours down to 0%. It is unclear to me why the new brokers are ignoring the
log retention time and seemingly keeping the logs indefinitely. Both the
existing brokers and the new ones have the same server.properties file
which the default log.retention.hours=24.

To add the new brokers I ran the reassign-partition tool and just moved 3
partitions from some other nodes to the new nodes, the reassignment seemed
to complete successfully, there are 30 partitions and 10 brokers so each
broker has 3 partitions.

Re: New broker ignoring retention

Posted by Todd S <to...@borked.ca>.
FWIW, we've had good luck changing the mtime.  No problems found.

On Mon, Apr 6, 2015 at 4:37 PM, Todd Palino <tp...@gmail.com> wrote:
> I answered this in IRC, but the issue is that retention depends on the
> modification time of the log segments on disk. When you copy a partition
> from one broker to another, the mtime of the log segments on the new broker
> will be now. That means the retention clock starts over again. This means
> that your retention for those partitions will grow to 2 times what it
> should be, before dropping off to what you want.
>
> We deal with this a lot as well, which is part of why we keep a lot of
> headroom on our brokers when it comes to disk space. We've considered
> trying to change the mtime of the files manually after a move (we have a
> separate time-series database of offsets for every partition, so we can
> tell what the mtime of the file "should" be within 60 seconds), but we
> haven't done any experimentation with this as to whether or not it would
> actually work without problems.
>
> -Todd
>
>
> On Mon, Apr 6, 2015 at 1:10 PM, Andrew Jorgensen <
> ajorgensen@twitter.com.invalid> wrote:
>
>> I can't find any, but does anyone know of any bugs in 0.8.1.1 that would
>> cause new brokers added to an existing cluster to ignore the per-topic
>> configuration for retention?
>>
>> I had a 8 node cluster with a topic with per topic retention set
>> like: Configs:retention.ms=5400000. I attempted to add 2 more brokers to
>> the cluster today and transfer 3 of the existing partitions over to the new
>> nodes. In general, the existing nodes stay around 60% disk usage but the
>> new brokers start at around 60% and then fall over the course of about two
>> hours down to 0%. It is unclear to me why the new brokers are ignoring the
>> log retention time and seemingly keeping the logs indefinitely. Both the
>> existing brokers and the new ones have the same server.properties file
>> which the default log.retention.hours=24.
>>
>> To add the new brokers I ran the reassign-partition tool and just moved 3
>> partitions from some other nodes to the new nodes, the reassignment seemed
>> to complete successfully, there are 30 partitions and 10 brokers so each
>> broker has 3 partitions.
>>

Re: New broker ignoring retention

Posted by Todd Palino <tp...@gmail.com>.
I answered this in IRC, but the issue is that retention depends on the
modification time of the log segments on disk. When you copy a partition
from one broker to another, the mtime of the log segments on the new broker
will be now. That means the retention clock starts over again. This means
that your retention for those partitions will grow to 2 times what it
should be, before dropping off to what you want.

We deal with this a lot as well, which is part of why we keep a lot of
headroom on our brokers when it comes to disk space. We've considered
trying to change the mtime of the files manually after a move (we have a
separate time-series database of offsets for every partition, so we can
tell what the mtime of the file "should" be within 60 seconds), but we
haven't done any experimentation with this as to whether or not it would
actually work without problems.

-Todd


On Mon, Apr 6, 2015 at 1:10 PM, Andrew Jorgensen <
ajorgensen@twitter.com.invalid> wrote:

> I can't find any, but does anyone know of any bugs in 0.8.1.1 that would
> cause new brokers added to an existing cluster to ignore the per-topic
> configuration for retention?
>
> I had a 8 node cluster with a topic with per topic retention set
> like: Configs:retention.ms=5400000. I attempted to add 2 more brokers to
> the cluster today and transfer 3 of the existing partitions over to the new
> nodes. In general, the existing nodes stay around 60% disk usage but the
> new brokers start at around 60% and then fall over the course of about two
> hours down to 0%. It is unclear to me why the new brokers are ignoring the
> log retention time and seemingly keeping the logs indefinitely. Both the
> existing brokers and the new ones have the same server.properties file
> which the default log.retention.hours=24.
>
> To add the new brokers I ran the reassign-partition tool and just moved 3
> partitions from some other nodes to the new nodes, the reassignment seemed
> to complete successfully, there are 30 partitions and 10 brokers so each
> broker has 3 partitions.
>