You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Steven Wu <st...@netflix.com.INVALID> on 2014/06/03 01:00:33 UTC

log retention and rollover

This might be a bit unusual. We have a topic that we only need to keep last
5 minutes of msgs so that replay from beginning is fast.

Although retention.ms has time unit of minute, segment.ms ONLY has time
unit of hour. If I understand cleanup correctly, it can only delete files
that are rolled over. If true, the minimal retention period can be actually
one hour.

is there any particular reason for different time units for retention and
roll? Can we add "log.roll.minutes"?

retention.ms7 dayslog.retention.minutesThis configuration controls the
maximum time we will retain a log before we will discard old log segments
to free up space if we are using the "delete" retention policy. This
represents an SLA on how soon consumers must read their data.

segment.ms7 dayslog.roll.hoursThis configuration controls the period of
time after which Kafka will force the log to roll even if the segment file
isn't full to ensure that retention can delete or compact old data.

Thanks,
Steven

Re: log retention and rollover

Posted by Steven Wu <st...@netflix.com.INVALID>.
created KAFKA-1480 <https://issues.apache.org/jira/browse/KAFKA-1480>.
Thanks!

it's also generally better to have consistent/matching time unit for these
two configs.


On Mon, Jun 2, 2014 at 4:22 PM, Guozhang Wang <wa...@gmail.com> wrote:

> Steven,
>
> We initially set the rolling criterion based on hours to avoid too frequent
> log rolling and in turn too small segment files. For your case this may be
> reasonable to set the rolling criterion on minutes. Could you file a JIRA?
>
> Guozhang
>
>
> On Mon, Jun 2, 2014 at 4:00 PM, Steven Wu <st...@netflix.com.invalid>
> wrote:
>
> > This might be a bit unusual. We have a topic that we only need to keep
> last
> > 5 minutes of msgs so that replay from beginning is fast.
> >
> > Although retention.ms has time unit of minute, segment.ms ONLY has time
> > unit of hour. If I understand cleanup correctly, it can only delete files
> > that are rolled over. If true, the minimal retention period can be
> actually
> > one hour.
> >
> > is there any particular reason for different time units for retention and
> > roll? Can we add "log.roll.minutes"?
> >
> > retention.ms7 dayslog.retention.minutesThis configuration controls the
> > maximum time we will retain a log before we will discard old log segments
> > to free up space if we are using the "delete" retention policy. This
> > represents an SLA on how soon consumers must read their data.
> >
> > segment.ms7 dayslog.roll.hoursThis configuration controls the period of
> > time after which Kafka will force the log to roll even if the segment
> file
> > isn't full to ensure that retention can delete or compact old data.
> >
> > Thanks,
> > Steven
> >
>
>
>
> --
> -- Guozhang
>

Re: log retention and rollover

Posted by Guozhang Wang <wa...@gmail.com>.
Steven,

We initially set the rolling criterion based on hours to avoid too frequent
log rolling and in turn too small segment files. For your case this may be
reasonable to set the rolling criterion on minutes. Could you file a JIRA?

Guozhang


On Mon, Jun 2, 2014 at 4:00 PM, Steven Wu <st...@netflix.com.invalid>
wrote:

> This might be a bit unusual. We have a topic that we only need to keep last
> 5 minutes of msgs so that replay from beginning is fast.
>
> Although retention.ms has time unit of minute, segment.ms ONLY has time
> unit of hour. If I understand cleanup correctly, it can only delete files
> that are rolled over. If true, the minimal retention period can be actually
> one hour.
>
> is there any particular reason for different time units for retention and
> roll? Can we add "log.roll.minutes"?
>
> retention.ms7 dayslog.retention.minutesThis configuration controls the
> maximum time we will retain a log before we will discard old log segments
> to free up space if we are using the "delete" retention policy. This
> represents an SLA on how soon consumers must read their data.
>
> segment.ms7 dayslog.roll.hoursThis configuration controls the period of
> time after which Kafka will force the log to roll even if the segment file
> isn't full to ensure that retention can delete or compact old data.
>
> Thanks,
> Steven
>



-- 
-- Guozhang