You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Simon Cooper <si...@featurespace.co.uk> on 2018/06/11 10:36:47 UTC

Details of segment deletion

Hi,

I've ben trying to work out the details of when exactly kafka log segments get deleted for to the retention period, so it would be helpful if someone could clarify the behaviour:


  *   Is a segment only deleted when all messages in that segment have 'timed out', or are messages deleted within each segment?
  *   Does the server artificially limit the messages returned to clients to those within the retention period, even if they still exist in the segment file?
  *   Does the segment deletion happen when a new segment is created, or is it done as a separate operation by the log cleaner?

Thanks for the help!
Simon Cooper

RE: Details of segment deletion

Posted by Simon Cooper <si...@featurespace.co.uk>.
Thanks, that's answered all my questions!

Simon

-----Original Message-----
From: Gwen Shapira <gw...@confluent.io> 
Sent: 13 June 2018 02:42
To: Users <us...@kafka.apache.org>
Subject: Re: Details of segment deletion

See below:

On Mon, Jun 11, 2018 at 3:36 AM, Simon Cooper < simon.cooper@featurespace.co.uk> wrote:

> Hi,
>
> I've ben trying to work out the details of when exactly kafka log 
> segments get deleted for to the retention period, so it would be 
> helpful if someone could clarify the behaviour:
>
>
>   *   Is a segment only deleted when all messages in that segment have
> 'timed out', or are messages deleted within each segment?
>

Kafka only deletes entire segments (except for compacted topics, which are a different story)



>   *   Does the server artificially limit the messages returned to clients
> to those within the retention period, even if they still exist in the 
> segment file?
>

Older messages can be read if the segment wasn't deleted yet. You can check the "beginning of log" offset JMX metric to see what is the oldest offset available to consumers on each partition.


>   *   Does the segment deletion happen when a new segment is created, or
> is it done as a separate operation by the log cleaner?
>

Separate operation by log cleaner, but note that active segment is never deleted so sometimes you are waiting for new segment to get created before a new one is deleted.


>
> Thanks for the help!
> Simon Cooper
>



--
*Gwen Shapira*
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter <https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2FConfluentInc&data=02%7C01%7Csimon.cooper%40featurespace.co.uk%7C19c5069967754ce17f7808d5d0cee3e4%7C19e863aab068484d9f9f990b545c5a0f%7C0%7C0%7C636644509362773827&sdata=kXauaDyHFdS55Ce3i4Q4Gm5Z%2FNvKSacckLKz4l1WibY%3D&reserved=0> | blog <https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.confluent.io%2Fblog&data=02%7C01%7Csimon.cooper%40featurespace.co.uk%7C19c5069967754ce17f7808d5d0cee3e4%7C19e863aab068484d9f9f990b545c5a0f%7C0%7C1%7C636644509362773827&sdata=2Myog3LDhQjJZ9grPFAoMGyAUrlEvKPN2dr2t9vWH1U%3D&reserved=0>

Re: Details of segment deletion

Posted by Ted Yu <yu...@gmail.com>.
Minor clarification (since new segment appeared twice) :

bq. before a new one is deleted.

The 'new one' (in the last sentence) would become old when another segment
is created.

Cheers


On Tue, Jun 12, 2018 at 6:42 PM, Gwen Shapira <gw...@confluent.io> wrote:

> See below:
>
> On Mon, Jun 11, 2018 at 3:36 AM, Simon Cooper <
> simon.cooper@featurespace.co.uk> wrote:
>
> > Hi,
> >
> > I've ben trying to work out the details of when exactly kafka log
> segments
> > get deleted for to the retention period, so it would be helpful if
> someone
> > could clarify the behaviour:
> >
> >
> >   *   Is a segment only deleted when all messages in that segment have
> > 'timed out', or are messages deleted within each segment?
> >
>
> Kafka only deletes entire segments (except for compacted topics, which are
> a different story)
>
>
>
> >   *   Does the server artificially limit the messages returned to clients
> > to those within the retention period, even if they still exist in the
> > segment file?
> >
>
> Older messages can be read if the segment wasn't deleted yet. You can check
> the "beginning of log" offset JMX metric to see what is the oldest offset
> available to consumers on each partition.
>
>
> >   *   Does the segment deletion happen when a new segment is created, or
> > is it done as a separate operation by the log cleaner?
> >
>
> Separate operation by log cleaner, but note that active segment is never
> deleted so sometimes you are waiting for new segment to get created before
> a new one is deleted.
>
>
> >
> > Thanks for the help!
> > Simon Cooper
> >
>
>
>
> --
> *Gwen Shapira*
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> <http://www.confluent.io/blog>
>

Re: Details of segment deletion

Posted by Gwen Shapira <gw...@confluent.io>.
See below:

On Mon, Jun 11, 2018 at 3:36 AM, Simon Cooper <
simon.cooper@featurespace.co.uk> wrote:

> Hi,
>
> I've ben trying to work out the details of when exactly kafka log segments
> get deleted for to the retention period, so it would be helpful if someone
> could clarify the behaviour:
>
>
>   *   Is a segment only deleted when all messages in that segment have
> 'timed out', or are messages deleted within each segment?
>

Kafka only deletes entire segments (except for compacted topics, which are
a different story)



>   *   Does the server artificially limit the messages returned to clients
> to those within the retention period, even if they still exist in the
> segment file?
>

Older messages can be read if the segment wasn't deleted yet. You can check
the "beginning of log" offset JMX metric to see what is the oldest offset
available to consumers on each partition.


>   *   Does the segment deletion happen when a new segment is created, or
> is it done as a separate operation by the log cleaner?
>

Separate operation by log cleaner, but note that active segment is never
deleted so sometimes you are waiting for new segment to get created before
a new one is deleted.


>
> Thanks for the help!
> Simon Cooper
>



-- 
*Gwen Shapira*
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
<http://www.confluent.io/blog>