You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by 东方甲乙 <25...@qq.com> on 2016/10/16 06:14:40 UTC

回复: [DISCUSS] KIP-68 Add a consumed log retention before log retention

Hi Renu:
    Sorry for the delay, here are the comments:  
1. You mean the config for the topic ? We also support the per topic's consumed retention configuration. 


2. The consumer group's commit offset timeout is support in the 0.9, the consumed retention only concern about the current commit offset.
log retention will not stop even after the consumer disappears.


3. You can set to a very small time, for example 1ms, so the log will be deleted after consumed.
Set to 0 will throw error now.


4. Log retention timeout will not change, it only depend on the now time -  last modified time of the log file.
In the case when a new consumer comes,  will find the new min commit offset in the consumed retention process.


Thanks,
David


------------------ 原始邮件 ------------------
发件人: "Renu Tewari";<te...@gmail.com>;
发送时间: 2016年10月11日(星期二) 凌晨4:55
收件人: "dev"<de...@kafka.apache.org>; 

主题: Re: [DISCUSS] KIP-68 Add a consumed log retention before log retention



Hi David
  This is a very timely KIP given the number of use cases in the streams
processing pipeline than need consumed log retention management.

Some questions that Becket and Dong asked just wanted to make sure are
described in the KIP.

1. How is the configuration setup per topic to know what is the set of
consumer groups that are "subscribed" to this topic whose committed offsets
will be tracked. Can we have more details on how this will be dynamically
tracked as consumers come and go.
2. Is there a timeout to determine if a consumer group has stopped
committing offsets to topic partitions that they had earlier consumed? Or
the consumed log retention will track each known consumer/consumers groups
committed offset and stop any cleaning if a consumer disappears after
consuming. This is to Dong's earlier question.
3. Can the log.retention value be set to 0 to indicate the log is set to be
cleaned to the min committed offset immediately after it has been consumed?

4. What guarantee is given on when the consumed log will eventually be
cleaned. If the log.retention timeout is enabled for a consumed offset and
a new consumer starts consuming from the beginning then is the min
committed offset value changed and the timer based on log.retention timeout
restarted?

 This kind of all relates to active and inactive consumers and if the set
changes dynamically how does the consumed log retention actually make
progress.

regards
renu


On Mon, Oct 10, 2016 at 1:05 AM, Dong Lin <li...@gmail.com> wrote:

> Hey David,
>
> Thanks for reply. Please see comment inline.
>
> On Mon, Oct 10, 2016 at 12:40 AM, Pengwei (L) <pe...@huawei.com>
> wrote:
>
> > Hi Dong
> >    Thanks for the questions:
> >
> > 1.  Now we don't distinguish inactive or active groups. Because in some
> > case maybe inactive group will become active again, and using the
> previous
> > commit offset.
> >
> > So we will not delete the log segment in the consumer retention if there
> > are some groups consume but not commit, but the log segment can be
> delete by
> >      the force retention.
> >
>
> So in the example I provided, the consumed log retention will be
> effectively disabled, right? This seems to be a real problem in operation
> -- we don't want log retention to be un-intentionally disabled simply
> because someone start a tool to consume from that topic. Either this KIP
> should provide a way to handle this, or there should be a way for operator
> to be aware of such case and be able to re-eanble consumed log retention
> for the topic. What do you think?
>
>
>
> > 2.  These configs are used to determine the out of date time of the
> > consumed retention, like the parameters of the force retention
> > (log.retention.hours, log.retention.minutes, log.retention.ms). For
> > example, users want the save the log for 3 days, after 3 days, kafka will
> > delete the log segments which are
> >
> > consumed by all the consumer group.  The log retention thread need these
> > parameters.
> >
> > It makes sense to have configs such as log.retention.ms -- it is used to
> make data available for up to a configured amount of time before it is
> deleted. My question is what is the use-case for making log available for
> another e.g. 3 days after it has been consumed by all consumer groups. The
> purpose of this KIP is to allow log to be deleted right as long as all
> interested consumer groups have consumed it. Can you provide a use-case for
> keeping log available for longer time after it has been consumed by all
> groups?
>
>
> >
> > Thanks,
> > David
> >
> >
> > > Hey David,
> > >
> > > Thanks for the KIP. Can you help with the following two questions:
> > >
> > > 1) If someone start a consumer (e.g. kafka-console-consumer) to
> consume a
> > > topic for debug/validation purpose, a randome consumer group may be
> > created
> > > and offset may be committed for this consumer group. If no offset
> commit
> > is
> > > made for this consumer group in the future, will this effectively
> > > disable consumed log retention for this topic? In other words, how do
> > this
> > > KIP distinguish active consumer group from inactive ones?
> > >
> > > 2) Why do we need new configs such as log.retention.commitoffset.
> hours?
> > Can
> > >we simply delete log segments if consumed log retention is enabled for
> > this
> > > topic and all consumer groups have consumed messages in the log
> segment?
> > >
> > > Thanks,
> > > Dong
> > >
> > >
> > >
> > >On Sat, Oct 8, 2016 at 2:15 AM, Pengwei (L) <pe...@huawei.com>
> > wrote:
> > >
> > > > Hi Becket,
> > > >
> > > >   Thanks for the feedback:
> > > > 1.  We use the simple consumer api to query the commit offset, so we
> > don't
> > > > need to specify the consumer group.
> > > > 2.  Every broker using the simple consumer api(OffsetFetchKey) to
> query
> > > > the commit offset in the log retention process.  The client can
> commit
> > > > offset or not.
> > > > 3.  It does not need to distinguish the follower brokers or leader
> > > > brokers,  every brokers can query.
> > > > 4.  We don't need to change the protocols, we mainly change the log
> > > > retention process in the log manager.
> > > >
> > > >   One question is the query min offset need O(partitions * groups)
> time
> > > > complexity, another alternative is to build an internal topic to save
> > every
> > > > partition's min offset, it can reduce to O(1).
> > > > I will update the wiki for more details.
> > > >
> > > > Thanks,
> > > > David
> > > >
> > > >
> > > > > Hi Pengwei,
> > > > >
> > > > > Thanks for the KIP proposal. It is a very useful KIP. At a high
> > level,
> > > > the
> > > > > proposed behavior looks reasonable to me.
> > > > >
> > > > > However, it seems that some of the details are not mentioned in the
> > KIP.
> > > > > For example,
> > > > >
> > > > > 1. How will the expected consumer group be specified? Is it through
> > a per
> > > > > topic dynamic configuration?
> > > > > 2. How do the brokers detect the consumer offsets? Is it required
> > for a
> > > > > consumer to commit offsets?
> > > > > 3. How do all the replicas know the about the committed offsets?
> > e.g. 1)
> > > > > non-coordinator brokers which do not have the committed offsets, 2)
> > > > > follower brokers which do not have consumers directly consuming
> from
> > it.
> > > > > 4. Is there any other changes need to be made (e.g. new protocols)
> in
> > > > > addition to the configuration change?
> > > > >
> > > > > It would be great if you can update the wiki to have more details.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Wed, Sep 7, 2016 at 2:26 AM, Pengwei (L) <pengwei.li@huawei.com
> >
> > > > wrote:
> > > > >
> > > > > > Hi All,
> > > > > >    I have made a KIP to enhance the log retention, details as
> > follows:
> > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > 68+Add+a+consumed+log+retention+before+log+retention
> > > > > >    Now start a discuss thread for this KIP , looking forward to
> the
> > > > > > feedback.
> > > > > >
> > > > > > Thanks,
> > > > > > David
> > > > > >
> > > > > >
> >
> >
>