You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Mayuresh Gharat <gh...@gmail.com> on 2015/07/25 03:34:44 UTC

Query about LogConfigs

I was thinking why we have 3 different configs for the same property (log
retention)  :

"log.retention.ms"
"log.retention.minutes"
"log.retention.hours"

Why don't we only use the Milliseconds?
There are other properties as well like log Jitter, LogRollTime which raise
the same question in my mind.

The reason I am asking this is that a user might misconfigure them like
setting the log.retention.ms = 1 and log.retention.hours = 1.

-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125

Re: Query about LogConfigs

Posted by Mayuresh Gharat <gh...@gmail.com>.
Cool. Yeah +1 on make it consistent.

Thanks,

Mayuresh

On Fri, Jul 24, 2015 at 7:29 PM, Grant Henke <gh...@cloudera.com> wrote:

> They take precedence in order of granularity (ms > minutes > hours). Below
> is the relevant snippet of code.
>
> I vote for keeping all time related configs in milliseconds and deprecating
> the others. It gives you the most control in a single config. The minutes
> and hours ones offer no new functionality that can't be achieved with
> the milliseconds one.
>
>  private def getLogRetentionTimeMillis(): Long = {
>     val millisInMinute = 60L * 1000L
>     val millisInHour = 60L * millisInMinute
>
>     if(props.containsKey("log.retention.ms")){
>        props.getIntInRange("log.retention.ms", (1, Int.MaxValue))
>     }
>     else if(props.containsKey("log.retention.minutes")){
>        millisInMinute * props.getIntInRange("log.retention.minutes", (1,
> Int.MaxValue))
>     }
>     else {
>        millisInHour * props.getIntInRange("log.retention.hours", 24*7, (1,
> Int.MaxValue))
>     }
>   }
>
> On Fri, Jul 24, 2015 at 9:18 PM, Mayuresh Gharat <
> gharatmayuresh15@gmail.com
> > wrote:
>
> > Hmm.
> > This creates confusion for people who might be new to kafka or have never
> > used those properties.
> > Does deprecating the older ones make sense here?
> >
> > Thanks,
> >
> > Mayuresh
> >
> >
> > On Fri, Jul 24, 2015 at 7:15 PM, Gwen Shapira <gs...@cloudera.com>
> > wrote:
> >
> > > Agree. All I found on this was KAFKA-1325, which is more about
> > > inconsistency than real use-case.
> > >
> > > Anyway, I'd argue that if we added it in 0.8.2.0, we can't take it out
> > now
> > > :)
> > >
> > > On Fri, Jul 24, 2015 at 7:04 PM, Aditya Auradkar
> > > <aa...@linkedin.com.invalid> wrote:
> > > > Is there actually a use case where we need "log.retention.ms"? In
> most
> > > > cases, people would want to retain their logs for at least a few
> > minutes
> > > > I'd think.
> > > >
> > > > Aditya
> > > >
> > > > On Fri, Jul 24, 2015 at 6:49 PM, Gwen Shapira <gshapira@cloudera.com
> >
> > > wrote:
> > > >
> > > >> Backward compatibility, I think.
> > > >>
> > > >> At least the "ms" one is fairly new, and I think we left the others
> to
> > > >> avoid break configuration during upgrade.
> > > >>
> > > >> On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat
> > > >> <gh...@gmail.com> wrote:
> > > >> > I was thinking why we have 3 different configs for the same
> property
> > > (log
> > > >> > retention)  :
> > > >> >
> > > >> > "log.retention.ms"
> > > >> > "log.retention.minutes"
> > > >> > "log.retention.hours"
> > > >> >
> > > >> > Why don't we only use the Milliseconds?
> > > >> > There are other properties as well like log Jitter, LogRollTime
> > which
> > > >> raise
> > > >> > the same question in my mind.
> > > >> >
> > > >> > The reason I am asking this is that a user might misconfigure them
> > > like
> > > >> > setting the log.retention.ms = 1 and log.retention.hours = 1.
> > > >> >
> > > >> > --
> > > >> > -Regards,
> > > >> > Mayuresh R. Gharat
> > > >> > (862) 250-7125
> > > >>
> > >
> >
> >
> >
> > --
> > -Regards,
> > Mayuresh R. Gharat
> > (862) 250-7125
> >
>
>
>
> --
> Grant Henke
> Solutions Consultant | Cloudera
> ghenke@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125

Re: Query about LogConfigs

Posted by Ewen Cheslack-Postava <ew...@confluent.io>.
This also came up recently when Geoff and I were discussing KAFKA-2257,
which needs to add a timeout option to a tool that KAFKA-2276 is
introducing. For that timeout, the question is whether we should strive for
consistency (use ms), the unit that is most likely convenient (use s), or
finest granularity (use ms).

In the case of retention, I like having at least minutes and hours as
options. Admittedly it's not ideal to have two settings for the same
config, but both units are reasonable and having both makes it convenient
for the user. If you only support, for example, the minutes unit, having to
mentally divide by 60 to do the conversion is annoying. If you only support
hours, you might not provide fine enough granularity for some use cases. I
think the only real candidate for removal would be log.retention.ms, but it
seems that was added for consistency.

I actually think the ideal solution is to make ConfigDef.Types contain a
DURATION type. It could either make it so that adding a setting foo of type
TIME allows you to specify foo.<unit> = duration where unit is
ms,seconds,minutes,hours, or just parse strings with the appropriate
suffixes (ms, s, m, hr) into int + TimeUnit. This gives users flexibility
to use whatever is most convenient and standardizes on supporting any
format for any duration config. In practice, the first version is backwards
compatible and probably feasible but annoying to add to the
ConfigDef/AbstractConfig classes, and the second option is not backwards
compatible.

-Ewen



On Fri, Jul 24, 2015 at 7:29 PM, Grant Henke <gh...@cloudera.com> wrote:

> They take precedence in order of granularity (ms > minutes > hours). Below
> is the relevant snippet of code.
>
> I vote for keeping all time related configs in milliseconds and deprecating
> the others. It gives you the most control in a single config. The minutes
> and hours ones offer no new functionality that can't be achieved with
> the milliseconds one.
>
>  private def getLogRetentionTimeMillis(): Long = {
>     val millisInMinute = 60L * 1000L
>     val millisInHour = 60L * millisInMinute
>
>     if(props.containsKey("log.retention.ms")){
>        props.getIntInRange("log.retention.ms", (1, Int.MaxValue))
>     }
>     else if(props.containsKey("log.retention.minutes")){
>        millisInMinute * props.getIntInRange("log.retention.minutes", (1,
> Int.MaxValue))
>     }
>     else {
>        millisInHour * props.getIntInRange("log.retention.hours", 24*7, (1,
> Int.MaxValue))
>     }
>   }
>
> On Fri, Jul 24, 2015 at 9:18 PM, Mayuresh Gharat <
> gharatmayuresh15@gmail.com
> > wrote:
>
> > Hmm.
> > This creates confusion for people who might be new to kafka or have never
> > used those properties.
> > Does deprecating the older ones make sense here?
> >
> > Thanks,
> >
> > Mayuresh
> >
> >
> > On Fri, Jul 24, 2015 at 7:15 PM, Gwen Shapira <gs...@cloudera.com>
> > wrote:
> >
> > > Agree. All I found on this was KAFKA-1325, which is more about
> > > inconsistency than real use-case.
> > >
> > > Anyway, I'd argue that if we added it in 0.8.2.0, we can't take it out
> > now
> > > :)
> > >
> > > On Fri, Jul 24, 2015 at 7:04 PM, Aditya Auradkar
> > > <aa...@linkedin.com.invalid> wrote:
> > > > Is there actually a use case where we need "log.retention.ms"? In
> most
> > > > cases, people would want to retain their logs for at least a few
> > minutes
> > > > I'd think.
> > > >
> > > > Aditya
> > > >
> > > > On Fri, Jul 24, 2015 at 6:49 PM, Gwen Shapira <gshapira@cloudera.com
> >
> > > wrote:
> > > >
> > > >> Backward compatibility, I think.
> > > >>
> > > >> At least the "ms" one is fairly new, and I think we left the others
> to
> > > >> avoid break configuration during upgrade.
> > > >>
> > > >> On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat
> > > >> <gh...@gmail.com> wrote:
> > > >> > I was thinking why we have 3 different configs for the same
> property
> > > (log
> > > >> > retention)  :
> > > >> >
> > > >> > "log.retention.ms"
> > > >> > "log.retention.minutes"
> > > >> > "log.retention.hours"
> > > >> >
> > > >> > Why don't we only use the Milliseconds?
> > > >> > There are other properties as well like log Jitter, LogRollTime
> > which
> > > >> raise
> > > >> > the same question in my mind.
> > > >> >
> > > >> > The reason I am asking this is that a user might misconfigure them
> > > like
> > > >> > setting the log.retention.ms = 1 and log.retention.hours = 1.
> > > >> >
> > > >> > --
> > > >> > -Regards,
> > > >> > Mayuresh R. Gharat
> > > >> > (862) 250-7125
> > > >>
> > >
> >
> >
> >
> > --
> > -Regards,
> > Mayuresh R. Gharat
> > (862) 250-7125
> >
>
>
>
> --
> Grant Henke
> Solutions Consultant | Cloudera
> ghenke@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>



-- 
Thanks,
Ewen

Re: Query about LogConfigs

Posted by Grant Henke <gh...@cloudera.com>.
They take precedence in order of granularity (ms > minutes > hours). Below
is the relevant snippet of code.

I vote for keeping all time related configs in milliseconds and deprecating
the others. It gives you the most control in a single config. The minutes
and hours ones offer no new functionality that can't be achieved with
the milliseconds one.

 private def getLogRetentionTimeMillis(): Long = {
    val millisInMinute = 60L * 1000L
    val millisInHour = 60L * millisInMinute

    if(props.containsKey("log.retention.ms")){
       props.getIntInRange("log.retention.ms", (1, Int.MaxValue))
    }
    else if(props.containsKey("log.retention.minutes")){
       millisInMinute * props.getIntInRange("log.retention.minutes", (1,
Int.MaxValue))
    }
    else {
       millisInHour * props.getIntInRange("log.retention.hours", 24*7, (1,
Int.MaxValue))
    }
  }

On Fri, Jul 24, 2015 at 9:18 PM, Mayuresh Gharat <gharatmayuresh15@gmail.com
> wrote:

> Hmm.
> This creates confusion for people who might be new to kafka or have never
> used those properties.
> Does deprecating the older ones make sense here?
>
> Thanks,
>
> Mayuresh
>
>
> On Fri, Jul 24, 2015 at 7:15 PM, Gwen Shapira <gs...@cloudera.com>
> wrote:
>
> > Agree. All I found on this was KAFKA-1325, which is more about
> > inconsistency than real use-case.
> >
> > Anyway, I'd argue that if we added it in 0.8.2.0, we can't take it out
> now
> > :)
> >
> > On Fri, Jul 24, 2015 at 7:04 PM, Aditya Auradkar
> > <aa...@linkedin.com.invalid> wrote:
> > > Is there actually a use case where we need "log.retention.ms"? In most
> > > cases, people would want to retain their logs for at least a few
> minutes
> > > I'd think.
> > >
> > > Aditya
> > >
> > > On Fri, Jul 24, 2015 at 6:49 PM, Gwen Shapira <gs...@cloudera.com>
> > wrote:
> > >
> > >> Backward compatibility, I think.
> > >>
> > >> At least the "ms" one is fairly new, and I think we left the others to
> > >> avoid break configuration during upgrade.
> > >>
> > >> On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat
> > >> <gh...@gmail.com> wrote:
> > >> > I was thinking why we have 3 different configs for the same property
> > (log
> > >> > retention)  :
> > >> >
> > >> > "log.retention.ms"
> > >> > "log.retention.minutes"
> > >> > "log.retention.hours"
> > >> >
> > >> > Why don't we only use the Milliseconds?
> > >> > There are other properties as well like log Jitter, LogRollTime
> which
> > >> raise
> > >> > the same question in my mind.
> > >> >
> > >> > The reason I am asking this is that a user might misconfigure them
> > like
> > >> > setting the log.retention.ms = 1 and log.retention.hours = 1.
> > >> >
> > >> > --
> > >> > -Regards,
> > >> > Mayuresh R. Gharat
> > >> > (862) 250-7125
> > >>
> >
>
>
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
>



-- 
Grant Henke
Solutions Consultant | Cloudera
ghenke@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke

Re: Query about LogConfigs

Posted by Mayuresh Gharat <gh...@gmail.com>.
Hmm.
This creates confusion for people who might be new to kafka or have never
used those properties.
Does deprecating the older ones make sense here?

Thanks,

Mayuresh


On Fri, Jul 24, 2015 at 7:15 PM, Gwen Shapira <gs...@cloudera.com> wrote:

> Agree. All I found on this was KAFKA-1325, which is more about
> inconsistency than real use-case.
>
> Anyway, I'd argue that if we added it in 0.8.2.0, we can't take it out now
> :)
>
> On Fri, Jul 24, 2015 at 7:04 PM, Aditya Auradkar
> <aa...@linkedin.com.invalid> wrote:
> > Is there actually a use case where we need "log.retention.ms"? In most
> > cases, people would want to retain their logs for at least a few minutes
> > I'd think.
> >
> > Aditya
> >
> > On Fri, Jul 24, 2015 at 6:49 PM, Gwen Shapira <gs...@cloudera.com>
> wrote:
> >
> >> Backward compatibility, I think.
> >>
> >> At least the "ms" one is fairly new, and I think we left the others to
> >> avoid break configuration during upgrade.
> >>
> >> On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat
> >> <gh...@gmail.com> wrote:
> >> > I was thinking why we have 3 different configs for the same property
> (log
> >> > retention)  :
> >> >
> >> > "log.retention.ms"
> >> > "log.retention.minutes"
> >> > "log.retention.hours"
> >> >
> >> > Why don't we only use the Milliseconds?
> >> > There are other properties as well like log Jitter, LogRollTime which
> >> raise
> >> > the same question in my mind.
> >> >
> >> > The reason I am asking this is that a user might misconfigure them
> like
> >> > setting the log.retention.ms = 1 and log.retention.hours = 1.
> >> >
> >> > --
> >> > -Regards,
> >> > Mayuresh R. Gharat
> >> > (862) 250-7125
> >>
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125

Re: Query about LogConfigs

Posted by Gwen Shapira <gs...@cloudera.com>.
Agree. All I found on this was KAFKA-1325, which is more about
inconsistency than real use-case.

Anyway, I'd argue that if we added it in 0.8.2.0, we can't take it out now :)

On Fri, Jul 24, 2015 at 7:04 PM, Aditya Auradkar
<aa...@linkedin.com.invalid> wrote:
> Is there actually a use case where we need "log.retention.ms"? In most
> cases, people would want to retain their logs for at least a few minutes
> I'd think.
>
> Aditya
>
> On Fri, Jul 24, 2015 at 6:49 PM, Gwen Shapira <gs...@cloudera.com> wrote:
>
>> Backward compatibility, I think.
>>
>> At least the "ms" one is fairly new, and I think we left the others to
>> avoid break configuration during upgrade.
>>
>> On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat
>> <gh...@gmail.com> wrote:
>> > I was thinking why we have 3 different configs for the same property (log
>> > retention)  :
>> >
>> > "log.retention.ms"
>> > "log.retention.minutes"
>> > "log.retention.hours"
>> >
>> > Why don't we only use the Milliseconds?
>> > There are other properties as well like log Jitter, LogRollTime which
>> raise
>> > the same question in my mind.
>> >
>> > The reason I am asking this is that a user might misconfigure them like
>> > setting the log.retention.ms = 1 and log.retention.hours = 1.
>> >
>> > --
>> > -Regards,
>> > Mayuresh R. Gharat
>> > (862) 250-7125
>>

Re: Query about LogConfigs

Posted by Aditya Auradkar <aa...@linkedin.com.INVALID>.
Is there actually a use case where we need "log.retention.ms"? In most
cases, people would want to retain their logs for at least a few minutes
I'd think.

Aditya

On Fri, Jul 24, 2015 at 6:49 PM, Gwen Shapira <gs...@cloudera.com> wrote:

> Backward compatibility, I think.
>
> At least the "ms" one is fairly new, and I think we left the others to
> avoid break configuration during upgrade.
>
> On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat
> <gh...@gmail.com> wrote:
> > I was thinking why we have 3 different configs for the same property (log
> > retention)  :
> >
> > "log.retention.ms"
> > "log.retention.minutes"
> > "log.retention.hours"
> >
> > Why don't we only use the Milliseconds?
> > There are other properties as well like log Jitter, LogRollTime which
> raise
> > the same question in my mind.
> >
> > The reason I am asking this is that a user might misconfigure them like
> > setting the log.retention.ms = 1 and log.retention.hours = 1.
> >
> > --
> > -Regards,
> > Mayuresh R. Gharat
> > (862) 250-7125
>

Re: Query about LogConfigs

Posted by Mayuresh Gharat <gh...@gmail.com>.
Oh ok.
Is there a plan that we should deprecate the older ones. This is because as
I said some users might misconfigure it like by using both the
log.retention.ms" and "log.retention.minutes" by mistake.
We can probably set a warning that only one of this should be used.

Also looking at the KafkaConfig, it looks like the LogRetentionTimeHours is
a required config while the LogRetentionTimeMillis and LogRetentionTimeMinutes
are not.
Also the doc says this :

val LogRetentionTimeMillisDoc = "The number of milliseconds to keep a log
file before deleting it (in milliseconds)"
val LogRetentionTimeMinsDoc = "The number of minutes to keep a log file
before deleting it (in minutes), secondary to " +
LogRetentionTimeMillisProp + " property"
val LogRetentionTimeHoursDoc = "The number of hours to keep a log file
before deleting it (in hours), tertiary to " + LogRetentionTimeMillisProp +
" property"


So if my understanding is correct, LogRetentionTimeHours takes precedence
over all of the other. Am I right?

Thanks,

Mayuresh


On Fri, Jul 24, 2015 at 6:49 PM, Gwen Shapira <gs...@cloudera.com> wrote:

> Backward compatibility, I think.
>
> At least the "ms" one is fairly new, and I think we left the others to
> avoid break configuration during upgrade.
>
> On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat
> <gh...@gmail.com> wrote:
> > I was thinking why we have 3 different configs for the same property (log
> > retention)  :
> >
> > "log.retention.ms"
> > "log.retention.minutes"
> > "log.retention.hours"
> >
> > Why don't we only use the Milliseconds?
> > There are other properties as well like log Jitter, LogRollTime which
> raise
> > the same question in my mind.
> >
> > The reason I am asking this is that a user might misconfigure them like
> > setting the log.retention.ms = 1 and log.retention.hours = 1.
> >
> > --
> > -Regards,
> > Mayuresh R. Gharat
> > (862) 250-7125
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125

Re: Query about LogConfigs

Posted by Gwen Shapira <gs...@cloudera.com>.
Backward compatibility, I think.

At least the "ms" one is fairly new, and I think we left the others to
avoid break configuration during upgrade.

On Fri, Jul 24, 2015 at 6:34 PM, Mayuresh Gharat
<gh...@gmail.com> wrote:
> I was thinking why we have 3 different configs for the same property (log
> retention)  :
>
> "log.retention.ms"
> "log.retention.minutes"
> "log.retention.hours"
>
> Why don't we only use the Milliseconds?
> There are other properties as well like log Jitter, LogRollTime which raise
> the same question in my mind.
>
> The reason I am asking this is that a user might misconfigure them like
> setting the log.retention.ms = 1 and log.retention.hours = 1.
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125