You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mahout.apache.org by Dan Filimon <da...@gmail.com> on 2013/01/02 13:05:35 UTC

Should algorithms log progress?

I'd like to output some intermediate information to check the progress
of the algorithms I'm working on with Ted, StreamingKMeans and
BallKMeans.

I can add a Logger as a member variable and log the progress there, or
can just to printfs (which seems really dirty).

What's the best way of doing this? I'd prefer having the output, at
least when evaluating the quality of the code, but on the other hand
it'd be nice to disable logging when running in production.

I'm thinking of adding Logger progressLogger and a boolean
enableLogging variables to the algorithm classes to control this.

But then, it'd also be nice to return the log as a stream when the
algorithm is done. So, there'd be a getLog() method.

Any advice?

Re: Should algorithms log progress?

Posted by Dan Filimon <da...@gmail.com>.

Yep, this is the way to go. :)
Thanks!

On Wed, Jan 2, 2013 at 5:14 PM, Ted Dunning <te...@gmail.com> wrote:
> Debug-level logging is the better option.
>
> We should also consider replacing our dependency on slf4j-jcl with
> something like slf4j-log4j or logback to make this kind of configuration
> easier.
>
> Our libraries should generally not impose a logging framework.  Our
> top-level may do so since it is an application.
>
>
> On Wed, Jan 2, 2013 at 6:47 AM, Sean Owen <sr...@gmail.com> wrote:
>
>> Ideally, make them debug-level log statements and then enable/disable
>> them at runtime as needed. In practice that means configuring the JDK
>> logger, since that's what SLF4J is using underneath (currently). I
>> always have to fiddle a bit to figure out how to do it.
>>
>> Or: put in whatever you want for debugging and delete it / comment it
>> out before committing.
>>
>> On Wed, Jan 2, 2013 at 2:43 PM, Dan Filimon <da...@gmail.com>
>> wrote:
>> > What I'm looking for, is a way of collecting runtime stats without
>> > having 2 versions of the code (one that prints it out and one that
>> > doesn't).
>> > How could I do this without logging?
>> >
>> > On Wed, Jan 2, 2013 at 4:40 PM, Ted Dunning <te...@gmail.com>
>> wrote:
>> >> The normal answer is that we use slf4j.  If you log at debug or info
>> level,
>> >> then your conditionals shouldn't be necessary.
>> >>
>> >> Returning the log as a stream is pretty unusual, but some high
>> performance
>> >> systems can't handle the overhead of even something like slf4j.
>>  Typically,
>> >> this is because these systems need to make the decision about logging
>> level
>> >> somewhat after the fact ... i.e. they need to log at a substantial
>> level,
>> >> but only store the results if something goes wrong.
>> >>
>> >> I don't think that streaming k-means is in this category.
>> >>
>> >> On Wed, Jan 2, 2013 at 4:05 AM, Dan Filimon <
>> dangeorge.filimon@gmail.com>wrote:
>> >>
>> >>> I can add a Logger as a member variable and log the progress there, or
>> >>> can just to printfs (which seems really dirty).
>> >>>
>> >>> What's the best way of doing this? I'd prefer having the output, at
>> >>> least when evaluating the quality of the code, but on the other hand
>> >>> it'd be nice to disable logging when running in production.
>> >>>
>> >>> I'm thinking of adding Logger progressLogger and a boolean
>> >>> enableLogging variables to the algorithm classes to control this.
>> >>>
>> >>> But then, it'd also be nice to return the log as a stream when the
>> >>> algorithm is done. So, there'd be a getLog() method.
>> >>>
>>

Re: Should algorithms log progress?

Posted by Ted Dunning <te...@gmail.com>.

Debug-level logging is the better option.

We should also consider replacing our dependency on slf4j-jcl with
something like slf4j-log4j or logback to make this kind of configuration
easier.

Our libraries should generally not impose a logging framework.  Our
top-level may do so since it is an application.


On Wed, Jan 2, 2013 at 6:47 AM, Sean Owen <sr...@gmail.com> wrote:

> Ideally, make them debug-level log statements and then enable/disable
> them at runtime as needed. In practice that means configuring the JDK
> logger, since that's what SLF4J is using underneath (currently). I
> always have to fiddle a bit to figure out how to do it.
>
> Or: put in whatever you want for debugging and delete it / comment it
> out before committing.
>
> On Wed, Jan 2, 2013 at 2:43 PM, Dan Filimon <da...@gmail.com>
> wrote:
> > What I'm looking for, is a way of collecting runtime stats without
> > having 2 versions of the code (one that prints it out and one that
> > doesn't).
> > How could I do this without logging?
> >
> > On Wed, Jan 2, 2013 at 4:40 PM, Ted Dunning <te...@gmail.com>
> wrote:
> >> The normal answer is that we use slf4j.  If you log at debug or info
> level,
> >> then your conditionals shouldn't be necessary.
> >>
> >> Returning the log as a stream is pretty unusual, but some high
> performance
> >> systems can't handle the overhead of even something like slf4j.
>  Typically,
> >> this is because these systems need to make the decision about logging
> level
> >> somewhat after the fact ... i.e. they need to log at a substantial
> level,
> >> but only store the results if something goes wrong.
> >>
> >> I don't think that streaming k-means is in this category.
> >>
> >> On Wed, Jan 2, 2013 at 4:05 AM, Dan Filimon <
> dangeorge.filimon@gmail.com>wrote:
> >>
> >>> I can add a Logger as a member variable and log the progress there, or
> >>> can just to printfs (which seems really dirty).
> >>>
> >>> What's the best way of doing this? I'd prefer having the output, at
> >>> least when evaluating the quality of the code, but on the other hand
> >>> it'd be nice to disable logging when running in production.
> >>>
> >>> I'm thinking of adding Logger progressLogger and a boolean
> >>> enableLogging variables to the algorithm classes to control this.
> >>>
> >>> But then, it'd also be nice to return the log as a stream when the
> >>> algorithm is done. So, there'd be a getLog() method.
> >>>
>

Re: Should algorithms log progress?

Posted by Sean Owen <sr...@gmail.com>.

Ideally, make them debug-level log statements and then enable/disable
them at runtime as needed. In practice that means configuring the JDK
logger, since that's what SLF4J is using underneath (currently). I
always have to fiddle a bit to figure out how to do it.

Or: put in whatever you want for debugging and delete it / comment it
out before committing.

On Wed, Jan 2, 2013 at 2:43 PM, Dan Filimon <da...@gmail.com> wrote:
> What I'm looking for, is a way of collecting runtime stats without
> having 2 versions of the code (one that prints it out and one that
> doesn't).
> How could I do this without logging?
>
> On Wed, Jan 2, 2013 at 4:40 PM, Ted Dunning <te...@gmail.com> wrote:
>> The normal answer is that we use slf4j.  If you log at debug or info level,
>> then your conditionals shouldn't be necessary.
>>
>> Returning the log as a stream is pretty unusual, but some high performance
>> systems can't handle the overhead of even something like slf4j.  Typically,
>> this is because these systems need to make the decision about logging level
>> somewhat after the fact ... i.e. they need to log at a substantial level,
>> but only store the results if something goes wrong.
>>
>> I don't think that streaming k-means is in this category.
>>
>> On Wed, Jan 2, 2013 at 4:05 AM, Dan Filimon <da...@gmail.com>wrote:
>>
>>> I can add a Logger as a member variable and log the progress there, or
>>> can just to printfs (which seems really dirty).
>>>
>>> What's the best way of doing this? I'd prefer having the output, at
>>> least when evaluating the quality of the code, but on the other hand
>>> it'd be nice to disable logging when running in production.
>>>
>>> I'm thinking of adding Logger progressLogger and a boolean
>>> enableLogging variables to the algorithm classes to control this.
>>>
>>> But then, it'd also be nice to return the log as a stream when the
>>> algorithm is done. So, there'd be a getLog() method.
>>>

Re: Should algorithms log progress?

Posted by Dan Filimon <da...@gmail.com>.

What I'm looking for, is a way of collecting runtime stats without
having 2 versions of the code (one that prints it out and one that
doesn't).
How could I do this without logging?

On Wed, Jan 2, 2013 at 4:40 PM, Ted Dunning <te...@gmail.com> wrote:
> The normal answer is that we use slf4j.  If you log at debug or info level,
> then your conditionals shouldn't be necessary.
>
> Returning the log as a stream is pretty unusual, but some high performance
> systems can't handle the overhead of even something like slf4j.  Typically,
> this is because these systems need to make the decision about logging level
> somewhat after the fact ... i.e. they need to log at a substantial level,
> but only store the results if something goes wrong.
>
> I don't think that streaming k-means is in this category.
>
> On Wed, Jan 2, 2013 at 4:05 AM, Dan Filimon <da...@gmail.com>wrote:
>
>> I can add a Logger as a member variable and log the progress there, or
>> can just to printfs (which seems really dirty).
>>
>> What's the best way of doing this? I'd prefer having the output, at
>> least when evaluating the quality of the code, but on the other hand
>> it'd be nice to disable logging when running in production.
>>
>> I'm thinking of adding Logger progressLogger and a boolean
>> enableLogging variables to the algorithm classes to control this.
>>
>> But then, it'd also be nice to return the log as a stream when the
>> algorithm is done. So, there'd be a getLog() method.
>>

Re: Should algorithms log progress?

Posted by Ted Dunning <te...@gmail.com>.

The normal answer is that we use slf4j.  If you log at debug or info level,
then your conditionals shouldn't be necessary.

Returning the log as a stream is pretty unusual, but some high performance
systems can't handle the overhead of even something like slf4j.  Typically,
this is because these systems need to make the decision about logging level
somewhat after the fact ... i.e. they need to log at a substantial level,
but only store the results if something goes wrong.

I don't think that streaming k-means is in this category.

On Wed, Jan 2, 2013 at 4:05 AM, Dan Filimon <da...@gmail.com>wrote:

> I can add a Logger as a member variable and log the progress there, or
> can just to printfs (which seems really dirty).
>
> What's the best way of doing this? I'd prefer having the output, at
> least when evaluating the quality of the code, but on the other hand
> it'd be nice to disable logging when running in production.
>
> I'm thinking of adding Logger progressLogger and a boolean
> enableLogging variables to the algorithm classes to control this.
>
> But then, it'd also be nice to return the log as a stream when the
> algorithm is done. So, there'd be a getLog() method.
>