You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Nabarun <se...@gmail.com> on 2011/06/10 13:54:28 UTC
Stochastic gradient algorithm related queries
I was going through Mahout code. A couple of queries I had related to
OnlineRegression algorithm (Stochastic Gradient implementation with LR)
1. I saw in the CrossFolder program the LogLikelihood was computed as
LogLikelihood += (Math.Log(score) - LogLikelihood)/(Math.min(records,
windowSize)
My query is, can't we use the formula which says
LogLikelihood = Sum (log p) or log(1-p)
depending on the value of y
where log p is computed online for each row
2. learning rate has been calculated as
CurrentLearningRate = mu0 * Math.pow (decayFactor, getStep ()) * Math.pow
(getStep () + stepOffset, forgettingExponent)
Can we use
LearningRate (epoch) = initialLearningRate / (1 + epoch / annealingRate)
Where we are going to use Inverse learning rate as it guarantees to converge
to a limit
Refference taken from: http://alias-i.com/lingpipe-
3.9.3/docs/api/com/aliasi/stats/AnnealingSchedule.html
Thanks
Nabarun
Re: Stochastic gradient algorithm related queries
Posted by Ted Dunning <te...@gmail.com>.
On Fri, Jun 10, 2011 at 1:54 PM, Nabarun <se...@gmail.com> wrote:
> I was going through Mahout code. A couple of queries I had related to
> OnlineRegression algorithm (Stochastic Gradient implementation with LR)
>
> 1. I saw in the CrossFolder program the LogLikelihood was computed as
>
> LogLikelihood += (Math.Log(score) - LogLikelihood)/(Math.min(records,
> windowSize)
>
> My query is, can't we use the formula which says
> LogLikelihood = Sum (log p) or log(1-p)
> depending on the value of y
How is that different from what is done?
The current code keeps only an exponential moving average. See
http://tdunning.blogspot.com/2011_05_01_archive.html
>
> 2. learning rate has been calculated as
>
> CurrentLearningRate = mu0 * Math.pow (decayFactor, getStep ()) * Math.pow
> (getStep () + stepOffset, forgettingExponent)
>
> Can we use
> LearningRate (epoch) = initialLearningRate / (1 + epoch / annealingRate)
Again, that looks a log like a special case of what is here.
> Where we are going to use Inverse learning rate as it guarantees to converge
> to a limit
This is a theoretical guarantee that is not always helpful in practice.