You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by David Kincaid <ki...@gmail.com> on 2012/11/28 20:19:19 UTC

Mahout SGD - is it really descent?

While trying to wrap my head around the Mahout code for SGD I noticed that
the update to the beta terms seems to be doing gradient ascent and not
descent. Could someone help me find the missing minus sign?

The line of code in question from AbstractOnlineLogisticRegression.java,
train() is:

        double newValue = beta.getQuick(i, j) + gradientBase * learningRate
* perTermLearningRate(j) * instance.get(j);

It looks to me like the update to beta is ascending the gradient (hence the
addition sign instead of minus). Could you help me understand where my
thinking is going wrong?

Thanks,

Dave

Re: Mahout SGD - is it really descent?

Posted by Ted Dunning <te...@gmail.com>.
+1

On Wed, Nov 28, 2012 at 12:56 PM, Jake Mannix <ja...@gmail.com> wrote:

> or maybe call the variable negativeGradient, instead?
>

Re: Mahout SGD - is it really descent?

Posted by Jake Mannix <ja...@gmail.com>.
or maybe call the variable negativeGradient, instead?


On Wed, Nov 28, 2012 at 12:50 PM, Ted Dunning <te...@gmail.com> wrote:

> Robert's analysis is correct.
>
> This would be worthy of a comment at the least.
>
> On Wed, Nov 28, 2012 at 11:53 AM, Lancaster, Robert (Orbitz) <
> ROBERT.LANCASTER@orbitz.com> wrote:
>
> > graidentBase is coming from:
> > double gradientBase = gradient.get(i);
> >
> > Prior to that:
> > Vector gradient = this.gradient.apply(groupKey, actual, instance, this);
> >
> > "this.gradient" is an instance of DefaultGradient (in the same project).
> >  The last two lines of the apply function are:
> > r.assign(v, Functions.MINUS);
> > return r;
> >
> > This appears to be where the gradient values are negated.
> >
> >
> >
> >
> > -----Original Message-----
> > From: David Kincaid [mailto:kincaid.dave@gmail.com]
> > Sent: Wednesday, November 28, 2012 1:41 PM
> > To: user@mahout.apache.org
> > Subject: Re: Mahout SGD - is it really descent?
> >
> > I thought it might be too, but doesn't look like it to me. Of course, I
> > really have a hard time following vector and matrix math done in Java.
> Does
> > v.minus(r) mean v - r or r - v?
> >
> > On Wed, Nov 28, 2012 at 1:28 PM, David Arthur <mu...@gmail.com> wrote:
> >
> > > My completely unfounded guess would be the sign is built into
> > > gradientBase
> > >
> > > On Nov 28, 2012, at 2:19 PM, David Kincaid wrote:
> > >
> > > > While trying to wrap my head around the Mahout code for SGD I
> > > > noticed
> > > that
> > > > the update to the beta terms seems to be doing gradient ascent and
> > > > not descent. Could someone help me find the missing minus sign?
> > > >
> > > > The line of code in question from
> > > > AbstractOnlineLogisticRegression.java,
> > > > train() is:
> > > >
> > > >        double newValue = beta.getQuick(i, j) + gradientBase *
> > > learningRate
> > > > * perTermLearningRate(j) * instance.get(j);
> > > >
> > > > It looks to me like the update to beta is ascending the gradient
> > > > (hence
> > > the
> > > > addition sign instead of minus). Could you help me understand where
> > > > my thinking is going wrong?
> > > >
> > > > Thanks,
> > > >
> > > > Dave
> > >
> > >
> >
>



-- 

  -jake

Re: Mahout SGD - is it really descent?

Posted by Ted Dunning <te...@gmail.com>.
Robert's analysis is correct.

This would be worthy of a comment at the least.

On Wed, Nov 28, 2012 at 11:53 AM, Lancaster, Robert (Orbitz) <
ROBERT.LANCASTER@orbitz.com> wrote:

> graidentBase is coming from:
> double gradientBase = gradient.get(i);
>
> Prior to that:
> Vector gradient = this.gradient.apply(groupKey, actual, instance, this);
>
> "this.gradient" is an instance of DefaultGradient (in the same project).
>  The last two lines of the apply function are:
> r.assign(v, Functions.MINUS);
> return r;
>
> This appears to be where the gradient values are negated.
>
>
>
>
> -----Original Message-----
> From: David Kincaid [mailto:kincaid.dave@gmail.com]
> Sent: Wednesday, November 28, 2012 1:41 PM
> To: user@mahout.apache.org
> Subject: Re: Mahout SGD - is it really descent?
>
> I thought it might be too, but doesn't look like it to me. Of course, I
> really have a hard time following vector and matrix math done in Java. Does
> v.minus(r) mean v - r or r - v?
>
> On Wed, Nov 28, 2012 at 1:28 PM, David Arthur <mu...@gmail.com> wrote:
>
> > My completely unfounded guess would be the sign is built into
> > gradientBase
> >
> > On Nov 28, 2012, at 2:19 PM, David Kincaid wrote:
> >
> > > While trying to wrap my head around the Mahout code for SGD I
> > > noticed
> > that
> > > the update to the beta terms seems to be doing gradient ascent and
> > > not descent. Could someone help me find the missing minus sign?
> > >
> > > The line of code in question from
> > > AbstractOnlineLogisticRegression.java,
> > > train() is:
> > >
> > >        double newValue = beta.getQuick(i, j) + gradientBase *
> > learningRate
> > > * perTermLearningRate(j) * instance.get(j);
> > >
> > > It looks to me like the update to beta is ascending the gradient
> > > (hence
> > the
> > > addition sign instead of minus). Could you help me understand where
> > > my thinking is going wrong?
> > >
> > > Thanks,
> > >
> > > Dave
> >
> >
>

Re: Mahout SGD - is it really descent?

Posted by David Kincaid <ki...@gmail.com>.
Thanks much, Robert. I see it now. I think I've got the basics of the
implementation down now.

On Wed, Nov 28, 2012 at 1:53 PM, Lancaster, Robert (Orbitz) <
ROBERT.LANCASTER@orbitz.com> wrote:

> graidentBase is coming from:
> double gradientBase = gradient.get(i);
>
> Prior to that:
> Vector gradient = this.gradient.apply(groupKey, actual, instance, this);
>
> "this.gradient" is an instance of DefaultGradient (in the same project).
>  The last two lines of the apply function are:
> r.assign(v, Functions.MINUS);
> return r;
>
> This appears to be where the gradient values are negated.
>
>
>
>
> -----Original Message-----
> From: David Kincaid [mailto:kincaid.dave@gmail.com]
> Sent: Wednesday, November 28, 2012 1:41 PM
> To: user@mahout.apache.org
> Subject: Re: Mahout SGD - is it really descent?
>
> I thought it might be too, but doesn't look like it to me. Of course, I
> really have a hard time following vector and matrix math done in Java. Does
> v.minus(r) mean v - r or r - v?
>
> On Wed, Nov 28, 2012 at 1:28 PM, David Arthur <mu...@gmail.com> wrote:
>
> > My completely unfounded guess would be the sign is built into
> > gradientBase
> >
> > On Nov 28, 2012, at 2:19 PM, David Kincaid wrote:
> >
> > > While trying to wrap my head around the Mahout code for SGD I
> > > noticed
> > that
> > > the update to the beta terms seems to be doing gradient ascent and
> > > not descent. Could someone help me find the missing minus sign?
> > >
> > > The line of code in question from
> > > AbstractOnlineLogisticRegression.java,
> > > train() is:
> > >
> > >        double newValue = beta.getQuick(i, j) + gradientBase *
> > learningRate
> > > * perTermLearningRate(j) * instance.get(j);
> > >
> > > It looks to me like the update to beta is ascending the gradient
> > > (hence
> > the
> > > addition sign instead of minus). Could you help me understand where
> > > my thinking is going wrong?
> > >
> > > Thanks,
> > >
> > > Dave
> >
> >
>

RE: Mahout SGD - is it really descent?

Posted by "Lancaster, Robert (Orbitz)" <RO...@orbitz.com>.
graidentBase is coming from:
double gradientBase = gradient.get(i);

Prior to that: 
Vector gradient = this.gradient.apply(groupKey, actual, instance, this);

"this.gradient" is an instance of DefaultGradient (in the same project).  The last two lines of the apply function are:
r.assign(v, Functions.MINUS);
return r;

This appears to be where the gradient values are negated.




-----Original Message-----
From: David Kincaid [mailto:kincaid.dave@gmail.com] 
Sent: Wednesday, November 28, 2012 1:41 PM
To: user@mahout.apache.org
Subject: Re: Mahout SGD - is it really descent?

I thought it might be too, but doesn't look like it to me. Of course, I really have a hard time following vector and matrix math done in Java. Does
v.minus(r) mean v - r or r - v?

On Wed, Nov 28, 2012 at 1:28 PM, David Arthur <mu...@gmail.com> wrote:

> My completely unfounded guess would be the sign is built into 
> gradientBase
>
> On Nov 28, 2012, at 2:19 PM, David Kincaid wrote:
>
> > While trying to wrap my head around the Mahout code for SGD I 
> > noticed
> that
> > the update to the beta terms seems to be doing gradient ascent and 
> > not descent. Could someone help me find the missing minus sign?
> >
> > The line of code in question from 
> > AbstractOnlineLogisticRegression.java,
> > train() is:
> >
> >        double newValue = beta.getQuick(i, j) + gradientBase *
> learningRate
> > * perTermLearningRate(j) * instance.get(j);
> >
> > It looks to me like the update to beta is ascending the gradient 
> > (hence
> the
> > addition sign instead of minus). Could you help me understand where 
> > my thinking is going wrong?
> >
> > Thanks,
> >
> > Dave
>
>

Re: Mahout SGD - is it really descent?

Posted by David Kincaid <ki...@gmail.com>.
I thought it might be too, but doesn't look like it to me. Of course, I
really have a hard time following vector and matrix math done in Java. Does
v.minus(r) mean v - r or r - v?

On Wed, Nov 28, 2012 at 1:28 PM, David Arthur <mu...@gmail.com> wrote:

> My completely unfounded guess would be the sign is built into gradientBase
>
> On Nov 28, 2012, at 2:19 PM, David Kincaid wrote:
>
> > While trying to wrap my head around the Mahout code for SGD I noticed
> that
> > the update to the beta terms seems to be doing gradient ascent and not
> > descent. Could someone help me find the missing minus sign?
> >
> > The line of code in question from AbstractOnlineLogisticRegression.java,
> > train() is:
> >
> >        double newValue = beta.getQuick(i, j) + gradientBase *
> learningRate
> > * perTermLearningRate(j) * instance.get(j);
> >
> > It looks to me like the update to beta is ascending the gradient (hence
> the
> > addition sign instead of minus). Could you help me understand where my
> > thinking is going wrong?
> >
> > Thanks,
> >
> > Dave
>
>

Re: Mahout SGD - is it really descent?

Posted by David Arthur <mu...@gmail.com>.
My completely unfounded guess would be the sign is built into gradientBase

On Nov 28, 2012, at 2:19 PM, David Kincaid wrote:

> While trying to wrap my head around the Mahout code for SGD I noticed that
> the update to the beta terms seems to be doing gradient ascent and not
> descent. Could someone help me find the missing minus sign?
> 
> The line of code in question from AbstractOnlineLogisticRegression.java,
> train() is:
> 
>        double newValue = beta.getQuick(i, j) + gradientBase * learningRate
> * perTermLearningRate(j) * instance.get(j);
> 
> It looks to me like the update to beta is ascending the gradient (hence the
> addition sign instead of minus). Could you help me understand where my
> thinking is going wrong?
> 
> Thanks,
> 
> Dave