You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by namit maheshwari <na...@gmail.com> on 2014/05/23 13:23:01 UTC

link() function in AbstractOnlineLogisticRegression class

Hello Everyone,

In mahout's *AbstractOnlineLogisticRegression *class the *public static
Vector link(Vector v)*
function checks the *max* value against 40.

Could anyone please explain the significance of 40 in context of Logistic
Regression?

Thanks
Namit

Re: link() function in AbstractOnlineLogisticRegression class

Posted by Ted Dunning <te...@gmail.com>.
It avoids numeric underflow or overflow.

With multinomial logistic regression, it is common that all of the values
are relatively far from zero.  To avoid numerical issues, a common offset
is subtracted before exponentiating.  The same offset comes out of the
normalizing factor so the result is the same as desired.

To learn why, you should try using code without the offset.



On Sun, May 25, 2014 at 9:38 PM, namit maheshwari <
namitmaheshwari7@gmail.com> wrote:

> Hello Ted,
>
> Thanks for the reply. But I am not able to understand what is the
> point of *subtracting
> the max value from every array element*. Here they are using *softmax
> regression* instead of standard logistic regression. So how does
> subtracting max value solves the problem?
>
> Thanks
> Namit
>
>
> On Fri, May 23, 2014 at 6:18 PM, Ted Dunning <te...@gmail.com>
> wrote:
>
> > exp(40) > 10^17
> >
> > Thus, if x >= 1, for x + exp(-40) all significant bits of the exponential
> > are lost and the result is identical to just saying x.  Likewise for x
> <=1,
> > for 1+exp(40), the addition of 1 has no effect.
> >
> > The logistic function [1] is defined as f(x) = 1 / (1 + exp(-x)), thus
> when
> > using double precision floating point where x >= 40, f(x) = 1 and where x
> > <= -40, f(x) = 0.
> >
> >
> > [1] https://en.wikipedia.org/wiki/Logistic_function
> >
> >
> >
> > On Fri, May 23, 2014 at 4:23 AM, namit maheshwari <
> > namitmaheshwari7@gmail.com> wrote:
> >
> > > Hello Everyone,
> > >
> > > In mahout's *AbstractOnlineLogisticRegression *class the *public static
> > > Vector link(Vector v)*
> > > function checks the *max* value against 40.
> > >
> > > Could anyone please explain the significance of 40 in context of
> Logistic
> > > Regression?
> > >
> > > Thanks
> > > Namit
> > >
> >
>

Re: link() function in AbstractOnlineLogisticRegression class

Posted by namit maheshwari <na...@gmail.com>.
Hello Ted,

Thanks for the reply. But I am not able to understand what is the
point of *subtracting
the max value from every array element*. Here they are using *softmax
regression* instead of standard logistic regression. So how does
subtracting max value solves the problem?

Thanks
Namit


On Fri, May 23, 2014 at 6:18 PM, Ted Dunning <te...@gmail.com> wrote:

> exp(40) > 10^17
>
> Thus, if x >= 1, for x + exp(-40) all significant bits of the exponential
> are lost and the result is identical to just saying x.  Likewise for x <=1,
> for 1+exp(40), the addition of 1 has no effect.
>
> The logistic function [1] is defined as f(x) = 1 / (1 + exp(-x)), thus when
> using double precision floating point where x >= 40, f(x) = 1 and where x
> <= -40, f(x) = 0.
>
>
> [1] https://en.wikipedia.org/wiki/Logistic_function
>
>
>
> On Fri, May 23, 2014 at 4:23 AM, namit maheshwari <
> namitmaheshwari7@gmail.com> wrote:
>
> > Hello Everyone,
> >
> > In mahout's *AbstractOnlineLogisticRegression *class the *public static
> > Vector link(Vector v)*
> > function checks the *max* value against 40.
> >
> > Could anyone please explain the significance of 40 in context of Logistic
> > Regression?
> >
> > Thanks
> > Namit
> >
>

Re: link() function in AbstractOnlineLogisticRegression class

Posted by Ted Dunning <te...@gmail.com>.
exp(40) > 10^17

Thus, if x >= 1, for x + exp(-40) all significant bits of the exponential
are lost and the result is identical to just saying x.  Likewise for x <=1,
for 1+exp(40), the addition of 1 has no effect.

The logistic function [1] is defined as f(x) = 1 / (1 + exp(-x)), thus when
using double precision floating point where x >= 40, f(x) = 1 and where x
<= -40, f(x) = 0.


[1] https://en.wikipedia.org/wiki/Logistic_function



On Fri, May 23, 2014 at 4:23 AM, namit maheshwari <
namitmaheshwari7@gmail.com> wrote:

> Hello Everyone,
>
> In mahout's *AbstractOnlineLogisticRegression *class the *public static
> Vector link(Vector v)*
> function checks the *max* value against 40.
>
> Could anyone please explain the significance of 40 in context of Logistic
> Regression?
>
> Thanks
> Namit
>