You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by prabu palanisamy <pr...@serendio.com> on 2013/02/07 12:26:10 UTC

Regarding find p-value in logistic regression

Hi

In "R" Language, the Logistic Regression function "glm"  generates the "P
Value".

In Mahout's "TrainLogistic" - the logistic regression function provides
only the coefficients of the variable .

Is there anyway to get the P value in Mahout as  "R's glm" provides P Value.

*R's glm Example Input*
** mylogit <- glm(IsAlert ~ P1 + P2, data=mysample, family="binomial")
 summary(mylogit)
 *
**R's glm Output*

Coefficients:
                   Estimate     Std. Error      z value       Pr(>|z|) (*P
values*)
(Intercept)    9.597e-01    1.703e-01      5.634         1.76e-08 ***
P1              -1.531e-02    7.232e-04     -21.167       < 2e-16 ***
P2               7.353e-04    1.390e-03      0.529         0.597

*Mahout, TrainLogistic Example Input:

*$MAHOUT_HOME/bin/mahout trainlogistic --input mysample.csv \
--output ./model \
--target y --categories 2 \
--predictors P1 P2 --types numeric \
--features 2 --passes 100 --rate 50

*Mahout OUTPUT*
y ~ -0.157*Intercept Term + -0.678*P1 + -0.416*P2
Intercept Term -0.15655
P1 -0.67841
P2 -0.41587.

But I need to get P Value,

Thanks in advance

Thanks and Regards
Prabu

Re: Regarding find p-value in logistic regression

Posted by Ted Dunning <te...@gmail.com>.
P values are normally computed using second derivatives.  With an on-line
order 1 algorithm like SGD, that can be difficult to do.

I will keep my eyes out for some sort of possibility for this.

On Thu, Feb 7, 2013 at 1:53 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> I had same question once, i wonder to hear the answer to this one again.
>
> As far as i understand, in R this value is derived analytically whereas
> Mahout's regression is stochastic and as far as i understand those
> techniques are not conducive to direct application of computation done in
> R. But i wonder if such approximate inference of the p-value on the
> regressor coefficient could be applied . (Not that i totally understand
> analysis of variance done in R either for these guys).
>
>
> On Thu, Feb 7, 2013 at 3:26 AM, prabu palanisamy <pr...@serendio.com>
> wrote:
>
> > Hi
> >
> > In "R" Language, the Logistic Regression function "glm"  generates the "P
> > Value".
> >
> > In Mahout's "TrainLogistic" - the logistic regression function provides
> > only the coefficients of the variable .
> >
> > Is there anyway to get the P value in Mahout as  "R's glm" provides P
> > Value.
> >
> > *R's glm Example Input*
> > ** mylogit <- glm(IsAlert ~ P1 + P2, data=mysample, family="binomial")
> >  summary(mylogit)
> >  *
> > **R's glm Output*
> >
> > Coefficients:
> >                    Estimate     Std. Error      z value       Pr(>|z|)
> (*P
> > values*)
> > (Intercept)    9.597e-01    1.703e-01      5.634         1.76e-08 ***
> > P1              -1.531e-02    7.232e-04     -21.167       < 2e-16 ***
> > P2               7.353e-04    1.390e-03      0.529         0.597
> >
> > *Mahout, TrainLogistic Example Input:
> >
> > *$MAHOUT_HOME/bin/mahout trainlogistic --input mysample.csv \
> > --output ./model \
> > --target y --categories 2 \
> > --predictors P1 P2 --types numeric \
> > --features 2 --passes 100 --rate 50
> >
> > *Mahout OUTPUT*
> > y ~ -0.157*Intercept Term + -0.678*P1 + -0.416*P2
> > Intercept Term -0.15655
> > P1 -0.67841
> > P2 -0.41587.
> >
> > But I need to get P Value,
> >
> > Thanks in advance
> >
> > Thanks and Regards
> > Prabu
> >
>

Re: Regarding find p-value in logistic regression

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
I had same question once, i wonder to hear the answer to this one again.

As far as i understand, in R this value is derived analytically whereas
Mahout's regression is stochastic and as far as i understand those
techniques are not conducive to direct application of computation done in
R. But i wonder if such approximate inference of the p-value on the
regressor coefficient could be applied . (Not that i totally understand
analysis of variance done in R either for these guys).


On Thu, Feb 7, 2013 at 3:26 AM, prabu palanisamy <pr...@serendio.com> wrote:

> Hi
>
> In "R" Language, the Logistic Regression function "glm"  generates the "P
> Value".
>
> In Mahout's "TrainLogistic" - the logistic regression function provides
> only the coefficients of the variable .
>
> Is there anyway to get the P value in Mahout as  "R's glm" provides P
> Value.
>
> *R's glm Example Input*
> ** mylogit <- glm(IsAlert ~ P1 + P2, data=mysample, family="binomial")
>  summary(mylogit)
>  *
> **R's glm Output*
>
> Coefficients:
>                    Estimate     Std. Error      z value       Pr(>|z|) (*P
> values*)
> (Intercept)    9.597e-01    1.703e-01      5.634         1.76e-08 ***
> P1              -1.531e-02    7.232e-04     -21.167       < 2e-16 ***
> P2               7.353e-04    1.390e-03      0.529         0.597
>
> *Mahout, TrainLogistic Example Input:
>
> *$MAHOUT_HOME/bin/mahout trainlogistic --input mysample.csv \
> --output ./model \
> --target y --categories 2 \
> --predictors P1 P2 --types numeric \
> --features 2 --passes 100 --rate 50
>
> *Mahout OUTPUT*
> y ~ -0.157*Intercept Term + -0.678*P1 + -0.416*P2
> Intercept Term -0.15655
> P1 -0.67841
> P2 -0.41587.
>
> But I need to get P Value,
>
> Thanks in advance
>
> Thanks and Regards
> Prabu
>