You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by evanzamir <za...@gmail.com> on 2016/09/06 19:49:45 UTC

I noticed LinearRegression sometimes produces negative R^2 values

Am I misinterpreting what r2() in the LinearRegression Model summary means?
By definition, R^2 should never be a negative number!



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/I-noticed-LinearRegression-sometimes-produces-negative-R-2-values-tp27667.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: I noticed LinearRegression sometimes produces negative R^2 values

Posted by Nick Pentreath <ni...@gmail.com>.
That does seem strange. Can you provide an example to reproduce?



On Tue, 6 Sep 2016 at 21:49 evanzamir <za...@gmail.com> wrote:

> Am I misinterpreting what r2() in the LinearRegression Model summary means?
> By definition, R^2 should never be a negative number!
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/I-noticed-LinearRegression-sometimes-produces-negative-R-2-values-tp27667.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

Re: I noticed LinearRegression sometimes produces negative R^2 values

Posted by Evan Zamir <za...@gmail.com>.
Yes, it's on a hold out segment from the data set being fitted.
On Wed, Sep 7, 2016 at 1:02 AM Sean Owen <so...@cloudera.com> wrote:

> Yes, should be.
> It's also not necessarily nonnegative if you evaluate R^2 on a
> different data set than you fit it to. Is that the case?
>
> On Tue, Sep 6, 2016 at 11:15 PM, Evan Zamir <za...@gmail.com> wrote:
> > I am using the default setting for setting fitIntercept, which *should*
> be
> > TRUE right?
> >
> > On Tue, Sep 6, 2016 at 1:38 PM Sean Owen <so...@cloudera.com> wrote:
> >>
> >> Are you not fitting an intercept / regressing through the origin? with
> >> that constraint it's no longer true that R^2 is necessarily
> >> nonnegative. It basically means that the errors are even bigger than
> >> what you'd get by predicting the data's mean value as a constant
> >> model.
> >>
> >> On Tue, Sep 6, 2016 at 8:49 PM, evanzamir <za...@gmail.com> wrote:
> >> > Am I misinterpreting what r2() in the LinearRegression Model summary
> >> > means?
> >> > By definition, R^2 should never be a negative number!
> >> >
> >> >
> >> >
> >> > --
> >> > View this message in context:
> >> >
> http://apache-spark-user-list.1001560.n3.nabble.com/I-noticed-LinearRegression-sometimes-produces-negative-R-2-values-tp27667.html
> >> > Sent from the Apache Spark User List mailing list archive at
> Nabble.com.
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> >> >
>

Re: I noticed LinearRegression sometimes produces negative R^2 values

Posted by Sean Owen <so...@cloudera.com>.
Yes, should be.
It's also not necessarily nonnegative if you evaluate R^2 on a
different data set than you fit it to. Is that the case?

On Tue, Sep 6, 2016 at 11:15 PM, Evan Zamir <za...@gmail.com> wrote:
> I am using the default setting for setting fitIntercept, which *should* be
> TRUE right?
>
> On Tue, Sep 6, 2016 at 1:38 PM Sean Owen <so...@cloudera.com> wrote:
>>
>> Are you not fitting an intercept / regressing through the origin? with
>> that constraint it's no longer true that R^2 is necessarily
>> nonnegative. It basically means that the errors are even bigger than
>> what you'd get by predicting the data's mean value as a constant
>> model.
>>
>> On Tue, Sep 6, 2016 at 8:49 PM, evanzamir <za...@gmail.com> wrote:
>> > Am I misinterpreting what r2() in the LinearRegression Model summary
>> > means?
>> > By definition, R^2 should never be a negative number!
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> > http://apache-spark-user-list.1001560.n3.nabble.com/I-noticed-LinearRegression-sometimes-produces-negative-R-2-values-tp27667.html
>> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>> >

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: I noticed LinearRegression sometimes produces negative R^2 values

Posted by Evan Zamir <za...@gmail.com>.
I am using the default setting for setting *fitIntercept*, which *should*
be TRUE right?

On Tue, Sep 6, 2016 at 1:38 PM Sean Owen <so...@cloudera.com> wrote:

> Are you not fitting an intercept / regressing through the origin? with
> that constraint it's no longer true that R^2 is necessarily
> nonnegative. It basically means that the errors are even bigger than
> what you'd get by predicting the data's mean value as a constant
> model.
>
> On Tue, Sep 6, 2016 at 8:49 PM, evanzamir <za...@gmail.com> wrote:
> > Am I misinterpreting what r2() in the LinearRegression Model summary
> means?
> > By definition, R^2 should never be a negative number!
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/I-noticed-LinearRegression-sometimes-produces-negative-R-2-values-tp27667.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> >
>

Re: I noticed LinearRegression sometimes produces negative R^2 values

Posted by Sean Owen <so...@cloudera.com>.
Are you not fitting an intercept / regressing through the origin? with
that constraint it's no longer true that R^2 is necessarily
nonnegative. It basically means that the errors are even bigger than
what you'd get by predicting the data's mean value as a constant
model.

On Tue, Sep 6, 2016 at 8:49 PM, evanzamir <za...@gmail.com> wrote:
> Am I misinterpreting what r2() in the LinearRegression Model summary means?
> By definition, R^2 should never be a negative number!
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/I-noticed-LinearRegression-sometimes-produces-negative-R-2-values-tp27667.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org