You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Joseph K. Bradley (JIRA)" <ji...@apache.org> on 2015/07/17 01:45:04 UTC

[jira] [Commented] (SPARK-9120) Add multivariate regression (or prediction) interface

    [ https://issues.apache.org/jira/browse/SPARK-9120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630546#comment-14630546 ] 

Joseph K. Bradley commented on SPARK-9120:
------------------------------------------

This sounds reasonable.

One caveat though: Since adding those abstractions, I have wondered a bit about their generality.  I feel like they are mainly useful for helping developers write new algorithms and avoid some boilerplate code.  For providing public abstractions, I think we should probably design some traits---but I have not had time to think about this deeply.

So I think we should do this lazily: If you have an algorithm to add, it should be added with the interface.  As we add more algorithms, then we can start thinking about creating an abstraction.

What do you think?

> Add multivariate regression (or prediction) interface
> -----------------------------------------------------
>
>                 Key: SPARK-9120
>                 URL: https://issues.apache.org/jira/browse/SPARK-9120
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 1.4.0
>            Reporter: Alexander Ulanov
>             Fix For: 1.4.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> org.apache.spark.ml.regression.RegressionModel supports prediction only for a single variable with a method "predict:Double" by extending the Predictor. There is a need for multivariate prediction, at least for regression. I propose to modify "RegressionModel" interface similarly to how it is done in "ClassificationModel", which supports multiclass classification. It has "predict:Double" and "predictRaw:Vector". Analogously, "RegressionModel" should have something like "predictMultivariate:Vector".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org