You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "zhengruifeng (JIRA)" <ji...@apache.org> on 2016/12/07 02:54:58 UTC
[jira] [Updated] (SPARK-18757) Models in Pyspark support column setters

     [ https://issues.apache.org/jira/browse/SPARK-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

zhengruifeng updated SPARK-18757:
---------------------------------
    Description: 
Recently, I found three places in which column setters are missing: KMeansModel, BisectingKMeansModel and OneVsRestModel.
These three models directly inherit `Model` which dont have columns setters, so I had to add the missing setters manually in [SPARK-18625] and [SPARK-18520].
Fow now, models in pyspark still don't support column setters at all.
I suggest that we keep the hierarchy of pyspark models in line with that in the scala side:
For classifiation and regression algs, I‘m making a trial in [SPARK-18379]
For clustering algs, I think we may first create abstract classes {{ClusteringModel}} and {{ProbabilisticClusteringModel}}, and make clustering algs inherit it. Then, in the python side, we copy the hierarchy so that we dont need to add setters for each alg.
For features algs, we can also use a abstract class {{FeatureModel}} in scala side, and do the same thing.

What's your opinions? [~yanboliang][~josephkb][~sethah][~srowen]

  was:
Recently, I found three places in which column setters are missing: KMeansModel, BisectingKMeansModel and BisectingKMeansModel.
These three models directly inherit `Model` which dont have columns setters, so I had to add the missing setters manually in [SPARK-18625] and [SPARK-18520].
Fow now, models in pyspark still don't support column setters at all.
I suggest that we keep the hierarchy of pyspark models in line with that in the scala side:
For classifiation and regression algs, I‘m making a trial in [SPARK-18379]
For clustering algs, I think we may first create abstract classes {{ClusteringModel}} and {{ProbabilisticClusteringModel}}, and make clustering algs inherit it. Then, in the python side, we copy the hierarchy so that we dont need to add setters for each alg.
For features algs, we can also use a abstract class {{FeatureModel}} in scala side, and do the same thing.

What's your opinions? [~yanboliang][~josephkb][~sethah][~srowen]


> Models in Pyspark support column setters
> ----------------------------------------
>
>                 Key: SPARK-18757
>                 URL: https://issues.apache.org/jira/browse/SPARK-18757
>             Project: Spark
>          Issue Type: Brainstorming
>          Components: ML, PySpark
>            Reporter: zhengruifeng
>
> Recently, I found three places in which column setters are missing: KMeansModel, BisectingKMeansModel and OneVsRestModel.
> These three models directly inherit `Model` which dont have columns setters, so I had to add the missing setters manually in [SPARK-18625] and [SPARK-18520].
> Fow now, models in pyspark still don't support column setters at all.
> I suggest that we keep the hierarchy of pyspark models in line with that in the scala side:
> For classifiation and regression algs, I‘m making a trial in [SPARK-18379]
> For clustering algs, I think we may first create abstract classes {{ClusteringModel}} and {{ProbabilisticClusteringModel}}, and make clustering algs inherit it. Then, in the python side, we copy the hierarchy so that we dont need to add setters for each alg.
> For features algs, we can also use a abstract class {{FeatureModel}} in scala side, and do the same thing.
> What's your opinions? [~yanboliang][~josephkb][~sethah][~srowen]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org