You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Yu Ishikawa <yu...@gmail.com> on 2015/06/18 05:15:46 UTC

[mllib] Refactoring some spark.mllib model classes in Python not inheriting JavaModelWrapper

Hi all,

I think we should refactor some machine learning model classes in Python to
reduce the software maintainability.
Inheriting JavaModelWrapper class, we can easily and directly call Scala API
for the model without PythonMLlibAPI.

In some case, a machine learning model class in Python has complicated
variables. That is, it is a little hard to implement import/export methods
and it is also a little troublesome to implement the function in both of
Scala and Python. And I think standardizing how to create a model class in
python is important.

What do you think about that?

Thanks,
Yu



-----
-- Yu Ishikawa
--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Refactoring-some-spark-mllib-model-classes-in-Python-not-inheriting-JavaModelWrapper-tp12781.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: [mllib] Refactoring some spark.mllib model classes in Python not inheriting JavaModelWrapper

Posted by Yu Ishikawa <yu...@gmail.com>.
Hi Xiangrui

I got it. I will try to refactor any model class not inheriting
JavaModelWrapper and show you it.

Thanks,
Yu



-----
-- Yu Ishikawa
--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Refactoring-some-spark-mllib-model-classes-in-Python-not-inheriting-JavaModelWrapper-tp12781p12803.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: [mllib] Refactoring some spark.mllib model classes in Python not inheriting JavaModelWrapper

Posted by Xiangrui Meng <me...@gmail.com>.
Hi Yu,

Reducing the code complexity on the Python side is certainly what we
want to see:) We didn't call Java directly in Python models because
Java methods don't work inside RDD closures, e.g.,

rdd.map(lambda x: model.predict(x[1]))

But I agree that for model save/load the implementation should be
simplified. Could you submit a PR and see how much code we can save?

Thanks,
Xiangrui

On Wed, Jun 17, 2015 at 8:15 PM, Yu Ishikawa
<yu...@gmail.com> wrote:
> Hi all,
>
> I think we should refactor some machine learning model classes in Python to
> reduce the software maintainability.
> Inheriting JavaModelWrapper class, we can easily and directly call Scala API
> for the model without PythonMLlibAPI.
>
> In some case, a machine learning model class in Python has complicated
> variables. That is, it is a little hard to implement import/export methods
> and it is also a little troublesome to implement the function in both of
> Scala and Python. And I think standardizing how to create a model class in
> python is important.
>
> What do you think about that?
>
> Thanks,
> Yu
>
>
>
> -----
> -- Yu Ishikawa
> --
> View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Refactoring-some-spark-mllib-model-classes-in-Python-not-inheriting-JavaModelWrapper-tp12781.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org