You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Peter Rudenko <pe...@gmail.com> on 2015/06/08 18:17:44 UTC

[ml] Why all model classes are final?

Hi, previously all the models in ml package were private to package, so 
if i need to customize some models i inherit them in org.apache.spark.ml 
package in my project. But now new models 
(https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala#L46) 
are final classes. So if i need to customize 1 line or so, i need to 
redefine the whole class. Any reasons to do so? As a developer,i 
understand all the risks of using Developer/Alpha API. That's why i'm 
using spark, because it provides a building blocks that i could easily 
customize and combine for my need.

Thanks,
Peter Rudenko

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: [ml] Why all model classes are final?

Posted by Joseph Bradley <jo...@databricks.com>.
Hi Peter,

We've tried to be cautious about making APIs public without need, to allow
for changes needed in the future which we can't foresee now.  Marking
classes as final is part of that.  While marking things as Experimental or
DeveloperApi is a sort of warning, we've often felt that even changing
those Experimental/Developer APIs is dangerous since people can come to
rely on those APIs.

However, customization is a very valid use case, and I agree that the
classes should be opened up in the future.  I hope that, as the Pipelines
API graduates from alpha, more users will give feedback about them, and
that will give us enough confidence in the API stability to make the
classes non-final.

Joseph

On Mon, Jun 8, 2015 at 9:17 AM, Peter Rudenko <pe...@gmail.com>
wrote:

> Hi, previously all the models in ml package were private to package, so if
> i need to customize some models i inherit them in org.apache.spark.ml
> package in my project. But now new models (
> https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala#L46)
> are final classes. So if i need to customize 1 line or so, i need to
> redefine the whole class. Any reasons to do so? As a developer,i understand
> all the risks of using Developer/Alpha API. That's why i'm using spark,
> because it provides a building blocks that i could easily customize and
> combine for my need.
>
> Thanks,
> Peter Rudenko
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: [ml] Why all model classes are final?

Posted by Erik Erlandson <ej...@redhat.com>.
I was able to work around this problem in several cases using the class 'enhancement' or 'extension' pattern to add some functionality to the decision tree model data structures.


----- Original Message -----
> Hi, previously all the models in ml package were private to package, so
> if i need to customize some models i inherit them in org.apache.spark.ml
> package in my project. But now new models
> (https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala#L46)
> are final classes. So if i need to customize 1 line or so, i need to
> redefine the whole class. Any reasons to do so? As a developer,i
> understand all the risks of using Developer/Alpha API. That's why i'm
> using spark, because it provides a building blocks that i could easily
> customize and combine for my need.
> 
> Thanks,
> Peter Rudenko
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org