You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/08/26 10:06:45 UTC

[jira] [Commented] (SPARK-7751) Add @Since annotation to stable and experimental methods in MLlib

    [ https://issues.apache.org/jira/browse/SPARK-7751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712674#comment-14712674 ] 

Sean Owen commented on SPARK-7751:
----------------------------------

Small point for discussion -- 54 child JIRAs have been created here. Now, I am actually a fan of using JIRA to track tasks in a granular way. But the management of each of these JIRAs will generate 5-10 emails over its life. I am still trying to actually skim everything to issues@, and this adds a fair bit of noise to an already super busy list (couple hundred emails a day). 

You could say, well, don't read issues@, since most people don't, but I think that's a problem: not many people see how many JIRAs get filed and silently ignored since that's the only message they create, and I don't think people generally look at anything but their own tickets in JIRA.

I also wonder whether breaking out these changes might ultimately create more work, as different people have dipped in to resolve subsets of them in potentially different ways.

> Add @Since annotation to stable and experimental methods in MLlib
> -----------------------------------------------------------------
>
>                 Key: SPARK-7751
>                 URL: https://issues.apache.org/jira/browse/SPARK-7751
>             Project: Spark
>          Issue Type: Umbrella
>          Components: Documentation, MLlib
>            Reporter: Xiangrui Meng
>            Assignee: Xiangrui Meng
>            Priority: Minor
>              Labels: starter
>
> This is useful to check whether a feature exists in some version of Spark. This is an umbrella JIRA to track the progress. We want to have -@since tag- @Since annotation for both stable (those without any Experimental/DeveloperApi/AlphaComponent annotations) and experimental methods in MLlib:
> (Do NOT tag private or package private classes or methods, nor local variables and methods.)
> * an example PR for Scala: https://github.com/apache/spark/pull/8309
> We need to dig the history of git commit to figure out what was the Spark version when a method was first introduced. Take `NaiveBayes.setModelType` as an example. We can grep `def setModelType` at different version git tags.
> {code}
> meng@xm:~/src/spark
> $ git show v1.3.0:mllib/src/main/scala/org/apache/spark/mllib/classification/NaiveBayes.scala | grep "def setModelType"
> meng@xm:~/src/spark
> $ git show v1.4.0:mllib/src/main/scala/org/apache/spark/mllib/classification/NaiveBayes.scala | grep "def setModelType"
>   def setModelType(modelType: String): NaiveBayes = {
> {code}
> If there are better ways, please let us know.
> We cannot add all -@since tags- @Since annotation in a single PR, which is hard to review. So we made some subtasks for each package, for example `org.apache.spark.classification`. Feel free to add more sub-tasks for Python and the `spark.ml` package.
> Plan:
> 1. In 1.5, we try to add @Since annotation to all stable/experimental methods under `spark.mllib`.
> 2. Starting from 1.6, we require @Since annotation in all new PRs.
> 3. In 1.6, we try to add @SInce annotation to all stable/experimental methods under `spark.ml`, `pyspark.mllib`, and `pyspark.ml`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org