You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by tgaloppo <gi...@git.apache.org> on 2015/02/05 18:28:46 UTC

[GitHub] spark pull request: [SPARK-5013] [MLlib] [WIP] Added documentation...

GitHub user tgaloppo opened a pull request:

    https://github.com/apache/spark/pull/4401

    [SPARK-5013] [MLlib] [WIP] Added documentation and sample data file for GaussianMixture

    Simple description and code samples (and sample data) for GaussianMixture


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tgaloppo/spark spark-5013

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/4401.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4401
    
----
commit 3eb41fa1f304d817109d5f56349dfcca49119957
Author: Travis Galoppo <tj...@columbia.edu>
Date:   2015-02-05T17:25:06Z

    [SPARK-5013] Added documentation and sample data file for GaussianMixture

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] [WIP] Added documentation...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73197096
  
    Did you follow the `docs/README.md`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] Added documentation and s...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73254122
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26915/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] [WIP] Added documentation...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73177607
  
      [Test build #26884 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26884/consoleFull) for   PR 4401 at commit [`2368690`](https://github.com/apache/spark/commit/23686904dbea03d10264b00b9fffa797d21d43f0).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `public class GaussianMixtureExample `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] [WIP] Added documentation...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73102200
  
      [Test build #26845 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26845/consoleFull) for   PR 4401 at commit [`3eb41fa`](https://github.com/apache/spark/commit/3eb41fa1f304d817109d5f56349dfcca49119957).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `public class GaussianMixtureExample `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] [WIP] Added documentation...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4401#discussion_r24216731
  
    --- Diff: docs/mllib-clustering.md ---
    @@ -168,6 +187,112 @@ print("Within Set Sum of Squared Error = " + str(WSSSE))
     
     </div>
     
    +#### GaussianMixture
    +
    +<div class="codetabs">
    +<div data-lang="scala" markdown="1">
    +In the following example after loading and parsing data, we use a
    +[`GaussianMixture`](api/scala/index.html#org.apache.spark.mllib.clustering.GaussianMixture) 
    +object to cluster the data into two clusters. The number of desired clusters is passed 
    +to the algorithm. We then output the parameters of the mixture model.
    +
    +{% highlight scala %}
    +import org.apache.spark.mllib.clustering.GaussianMixture
    +import org.apache.spark.mllib.linalg.Vectors
    +
    +// Load and parse the data
    +val data = sc.textFile("data/mllib/gmm_data.txt")
    +val parsedData = data.map(s => Vectors.dense(s.trim.split(' ').map(_.toDouble))).cache()
    +
    +// Cluster the data into two classes using GaussianMixture
    +val gmm = new GaussianMixture().setK(2).run(parsedData)
    +
    +// output parameters of max-likelihood model
    +for (i <- 0 until gmm.k) {
    +  println("weight=%f\nmu=%s\nsigma=\n%s\n" format 
    +    (gmm.weights(i), gmm.gaussians(i).mu, gmm.gaussians(i).sigma))
    +}
    +
    +{% endhighlight %}
    +</div>
    +
    +<div data-lang="java" markdown="1">
    +All of MLlib's methods use Java-friendly types, so you can import and call them there the same
    +way you do in Scala. The only caveat is that the methods take Scala RDD objects, while the
    +Spark Java API uses a separate `JavaRDD` class. You can convert a Java RDD to a Scala one by
    +calling `.rdd()` on your `JavaRDD` object. A self-contained application example
    +that is equivalent to the provided example in Scala is given below:
    +
    +{% highlight java %}
    +import org.apache.spark.api.java.*;
    +import org.apache.spark.api.java.function.Function;
    +import org.apache.spark.mllib.clustering.GaussianMixture;
    +import org.apache.spark.mllib.clustering.GaussianMixtureModel;
    +import org.apache.spark.mllib.linalg.Vector;
    +import org.apache.spark.mllib.linalg.Vectors;
    +import org.apache.spark.SparkConf;
    +
    +public class GaussianMixtureExample {
    +  public static void main(String[] args) {
    +    SparkConf conf = new SparkConf().setAppName("GaussianMixture Example");
    +    JavaSparkContext sc = new JavaSparkContext(conf);
    +
    +    // Load and parse data
    +    String path = "data/mllib/gmm_data.txt";
    +    JavaRDD<String> data = sc.textFile(path);
    +    JavaRDD<Vector> parsedData = data.map(
    +      new Function<String, Vector>() {
    +        public Vector call(String s) {
    +          String[] sarray = s.trim().split(" ");
    +          double[] values = new double[sarray.length];
    +          for (int i = 0; i < sarray.length; i++)
    +            values[i] = Double.parseDouble(sarray[i]);
    +          return Vectors.dense(values);
    +        }
    +      }
    +    );
    +    parsedData.cache();
    +
    +    // Cluster the data into two classes using GaussianMixture
    +    GaussianMixtureModel gmm = new GaussianMixture().setK(2).run(parsedData.rdd());
    +
    +    // Output the parameters of the mixture model
    +    for(int j=0; j<gmm.k(); j++) {
    +        System.out.println("weight=%f\nmu=%s\nsigma=\n%s\n",
    +            gmm.weights()[j], gmm.gaussians()[j].mu(), gmm.gaussians()[j].sigma());
    +    }
    +  }
    +}
    +{% endhighlight %}
    +</div>
    +
    +<div data-lang="python" markdown="1">
    +In the following example after loading and parsing data, we use a
    +[`GaussianMixture`](api/scala/index.html#org.apache.spark.mllib.clustering.GaussianMixture) 
    +object to cluster the data into two clusters. The number of desired clusters is passed 
    +to the algorithm. We then output the parameters of the mixture model.
    +
    +{% highlight python %}
    +from pyspark.mllib.clustering import GaussianMixture
    +from numpy import array
    +
    +# Load and parse the data
    +data = sc.textFile("data/mllib/gmm_data.txt")
    +parsedData = data.map(lambda line: array([float(x) for x in line.strip().split(' ')]))
    +
    +# Build the model (cluster the data)
    +gmm = new GaussianMixture.train(parsedData, 2)
    --- End diff --
    
    remove `new`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] [WIP] Added documentation...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4401#discussion_r24216733
  
    --- Diff: docs/mllib-clustering.md ---
    @@ -168,6 +187,112 @@ print("Within Set Sum of Squared Error = " + str(WSSSE))
     
     </div>
     
    +#### GaussianMixture
    +
    +<div class="codetabs">
    +<div data-lang="scala" markdown="1">
    +In the following example after loading and parsing data, we use a
    +[`GaussianMixture`](api/scala/index.html#org.apache.spark.mllib.clustering.GaussianMixture) 
    +object to cluster the data into two clusters. The number of desired clusters is passed 
    +to the algorithm. We then output the parameters of the mixture model.
    +
    +{% highlight scala %}
    +import org.apache.spark.mllib.clustering.GaussianMixture
    +import org.apache.spark.mllib.linalg.Vectors
    +
    +// Load and parse the data
    +val data = sc.textFile("data/mllib/gmm_data.txt")
    +val parsedData = data.map(s => Vectors.dense(s.trim.split(' ').map(_.toDouble))).cache()
    +
    +// Cluster the data into two classes using GaussianMixture
    +val gmm = new GaussianMixture().setK(2).run(parsedData)
    +
    +// output parameters of max-likelihood model
    +for (i <- 0 until gmm.k) {
    +  println("weight=%f\nmu=%s\nsigma=\n%s\n" format 
    +    (gmm.weights(i), gmm.gaussians(i).mu, gmm.gaussians(i).sigma))
    +}
    +
    +{% endhighlight %}
    +</div>
    +
    +<div data-lang="java" markdown="1">
    +All of MLlib's methods use Java-friendly types, so you can import and call them there the same
    +way you do in Scala. The only caveat is that the methods take Scala RDD objects, while the
    +Spark Java API uses a separate `JavaRDD` class. You can convert a Java RDD to a Scala one by
    +calling `.rdd()` on your `JavaRDD` object. A self-contained application example
    +that is equivalent to the provided example in Scala is given below:
    +
    +{% highlight java %}
    +import org.apache.spark.api.java.*;
    +import org.apache.spark.api.java.function.Function;
    +import org.apache.spark.mllib.clustering.GaussianMixture;
    +import org.apache.spark.mllib.clustering.GaussianMixtureModel;
    +import org.apache.spark.mllib.linalg.Vector;
    +import org.apache.spark.mllib.linalg.Vectors;
    +import org.apache.spark.SparkConf;
    +
    +public class GaussianMixtureExample {
    +  public static void main(String[] args) {
    +    SparkConf conf = new SparkConf().setAppName("GaussianMixture Example");
    +    JavaSparkContext sc = new JavaSparkContext(conf);
    +
    +    // Load and parse data
    +    String path = "data/mllib/gmm_data.txt";
    +    JavaRDD<String> data = sc.textFile(path);
    +    JavaRDD<Vector> parsedData = data.map(
    +      new Function<String, Vector>() {
    +        public Vector call(String s) {
    +          String[] sarray = s.trim().split(" ");
    +          double[] values = new double[sarray.length];
    +          for (int i = 0; i < sarray.length; i++)
    +            values[i] = Double.parseDouble(sarray[i]);
    +          return Vectors.dense(values);
    +        }
    +      }
    +    );
    +    parsedData.cache();
    +
    +    // Cluster the data into two classes using GaussianMixture
    +    GaussianMixtureModel gmm = new GaussianMixture().setK(2).run(parsedData.rdd());
    +
    +    // Output the parameters of the mixture model
    +    for(int j=0; j<gmm.k(); j++) {
    +        System.out.println("weight=%f\nmu=%s\nsigma=\n%s\n",
    +            gmm.weights()[j], gmm.gaussians()[j].mu(), gmm.gaussians()[j].sigma());
    +    }
    +  }
    +}
    +{% endhighlight %}
    +</div>
    +
    +<div data-lang="python" markdown="1">
    +In the following example after loading and parsing data, we use a
    +[`GaussianMixture`](api/scala/index.html#org.apache.spark.mllib.clustering.GaussianMixture) 
    --- End diff --
    
    This should be linked to the Python doc, e.g., `[NaiveBayes](api/python/pyspark.mllib.classification.NaiveBayes-class.html)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] [WIP] Added documentation...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73165600
  
    @tgaloppo Thanks for working on the user guide! I saw this is still marked as `WIP`. Are you going to add more content?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] Added documentation and s...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73242351
  
      [Test build #26915 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26915/consoleFull) for   PR 4401 at commit [`c9ff9a5`](https://github.com/apache/spark/commit/c9ff9a5536a3141baa785b87a7fce05543b90c84).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] [WIP] Added documentation...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73177613
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26884/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] [WIP] Added documentation...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73171464
  
      [Test build #26884 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26884/consoleFull) for   PR 4401 at commit [`2368690`](https://github.com/apache/spark/commit/23686904dbea03d10264b00b9fffa797d21d43f0).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] Added documentation and s...

Posted by tgaloppo <gi...@git.apache.org>.
Github user tgaloppo commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73243150
  
    @mengxr I was able to build the docs (I had to do a clean build on my source tree for some reason).  I have checked all links (that I added, anyway).  I also updated mllib-guide.md to include Gaussian mixture (and power iteration, since it was also missing) in the list of clustering techniques.
    
    I've removed the WIP tag; I will not add any more content, unless requested.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] Added documentation and s...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/4401


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] [WIP] Added documentation...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73089327
  
      [Test build #26845 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26845/consoleFull) for   PR 4401 at commit [`3eb41fa`](https://github.com/apache/spark/commit/3eb41fa1f304d817109d5f56349dfcca49119957).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] [WIP] Added documentation...

Posted by tgaloppo <gi...@git.apache.org>.
Github user tgaloppo commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73171324
  
    @mengxr I have made the fixes you pointed out.  I am having trouble building the API docs so I can not verify that the link to the python GaussianMixture class resolves properly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] Added documentation and s...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73287334
  
    LGTM. Merged into master and branch-1.3! Thanks for working on the user guide!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] [WIP] Added documentation...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73102218
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26845/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] Added documentation and s...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73254105
  
      [Test build #26915 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26915/consoleFull) for   PR 4401 at commit [`c9ff9a5`](https://github.com/apache/spark/commit/c9ff9a5536a3141baa785b87a7fce05543b90c84).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `public class GaussianMixtureExample `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5013] [MLlib] [WIP] Added documentation...

Posted by tgaloppo <gi...@git.apache.org>.
Github user tgaloppo commented on the pull request:

    https://github.com/apache/spark/pull/4401#issuecomment-73166676
  
    @mengxr I was considering adding a graphic of the test data with the recovered 2-d gaussians... I'm not sure if it would really be beneficial or not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org