You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by wangmiao1981 <gi...@git.apache.org> on 2016/05/25 19:58:26 UTC

[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

GitHub user wangmiao1981 opened a pull request:

    https://github.com/apache/spark/pull/13301

    [SPARK-15449][MLlib][Example]:Wrong Data Format - Documentation Issue

    ## What changes were proposed in this pull request?
    
    (Please fill in changes proposed in this fix)
    In the MLLib naivebayes example, scala and python example doesn't use libsvm data, but Java does.
    
    I make changes in scala and python example to use the libsvm data as the same as Java example.
    
    ## How was this patch tested?
    
    Manual tests
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/wangmiao1981/spark example

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13301.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13301
    
----
commit fa3656e2aab980c0413357699d3774faf8372b0e
Author: wm624@hotmail.com <wm...@hotmail.com>
Date:   2016-05-25T19:55:18Z

    change data source for mllib naivebayes example

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by MLnick <gi...@git.apache.org>.
Github user MLnick commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13301#discussion_r64717582
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/NaiveBayesExample.scala ---
    @@ -31,16 +30,11 @@ object NaiveBayesExample {
         val conf = new SparkConf().setAppName("NaiveBayesExample")
         val sc = new SparkContext(conf)
         // $example on$
    -    val data = sc.textFile("data/mllib/sample_naive_bayes_data.txt")
    -    val parsedData = data.map { line =>
    -      val parts = line.split(',')
    -      LabeledPoint(parts(0).toDouble, Vectors.dense(parts(1).split(' ').map(_.toDouble)))
    -    }
    +    // Load and parse the data file.
    +    val data = MLUtils.loadLibSVMFile(sc, "data/mllib/sample_libsvm_data.txt")
     
         // Split data into training (60%) and test (40%).
    -    val splits = parsedData.randomSplit(Array(0.6, 0.4), seed = 11L)
    -    val training = splits(0)
    -    val test = splits(1)
    +    val Array(training, test) = data.randomSplit(Array(0.6, 0.4))
    --- End diff --
    
    I see you removed the `seed` here, which is not necessary but probably makes sense to simplify. In which case, perhaps remove the `seed` arg from the Java & Python examples. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by wangmiao1981 <gi...@git.apache.org>.
Github user wangmiao1981 commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221706864
  
    @srowen ML uses the sample_libsvm_data.txt in all three examples.
    sample_naive_bayes_data.txt  is not in libsvm format. The format is shown below:
    
    0,1 0 0
    0,2 0 0
    0,3 0 0
    0,4 0 0
    1,0 1 0
    1,0 2 0
    1,0 3 0
    1,0 4 0
    2,0 0 1
    2,0 0 2
    2,0 0 3
    2,0 0 4
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221692582
  
    **[Test build #59291 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59291/consoleFull)** for PR 13301 at commit [`fa3656e`](https://github.com/apache/spark/commit/fa3656e2aab980c0413357699d3774faf8372b0e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221699387
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221696827
  
    **[Test build #59292 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59292/consoleFull)** for PR 13301 at commit [`5bf30dd`](https://github.com/apache/spark/commit/5bf30dd7e9049c4bb52daff1ff33ce06f2c47e08).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221692715
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59291/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by wangmiao1981 <gi...@git.apache.org>.
Github user wangmiao1981 commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221722166
  
    @srowen I think we can delete it. Let me double check it and update this PR. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-222221171
  
    **[Test build #59507 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59507/consoleFull)** for PR 13301 at commit [`7d3b5b0`](https://github.com/apache/spark/commit/7d3b5b0b081b6a0741f296df073ae52128bb6d7a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by wangmiao1981 <gi...@git.apache.org>.
Github user wangmiao1981 commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221722860
  
    I grep all scala/java/py files and there is no reference to the data file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221699248
  
    **[Test build #59292 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59292/consoleFull)** for PR 13301 at commit [`5bf30dd`](https://github.com/apache/spark/commit/5bf30dd7e9049c4bb52daff1ff33ce06f2c47e08).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221731591
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59300/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by MLnick <gi...@git.apache.org>.
Github user MLnick commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-222224371
  
    LGTM pending tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13301


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221718026
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-222230386
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59507/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-222230238
  
    **[Test build #59507 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59507/consoleFull)** for PR 13301 at commit [`7d3b5b0`](https://github.com/apache/spark/commit/7d3b5b0b081b6a0741f296df073ae52128bb6d7a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221731484
  
    **[Test build #59300 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59300/consoleFull)** for PR 13301 at commit [`2e15ec9`](https://github.com/apache/spark/commit/2e15ec9771014a6b708d153fa76cc840c4c552e5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by wangmiao1981 <gi...@git.apache.org>.
Github user wangmiao1981 commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-222220897
  
    @MLnick Done. removed seed in python and java


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221717914
  
    **[Test build #59297 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59297/consoleFull)** for PR 13301 at commit [`16c6f6c`](https://github.com/apache/spark/commit/16c6f6c02c8d6a60db446efe02b81bfc1ae2b33f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by wangmiao1981 <gi...@git.apache.org>.
Github user wangmiao1981 commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-222040536
  
    @MLnick I am on travel now. I will update it on Saturday. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-222230384
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221715737
  
    **[Test build #59297 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59297/consoleFull)** for PR 13301 at commit [`16c6f6c`](https://github.com/apache/spark/commit/16c6f6c02c8d6a60db446efe02b81bfc1ae2b33f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221705531
  
    Hm, I guess I don't see why this should use the libsvm sample? 2 of the 3 examples used the naive bayes sample, and it's a naive bayes example. Shouldn't Java be fixed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221731589
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by MLnick <gi...@git.apache.org>.
Github user MLnick commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221825221
  
    Minor comment otherwise LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-222284311
  
    Merged to master/2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221720899
  
    OK got it. Now `sample_naive_bayes_data.txt` is no longer used, I think. I think it can be deleted?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221699390
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59292/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221718028
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59297/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221692709
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221690057
  
    **[Test build #59291 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59291/consoleFull)** for PR 13301 at commit [`fa3656e`](https://github.com/apache/spark/commit/fa3656e2aab980c0413357699d3774faf8372b0e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13301#discussion_r64652199
  
    --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/NaiveBayesExample.scala ---
    @@ -31,21 +32,17 @@ object NaiveBayesExample {
         val conf = new SparkConf().setAppName("NaiveBayesExample")
         val sc = new SparkContext(conf)
         // $example on$
    -    val data = sc.textFile("data/mllib/sample_naive_bayes_data.txt")
    -    val parsedData = data.map { line =>
    -      val parts = line.split(',')
    -      LabeledPoint(parts(0).toDouble, Vectors.dense(parts(1).split(' ').map(_.toDouble)))
    -    }
    +    // Load and parse the data file.
    +    val data = MLUtils.loadLibSVMFile(sc, "data/mllib/sample_libsvm_data.txt")
     
         // Split data into training (60%) and test (40%).
    -    val splits = parsedData.randomSplit(Array(0.6, 0.4), seed = 11L)
    -    val training = splits(0)
    -    val test = splits(1)
    +    val splits = data.randomSplit(Array(0.6, 0.4))
    +    val (trainingData, testData) = (splits(0), splits(1))
    --- End diff --
    
    Lets not change the variable names, since they were consistent.
    But hey in Scala this can be `val Array(training, test) = data.randomSplit(...)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13301#issuecomment-221723706
  
    **[Test build #59300 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59300/consoleFull)** for PR 13301 at commit [`2e15ec9`](https://github.com/apache/spark/commit/2e15ec9771014a6b708d153fa76cc840c4c552e5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org