You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by pw...@apache.org on 2014/01/14 08:08:41 UTC

[8/8] git commit: Merge pull request #380 from mateiz/py-bayes

Merge pull request #380 from mateiz/py-bayes

Add Naive Bayes to Python MLlib, and some API fixes

- Added a Python wrapper for Naive Bayes
- Updated the Scala Naive Bayes to match the style of our other
  algorithms better and in particular make it easier to call from Java
  (added builder pattern, removed default value in train method)
- Updated Python MLlib functions to not require a SparkContext; we can
  get that from the RDD the user gives
- Added a toString method in LabeledPoint
- Made the Python MLlib tests run as part of run-tests as well (before
  they could only be run individually through each file)


Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/fdaabdc6
Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/fdaabdc6
Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/fdaabdc6

Branch: refs/heads/master
Commit: fdaabdc67387524ffb84354f87985f48bd31cf60
Parents: 4a805af cc93c2a
Author: Patrick Wendell <pw...@gmail.com>
Authored: Mon Jan 13 23:08:26 2014 -0800
Committer: Patrick Wendell <pw...@gmail.com>
Committed: Mon Jan 13 23:08:26 2014 -0800

----------------------------------------------------------------------
 docs/_config.yml                                |  2 +-
 docs/mllib-guide.md                             | 19 +++--
 docs/python-programming-guide.md                |  8 +-
 mllib/data/sample_naive_bayes_data.txt          |  6 ++
 .../spark/mllib/api/python/PythonMLLibAPI.scala | 17 +++++
 .../classification/LogisticRegression.scala     |  4 +-
 .../spark/mllib/classification/NaiveBayes.scala | 65 ++++++++++++++---
 .../apache/spark/mllib/classification/SVM.scala |  2 +
 .../spark/mllib/regression/LabeledPoint.scala   |  6 +-
 .../apache/spark/mllib/regression/Lasso.scala   |  4 +-
 .../mllib/regression/LinearRegression.scala     |  2 +
 .../mllib/regression/RidgeRegression.scala      |  4 +-
 .../classification/JavaNaiveBayesSuite.java     | 72 ++++++++++++++++++
 python/pyspark/mllib/_common.py                 |  2 +-
 python/pyspark/mllib/classification.py          | 77 +++++++++++++++++---
 python/pyspark/mllib/clustering.py              | 11 +--
 python/pyspark/mllib/recommendation.py          | 10 ++-
 python/pyspark/mllib/regression.py              | 35 +++++----
 python/pyspark/worker.py                        |  4 +
 python/run-tests                                |  5 ++
 20 files changed, 297 insertions(+), 58 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/fdaabdc6/mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala
----------------------------------------------------------------------