You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@systemml.apache.org by de...@apache.org on 2017/04/07 18:58:34 UTC

[30/50] [abbrv] incubator-systemml git commit: [SYSTEMML-1238] Updated the default parameters of mllearn to match that of scikit learn.

[SYSTEMML-1238] Updated the default parameters of mllearn to match that of
scikit learn.

- Also updated the test to compare our algorithm to scikit-learn.

Closes #398.


Project: http://git-wip-us.apache.org/repos/asf/incubator-systemml/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-systemml/commit/0fb74b94
Tree: http://git-wip-us.apache.org/repos/asf/incubator-systemml/tree/0fb74b94
Diff: http://git-wip-us.apache.org/repos/asf/incubator-systemml/diff/0fb74b94

Branch: refs/heads/gh-pages
Commit: 0fb74b94af9e244b5695745ac7b3651b485b812f
Parents: bb97a4b
Author: Niketan Pansare <np...@us.ibm.com>
Authored: Fri Feb 17 14:54:23 2017 -0800
Committer: Niketan Pansare <np...@us.ibm.com>
Committed: Fri Feb 17 14:59:49 2017 -0800

----------------------------------------------------------------------
 algorithms-regression.md  | 8 ++++----
 beginners-guide-python.md | 2 +-
 python-reference.md       | 6 +++---
 3 files changed, 8 insertions(+), 8 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/0fb74b94/algorithms-regression.md
----------------------------------------------------------------------
diff --git a/algorithms-regression.md b/algorithms-regression.md
index 992862e..80b38a3 100644
--- a/algorithms-regression.md
+++ b/algorithms-regression.md
@@ -83,8 +83,8 @@ efficient when the number of features $m$ is relatively small
 <div data-lang="Python" markdown="1">
 {% highlight python %}
 from systemml.mllearn import LinearRegression
-# C = 1/reg
-lr = LinearRegression(sqlCtx, fit_intercept=True, C=1.0, solver='direct-solve')
+# C = 1/reg (to disable regularization, use float("inf"))
+lr = LinearRegression(sqlCtx, fit_intercept=True, normalize=False, C=float("inf"), solver='direct-solve')
 # X_train, y_train and X_test can be NumPy matrices or Pandas DataFrame or SciPy Sparse Matrix
 y_test = lr.fit(X_train, y_train)
 # df_train is DataFrame that contains two columns: "features" (of type Vector) and "label". df_test is a DataFrame that contains the column "features"
@@ -125,8 +125,8 @@ y_test = lr.fit(df_train)
 <div data-lang="Python" markdown="1">
 {% highlight python %}
 from systemml.mllearn import LinearRegression
-# C = 1/reg
-lr = LinearRegression(sqlCtx, fit_intercept=True, max_iter=100, tol=0.000001, C=1.0, solver='newton-cg')
+# C = 1/reg (to disable regularization, use float("inf"))
+lr = LinearRegression(sqlCtx, fit_intercept=True, normalize=False, max_iter=100, tol=0.000001, C=float("inf"), solver='newton-cg')
 # X_train, y_train and X_test can be NumPy matrices or Pandas DataFrames or SciPy Sparse matrices
 y_test = lr.fit(X_train, y_train)
 # df_train is DataFrame that contains two columns: "features" (of type Vector) and "label". df_test is a DataFrame that contains the column "features"

http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/0fb74b94/beginners-guide-python.md
----------------------------------------------------------------------
diff --git a/beginners-guide-python.md b/beginners-guide-python.md
index 4d1b098..ffab09e 100644
--- a/beginners-guide-python.md
+++ b/beginners-guide-python.md
@@ -228,7 +228,7 @@ X_test = diabetes_X[-20:]
 y_train = diabetes.target[:-20]
 y_test = diabetes.target[-20:]
 # Create linear regression object
-regr = LinearRegression(sqlCtx, fit_intercept=True, C=1, solver='direct-solve')
+regr = LinearRegression(sqlCtx, fit_intercept=True, C=float("inf"), solver='direct-solve')
 # Train the model using the training sets
 regr.fit(X_train, y_train)
 y_predicted = regr.predict(X_test)

http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/0fb74b94/python-reference.md
----------------------------------------------------------------------
diff --git a/python-reference.md b/python-reference.md
index 65dcb5c..8d38598 100644
--- a/python-reference.md
+++ b/python-reference.md
@@ -731,7 +731,7 @@ LogisticRegression score: 0.922222
 
 ### Reference documentation
 
- *class*`systemml.mllearn.estimators.LinearRegression`(*sqlCtx*, *fit\_intercept=True*, *max\_iter=100*, *tol=1e-06*, *C=1.0*, *solver='newton-cg'*, *transferUsingDF=False*)(#systemml.mllearn.estimators.LinearRegression "Permalink to this definition")
+ *class*`systemml.mllearn.estimators.LinearRegression`(*sqlCtx*, *fit\_intercept=True*, *normalize=False*, *max\_iter=100*, *tol=1e-06*, *C=float("inf")*, *solver='newton-cg'*, *transferUsingDF=False*)(#systemml.mllearn.estimators.LinearRegression "Permalink to this definition")
 :   Bases: `systemml.mllearn.estimators.BaseSystemMLRegressor`{.xref .py
     .py-class .docutils .literal}
 
@@ -760,7 +760,7 @@ LogisticRegression score: 0.922222
         >>> # The mean square error
         >>> print("Residual sum of squares: %.2f" % np.mean((regr.predict(diabetes_X_test) - diabetes_y_test) ** 2))
 
- *class*`systemml.mllearn.estimators.LogisticRegression`(*sqlCtx*, *penalty='l2'*, *fit\_intercept=True*, *max\_iter=100*, *max\_inner\_iter=0*, *tol=1e-06*, *C=1.0*, *solver='newton-cg'*, *transferUsingDF=False*)(#systemml.mllearn.estimators.LogisticRegression "Permalink to this definition")
+ *class*`systemml.mllearn.estimators.LogisticRegression`(*sqlCtx*, *penalty='l2'*, *fit\_intercept=True*, *normalize=False*,  *max\_iter=100*, *max\_inner\_iter=0*, *tol=1e-06*, *C=1.0*, *solver='newton-cg'*, *transferUsingDF=False*)(#systemml.mllearn.estimators.LogisticRegression "Permalink to this definition")
 :   Bases: `systemml.mllearn.estimators.BaseSystemMLClassifier`{.xref
     .py .py-class .docutils .literal}
 
@@ -817,7 +817,7 @@ LogisticRegression score: 0.922222
         >>> prediction = model.transform(test)
         >>> prediction.show()
 
- *class*`systemml.mllearn.estimators.SVM`(*sqlCtx*, *fit\_intercept=True*, *max\_iter=100*, *tol=1e-06*, *C=1.0*, *is\_multi\_class=False*, *transferUsingDF=False*)(#systemml.mllearn.estimators.SVM "Permalink to this definition")
+ *class*`systemml.mllearn.estimators.SVM`(*sqlCtx*, *fit\_intercept=True*, *normalize=False*, *max\_iter=100*, *tol=1e-06*, *C=1.0*, *is\_multi\_class=False*, *transferUsingDF=False*)(#systemml.mllearn.estimators.SVM "Permalink to this definition")
 :   Bases: `systemml.mllearn.estimators.BaseSystemMLClassifier`{.xref
     .py .py-class .docutils .literal}