You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@madlib.apache.org by fmcquillan99 <gi...@git.apache.org> on 2018/02/14 00:46:52 UTC
[GitHub] madlib pull request #235: update KNN, DT and RF docs to match recent commits
GitHub user fmcquillan99 opened a pull request:
https://github.com/apache/madlib/pull/235
update KNN, DT and RF docs to match recent commits
KNN
* describe weighted average in more detail
DT & RF
* correct some doc errors and omissions
* update example to show positive variable importance in RF
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/madlib/madlib master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/madlib/pull/235.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #235
----
commit d15f9fcc1ea625514aeeb7418f52f3e5b80c532c
Author: Frank McQuillan <fm...@...>
Date: 2018-02-14T00:16:57Z
update KNN, DT and RF docs to match recent commits
----
---
[GitHub] madlib pull request #235: update KNN, DT and RF docs to match recent commits
Posted by iyerr3 <gi...@git.apache.org>.
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/235#discussion_r168523757
--- Diff: src/ports/postgres/modules/recursive_partitioning/random_forest.sql_in ---
@@ -208,13 +208,26 @@ forest_train(training_table_name,
<tr>
<th>dependent_var_levels</th>
- <td>itext. For classification, the distinct levels of the dependent variable.</td>
+ <td>text. For classification, the distinct levels of the dependent variable.</td>
</tr>
<tr>
<th>dependent_var_type</th>
<td>text. The type of dependent variable.</td>
</tr>
+
+ <tr>
+ <th>independent_var_types</th>
+ <td>text. A comma separated string for the types of independent variables.</td>
+ </tr>
+
+ <tr>
+ <th>null_proxy</th>
+ <td>text. Describes how NULLs are handled. If NULL is not
+ treated as a separate categorical variable, this will be blank.
--- End diff --
again `this will be NULL` is more appropriate.
---
[GitHub] madlib issue #235: update KNN, DT and RF docs to match recent commits
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit commented on the issue:
https://github.com/apache/madlib/pull/235
Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/madlib-pr-build/347/
---
[GitHub] madlib pull request #235: update KNN, DT and RF docs to match recent commits
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/madlib/pull/235
---
[GitHub] madlib pull request #235: update KNN, DT and RF docs to match recent commits
Posted by fmcquillan99 <gi...@git.apache.org>.
Github user fmcquillan99 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/235#discussion_r168557191
--- Diff: src/ports/postgres/modules/recursive_partitioning/random_forest.sql_in ---
@@ -208,13 +208,26 @@ forest_train(training_table_name,
<tr>
<th>dependent_var_levels</th>
- <td>itext. For classification, the distinct levels of the dependent variable.</td>
+ <td>text. For classification, the distinct levels of the dependent variable.</td>
</tr>
<tr>
<th>dependent_var_type</th>
<td>text. The type of dependent variable.</td>
</tr>
+
+ <tr>
+ <th>independent_var_types</th>
+ <td>text. A comma separated string for the types of independent variables.</td>
+ </tr>
+
+ <tr>
+ <th>null_proxy</th>
+ <td>text. Describes how NULLs are handled. If NULL is not
+ treated as a separate categorical variable, this will be blank.
--- End diff --
thanks I made the suggested changes
---
[GitHub] madlib issue #235: update KNN, DT and RF docs to match recent commits
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit commented on the issue:
https://github.com/apache/madlib/pull/235
Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/madlib-pr-build/344/
---
[GitHub] madlib pull request #235: update KNN, DT and RF docs to match recent commits
Posted by iyerr3 <gi...@git.apache.org>.
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/235#discussion_r168523662
--- Diff: src/ports/postgres/modules/recursive_partitioning/decision_tree.sql_in ---
@@ -355,6 +355,19 @@ tree_train(
<th>independent_var_types</th>
<td>TEXT. A comma separated string for the types of independent variables.</td>
</tr>
+
+ <tr>
+ <th>n_folds</th>
+ <td>BIGINT. Number of cross-validation folds used.</td>
+ </tr>
+
+ <tr>
+ <th>null_proxy</th>
+ <td>TEXT. Describes how NULLs are handled. If NULL is not
+ treated as a separate categorical variable, this will be blank.
--- End diff --
I suggest replacing `this will be blank` with `this will be NULL`. The `blank` for NULL is the default in psql but that can easily be changed.
---