You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@systemds.apache.org by ja...@apache.org on 2021/05/13 11:41:01 UTC
[systemds] branch master updated: [SYSTEMDS-2870][DOC] Builtin
Decision Tree (#1272)
This is an automated email from the ASF dual-hosted git repository.
janardhan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemds.git
The following commit(s) were added to refs/heads/master by this push:
new dadc154 [SYSTEMDS-2870][DOC] Builtin Decision Tree (#1272)
dadc154 is described below
commit dadc15453d0de8956a1ac3097c2d47ae1f0547d2
Author: Janardhan Pulivarthi <j1...@protonmail.com>
AuthorDate: Thu May 13 17:10:51 2021 +0530
[SYSTEMDS-2870][DOC] Builtin Decision Tree (#1272)
Reference commit: 44e42945
---
docs/site/builtins-reference.md | 47 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 47 insertions(+)
diff --git a/docs/site/builtins-reference.md b/docs/site/builtins-reference.md
index bb8bb93..9ad2d72 100644
--- a/docs/site/builtins-reference.md
+++ b/docs/site/builtins-reference.md
@@ -33,6 +33,7 @@ limitations under the License.
* [`csplineDS`-Function](#csplineDS-function)
* [`cvlm`-Function](#cvlm-function)
* [`DBSCAN`-Function](#DBSCAN-function)
+ * [`decisionTree`-Function](#decisiontree-function)
* [`discoverFD`-Function](#discoverFD-function)
* [`dist`-Function](#dist-function)
* [`dmv`-Function](#dmv-function)
@@ -374,6 +375,52 @@ X = rand(rows=1780, cols=180, min=1, max=20)
dbscan(X = X, eps = 2.5, minPts = 360)
```
+## `decisionTree`-Function
+
+The `decisionTree()` implements the classification tree with both scale and categorical
+features.
+
+### Usage
+
+```r
+M = decisionTree(X, Y, R);
+```
+
+### Arguments
+
+| Name | Type | Default | Description |
+| :--------- | :-------------- | :--------- | :---------- |
+| X | Matrix[Double] | required | Feature matrix X; note that X needs to be both recoded and dummy coded |
+| Y | Matrix[Double] | required | Label matrix Y; note that Y needs to be both recoded and dummy coded |
+| R | Matrix[Double] | " " | Matrix R which for each feature in X contains the following information <br> - R[1,]: Row Vector which indicates if feature vector is scalar or categorical. 1 indicates <br> a scalar feature vector, other positive Integers indicate the number of categories <br> If R is not provided by default all variables are assumed to be scale |
+| bins | Integer | `20` | Number of equiheight bins per scale feature to choose thresholds |
+| depth | Integer | `25` | Maximum depth of the learned tree |
+| verbose | Boolean | `FALSE` | boolean specifying if the algorithm should print information while executing |
+
+### Returns
+
+| Name | Type | Description |
+| :--- | :-----------| :---------- |
+| M | Matrix[Double] | Each column of the matrix corresponds to a node in the learned tree |
+
+### Example
+
+```r
+X = matrix("4.5 4.0 3.0 2.8 3.5
+ 1.9 2.4 1.0 3.4 2.9
+ 2.0 1.1 1.0 4.9 3.4
+ 2.3 5.0 2.0 1.4 1.8
+ 2.1 1.1 3.0 1.0 1.9", rows=5, cols=5)
+Y = matrix("1.0
+ 0.0
+ 0.0
+ 1.0
+ 0.0", rows=5, cols=1)
+R = matrix("1.0 1.0 3.0 1.0 1.0", rows=1, cols=5)
+M = decisionTree(X = X, Y = Y, R = R)
+```
+
+
## `discoverFD`-Function
The `discoverFD`-function finds the functional dependencies.