You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2016/01/12 14:40:39 UTC

[jira] [Resolved] (SPARK-12773) Impurity and Sample details for each node of a decision tree

     [ https://issues.apache.org/jira/browse/SPARK-12773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-12773.
-------------------------------
          Resolution: Invalid
    Target Version/s:   (was: 1.5.2)

Please ask questions at user@spark.apache.org

> Impurity and Sample details for each node of a decision tree
> ------------------------------------------------------------
>
>                 Key: SPARK-12773
>                 URL: https://issues.apache.org/jira/browse/SPARK-12773
>             Project: Spark
>          Issue Type: Question
>          Components: ML, MLlib
>    Affects Versions: 1.5.2
>            Reporter: Rahul Tanwani
>
> I just want to understand if each node in the decision tree calculates / stores information about no. of samples that satisfy the split criteria. Looking at the code, I find some information about the impurity statistics but did not find anything on the samples. Sci-kit learn exposes both of these metrics. The information may help in the cases where there are multiple decision rules (multiple leaf nodes) yielding the same prediction and we want to do some relative comparisions of decision paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org