You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Rahul Tanwani (JIRA)" <ji...@apache.org> on 2016/01/12 10:19:39 UTC
[jira] [Created] (SPARK-12773) Impurity and Sample details for each
node of a decision tree
Rahul Tanwani created SPARK-12773:
-------------------------------------
Summary: Impurity and Sample details for each node of a decision tree
Key: SPARK-12773
URL: https://issues.apache.org/jira/browse/SPARK-12773
Project: Spark
Issue Type: Question
Components: ML, MLlib
Affects Versions: 1.5.2
Reporter: Rahul Tanwani
I just want to understand if each node in the decision tree calculates / stores information about no. of samples that satisfy the split criteria. Looking at the code, I find some information about the impurity statistics but did not find anything on the samples. Sci-kit learn exposes both of these metrics. The information may help in the cases where there are multiple decision rules (multiple leaf nodes) yielding the same prediction and we want to do some relative comparisions of decision paths.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org