You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Rahul Iyer (JIRA)" <ji...@apache.org> on 2018/06/13 18:27:00 UTC

[jira] [Created] (MADLIB-1246) Add impurity variable importance to RF

Rahul Iyer created MADLIB-1246:
----------------------------------

             Summary: Add impurity variable importance to RF
                 Key: MADLIB-1246
                 URL: https://issues.apache.org/jira/browse/MADLIB-1246
             Project: Apache MADlib
          Issue Type: New Feature
          Components: Module: Decision Tree, Module: Random Forest
            Reporter: Rahul Iyer
            Assignee: Rahul Iyer
             Fix For: v1.15


From the Breiman resource that we use for random forest:
{quote}Gini importance
{quote}
{quote}Every time a split of a node is made on variable m the gini impurity criterion for the two descendent nodes is less than the parent node. Adding up the gini decreases for each individual variable over all trees in the forest gives a fast variable importance that is often very consistent with the permutation importance measure.
{quote}
We can add a similar measure in our DT code called as {{impurity_variable_importance}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)