You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by IKumasa Mukai <ik...@gmail.com> on 2012/02/07 12:22:44 UTC

Re: About "complementary" using mahout to build the random forest

Hi wang-san.

# I detach this topic and modified the subject.

> Do you know the meaning of option "complementary" using mahout to build the random forest?

Sorry, I cannot get your point clearly, but do you talk about the
Pruning and the Grafting?

Regards,

2012/2/7 Wang Yue (Commented) (JIRA) <ji...@apache.org>:
>
>    [ https://issues.apache.org/jira/browse/MAHOUT-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201399#comment-13201399 ]
>
> Wang Yue commented on MAHOUT-945:
> ---------------------------------
>
> Hi, Mukai
>  Thanks for your efforts. I feel it is ok for your modification. I have a
> question about the decision tree building.
>  Do you know the meaning of  option "complementary" using mahout to build
> the random forest?
>
> On Mon, Jan 23, 2012 at 6:55 AM, Ikumasa Mukai (Updated) (JIRA) <
>
>
>
> --
> Regards, Wang Yue
> PhD Starts From 08 Fall
> NUS  Graduate School for Integrative Sciences and Engineering, NUS
> Email: wangyue@nus.edu.sg
> Homepage: https://sites.google.com/site/fayue1015/
> HP:    +65 81022515
>
>
>> The variance calculation of Random forest regression tree
>> ---------------------------------------------------------
>>
>>                 Key: MAHOUT-945
>>                 URL: https://issues.apache.org/jira/browse/MAHOUT-945
>>             Project: Mahout
>>          Issue Type: Improvement
>>          Components: Classification
>>    Affects Versions: 0.6
>>            Reporter: Wang Yue
>>              Labels: Regressionsplit.java
>>         Attachments: MAHOUT-945.patch, MAHOUT-945.patch
>>
>>   Original Estimate: 48h
>>  Remaining Estimate: 48h
>>
>> Hi, Mukai
>>   Thanks for your efforts in expand the RF to regression. However, I have a doubt about your implementation regarding to Regressionsplit.java. The variance method
>> "
>>  private static double variance(double[] s, double[] ss, double[] dataSize) {
>>     double var = 0;
>>     for (int i = 0; i < s.length; i++) {
>>       if (dataSize[i] > 0) {
>>         var += ss[i] - ((s[i] * s[i]) / dataSize[i]);
>>       }
>>     }
>>     return var;
>>   }
>> "
>> While the variance in my mind should be something like
>> var += ss[i]/dataSize[i] - ((s[i] * s[i]) / (dataSize[i]*dataSize[i]));
>> Please help correct me if I am wrong. Thanks
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>



-- 
- - - - - - -
IKumasa Mukai at Recruit Co.,Ltd.