You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Don Pazel <dp...@adconion.com> on 2011/07/12 19:52:39 UTC
Random Forest feature types
>From what I can see, the random forest implementation takes either numerical or categorical feature data. That worked fine for me, until I tried to incorporate word or text features. I liked the encoders used in SGD, but they don't seem to apply to random forests. So, did I overlook something simple that would allow me to include word or text features? If not, are there plans (assuming the core algorithm allows) to add these feature types to random forests in the future?
Thanks,
Don Pazel
Re : Random Forest feature types
Posted by deneche abdelhakim <a_...@yahoo.fr>.
I will gladly help any improvement to Mahout's Decision Forests
________________________________
De : Ted Dunning <te...@gmail.com>
À : user@mahout.apache.org
Envoyé le : Mardi 12 Juillet 2011 19h00
Objet : Re: Random Forest feature types
The random forest code predates the fancy encoders so support is limited for
that.
I would expect that you might be able to adapt the code to improve support.
Deneche would likely be willing to help (the original implementor).
On Tue, Jul 12, 2011 at 10:52 AM, Don Pazel <dp...@adconion.com> wrote:
> From what I can see, the random forest implementation takes either
> numerical or categorical feature data. That worked fine for me, until I
> tried to incorporate word or text features. I liked the encoders used in
> SGD, but they don't seem to apply to random forests. So, did I overlook
> something simple that would allow me to include word or text features? If
> not, are there plans (assuming the core algorithm allows) to add these
> feature types to random forests in the future?
>
>
> Thanks,
> Don Pazel
Re: Random Forest feature types
Posted by Ted Dunning <te...@gmail.com>.
The random forest code predates the fancy encoders so support is limited for
that.
I would expect that you might be able to adapt the code to improve support.
Deneche would likely be willing to help (the original implementor).
On Tue, Jul 12, 2011 at 10:52 AM, Don Pazel <dp...@adconion.com> wrote:
> From what I can see, the random forest implementation takes either
> numerical or categorical feature data. That worked fine for me, until I
> tried to incorporate word or text features. I liked the encoders used in
> SGD, but they don't seem to apply to random forests. So, did I overlook
> something simple that would allow me to include word or text features? If
> not, are there plans (assuming the core algorithm allows) to add these
> feature types to random forests in the future?
>
>
> Thanks,
> Don Pazel