You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Suneel Marthi (JIRA)" <ji...@apache.org> on 2014/03/02 22:57:21 UTC
[jira] [Commented] (MAHOUT-1153) Implement streaming random forests
[ https://issues.apache.org/jira/browse/MAHOUT-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13917581#comment-13917581 ]
Suneel Marthi commented on MAHOUT-1153:
---------------------------------------
[~andytwigg] I understand this has been implemented on Spark and an implementation is available at (http://featurestream.io), do u think we should start the conversation of rolling this into Mahout?
> Implement streaming random forests
> ----------------------------------
>
> Key: MAHOUT-1153
> URL: https://issues.apache.org/jira/browse/MAHOUT-1153
> Project: Mahout
> Issue Type: New Feature
> Components: Classification
> Reporter: Andy Twigg
> Labels: features
> Fix For: Backlog
>
>
> The current random forest implementations are in-core and not scalable. This issue is to add an out-of-core, scalable, streaming implementation. Initially it could be based on [1], and using mappers in a master-worker style.
> [1] http://jmlr.csail.mit.edu/papers/volume11/ben-haim10a/ben-haim10a.pdf
--
This message was sent by Atlassian JIRA
(v6.2#6252)
Re: [jira] [Commented] (MAHOUT-1153) Implement streaming random forests
Posted by Andy Twigg <an...@gmail.com>.
Yes, we could also consider committing it into the current mahout code
base. There are probably some advantages over the current impl. What
direction are you thinking?
On 2 March 2014 13:57, Suneel Marthi (JIRA) <ji...@apache.org> wrote:
>
> [ https://issues.apache.org/jira/browse/MAHOUT-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13917581#comment-13917581 ]
>
> Suneel Marthi commented on MAHOUT-1153:
> ---------------------------------------
>
> [~andytwigg] I understand this has been implemented on Spark and an implementation is available at (http://featurestream.io), do u think we should start the conversation of rolling this into Mahout?
>
>> Implement streaming random forests
>> ----------------------------------
>>
>> Key: MAHOUT-1153
>> URL: https://issues.apache.org/jira/browse/MAHOUT-1153
>> Project: Mahout
>> Issue Type: New Feature
>> Components: Classification
>> Reporter: Andy Twigg
>> Labels: features
>> Fix For: Backlog
>>
>>
>> The current random forest implementations are in-core and not scalable. This issue is to add an out-of-core, scalable, streaming implementation. Initially it could be based on [1], and using mappers in a master-worker style.
>> [1] http://jmlr.csail.mit.edu/papers/volume11/ben-haim10a/ben-haim10a.pdf
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.2#6252)