You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Suneel Marthi (JIRA)" <ji...@apache.org> on 2013/11/05 14:27:18 UTC
[jira] [Commented] (MAHOUT-1153) Implement streaming random forests
[ https://issues.apache.org/jira/browse/MAHOUT-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813912#comment-13813912 ]
Suneel Marthi commented on MAHOUT-1153:
---------------------------------------
Hey Andy,
The github link doesn't work anymore, do u think this can be part of 0.9?
> Implement streaming random forests
> ----------------------------------
>
> Key: MAHOUT-1153
> URL: https://issues.apache.org/jira/browse/MAHOUT-1153
> Project: Mahout
> Issue Type: New Feature
> Components: Classification
> Reporter: Andy Twigg
> Labels: features
> Fix For: Backlog
>
>
> The current random forest implementations are in-core and not scalable. This issue is to add an out-of-core, scalable, streaming implementation. Initially it could be based on [1], and using mappers in a master-worker style.
> [1] http://jmlr.csail.mit.edu/papers/volume11/ben-haim10a/ben-haim10a.pdf
--
This message was sent by Atlassian JIRA
(v6.1#6144)
Re: [jira] [Commented] (MAHOUT-1153) Implement streaming random forests
Posted by Andy Twigg <an...@gmail.com>.
Hi Suneel,
I spent a significant amount of effort trying to get this working against
0.8, but unfortunately it seemed a bad fit. Instead I wrote a version
against spark, which is now available as a service - http://featurestream.io
I'm open to open-sourcing it, but I wanted to see what use cases would come
out of it first. If anyone has any good idea, let me know.
Cheers,
Andy
--
andy.twigg@gmail.com
On 5 November 2013 05:27, Suneel Marthi (JIRA) <ji...@apache.org> wrote:
>
> [
> https://issues.apache.org/jira/browse/MAHOUT-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813912#comment-13813912]
>
> Suneel Marthi commented on MAHOUT-1153:
> ---------------------------------------
>
> Hey Andy,
>
> The github link doesn't work anymore, do u think this can be part of 0.9?
>
> > Implement streaming random forests
> > ----------------------------------
> >
> > Key: MAHOUT-1153
> > URL: https://issues.apache.org/jira/browse/MAHOUT-1153
> > Project: Mahout
> > Issue Type: New Feature
> > Components: Classification
> > Reporter: Andy Twigg
> > Labels: features
> > Fix For: Backlog
> >
> >
> > The current random forest implementations are in-core and not scalable.
> This issue is to add an out-of-core, scalable, streaming implementation.
> Initially it could be based on [1], and using mappers in a master-worker
> style.
> > [1]
> http://jmlr.csail.mit.edu/papers/volume11/ben-haim10a/ben-haim10a.pdf
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.1#6144)
>