You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Siddharth Prakash Singh <sp...@gmail.com> on 2009/02/20 17:40:51 UTC

Google SoC 2009

Hi all,

I wish to contribute to mahout as google soc participant this year. I
am interested in implementing a Map/Reduce enabled machine learning
algo.
Any suggestions please?

Re: Google SoC 2009

Posted by Ted Dunning <te...@gmail.com>.
That is really just the first application of Classifier.  What you are doing
should represent a good opportunity to increase the generality of the
interface, or to point out how a different interface is needed.

On Tue, Mar 3, 2009 at 3:01 AM, deneche abdelhakim <a_...@yahoo.fr>wrote:

> A question through, the most basic use of RF is as a classifier. Does it
> mean it must implement org.apache.mahout.common.Classifier interface ? Im
> not quite sure but it seems dedicated to classify text documents, but RF
> could be useful for any kind of datasets
>



-- 
Ted Dunning, CTO
DeepDyve

Re: Google SoC 2009

Posted by Grant Ingersoll <gs...@apache.org>.
On Mar 3, 2009, at 6:01 AM, deneche abdelhakim wrote:
>
> A question through, the most basic use of RF is as a classifier.  
> Does it mean it must implement org.apache.mahout.common.Classifier  
> interface ? Im not quite sure but it seems dedicated to classify  
> text documents, but RF could be useful for any kind of datasets

We can change/add to the interface if it is appropriate.  We're so  
early stage, there isn't a need to be bound to back-compat yet.

-Grant


Re: Google SoC 2009

Posted by deneche abdelhakim <a_...@yahoo.fr>.
Im seriously considering Random Forests (RF) as my GSoC project, they seem interesting, and judging by how often they have been suggested, they are very useful to Mahout. I found the following discussion:

http://markmail.org/message/dancn3n76ken6thb

that gives many useful informations about RF, and the Breiman's web site contains a very clear description of the algorithm and its possible uses. 

A question through, the most basic use of RF is as a classifier. Does it mean it must implement org.apache.mahout.common.Classifier interface ? Im not quite sure but it seems dedicated to classify text documents, but RF could be useful for any kind of datasets

--- En date de : Ven 27.2.09, Grant Ingersoll <gs...@apache.org> a écrit :

> De: Grant Ingersoll <gs...@apache.org>
> Objet: Re: Google SoC 2009
> À: mahout-dev@lucene.apache.org
> Date: Vendredi 27 Février 2009, 18h34
> Priority is in the eye of the beholder in Apache land, so
> scratch the itch you are most interested in.  Ultimately,
> we're interested in having a suite of ML libraries, but
> you certainly could do worse than to pick something that has
> proven to be useful, stable and well-used by lots of people
> over time.   I think several of them have been suggested on
> another related thread, but things like neural nets, linear
> regression, random forests, self organizing maps are all of
> interest.
> 
> -Grant
> 
> On Feb 24, 2009, at 12:04 PM, Siddharth Prakash Singh
> wrote:
> 
> > Hi,
> > 
> > No, I don't have any specific interest. I would
> rather like to work on
> > implementing algorithm which is of most priority.
> > 
> > Awaiting a response.
> > Siddharth
> > 
> > On Sat, Feb 21, 2009 at 2:43 AM, Isabel Drost
> <is...@apache.org> wrote:
> >> On Friday 20 February 2009, Siddharth Prakash
> Singh wrote:
> >>> I wish to contribute to mahout as google soc
> participant this year. I
> >>> am interested in implementing a Map/Reduce
> enabled machine learning
> >>> algo.
> >>> Any suggestions please?
> >> 
> >> Welcome Siddharth. Is there anything machine
> learning specific that interests
> >> you in particular?
> >> 
> >> You can also have a look in the Mahout Wiki as
> well as the jira to find out
> >> more on which algorithms are already available and
> which are still missing.
> >> 
> >> Isabel
> >> 
> >> 
> >> --
> >> One father is more than a hundred schoolmasters.  
>              -- George Herbert
> >>  |\      _,,,---,,_       Web:  
> <http://www.isabel-drost.de>
> >>  /,`.-'`'    -.  ;-;;,_
> >>  |,4-  ) )-,_..;\ (  `'-'
> >> '---''(_/--'  `-'\_) (fL) 
> IM:  <xm...@spaceboyz.net>
> >> 
> > 
> > 
> > 
> > --Siddharth Prakash Singh
> > http://www.spsneo.com
> 
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
> 
> Search the Lucene ecosystem
> (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
> http://www.lucidimagination.com/search


      

Re: Google SoC 2009

Posted by Grant Ingersoll <gs...@apache.org>.
Priority is in the eye of the beholder in Apache land, so scratch the  
itch you are most interested in.  Ultimately, we're interested in  
having a suite of ML libraries, but you certainly could do worse than  
to pick something that has proven to be useful, stable and well-used  
by lots of people over time.   I think several of them have been  
suggested on another related thread, but things like neural nets,  
linear regression, random forests, self organizing maps are all of  
interest.

-Grant

On Feb 24, 2009, at 12:04 PM, Siddharth Prakash Singh wrote:

> Hi,
>
> No, I don't have any specific interest. I would rather like to work on
> implementing algorithm which is of most priority.
>
> Awaiting a response.
> Siddharth
>
> On Sat, Feb 21, 2009 at 2:43 AM, Isabel Drost <is...@apache.org>  
> wrote:
>> On Friday 20 February 2009, Siddharth Prakash Singh wrote:
>>> I wish to contribute to mahout as google soc participant this  
>>> year. I
>>> am interested in implementing a Map/Reduce enabled machine learning
>>> algo.
>>> Any suggestions please?
>>
>> Welcome Siddharth. Is there anything machine learning specific that  
>> interests
>> you in particular?
>>
>> You can also have a look in the Mahout Wiki as well as the jira to  
>> find out
>> more on which algorithms are already available and which are still  
>> missing.
>>
>> Isabel
>>
>>
>> --
>> One father is more than a hundred schoolmasters.                --  
>> George Herbert
>>  |\      _,,,---,,_       Web:   <http://www.isabel-drost.de>
>>  /,`.-'`'    -.  ;-;;,_
>>  |,4-  ) )-,_..;\ (  `'-'
>> '---''(_/--'  `-'\_) (fL)  IM:  <xm...@spaceboyz.net>
>>
>
>
>
> -- 
> Siddharth Prakash Singh
> http://www.spsneo.com

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Re: Google SoC 2009

Posted by Siddharth Prakash Singh <sp...@gmail.com>.
Hi,

No, I don't have any specific interest. I would rather like to work on
implementing algorithm which is of most priority.

Awaiting a response.
Siddharth

On Sat, Feb 21, 2009 at 2:43 AM, Isabel Drost <is...@apache.org> wrote:
> On Friday 20 February 2009, Siddharth Prakash Singh wrote:
>> I wish to contribute to mahout as google soc participant this year. I
>> am interested in implementing a Map/Reduce enabled machine learning
>> algo.
>> Any suggestions please?
>
> Welcome Siddharth. Is there anything machine learning specific that interests
> you in particular?
>
> You can also have a look in the Mahout Wiki as well as the jira to find out
> more on which algorithms are already available and which are still missing.
>
> Isabel
>
>
> --
> One father is more than a hundred schoolmasters.                -- George Herbert
>  |\      _,,,---,,_       Web:   <http://www.isabel-drost.de>
>  /,`.-'`'    -.  ;-;;,_
>  |,4-  ) )-,_..;\ (  `'-'
> '---''(_/--'  `-'\_) (fL)  IM:  <xm...@spaceboyz.net>
>



-- 
Siddharth Prakash Singh
http://www.spsneo.com

Re: Google SoC 2009

Posted by Isabel Drost <is...@apache.org>.
On Friday 20 February 2009, Siddharth Prakash Singh wrote:
> I wish to contribute to mahout as google soc participant this year. I
> am interested in implementing a Map/Reduce enabled machine learning
> algo.
> Any suggestions please?

Welcome Siddharth. Is there anything machine learning specific that interests 
you in particular?

You can also have a look in the Mahout Wiki as well as the jira to find out 
more on which algorithms are already available and which are still missing.

Isabel 


-- 
One father is more than a hundred schoolmasters.		-- George Herbert
  |\      _,,,---,,_       Web:   <http://www.isabel-drost.de>
  /,`.-'`'    -.  ;-;;,_
 |,4-  ) )-,_..;\ (  `'-'
'---''(_/--'  `-'\_) (fL)  IM:  <xm...@spaceboyz.net>