You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by sr...@apache.org on 2010/05/11 15:26:58 UTC

svn commit: r943118 - /lucene/mahout/pmc/board-reports/2010/board-report-may.txt

Author: srowen
Date: Tue May 11 13:26:57 2010
New Revision: 943118

URL: http://svn.apache.org/viewvc?rev=943118&view=rev
Log:
Edits from dev list

Modified:
    lucene/mahout/pmc/board-reports/2010/board-report-may.txt

Modified: lucene/mahout/pmc/board-reports/2010/board-report-may.txt
URL: http://svn.apache.org/viewvc/lucene/mahout/pmc/board-reports/2010/board-report-may.txt?rev=943118&r1=943117&r2=943118&view=diff
==============================================================================
--- lucene/mahout/pmc/board-reports/2010/board-report-may.txt (original)
+++ lucene/mahout/pmc/board-reports/2010/board-report-may.txt Tue May 11 13:26:57 2010
@@ -4,15 +4,21 @@ This is the first report from Mahout as 
 recently reported status with Lucene's special April report. We take the
 opportunity to summarize Mahout state and restate recent activity.
 
-Mahout's goal is to build scalable machine learning libraries. "Scalable"
-means Hadoop-based implementations using MapReduce. The "machine learning"
-implemented to date has been primarily in the broad areas of:
+OVERVIEW
+
+Mahout's goal is to build scalable implementations of machine learning and
+data mining algorithms. "Scalable" means designed with exceptional scale in 
+mind, for efficiency and low memory consumption, and in many cases means 
+providing Hadoop-based implementations. The "machine learning" implemented 
+to date has been primarily in the broad areas of:
 
 - Collaborative filtering / recommender engines
 - Clustering
 - Classification
 - Frequent item set mining
 
+CURRENT ACTIVITY
+
 Mahout has created a release approximately every six months, most recently
 releasing version 0.3 in March 2010. The project remains in a state of
 rapid change and evolution, and looks to release 0.4 in September, 2010.
@@ -21,6 +27,11 @@ Recent activity in the project can be vi
 https://issues.apache.org/jira/secure/IssueNavigator.jspa?
   pid=12310751&fixfor=12314396&resolution=1
 
+This month, Mahout will complete migration of website, mailing lists, 
+SVN, and other information to reflect its status as a top-level project.
+
+GOOGLE SUMMER OF CODE
+
 Mahout will mentor five projects as part of Google's Summer of Code 
 program. The projects will add or enhance capability in the specific 
 areas of:
@@ -31,6 +42,9 @@ areas of:
 - Neural network with back propagation learning
 - Eigencuts spectral clustering
 
+MAHOUT IN ACTION
+
 The book "Mahout in Action", published by Manning, continues to be written
 and is approximately half complete. It has received some favorable feedback
 via Manning's early access program.
+



Re: svn commit: r943118 - /lucene/mahout/pmc/board-reports/2010/board-report-may.txt

Posted by Sisir Koppaka <si...@gmail.com>.
Genetic Algorithms are very specific instances of Evolutionary Algorithms
inspired by genes...Evolutionary Computation encompasses the much broader
class of algorithms including Evolutionary Algorithms(which include
Quantum-Inspired Evolutionary Algos on which I work, and will hopefully port
from Mathematica to Java/Mahout one day), Swarm Intelligence, Learner
Classifier Systems(Pittsburgh, Michigan), and so on...Evolutionary
Algorithms seems generic enough yet legitimate enough(no empty shouts) for
our case, doesn't it?

On Tue, May 11, 2010 at 11:51 PM, Sean Owen <sr...@gmail.com> wrote:

> Nah I think it's fine to mention. The old "three Cs" meme (CF,
> Clustering, Classification) is outdated now so might as well fully
> update. If it were something that people just would like someone to
> support someday, I'd say let's not yet claim Mahout encompasses those
> topics. But yeah watchmaker is legitimate enough.
>
> Shall we label this notion "genetic algorithms" or "evolutionary
> algorithms" in general?
>
> On Tue, May 11, 2010 at 7:09 PM, Sisir Koppaka <si...@gmail.com>
> wrote:
> > Ok...cool. It's just that if it's going on the website by any chance,
> > mentioning evolutionary algorithms(org.apache.mahout.ga.watchmaker) might
> > attract contributors from the area. There are quite interesting
> algorithms
> > like NSGA-II <http://www.iitk.ac.in/kangal/codes.shtml>, and several
> memetic
> > algos that could be ported with widely available sources in other
> > languages(usually C and C++).
> >
> > I guess you're right, this isn't in the scope of the board report. It
> should
> > much rather be on the upcoming site on things-to-do or something of that
> > nature.
> >
> > On Tue, May 11, 2010 at 11:17 PM, Sean Owen <sr...@gmail.com> wrote:
> >
> >> Sure, that's in scope. There's not much you could call evolutionary in
> >> the code base yet, compared to what you see for CF, clustering,
> >> classification, and maybe frequent item set mining, in terms of
> >> quantity and maturity. So I'm just trying to usefully express the
> >> reality of the project's current state, rather than say "it's just
> >> machine learning stuff in general" without precluding things like
> >> this.
> >>
> >> On Tue, May 11, 2010 at 5:33 PM, Sisir Koppaka <sisir.koppaka@gmail.com
> >
> >> wrote:
> >> > Hi,
> >> > Please correct me if I am wrong - but isn't Mahout also into
> Evolutionary
> >> > Algorithms and Programming? Is it missing in the report by mistake?
> >> >
> >> > Sisir
> >> >
> >>
> >
> >
> >
> > --
> > SK
> >
>



-- 
SK

Re: svn commit: r943118 - /lucene/mahout/pmc/board-reports/2010/board-report-may.txt

Posted by Grant Ingersoll <gs...@apache.org>.
For the first Board report, I think it is useful to provide a bit of context about what Mahout is, but after that, Board reports should mostly be short and too the point, highlighting any key things that happened since the last report or anything that requires Board attn. 

-Grant

On May 11, 2010, at 2:21 PM, Sean Owen wrote:

> Nah I think it's fine to mention. The old "three Cs" meme (CF,
> Clustering, Classification) is outdated now so might as well fully
> update. If it were something that people just would like someone to
> support someday, I'd say let's not yet claim Mahout encompasses those
> topics. But yeah watchmaker is legitimate enough.
> 
> Shall we label this notion "genetic algorithms" or "evolutionary
> algorithms" in general?
> 
> On Tue, May 11, 2010 at 7:09 PM, Sisir Koppaka <si...@gmail.com> wrote:
>> Ok...cool. It's just that if it's going on the website by any chance,
>> mentioning evolutionary algorithms(org.apache.mahout.ga.watchmaker) might
>> attract contributors from the area. There are quite interesting algorithms
>> like NSGA-II <http://www.iitk.ac.in/kangal/codes.shtml>, and several memetic
>> algos that could be ported with widely available sources in other
>> languages(usually C and C++).
>> 
>> I guess you're right, this isn't in the scope of the board report. It should
>> much rather be on the upcoming site on things-to-do or something of that
>> nature.
>> 
>> On Tue, May 11, 2010 at 11:17 PM, Sean Owen <sr...@gmail.com> wrote:
>> 
>>> Sure, that's in scope. There's not much you could call evolutionary in
>>> the code base yet, compared to what you see for CF, clustering,
>>> classification, and maybe frequent item set mining, in terms of
>>> quantity and maturity. So I'm just trying to usefully express the
>>> reality of the project's current state, rather than say "it's just
>>> machine learning stuff in general" without precluding things like
>>> this.
>>> 
>>> On Tue, May 11, 2010 at 5:33 PM, Sisir Koppaka <si...@gmail.com>
>>> wrote:
>>>> Hi,
>>>> Please correct me if I am wrong - but isn't Mahout also into Evolutionary
>>>> Algorithms and Programming? Is it missing in the report by mistake?
>>>> 
>>>> Sisir
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> SK
>> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search


Re: svn commit: r943118 - /lucene/mahout/pmc/board-reports/2010/board-report-may.txt

Posted by Sean Owen <sr...@gmail.com>.
Nah I think it's fine to mention. The old "three Cs" meme (CF,
Clustering, Classification) is outdated now so might as well fully
update. If it were something that people just would like someone to
support someday, I'd say let's not yet claim Mahout encompasses those
topics. But yeah watchmaker is legitimate enough.

Shall we label this notion "genetic algorithms" or "evolutionary
algorithms" in general?

On Tue, May 11, 2010 at 7:09 PM, Sisir Koppaka <si...@gmail.com> wrote:
> Ok...cool. It's just that if it's going on the website by any chance,
> mentioning evolutionary algorithms(org.apache.mahout.ga.watchmaker) might
> attract contributors from the area. There are quite interesting algorithms
> like NSGA-II <http://www.iitk.ac.in/kangal/codes.shtml>, and several memetic
> algos that could be ported with widely available sources in other
> languages(usually C and C++).
>
> I guess you're right, this isn't in the scope of the board report. It should
> much rather be on the upcoming site on things-to-do or something of that
> nature.
>
> On Tue, May 11, 2010 at 11:17 PM, Sean Owen <sr...@gmail.com> wrote:
>
>> Sure, that's in scope. There's not much you could call evolutionary in
>> the code base yet, compared to what you see for CF, clustering,
>> classification, and maybe frequent item set mining, in terms of
>> quantity and maturity. So I'm just trying to usefully express the
>> reality of the project's current state, rather than say "it's just
>> machine learning stuff in general" without precluding things like
>> this.
>>
>> On Tue, May 11, 2010 at 5:33 PM, Sisir Koppaka <si...@gmail.com>
>> wrote:
>> > Hi,
>> > Please correct me if I am wrong - but isn't Mahout also into Evolutionary
>> > Algorithms and Programming? Is it missing in the report by mistake?
>> >
>> > Sisir
>> >
>>
>
>
>
> --
> SK
>

Re: svn commit: r943118 - /lucene/mahout/pmc/board-reports/2010/board-report-may.txt

Posted by Sisir Koppaka <si...@gmail.com>.
Ok...cool. It's just that if it's going on the website by any chance,
mentioning evolutionary algorithms(org.apache.mahout.ga.watchmaker) might
attract contributors from the area. There are quite interesting algorithms
like NSGA-II <http://www.iitk.ac.in/kangal/codes.shtml>, and several memetic
algos that could be ported with widely available sources in other
languages(usually C and C++).

I guess you're right, this isn't in the scope of the board report. It should
much rather be on the upcoming site on things-to-do or something of that
nature.

On Tue, May 11, 2010 at 11:17 PM, Sean Owen <sr...@gmail.com> wrote:

> Sure, that's in scope. There's not much you could call evolutionary in
> the code base yet, compared to what you see for CF, clustering,
> classification, and maybe frequent item set mining, in terms of
> quantity and maturity. So I'm just trying to usefully express the
> reality of the project's current state, rather than say "it's just
> machine learning stuff in general" without precluding things like
> this.
>
> On Tue, May 11, 2010 at 5:33 PM, Sisir Koppaka <si...@gmail.com>
> wrote:
> > Hi,
> > Please correct me if I am wrong - but isn't Mahout also into Evolutionary
> > Algorithms and Programming? Is it missing in the report by mistake?
> >
> > Sisir
> >
>



-- 
SK

Re: svn commit: r943118 - /lucene/mahout/pmc/board-reports/2010/board-report-may.txt

Posted by Sean Owen <sr...@gmail.com>.
Sure, that's in scope. There's not much you could call evolutionary in
the code base yet, compared to what you see for CF, clustering,
classification, and maybe frequent item set mining, in terms of
quantity and maturity. So I'm just trying to usefully express the
reality of the project's current state, rather than say "it's just
machine learning stuff in general" without precluding things like
this.

On Tue, May 11, 2010 at 5:33 PM, Sisir Koppaka <si...@gmail.com> wrote:
> Hi,
> Please correct me if I am wrong - but isn't Mahout also into Evolutionary
> Algorithms and Programming? Is it missing in the report by mistake?
>
> Sisir
>

Re: svn commit: r943118 - /lucene/mahout/pmc/board-reports/2010/board-report-may.txt

Posted by Sisir Koppaka <si...@gmail.com>.
Hi,
Please correct me if I am wrong - but isn't Mahout also into Evolutionary
Algorithms and Programming? Is it missing in the report by mistake?

Sisir

Re: svn commit: r943118 - /lucene/mahout/pmc/board-reports/2010/board-report-may.txt

Posted by Jeff Eastman <jd...@windwardsolutions.com>.
+1, the best so far


On 5/11/10 6:26 AM, srowen@apache.org wrote:
> Author: srowen
> Date: Tue May 11 13:26:57 2010
> New Revision: 943118
>
> URL: http://svn.apache.org/viewvc?rev=943118&view=rev
> Log:
> Edits from dev list
>
> Modified:
>      lucene/mahout/pmc/board-reports/2010/board-report-may.txt
>
> Modified: lucene/mahout/pmc/board-reports/2010/board-report-may.txt
> URL: http://svn.apache.org/viewvc/lucene/mahout/pmc/board-reports/2010/board-report-may.txt?rev=943118&r1=943117&r2=943118&view=diff
> ==============================================================================
> --- lucene/mahout/pmc/board-reports/2010/board-report-may.txt (original)
> +++ lucene/mahout/pmc/board-reports/2010/board-report-may.txt Tue May 11 13:26:57 2010
> @@ -4,15 +4,21 @@ This is the first report from Mahout as
>   recently reported status with Lucene's special April report. We take the
>   opportunity to summarize Mahout state and restate recent activity.
>
> -Mahout's goal is to build scalable machine learning libraries. "Scalable"
> -means Hadoop-based implementations using MapReduce. The "machine learning"
> -implemented to date has been primarily in the broad areas of:
> +OVERVIEW
> +
> +Mahout's goal is to build scalable implementations of machine learning and
> +data mining algorithms. "Scalable" means designed with exceptional scale in
> +mind, for efficiency and low memory consumption, and in many cases means
> +providing Hadoop-based implementations. The "machine learning" implemented
> +to date has been primarily in the broad areas of:
>
>   - Collaborative filtering / recommender engines
>   - Clustering
>   - Classification
>   - Frequent item set mining
>
> +CURRENT ACTIVITY
> +
>   Mahout has created a release approximately every six months, most recently
>   releasing version 0.3 in March 2010. The project remains in a state of
>   rapid change and evolution, and looks to release 0.4 in September, 2010.
> @@ -21,6 +27,11 @@ Recent activity in the project can be vi
>   https://issues.apache.org/jira/secure/IssueNavigator.jspa?
>     pid=12310751&fixfor=12314396&resolution=1
>
> +This month, Mahout will complete migration of website, mailing lists,
> +SVN, and other information to reflect its status as a top-level project.
> +
> +GOOGLE SUMMER OF CODE
> +
>   Mahout will mentor five projects as part of Google's Summer of Code
>   program. The projects will add or enhance capability in the specific
>   areas of:
> @@ -31,6 +42,9 @@ areas of:
>   - Neural network with back propagation learning
>   - Eigencuts spectral clustering
>
> +MAHOUT IN ACTION
> +
>   The book "Mahout in Action", published by Manning, continues to be written
>   and is approximately half complete. It has received some favorable feedback
>   via Manning's early access program.
> +
>
>
>
>