You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by is...@apache.org on 2009/11/19 09:36:32 UTC

svn commit: r882075 - in /lucene/mahout/site: publish/index.html publish/index.pdf src/documentation/content/xdocs/index.xml

Author: isabel
Date: Thu Nov 19 08:36:14 2009
New Revision: 882075

URL: http://svn.apache.org/viewvc?rev=882075&view=rev
Log:
Added some general "Which use cases can you solve with Mahout"
information to the index page. (Thanks to ghostbuster for inspiration.)

Modified:
    lucene/mahout/site/publish/index.html
    lucene/mahout/site/publish/index.pdf
    lucene/mahout/site/src/documentation/content/xdocs/index.xml

Modified: lucene/mahout/site/publish/index.html
URL: http://svn.apache.org/viewvc/lucene/mahout/site/publish/index.html?rev=882075&r1=882074&r2=882075&view=diff
==============================================================================
--- lucene/mahout/site/publish/index.html (original)
+++ lucene/mahout/site/publish/index.html Thu Nov 19 08:36:14 2009
@@ -237,7 +237,7 @@
 <a name="N1000C"></a><a name="Apache Lucene"></a>
 <h2 class="boxed">Apache Lucene Mahout</h2>
 <div class="section">
-<p>Mahout't goal is to build scalable machine learning libraries. With scalable we mean:
+<p>Mahout's goal is to build scalable machine learning libraries. With scalable we mean:
         <ul>
 <li>Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms.</li>
             
@@ -248,6 +248,8 @@
 </ul>
       
 </p>
+<p>Currently Mahout supports mainly four use cases: Recommendation mining takes users' behavior and from that tries to find items users might like. Clustering takes e.g. text documents and groups them into groups of topically related documents. Classification learns from exisiting categorized documents what documents of a specific category look like and is able to assign unlabelled documents to the (hopefully) correct category. Frequent itemset mining takes a set of item groups (terms in a query session, shopping cart content) and identifies, which individual items usually appear together.
+      </p>
 <p>Interested in helping? See the
         <a href="http://cwiki.apache.org/MAHOUT/">Wiki</a>
         or send us an <a href="./mailinglists.html">email</a>. Also note, we are just getting off the ground, so please
@@ -256,10 +258,10 @@
 </div>
 
     
-<a name="N1002C"></a><a name="Mahout News"></a>
+<a name="N1002F"></a><a name="Mahout News"></a>
 <h2 class="boxed">Mahout News</h2>
 <div class="section">
-<a name="N10032"></a><a name="17+Nov.+2009+-+Apache+Mahout+0.2+released"></a>
+<a name="N10035"></a><a name="17+Nov.+2009+-+Apache+Mahout+0.2+released"></a>
 <h3 class="boxed">17 Nov. 2009 - Apache Mahout 0.2 released</h3>
 <p>The Apache Lucene project is pleased to announce the release of Apache Mahout 0.2.</p>
 <p>
@@ -288,7 +290,7 @@
           <a href="http://www.apache.org/dyn/closer.cgi/lucene/mahout/">Apache Mirrors</a>
         
 </p>
-<a name="N10062"></a><a name="14+August+2009+-+Lucene+at+US+ApacheCon"></a>
+<a name="N10065"></a><a name="14+August+2009+-+Lucene+at+US+ApacheCon"></a>
 <h3 class="boxed">14 August 2009 - Lucene at US ApacheCon</h3>
 <p>
           
@@ -392,7 +394,7 @@
           </li>
         
 </ul>
-<a name="N100E0"></a><a name="07+April+2009+-+Apache+Mahout+0.1+released"></a>
+<a name="N100E3"></a><a name="07+April+2009+-+Apache+Mahout+0.1+released"></a>
 <h3 class="boxed">07 April 2009 - Apache Mahout 0.1 released</h3>
 <p>The Apache Lucene project is pleased to announce the release of Apache Mahout 0.1.
           Apache Mahout is a subproject of Apache Lucene with the goal of delivering scalable
@@ -425,7 +427,7 @@
           <a href="http://www.apache.org/dyn/closer.cgi/lucene/mahout/">Apache Mirrors</a>
         
 </p>
-<a name="N1010D"></a><a name="09+February+2009+-+Lucene+at+ApacheCon+Europe+2009+in+Amsterdam"></a>
+<a name="N10110"></a><a name="09+February+2009+-+Lucene+at+ApacheCon+Europe+2009+in+Amsterdam"></a>
 <h3 class="boxed">09 February 2009 - Lucene at ApacheCon Europe 2009 in Amsterdam</h3>
 <p>
           
@@ -493,7 +495,7 @@
 
         
 </ul>
-<a name="N10160"></a><a name="22+July+2008+-+Lucene+at+ApacheCon+New+Orleans"></a>
+<a name="N10163"></a><a name="22+July+2008+-+Lucene+at+ApacheCon+New+Orleans"></a>
 <h3 class="boxed">22 July 2008 - Lucene at ApacheCon New Orleans</h3>
 <p>
           
@@ -525,14 +527,14 @@
           </li>
         
 </ul>
-<a name="N10192"></a><a name="4+April+2008+-+Mahout+-+Now+with+more+Taste%21"></a>
+<a name="N10195"></a><a name="4+April+2008+-+Mahout+-+Now+with+more+Taste%21"></a>
 <h3 class="boxed">4 April 2008 - Mahout - Now with more Taste!</h3>
 <p>We are pleased to announce that the Taste Collaborative Filtering (<a href="http://taste.sf.net">Taste on
           SourceForge</a>) has donated it's codebase to the Mahout project. In the coming weeks and months we will work
           to bring it into Mahout and then make it run on Hadoop, bringing truly large scale collaborative filtering
           capabilities to our users.
         </p>
-<a name="N101A0"></a><a name="16+March+2008+-+Google+Summer+Of+Code+Projects"></a>
+<a name="N101A3"></a><a name="16+March+2008+-+Google+Summer+Of+Code+Projects"></a>
 <h3 class="boxed">16 March 2008 - Google Summer Of Code Projects</h3>
 <p>The ASF is in the process of creating projects for Google's annual Summer of Code Project. Mahout has a
           number of people willing to be mentors, so if you are a student interested in working on machine learning
@@ -540,7 +542,7 @@
           <a href="http://wiki.apache.org/general/SummerOfCode2008">Summer of Code</a>
           wiki page.
         </p>
-<a name="N101AE"></a><a name="22+January+2008+-+Mahout+launches"></a>
+<a name="N101B1"></a><a name="22+January+2008+-+Mahout+launches"></a>
 <h3 class="boxed">22 January 2008 - Mahout launches</h3>
 <p>The
           <a href="http://lucene.apache.org">Lucene PMC</a>

Modified: lucene/mahout/site/publish/index.pdf
URL: http://svn.apache.org/viewvc/lucene/mahout/site/publish/index.pdf?rev=882075&r1=882074&r2=882075&view=diff
==============================================================================
Binary files - no diff available.

Modified: lucene/mahout/site/src/documentation/content/xdocs/index.xml
URL: http://svn.apache.org/viewvc/lucene/mahout/site/src/documentation/content/xdocs/index.xml?rev=882075&r1=882074&r2=882075&view=diff
==============================================================================
--- lucene/mahout/site/src/documentation/content/xdocs/index.xml (original)
+++ lucene/mahout/site/src/documentation/content/xdocs/index.xml Thu Nov 19 08:36:14 2009
@@ -8,12 +8,14 @@
   <body>
     <section id="Apache Lucene">
       <title>Apache Lucene Mahout</title>
-      <p>Mahout't goal is to build scalable machine learning libraries. With scalable we mean:
+      <p>Mahout's goal is to build scalable machine learning libraries. With scalable we mean:
         <ul><li>Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms.</li>
             <li>Scalable to support your business case. Mahout is distributed under a commercially friendly Apache Software license.</li>
             <li>Scalable community. The goal of Mahout is to build a vibrant, responsive, diverse community to facilitate discussions not only on the project itself but also on potential use cases. Come to the mailing lists to find out more.</li>
         </ul>
       </p>
+      <p>Currently Mahout supports mainly four use cases: Recommendation mining takes users' behavior and from that tries to find items users might like. Clustering takes e.g. text documents and groups them into groups of topically related documents. Classification learns from exisiting categorized documents what documents of a specific category look like and is able to assign unlabelled documents to the (hopefully) correct category. Frequent itemset mining takes a set of item groups (terms in a query session, shopping cart content) and identifies, which individual items usually appear together.
+      </p>
       <p>Interested in helping? See the
         <a href="http://cwiki.apache.org/MAHOUT/">Wiki</a>
         or send us an <a href="./mailinglists.html">email</a>. Also note, we are just getting off the ground, so please