You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by sr...@apache.org on 2009/10/16 17:29:46 UTC

svn commit: r825939 [2/2] - in /lucene/mahout/site/src/documentation/content/xdocs: developer-resources.xml index.xml mailinglists.xml releases.xml site.xml systemrequirements.xml tabs.xml taste.xml whoweare.xml

Modified: lucene/mahout/site/src/documentation/content/xdocs/taste.xml
URL: http://svn.apache.org/viewvc/lucene/mahout/site/src/documentation/content/xdocs/taste.xml?rev=825939&r1=825938&r2=825939&view=diff
==============================================================================
--- lucene/mahout/site/src/documentation/content/xdocs/taste.xml (original)
+++ lucene/mahout/site/src/documentation/content/xdocs/taste.xml Fri Oct 16 15:29:45 2009
@@ -1,432 +1,668 @@
 <?xml version="1.0" encoding="UTF-8"?>
 <document>
-<header><title>Apache Mahout - Taste Documentation</title></header>
-<properties>
-<author email="srowen@apache.org">Sean Owen</author>
-</properties>
-<body>
-
-<section id="overview"><title>Overview</title>
-
-<p>Taste is a flexible, fast collaborative filtering engine for Java. The engine takes users'
-preferences for items ("tastes") and returns estimated preferences for other items. For example, a
-site that sells books or CDs could easily use Taste to figure out, from past purchase data, which
-CDs a customer might be interested in listening to.</p>
-
-<p>Taste provides a rich set of components from which you can construct a customized recommender
-system from a selection of algorithms. Taste is designed to be enterprise-ready; it's designed for
-performance, scalability and flexibility.
-Taste is not just for Java; it can be run as an external server which exposes recommendation logic
-to your application via web services and HTTP.</p>
-
-<p>Top-level packages define the Taste interfaces to these key abstractions:</p>
-
-<ul>
-  <li><code>DataModel</code></li>
-  <li><code>UserSimilarity</code> and <code>ItemSimilarity</code></li>
-  <li><code>UserNeighborhood</code></li>
-  <li><code>Recommender</code></li>
-</ul>
-
-<p>Subpackages of <code>org.apache.mahout.cf.taste.impl</code> hold implementations of these interfaces.
-These are the pieces from which you will build your own recommendation engine. That's it!
-For the academically inclined, Taste supports both <em>memory-based</em> and <em>item-based</em>
-recommender systems, <em>slope one</em> recommenders, and a couple other experimental implementations.
-It does not currently support <em>model-based</em> recommenders.</p>
-
-</section>
-
-<section id="architecture"><title>Architecture</title>
-
-<p class="centertext"><img src="images/taste-architecture.png" alt="Taste Architecture" height="1060" width="442"/></p>
-
-<p>This diagram shows the relationship between various Taste components in a user-based recommender.
-An item-based recommender system is similar except that there are no PreferenceInferrers or Neighborhood
-algorithms involved.</p>
-
-<section><title>Recommender</title>
-
-<p>A <code>Recommender</code> is the core abstraction in Taste. Given a <code>DataModel</code>, it can produce
-recommendations. Applications will most likely use the <code>GenericUserBasedRecommender</code> implementation
-or <code>GenericItemBasedRecommender</code>, possibly decorated by <code>CachingRecommender</code>.</p>
-
-</section>
-
-<section><title>DataModel</title>
-
-<p>A <code>DataModel</code> is the interface to information about user preferences. An implementation might
-draw this data from any source, but a database is the most likely source. Taste provides <code>MySQLJDBCDataModel</code>
-to access preference data from a database via JDBC, though many applications will want to write their own.
-Taste also provides a <code>FileDataModel</code>.</p>
-
-<p>There are no abstractions for a user or item in the object model (not anymore). Users and items are identified
-solely by an ID value in the framework. Further, this ID value must be numeric; it is a Java <code>long</code>
-type through the APIs. A <code>Preference</code> object or <code>PreferenceArray</code> object encapsulates
-the relation between user and preferred items (or items and users preferring them).</p>
-
-<p>Finally, Taste supports, in various ways, a so-called "boolean" data model in which users do not express
-preferences of varying strengths for items, but simply express an association or none at all. For example, while 
-users might express a preference from 1 to 5 in the context of a movie recommender site, there may be no
-notion of a preference value between users and pages in the context of recommending pages on a web site: there
-is only a notion of an association, or none, between a user and pages that have been visited.</p>
-
-</section>
-
-<section><title>UserSimilarity, ItemSimilarity</title>
-
-<p>A <code>UserSimilarity</code> defines a notion of similarity between two <code>User</code>s.
-This is a crucial part of a recommendation engine. These are attached to a <code>Neighborhood</code> implementation.
-<code>ItemSimilarity</code>s are analagous, but find similarity between <code>Item</code>s.</p>
-
-</section>
-
-<section><title>UserNeighborhood</title>
-
-<p>In a user-based recommender, recommendations are produced by finding a "neighborhood" of
-similar users near a given user. A <code>UserNeighborhood</code> defines a means of determining
-that neighborhood &#8212; for example, nearest 10 users. Implementations typically need a
-<code>UserSimilarity</code> to operate.</p>
-
-</section>
-
-</section>
-
-<section id="requirements"><title>Requirements</title>
-
-<section><title>Required</title>
-
-<ul>
- <li><a href="http://java.sun.com/j2se/1.5.0/index.jsp">Java / J2SE 6.0</a></li>
-</ul>
-
-</section>
-
-<section><title>Optional</title>
-
-<ul>
- <li><a href="http://ant.apache.org/">Apache Ant</a> 1.5 or later and <a href="http://maven.apache.org">Maven</a>
-  2.0.10 or later, if you want to build from source or build examples. (Mac users note that even OS X 10.5
-  ships with Maven 2.0.6, which will not work.)</li>
- <li>Taste web applications require a <a href="http://java.sun.com/products/servlet/index.jsp">Servlet 2.3+</a>
-  container, such as
-  <a href="http://jakarta.apache.org/tomcat/">Jakarta Tomcat</a>. It may in fact work with older
-  containers with slight modification.</li>
- <li><code>MySQLJDBCDataModel</code> implementation requires a
-  <a href="http://www.mysql.com/products/mysql/">MySQL 4.x</a> (or later) database.
-  Again, it may be made to work with earlier versions or other databases with slight changes.</li>
-
-</ul>
-
-</section>
-
-</section>
-
-<section id="demo"><title>Demo</title>
-
-<p>To build and run the demo, follow the instructions below, which are written for Unix-like operating systems:</p>
-
-<ol>
-  <li>Obtain a copy of the Mahout distribution, either from SVN or as a downloaded archive.</li>
-  <li>Download the "1 Million MovieLens Dataset" from
-   <a href="http://www.grouplens.org/">http://www.grouplens.org/</a>.</li>
-  <li>Unpack the archive and copy <code>movies.dat</code> and <code>ratings.dat</code> to
-   <code>trunk/taste-web/src/main/resources/org/apache/mahout/cf/taste/example/grouplens</code> under the Mahout distribution
-   directory.</li>
-  <li>Navigate to the directory where you unpacked the Mahout distribution, and navigate to <code>trunk</code>.</li>
-  <li>Run <code>mvn install</code>, which builds and installs Mahout core to your local repository</li>
-  <li><code>cd taste-web</code></li>
-  <li><code>cp ../examples/target/grouplens.jar ./lib</code></li>
-  <li>Edit <code>recommender.properties</code> and fill in the <code>recommender.class</code>:
-      <code>recommender.class=org.apache.mahout.cf.taste.example.grouplens.GroupLensRecommender</code>
-  </li>
-  <li><code>mvn package</code></li>
-  <li><code>mvn jetty:run-war</code>. You may need to give Maven more memory: in a bash shell, <code>export MAVEN_OPTS=-Xmx1024M</code></li>
-  <li>Get recommendations by accessing the web application in your browser:<br/>
-    <code>http://localhost:8080/RecommenderServlet?userID=1</code><br/>
-    This will produce a simple preference-item ID list which could be consumed by a client application.
-    Get more useful human-readable output with the <code>debug</code> parameter:<br/>
-    <code>http://localhost:8080/RecommenderServlet?userID=1&amp;debug=true</code></li>
-</ol>
-
-<p>Incidentally, Taste's web service interface may then be found at:<br/>
-<code>http://localhost:8080/RecommenderService.jws</code><br/>
-Its WSDL file will be here...<br/>
-<code>http://localhost:8080/RecommenderService.jws?wsdl</code><br/>
-... and you can even access it in your browser via a simple HTTP request:<br/>
-<code>.../RecommenderService.jws?method=recommend&amp;userID=1&amp;howMany=10</code></p>
-
-<p><em>Note: the exact URL where the service is deployed depends on how you deployed the application in your 
-app server. For instance if you deployed it as a .war file called 'mahout-taste-webapp.war', it will deploy
-at a URI whose path begins with /mahout-taste-webapp/ instead.</em></p>
-
-</section>
-
-<section id="examples"><title>Examples</title>
-
-<section><title>User-based Recommender</title>
-
-<p>User-based recommenders are the "original", conventional style of recommender system. They can produce good
-recommendations when tweaked properly; they are not necessarily the fastest recommender systems and
-are thus suitable for small data sets (roughly, less than ten million ratings). We'll start with an example of this.</p>
-
-<p>First, create a <code>DataModel</code> of some kind. Here, we'll use a simple on based
-on data in a file. The file should be in CSV format, with lines of the form <code>userID,itemID,prefValue</code>
-(e.g. "39505,290002,3.5"):</p>
-
-<pre>DataModel model = new FileDataModel(new File("data.txt"));
-</pre>
-
-<p>We'll use the PearsonCorrelationSimilarity implementation of <code>UserSimilarity</code> as our user
-correlation algorithm, and add an optional preference inference algorithm:</p>
-
-<pre>UserSimilarity userSimilarity = new PearsonCorrelationSimilarity(model);
-// Optional:
-userSimilarity.setPreferenceInferrer(new AveragingPreferenceInferrer());
-</pre>
-
-<p>Now we create a <code>UserNeighborhood</code> algorithm. Here we use nearest-3:</p>
-
-<pre>UserNeighborhood neighborhood =
-  new NearestNUserNeighborhood(3, userSimilarity, model);
-</pre>
-
-<p>Now we can create our <code>Recommender</code>, and add a caching decorator:</p>
-
-<pre>Recommender recommender =
-  new GenericUserBasedRecommender(model, neighborhood, userSimilarity);
-Recommender cachingRecommender = new CachingRecommender(recommender);
-</pre>
-
-<p>Now we can get 10 recommendations for user ID "1234" &#8212; done!</p>
-
-<pre>List&lt;RecommendedItem&gt; recommendations =
-  cachingRecommender.recommend(1234, 10);
-</pre>
-
-</section>
-
-<section><title>Item-based Recommender</title>
-
-<p>We could have created an item-based recommender instead. Item-based recommender base recommendation
-not on user similarity, but on item similarity. In theory these are about the same approach to the
-problem, just from different angles. However the similarity of two items is relatively fixed, more so
-than the similarity of two users. So, item-based recommenders can use pre-computed similarity values
-in the computations, which make them much faster. For large data sets, item-based recommenders
-are more appropriate.</p>
-
-<p>Let's start over, again with a <code>FileDataModel</code> to start:</p>
-
-<pre>DataModel model = new FileDataModel(new File("data.txt"));
-</pre>
-
-<p>We'll also need an <code>ItemSimilarity</code>. We could use <code>PearsonCorrelationSimilarity</code>,
-which computes item similarity in realtime, but, this is generally too slow to be useful.
-Instead, in a real application, you would feed a list of pre-computed correlations to
-a <code>GenericItemSimilarity</code>:</p>
-
-<pre>// Construct the list of pre-compted correlations
-Collection&lt;GenericItemSimilarity.ItemItemSimilarity&gt; correlations =
-  ...;
-ItemSimilarity itemSimilarity =
-  new GenericItemSimilarity(correlations);
-
-</pre>
-
-<p>Then we can finish as before to produce recommendations:</p>
-
-<pre>Recommender recommender =
-  new GenericItemBasedRecommender(model, itemSimilarity);
-Recommender cachingRecommender = new CachingRecommender(recommender);
-...
-List&lt;RecommendedItem&gt; recommendations =
-  cachingRecommender.recommend(1234, 10);
-</pre>
-
-</section>
-
-<section><title>Slope-One Recommender</title>
-
-<p>This is a simple yet effective <code>Recommender</code> and we present another example to
-round out the list:</p>
-
-<pre>DataModel model = new FileDataModel(new File("data.txt"));
-// Make a weighted slope one recommender
-Recommender recommender = new SlopeOneRecommender(model);
-Recommender cachingRecommender = new CachingRecommender(recommender);
-</pre>
-
-</section>
-
-</section>
-
-<section id="integration"><title>Integration with your application</title>
-
-<section><title>Direct</title>
-
-<p>You can create a <code>Recommender</code>, as shown above, wherever you like in your Java application, and use it. This
-includes simple Java applications or GUI applications, server applications, and J2EE web applications.</p>
-
-</section>
-
-<section><title>Standalone server</title>
-
-<p>Taste can also be run as an external server, which may be the only option for non-Java applications.
-A Taste Recommender can be exposed as a web application via <code>org.apach.mahout.cf.taste.web.RecommenderServlet</code>,
-and your application can then access recommendations via simple HTTP requests and response, or as a
-full-fledged SOAP web service. See above, and see
-<code>the javadoc</code> for details.</p>
-
-<p>To deploy your <code>Recommender</code> as an external server:</p>
-
-<ol>
-  <li>Obtain a copy of the Mahout distribution, either from SVN or as a downloaded archive.</li>
-  <li>Create an implementation of <code>org.apache.mahout.cf.taste.recommender.Recommender</code> (must have a no-arg constructor).</li>
-  <li>Compile it and create a JAR file containing your implementation.</li>
-  <li>Navigate to the directory where you unpacked the Mahout distribution, and navigate to <code>trunk</code>.</li>
-  <li>Run <code>mvn install</code>, which builds and installs Mahout core to your local repository</li>
-  <li><code>cd taste-web</code></li>
-  <li>Copy your .jar file: <code>cp [your .jar file] ./lib</code></li>
-  <li>Edit <code>recommender.properties</code> and fill in the <code>recommender.class</code> with your Recommender clas:
-      <code>recommender.class=[your recommender class]</code>
-  </li>
-  <li><code>mvn package</code></li>
-  <li>Your .war file is now available in the build directory as <code>mahout-taste-webapp.war</code> (which can be renamed).</li>
-</ol>
-
-</section>
-
-</section>
-
-<section id="performance"><title>Performance</title>
-
-<section><title>Runtime Performance</title>
-
-<p>The more data you give Taste, the better. Though Taste is designed for performance, you will undoubtedly run into
-performance issues at some point. For best results, consider using the following commad-line flags to your JVM:</p>
-
-<ul>
-  <li><code>-server</code>: Enables the server VM, which is generally appropriate for long-running,
-  computation-intensive applications.</li>
-  <li><code>-Xms1024m -Xmx1024m</code>: Make the heap as big as possible -- a gigabyte doesn't hurt when dealing
-  with tens millions of preferences. Taste will generally use as much memory as you give it for caching, which helps
-  performance. Set the initial and max size to the same value to avoid wasting time growing the
-  heap, and to avoid having the JVM run minor collections to avoid growing the heap, which will clear
-  cached values.</li>
-  <li><code>-da -dsa</code>: Disable all assertions.</li>
-  <li><code>-XX:+NewRatio=9</code>: Increase heap allocated to 'old' objects, which is most of them in this framework</li>
-  <li><code>-XX:+UseParallelGC -XX:+UseParallelOldGC</code> (multi-processor machines only): Use a GC algorithm designed to take
-  advantage of multiple processors, and designed for throughput. This is a default in J2SE 5.0.</li>
-  <li><code>-XX:-DisableExplicitGC</code>: Disable calls to <code>System.gc()</code>. These calls can only
-  hurt in the presence of modern GC algorithms; they may force Taste to remove cached data needlessly.
-  This flag isn't needed if you're sure your code and third-party code you use doesn't call this method.</li>
-</ul>
-
-<p>Also consider the following tips:</p>
-
-<ul>
-  <li>Use <code>CachingRecommender</code> on top of your custom <code>Recommender</code> implementation.</li>
-  <li>When using <code>JDBCDataModel</code>, make sure you've taken basic steps to optimize the table storing
-  preference data. Create a primary key on the user ID and item ID columns, and an index on them. Set them to
-  be non-null. And so on. Tune your database for lots of concurrent reads! When using JDBC,
-  the database is almost always the bottleneck. Plenty of memory and caching are even more important.</li>
-
-  <li>Also, pooling database connections is essential to performance. If using a J2EE container, it probably
-  provides a way to configure connection pools. If you are creating your own <code>DataSource</code> directly,
-  try wrapping it in <code>org.apache.mahout.cf.taste.impl.model.jdbc.ConnectionPoolDataSource</code></li>
-  <li>See MySQL-specific notes on performance in the javadoc for
-  <code>MySQLJDBCDataModel</code>.</li>
-</ul>
-
-</section>
-
-<section><title>Algorithm Performance: Which One Is Best?</title>
-
-<p>There is no right answer; it depends on your data, your application, environment, and performance needs.
-Taste provides the building blocks from which you can construct the best <code>Recommender</code> for your
-application. The links below provide research on this topic. You will probably need a bit of trial-and-error to find
-a setup that works best. The code sample above provides a good starting point.</p>
-
-<p>Fortunately, Taste provides a way to evaluate the accuracy of your <code>Recommender</code> on your own
-data, in <code>org.apache.mahout.cf.taste.eval</code>:</p>
-
-<pre>DataModel myModel = ...;
-RecommenderBuilder builder = new RecommenderBuilder() {
-    public Recommender buildRecommender(DataModel model) {
-      // build and return the Recommender to evaluate here
-    }
-  };
-RecommenderEvaluator evaluator =
-  new AverageAbsoluteDifferenceRecommenderEvaluator();
-double evaluation = evaluator.evaluate(builder, myModel, 0.9, 1.0);
-</pre>
-
-<p>For "boolean" data model situations, where there are no notions of preference value, the above evaluation
-based on estimated preference does not make sense. In this case, try this kind of evaluation, which presents
-traditional information retrieval figures like precision and recall, which are more meaningful:</p>
-
-<pre>
-...
-RecommenderIRStatsEvaluator evaluator =
-  new GenericRecommenderIRStatsEvaluator();
-IRStatistics stats =
-  evaluator.evaluate(builder, myModel, null, 3,
-                     RecommenderIRStatusEvaluator.CHOOSE_THRESHOLD,
-                     §1.0);
-</pre>
-
-</section>
-
-</section>
-
-<section id="useful"><title>Useful Links</title>
-
-<p>You'll want to look at these packages too, which offer more algorithms and approaches that you
-may find useful:</p>
-
-<ul>
-  <li><a href="http://www.nongnu.org/cofi/">Cofi</a>: A Java-Based Collaborative Filtering Library</li>
-  <li><a href="http://eecs.oregonstate.edu/iis/CoFE/">CoFE</a></li>
-</ul>
-
-<p>Here's a handful of research papers that I've read and found particularly useful:</p>
-
-<blockquote cite="http://research.microsoft.com/research/pubs/view.aspx?tr_id=166"><p>J.S. Breese, D. Heckerman
- and C. Kadie, "<a href="http://research.microsoft.com/research/pubs/view.aspx?tr_id=166">Empirical Analysis of
- Predictive Algorithms for Collaborative Filtering</a>,"
- in Proceedings of the Fourteenth Conference on Uncertainity in Artificial Intelligence (UAI 1998),
- 1998.</p></blockquote>
-<blockquote cite="http://www10.org/cdrom/papers/519/"><p>B. Sarwar, G. Karypis, J. Konstan and J. Riedl,
- "<a href="http://www10.org/cdrom/papers/519/">Item-based collaborative filtering recommendation
- algorithms</a>," in Proceedings of the Tenth International Conference on the World Wide Web (WWW 10),
- pp. 285-295, 2001.</p></blockquote>
-<blockquote cite="http://doi.acm.org/10.1145/192844.192905"><p>P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom and J. Riedl,
- "<a href="http://doi.acm.org/10.1145/192844.192905">GroupLens: an open architecture for
- collaborative filtering of netnews</a>," in Proceedings of the 1994 ACM conference on Computer Supported Cooperative
- Work (CSCW 1994), pp. 175-186, 1994.</p></blockquote>
-<blockquote cite="http://www.grouplens.org/papers/pdf/algs.pdf"><p>J.L. Herlocker, J.A. Konstan,
- A. Borchers and J. Riedl, "<a href="http://www.grouplens.org/papers/pdf/algs.pdf">An algorithmic framework for
- performing collaborative filtering</a>," in Proceedings of the 22nd annual international ACM SIGIR Conference
- on Research and Development in Information Retrieval (SIGIR 99), pp. 230-237, 1999.</p></blockquote>
-
-<blockquote cite="http://materialobjects.com/cf/MovieRecommender.pdf"><p>Clifford Lyon,
- "<a href="http://materialobjects.com/cf/MovieRecommender.pdf">Movie Recommender</a>,"
- CSCI E-280 final project, Harvard University, 2004.</p></blockquote>
-<blockquote cite="http://www.daniel-lemire.com/fr/abstracts/SDM2005.html"><p>Daniel Lemire, Anna Maclachlan,
- "<a href="http://www.daniel-lemire.com/fr/abstracts/SDM2005.html">Slope One Predictors for Online Rating-Based
- Collaborative Filtering</a>," Proceedings of SIAM Data Mining (SDM '05), 2005.</p></blockquote>
-<blockquote cite="http://www.daniel-lemire.com/fr/documents/publications/racofi_nrc.pdf"><p>
- Michelle Anderson, Marcel Ball, Harold Boley, Stephen Greene, Nancy Howse, Daniel Lemire and Sean McGrath,
- "<a href="http://www.daniel-lemire.com/fr/documents/publications/racofi_nrc.pdf">RACOFI: A Rule-Applying Collaborative
- Filtering System</a>," Proceedings of COLA '03, 2003.</p></blockquote>
-
-<p>These links will take you to all the collaborative filtering reading you could ever want!</p>
-
-<ul>
- <li><a href="http://www.paulperry.net/notes/cf.asp">Paul Perry's notes</a></li>
- <li><a href="http://jamesthornton.com/cf/">James Thornton's collaborative filtering resources</a></li>
- <li><a href="http://www.daniel-lemire.com/blog/">Daniel Lemire's blog</a> which frequently covers collaborative filtering topics</li>
-</ul>
+  <header>
+    <title>Apache Mahout - Taste Documentation</title>
+  </header>
+  <properties>
+    <author email="srowen@apache.org">Sean Owen</author>
+  </properties>
+  <body>
+
+    <section id="overview">
+      <title>Overview</title>
+
+      <p>Taste is a flexible, fast collaborative filtering engine for Java. The engine takes users'
+        preferences for items ("tastes") and returns estimated preferences for other items. For example, a
+        site that sells books or CDs could easily use Taste to figure out, from past purchase data, which
+        CDs a customer might be interested in listening to.
+      </p>
+
+      <p>Taste provides a rich set of components from which you can construct a customized recommender
+        system from a selection of algorithms. Taste is designed to be enterprise-ready; it's designed for
+        performance, scalability and flexibility.
+        Taste is not just for Java; it can be run as an external server which exposes recommendation logic
+        to your application via web services and HTTP.
+      </p>
+
+      <p>Top-level packages define the Taste interfaces to these key abstractions:</p>
+
+      <ul>
+        <li>
+          <code>DataModel</code>
+        </li>
+        <li>
+          <code>UserSimilarity</code>
+          and
+          <code>ItemSimilarity</code>
+        </li>
+        <li>
+          <code>UserNeighborhood</code>
+        </li>
+        <li>
+          <code>Recommender</code>
+        </li>
+      </ul>
+
+      <p>Subpackages of
+        <code>org.apache.mahout.cf.taste.impl</code>
+        hold implementations of these interfaces.
+        These are the pieces from which you will build your own recommendation engine. That's it!
+        For the academically inclined, Taste supports both
+        <em>memory-based</em>
+        and
+        <em>item-based</em>
+        recommender systems,
+        <em>slope one</em>
+        recommenders, and a couple other experimental implementations.
+        It does not currently support
+        <em>model-based</em>
+        recommenders.
+      </p>
+
+    </section>
+
+    <section id="architecture">
+      <title>Architecture</title>
+
+      <p class="centertext">
+        <img src="images/taste-architecture.png" alt="Taste Architecture" height="1060" width="442"/>
+      </p>
+
+      <p>This diagram shows the relationship between various Taste components in a user-based recommender.
+        An item-based recommender system is similar except that there are no PreferenceInferrers or Neighborhood
+        algorithms involved.
+      </p>
+
+      <section>
+        <title>Recommender</title>
+
+        <p>A
+          <code>Recommender</code>
+          is the core abstraction in Taste. Given a<code>DataModel</code>, it can produce
+          recommendations. Applications will most likely use the
+          <code>GenericUserBasedRecommender</code>
+          implementation
+          or<code>GenericItemBasedRecommender</code>, possibly decorated by<code>CachingRecommender</code>.
+        </p>
+
+      </section>
+
+      <section>
+        <title>DataModel</title>
+
+        <p>A
+          <code>DataModel</code>
+          is the interface to information about user preferences. An implementation might
+          draw this data from any source, but a database is the most likely source. Taste provides
+          <code>MySQLJDBCDataModel</code>
+          to access preference data from a database via JDBC, though many applications will want to write their own.
+          Taste also provides a<code>FileDataModel</code>.
+        </p>
+
+        <p>There are no abstractions for a user or item in the object model (not anymore). Users and items are
+          identified
+          solely by an ID value in the framework. Further, this ID value must be numeric; it is a Java
+          <code>long</code>
+          type through the APIs. A
+          <code>Preference</code>
+          object or
+          <code>PreferenceArray</code>
+          object encapsulates
+          the relation between user and preferred items (or items and users preferring them).
+        </p>
+
+        <p>Finally, Taste supports, in various ways, a so-called "boolean" data model in which users do not express
+          preferences of varying strengths for items, but simply express an association or none at all. For example,
+          while
+          users might express a preference from 1 to 5 in the context of a movie recommender site, there may be no
+          notion of a preference value between users and pages in the context of recommending pages on a web site: there
+          is only a notion of an association, or none, between a user and pages that have been visited.
+        </p>
+
+      </section>
+
+      <section>
+        <title>UserSimilarity, ItemSimilarity</title>
+
+        <p>A
+          <code>UserSimilarity</code>
+          defines a notion of similarity between two<code>User</code>s.
+          This is a crucial part of a recommendation engine. These are attached to a
+          <code>Neighborhood</code>
+          implementation.
+          <code>ItemSimilarity</code>s are analagous, but find similarity between<code>Item</code>s.
+        </p>
+
+      </section>
+
+      <section>
+        <title>UserNeighborhood</title>
+
+        <p>In a user-based recommender, recommendations are produced by finding a "neighborhood" of
+          similar users near a given user. A
+          <code>UserNeighborhood</code>
+          defines a means of determining
+          that neighborhood &#8212; for example, nearest 10 users. Implementations typically need a
+          <code>UserSimilarity</code>
+          to operate.
+        </p>
+
+      </section>
+
+    </section>
+
+    <section id="requirements">
+      <title>Requirements</title>
+
+      <section>
+        <title>Required</title>
+
+        <ul>
+          <li>
+            <a href="http://java.sun.com/j2se/1.5.0/index.jsp">Java / J2SE 6.0</a>
+          </li>
+        </ul>
+
+      </section>
+
+      <section>
+        <title>Optional</title>
+
+        <ul>
+          <li>
+            <a href="http://ant.apache.org/">Apache Ant</a>
+            1.5 or later and
+            <a href="http://maven.apache.org">Maven</a>
+            2.0.10 or later, if you want to build from source or build examples. (Mac users note that even OS X 10.5
+            ships with Maven 2.0.6, which will not work.)
+          </li>
+          <li>Taste web applications require a
+            <a href="http://java.sun.com/products/servlet/index.jsp">Servlet 2.3+</a>
+            container, such as
+            <a href="http://jakarta.apache.org/tomcat/">Jakarta Tomcat</a>. It may in fact work with older
+            containers with slight modification.
+          </li>
+          <li>
+            <code>MySQLJDBCDataModel</code>
+            implementation requires a
+            <a href="http://www.mysql.com/products/mysql/">MySQL 4.x</a>
+            (or later) database.
+            Again, it may be made to work with earlier versions or other databases with slight changes.
+          </li>
+
+        </ul>
+
+      </section>
+
+    </section>
+
+    <section id="demo">
+      <title>Demo</title>
+
+      <p>To build and run the demo, follow the instructions below, which are written for Unix-like operating systems:
+      </p>
+
+      <ol>
+        <li>Obtain a copy of the Mahout distribution, either from SVN or as a downloaded archive.</li>
+        <li>Download the "1 Million MovieLens Dataset" from
+          <a href="http://www.grouplens.org/">http://www.grouplens.org/</a>.
+        </li>
+        <li>Unpack the archive and copy
+          <code>movies.dat</code>
+          and
+          <code>ratings.dat</code>
+          to
+          <code>trunk/taste-web/src/main/resources/org/apache/mahout/cf/taste/example/grouplens</code>
+          under the Mahout distribution
+          directory.
+        </li>
+        <li>Navigate to the directory where you unpacked the Mahout distribution, and navigate to<code>trunk</code>.
+        </li>
+        <li>Run<code>mvn install</code>, which builds and installs Mahout core to your local repository
+        </li>
+        <li>
+          <code>cd taste-web</code>
+        </li>
+        <li>
+          <code>cp ../examples/target/grouplens.jar ./lib</code>
+        </li>
+        <li>Edit
+          <code>recommender.properties</code>
+          and fill in the<code>recommender.class</code>:
+          <code>recommender.class=org.apache.mahout.cf.taste.example.grouplens.GroupLensRecommender</code>
+        </li>
+        <li>
+          <code>mvn package</code>
+        </li>
+        <li><code>mvn jetty:run-war</code>. You may need to give Maven more memory: in a bash shell,
+          <code>export MAVEN_OPTS=-Xmx1024M</code>
+        </li>
+        <li>Get recommendations by accessing the web application in your browser:
+          <br/>
+          <code>http://localhost:8080/RecommenderServlet?userID=1</code>
+          <br/>
+          This will produce a simple preference-item ID list which could be consumed by a client application.
+          Get more useful human-readable output with the
+          <code>debug</code>
+          parameter:
+          <br/>
+          <code>http://localhost:8080/RecommenderServlet?userID=1&amp;debug=true</code>
+        </li>
+      </ol>
+
+      <p>Incidentally, Taste's web service interface may then be found at:
+        <br/>
+        <code>http://localhost:8080/RecommenderService.jws</code>
+        <br/>
+        Its WSDL file will be here...
+        <br/>
+        <code>http://localhost:8080/RecommenderService.jws?wsdl</code>
+        <br/>
+        ... and you can even access it in your browser via a simple HTTP request:
+        <br/>
+        <code>.../RecommenderService.jws?method=recommend&amp;userID=1&amp;howMany=10</code>
+      </p>
+
+      <p>
+        <em>Note: the exact URL where the service is deployed depends on how you deployed the application in your
+          app server. For instance if you deployed it as a .war file called 'mahout-taste-webapp.war', it will deploy
+          at a URI whose path begins with /mahout-taste-webapp/ instead.
+        </em>
+      </p>
+
+    </section>
+
+    <section id="examples">
+      <title>Examples</title>
+
+      <section>
+        <title>User-based Recommender</title>
+
+        <p>User-based recommenders are the "original", conventional style of recommender system. They can produce good
+          recommendations when tweaked properly; they are not necessarily the fastest recommender systems and
+          are thus suitable for small data sets (roughly, less than ten million ratings). We'll start with an example of
+          this.
+        </p>
+
+        <p>First, create a
+          <code>DataModel</code>
+          of some kind. Here, we'll use a simple on based
+          on data in a file. The file should be in CSV format, with lines of the form
+          <code>userID,itemID,prefValue</code>
+          (e.g. "39505,290002,3.5"):
+        </p>
+
+        <pre>DataModel model = new FileDataModel(new File("data.txt"));
+        </pre>
+
+        <p>We'll use the PearsonCorrelationSimilarity implementation of
+          <code>UserSimilarity</code>
+          as our user
+          correlation algorithm, and add an optional preference inference algorithm:
+        </p>
+
+        <pre>UserSimilarity userSimilarity = new PearsonCorrelationSimilarity(model);
+          // Optional:
+          userSimilarity.setPreferenceInferrer(new AveragingPreferenceInferrer());
+        </pre>
+
+        <p>Now we create a
+          <code>UserNeighborhood</code>
+          algorithm. Here we use nearest-3:
+        </p>
+
+        <pre>UserNeighborhood neighborhood =
+          new NearestNUserNeighborhood(3, userSimilarity, model);
+        </pre>
+
+        <p>Now we can create our<code>Recommender</code>, and add a caching decorator:
+        </p>
+
+        <pre>Recommender recommender =
+          new GenericUserBasedRecommender(model, neighborhood, userSimilarity);
+          Recommender cachingRecommender = new CachingRecommender(recommender);
+        </pre>
+
+        <p>Now we can get 10 recommendations for user ID "1234" &#8212; done!</p>
+
+        <pre>List&lt;RecommendedItem&gt; recommendations =
+          cachingRecommender.recommend(1234, 10);
+        </pre>
+
+      </section>
+
+      <section>
+        <title>Item-based Recommender</title>
+
+        <p>We could have created an item-based recommender instead. Item-based recommender base recommendation
+          not on user similarity, but on item similarity. In theory these are about the same approach to the
+          problem, just from different angles. However the similarity of two items is relatively fixed, more so
+          than the similarity of two users. So, item-based recommenders can use pre-computed similarity values
+          in the computations, which make them much faster. For large data sets, item-based recommenders
+          are more appropriate.
+        </p>
+
+        <p>Let's start over, again with a
+          <code>FileDataModel</code>
+          to start:
+        </p>
+
+        <pre>DataModel model = new FileDataModel(new File("data.txt"));
+        </pre>
+
+        <p>We'll also need an<code>ItemSimilarity</code>. We could use<code>PearsonCorrelationSimilarity</code>,
+          which computes item similarity in realtime, but, this is generally too slow to be useful.
+          Instead, in a real application, you would feed a list of pre-computed correlations to
+          a<code>GenericItemSimilarity</code>:
+        </p>
+
+        <pre>// Construct the list of pre-compted correlations
+          Collection&lt;GenericItemSimilarity.ItemItemSimilarity&gt; correlations =
+          ...;
+          ItemSimilarity itemSimilarity =
+          new GenericItemSimilarity(correlations);
+
+        </pre>
+
+        <p>Then we can finish as before to produce recommendations:</p>
+
+        <pre>Recommender recommender =
+          new GenericItemBasedRecommender(model, itemSimilarity);
+          Recommender cachingRecommender = new CachingRecommender(recommender);
+          ...
+          List&lt;RecommendedItem&gt; recommendations =
+          cachingRecommender.recommend(1234, 10);
+        </pre>
+
+      </section>
+
+      <section>
+        <title>Slope-One Recommender</title>
+
+        <p>This is a simple yet effective
+          <code>Recommender</code>
+          and we present another example to
+          round out the list:
+        </p>
+
+        <pre>DataModel model = new FileDataModel(new File("data.txt"));
+          // Make a weighted slope one recommender
+          Recommender recommender = new SlopeOneRecommender(model);
+          Recommender cachingRecommender = new CachingRecommender(recommender);
+        </pre>
+
+      </section>
+
+    </section>
+
+    <section id="integration">
+      <title>Integration with your application</title>
+
+      <section>
+        <title>Direct</title>
+
+        <p>You can create a<code>Recommender</code>, as shown above, wherever you like in your Java application, and use
+          it. This
+          includes simple Java applications or GUI applications, server applications, and J2EE web applications.
+        </p>
+
+      </section>
+
+      <section>
+        <title>Standalone server</title>
+
+        <p>Taste can also be run as an external server, which may be the only option for non-Java applications.
+          A Taste Recommender can be exposed as a web application via<code>
+            org.apach.mahout.cf.taste.web.RecommenderServlet</code>,
+          and your application can then access recommendations via simple HTTP requests and response, or as a
+          full-fledged SOAP web service. See above, and see
+          <code>the javadoc</code>
+          for details.
+        </p>
+
+        <p>To deploy your
+          <code>Recommender</code>
+          as an external server:
+        </p>
+
+        <ol>
+          <li>Obtain a copy of the Mahout distribution, either from SVN or as a downloaded archive.</li>
+          <li>Create an implementation of
+            <code>org.apache.mahout.cf.taste.recommender.Recommender</code>
+            (must have a no-arg constructor).
+          </li>
+          <li>Compile it and create a JAR file containing your implementation.</li>
+          <li>Navigate to the directory where you unpacked the Mahout distribution, and navigate to<code>trunk</code>.
+          </li>
+          <li>Run<code>mvn install</code>, which builds and installs Mahout core to your local repository
+          </li>
+          <li>
+            <code>cd taste-web</code>
+          </li>
+          <li>Copy your .jar file:
+            <code>cp [your .jar file] ./lib</code>
+          </li>
+          <li>Edit
+            <code>recommender.properties</code>
+            and fill in the
+            <code>recommender.class</code>
+            with your Recommender clas:
+            <code>recommender.class=[your recommender class]</code>
+          </li>
+          <li>
+            <code>mvn package</code>
+          </li>
+          <li>Your .war file is now available in the build directory as
+            <code>mahout-taste-webapp.war</code>
+            (which can be renamed).
+          </li>
+        </ol>
+
+      </section>
+
+    </section>
+
+    <section id="performance">
+      <title>Performance</title>
+
+      <section>
+        <title>Runtime Performance</title>
+
+        <p>The more data you give Taste, the better. Though Taste is designed for performance, you will undoubtedly run
+          into
+          performance issues at some point. For best results, consider using the following commad-line flags to your
+          JVM:
+        </p>
+
+        <ul>
+          <li><code>-server</code>: Enables the server VM, which is generally appropriate for long-running,
+            computation-intensive applications.
+          </li>
+          <li><code>-Xms1024m -Xmx1024m</code>: Make the heap as big as possible -- a gigabyte doesn't hurt when dealing
+            with tens millions of preferences. Taste will generally use as much memory as you give it for caching, which
+            helps
+            performance. Set the initial and max size to the same value to avoid wasting time growing the
+            heap, and to avoid having the JVM run minor collections to avoid growing the heap, which will clear
+            cached values.
+          </li>
+          <li><code>-da -dsa</code>: Disable all assertions.
+          </li>
+          <li><code>-XX:+NewRatio=9</code>: Increase heap allocated to 'old' objects, which is most of them in this
+            framework
+          </li>
+          <li>
+            <code>-XX:+UseParallelGC -XX:+UseParallelOldGC</code>
+            (multi-processor machines only): Use a GC algorithm designed to take
+            advantage of multiple processors, and designed for throughput. This is a default in J2SE 5.0.
+          </li>
+          <li><code>-XX:-DisableExplicitGC</code>: Disable calls to<code>System.gc()</code>. These calls can only
+            hurt in the presence of modern GC algorithms; they may force Taste to remove cached data needlessly.
+            This flag isn't needed if you're sure your code and third-party code you use doesn't call this method.
+          </li>
+        </ul>
+
+        <p>Also consider the following tips:</p>
+
+        <ul>
+          <li>Use
+            <code>CachingRecommender</code>
+            on top of your custom
+            <code>Recommender</code>
+            implementation.
+          </li>
+          <li>When using<code>JDBCDataModel</code>, make sure you've taken basic steps to optimize the table storing
+            preference data. Create a primary key on the user ID and item ID columns, and an index on them. Set them to
+            be non-null. And so on. Tune your database for lots of concurrent reads! When using JDBC,
+            the database is almost always the bottleneck. Plenty of memory and caching are even more important.
+          </li>
+
+          <li>Also, pooling database connections is essential to performance. If using a J2EE container, it probably
+            provides a way to configure connection pools. If you are creating your own
+            <code>DataSource</code>
+            directly,
+            try wrapping it in
+            <code>org.apache.mahout.cf.taste.impl.model.jdbc.ConnectionPoolDataSource</code>
+          </li>
+          <li>See MySQL-specific notes on performance in the javadoc for
+            <code>MySQLJDBCDataModel</code>.
+          </li>
+        </ul>
+
+      </section>
+
+      <section>
+        <title>Algorithm Performance: Which One Is Best?</title>
+
+        <p>There is no right answer; it depends on your data, your application, environment, and performance needs.
+          Taste provides the building blocks from which you can construct the best
+          <code>Recommender</code>
+          for your
+          application. The links below provide research on this topic. You will probably need a bit of trial-and-error
+          to find
+          a setup that works best. The code sample above provides a good starting point.
+        </p>
+
+        <p>Fortunately, Taste provides a way to evaluate the accuracy of your
+          <code>Recommender</code>
+          on your own
+          data, in<code>org.apache.mahout.cf.taste.eval</code>:
+        </p>
+
+        <pre>DataModel myModel = ...;
+          RecommenderBuilder builder = new RecommenderBuilder() {
+          public Recommender buildRecommender(DataModel model) {
+          // build and return the Recommender to evaluate here
+          }
+          };
+          RecommenderEvaluator evaluator =
+          new AverageAbsoluteDifferenceRecommenderEvaluator();
+          double evaluation = evaluator.evaluate(builder, myModel, 0.9, 1.0);
+        </pre>
+
+        <p>For "boolean" data model situations, where there are no notions of preference value, the above evaluation
+          based on estimated preference does not make sense. In this case, try this kind of evaluation, which presents
+          traditional information retrieval figures like precision and recall, which are more meaningful:
+        </p>
+
+        <pre>
+          ...
+          RecommenderIRStatsEvaluator evaluator =
+          new GenericRecommenderIRStatsEvaluator();
+          IRStatistics stats =
+          evaluator.evaluate(builder, myModel, null, 3,
+          RecommenderIRStatusEvaluator.CHOOSE_THRESHOLD,
+          §1.0);
+        </pre>
+
+      </section>
+
+    </section>
+
+    <section id="useful">
+      <title>Useful Links</title>
+
+      <p>You'll want to look at these packages too, which offer more algorithms and approaches that you
+        may find useful:
+      </p>
+
+      <ul>
+        <li><a href="http://www.nongnu.org/cofi/">Cofi</a>: A Java-Based Collaborative Filtering Library
+        </li>
+        <li>
+          <a href="http://eecs.oregonstate.edu/iis/CoFE/">CoFE</a>
+        </li>
+      </ul>
+
+      <p>Here's a handful of research papers that I've read and found particularly useful:</p>
+
+      <blockquote cite="http://research.microsoft.com/research/pubs/view.aspx?tr_id=166">
+        <p>J.S. Breese, D. Heckerman
+          and C. Kadie, "<a href="http://research.microsoft.com/research/pubs/view.aspx?tr_id=166">Empirical Analysis of
+            Predictive Algorithms for Collaborative Filtering</a>,"
+          in Proceedings of the Fourteenth Conference on Uncertainity in Artificial Intelligence (UAI 1998),
+          1998.
+        </p>
+      </blockquote>
+      <blockquote cite="http://www10.org/cdrom/papers/519/">
+        <p>B. Sarwar, G. Karypis, J. Konstan and J. Riedl,
+          "<a href="http://www10.org/cdrom/papers/519/">Item-based collaborative filtering recommendation
+            algorithms</a>," in Proceedings of the Tenth International Conference on the World Wide Web (WWW 10),
+          pp. 285-295, 2001.
+        </p>
+      </blockquote>
+      <blockquote cite="http://doi.acm.org/10.1145/192844.192905">
+        <p>P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom and J. Riedl,
+          "<a href="http://doi.acm.org/10.1145/192844.192905">GroupLens: an open architecture for
+            collaborative filtering of netnews</a>," in Proceedings of the 1994 ACM conference on Computer Supported
+          Cooperative
+          Work (CSCW 1994), pp. 175-186, 1994.
+        </p>
+      </blockquote>
+      <blockquote cite="http://www.grouplens.org/papers/pdf/algs.pdf">
+        <p>J.L. Herlocker, J.A. Konstan,
+          A. Borchers and J. Riedl, "<a href="http://www.grouplens.org/papers/pdf/algs.pdf">An algorithmic framework for
+            performing collaborative filtering</a>," in Proceedings of the 22nd annual international ACM SIGIR
+          Conference
+          on Research and Development in Information Retrieval (SIGIR 99), pp. 230-237, 1999.
+        </p>
+      </blockquote>
+
+      <blockquote cite="http://materialobjects.com/cf/MovieRecommender.pdf">
+        <p>Clifford Lyon,
+          "<a href="http://materialobjects.com/cf/MovieRecommender.pdf">Movie Recommender</a>,"
+          CSCI E-280 final project, Harvard University, 2004.
+        </p>
+      </blockquote>
+      <blockquote cite="http://www.daniel-lemire.com/fr/abstracts/SDM2005.html">
+        <p>Daniel Lemire, Anna Maclachlan,
+          "<a href="http://www.daniel-lemire.com/fr/abstracts/SDM2005.html">Slope One Predictors for Online Rating-Based
+            Collaborative Filtering</a>," Proceedings of SIAM Data Mining (SDM '05), 2005.
+        </p>
+      </blockquote>
+      <blockquote cite="http://www.daniel-lemire.com/fr/documents/publications/racofi_nrc.pdf">
+        <p>
+          Michelle Anderson, Marcel Ball, Harold Boley, Stephen Greene, Nancy Howse, Daniel Lemire and Sean McGrath,
+          "<a href="http://www.daniel-lemire.com/fr/documents/publications/racofi_nrc.pdf">RACOFI: A Rule-Applying
+          Collaborative
+          Filtering System</a>," Proceedings of COLA '03, 2003.
+        </p>
+      </blockquote>
+
+      <p>These links will take you to all the collaborative filtering reading you could ever want!</p>
+
+      <ul>
+        <li>
+          <a href="http://www.paulperry.net/notes/cf.asp">Paul Perry's notes</a>
+        </li>
+        <li>
+          <a href="http://jamesthornton.com/cf/">James Thornton's collaborative filtering resources</a>
+        </li>
+        <li>
+          <a href="http://www.daniel-lemire.com/blog/">Daniel Lemire's blog</a>
+          which frequently covers collaborative filtering topics
+        </li>
+      </ul>
 
-</section>
-</body>
+    </section>
+  </body>
 </document>

Modified: lucene/mahout/site/src/documentation/content/xdocs/whoweare.xml
URL: http://svn.apache.org/viewvc/lucene/mahout/site/src/documentation/content/xdocs/whoweare.xml?rev=825939&r1=825938&r2=825939&view=diff
==============================================================================
--- lucene/mahout/site/src/documentation/content/xdocs/whoweare.xml (original)
+++ lucene/mahout/site/src/documentation/content/xdocs/whoweare.xml Fri Oct 16 15:29:45 2009
@@ -1,36 +1,89 @@
 <?xml version="1.0"?>
 <document>
-	<header>
-        <title>	Apache Mahout - Who We Are</title>
-	</header>
-<properties>
-</properties>
-<body>
+  <header>
+    <title>Apache Mahout - Who We Are</title>
+  </header>
+  <properties>
+  </properties>
+  <body>
 
-<p>Apache Mahout is maintained by a team of volunteer developers.</p>
+    <p>Apache Mahout is maintained by a team of volunteer developers.</p>
 
-<section id="core"><title>Core Committers</title>
-<ul>
-<li><b><a href="http://www.isabel-drost.de">Isabel Drost</a></b> (isabel@...)</li>
-<li><b><a href="http://www.veoh.com">Ted Dunning</a></b> (tdunning@...)</li>
-<li><b><a href="http://www.windwardsolutions.com">Jeff Eastman</a></b> (jeastman@...)</li>
-<li><b><a href="http://www.jroller.com/page/otis">Otis Gospodnetic</a></b> (otis@...)</li>
-<li><b><a href="http://people.apache.org/list_I.html#gsingers">Grant Ingersoll</a></b> (gsingers@...) </li>
-<li><b>Sean Owen</b> (srowen@...)</li>
-<li><b><a href="http://www.cs.put.poznan.pl/~dweiss">Dawid Weiss</a></b> (dweiss@...)</li>
-<li><b>Karl Wettin</b> (kalle@...)</li>
-<li><b>AbdelHakim Deneche</b> (adeneche@...)</li>
-<li><b>David Hall</b> (dlwh@...)</li>
-</ul>
-</section>
-<section id="emeritus"><title>Emeritus Committers</title>
-<ul>
-	<li><b>Niranjan Balasubramanian</b> (nbalasub@...)</li>
-	<li><b>Erik Hatcher</b> (ehatcher@...)</li>
-	<li><b>Ozgur Yilmazel</b> (oyilmazel@...)</li>
-</ul>	
-</section>
-<p>Note that the email addresses above end with @apache.org.</p>
+    <section id="core">
+      <title>Core Committers</title>
+      <ul>
+        <li>
+          <b>
+            <a href="http://www.isabel-drost.de">Isabel Drost</a>
+          </b>
+          (isabel@...)
+        </li>
+        <li>
+          <b>
+            <a href="http://www.veoh.com">Ted Dunning</a>
+          </b>
+          (tdunning@...)
+        </li>
+        <li>
+          <b>
+            <a href="http://www.windwardsolutions.com">Jeff Eastman</a>
+          </b>
+          (jeastman@...)
+        </li>
+        <li>
+          <b>
+            <a href="http://www.jroller.com/page/otis">Otis Gospodnetic</a>
+          </b>
+          (otis@...)
+        </li>
+        <li>
+          <b>
+            <a href="http://people.apache.org/list_I.html#gsingers">Grant Ingersoll</a>
+          </b>
+          (gsingers@...)
+        </li>
+        <li>
+          <b>Sean Owen</b>
+          (srowen@...)
+        </li>
+        <li>
+          <b>
+            <a href="http://www.cs.put.poznan.pl/~dweiss">Dawid Weiss</a>
+          </b>
+          (dweiss@...)
+        </li>
+        <li>
+          <b>Karl Wettin</b>
+          (kalle@...)
+        </li>
+        <li>
+          <b>AbdelHakim Deneche</b>
+          (adeneche@...)
+        </li>
+        <li>
+          <b>David Hall</b>
+          (dlwh@...)
+        </li>
+      </ul>
+    </section>
+    <section id="emeritus">
+      <title>Emeritus Committers</title>
+      <ul>
+        <li>
+          <b>Niranjan Balasubramanian</b>
+          (nbalasub@...)
+        </li>
+        <li>
+          <b>Erik Hatcher</b>
+          (ehatcher@...)
+        </li>
+        <li>
+          <b>Ozgur Yilmazel</b>
+          (oyilmazel@...)
+        </li>
+      </ul>
+    </section>
+    <p>Note that the email addresses above end with @apache.org.</p>
 
-</body>
+  </body>
 </document>