You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2009/04/15 00:27:28 UTC

[Hadoop Wiki] Update of "PoweredBy" by TedDunning

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by TedDunning:
http://wiki.apache.org/hadoop/PoweredBy

------------------------------------------------------------------------------
  
   * [http://www.weblab.infosci.cornell.edu/ Cornell University Web Lab]
    * Generating web graphs on 100 nodes (dual 2.4GHz Xeon Processor, 2 GB RAM, 72GB Hard Drive)
+ 
+  * [http://www.deepdyve.com Deepdyve]
+   * Elastic cluster with 5-80 nodes 
+   * We use hadoop to create our indexes of deep web content and to provide a high availability and high bandwidth storage service for index shards for our search cluster.
  
   * [http://www.enormo.com/ Enormo]
    * 4 nodes cluster (32 cores, 1TB).
@@ -192, +196 @@

    * We use hadoop to process data relating to people on the web
    * We also involved with Cascading to help simplify how our data flows through various processing stages
  
+  * [http://code.google.com/p/redpoll/ Redpoll]
+   * Hardware: 35 nodes (2*4cpu 10TB disk 16GB RAM each)
+   * We intend to parallelize some traditional classification, clustering algorithms like Naive Bayes, K-Means, EM so that can deal with large-scale data sets.
+ 
   * [http://alpha.search.wikia.com Search Wikia]
    * A project to help develop open source social search tools.  We run a 125 node hadoop cluster.
  
@@ -265, +273 @@

    * 10 node cluster (Dual-Core AMD Opteron 2210, 4GB RAM, 1TB/node storage)
    * Run Naive Bayes classifiers in parallel over crawl data to discover event information
  
-  * [http://code.google.com/p/redpoll/ Redpoll]
-   * Hardware: 35 nodes (2*4cpu 10TB disk 16GB RAM each)
-   * We intent to parallelize some traditional classification, clustering algorithms like Naive Bayes, K-Means, EM so that can deal with large-scale data sets.
  
  ''When applicable, please include details about your cluster hardware and size.''