You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2008/11/04 10:04:08 UTC

[Hadoop Wiki] Update of "Hbase/PoweredBy" by LarsGeorge

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by LarsGeorge:
http://wiki.apache.org/hadoop/Hbase/PoweredBy

The comment on the change is:
Added description for WorldLingo

------------------------------------------------------------------------------
  
  [http://www.wikia.com/wiki/Wikia Wikia] hosts its user and keyword databases on a cluster of 7 machines.
  
+ [http://www.worldlingo.com/ WorldLingo] - The !WorldLingo Multilingual Archive. We use HBase to store millions of documents that we scan using Map/Reduce jobs to machine translate them into all or selected target languages from our set available machine translation languages. We currently store 12 million documents but plan to eventually reach the 450 million mark. HBase allows us scale out as we need to grow our storage capacities. Combined with Hadoop to keep the data replicated and therefore fail-safe we have backbone our service can rely on now and in the future. 
+ 
  [http://www.yahoo.com/ Yahoo!] uses HBase to store document fingerprint for detecting near-duplications. We have a cluster of few nodes that runs HDFS, mapreduce, and HBase. The table contains millions of rows. We use this for querying duplicated documents with realtime traffic.