You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2010/12/26 12:39:00 UTC

[Hadoop Wiki] Update of "Hbase/PoweredBy" by udanax

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hbase/PoweredBy" page has been changed by udanax.
The comment on this change is: Add my application.
http://wiki.apache.org/hadoop/Hbase/PoweredBy?action=diff&rev1=50&rev2=51

--------------------------------------------------

  
  [[http://www.twitter.com|Twitter]] runs HBase across its entire Hadoop cluster.  HBase provides a distributed, read/write backup of all  mysql tables in Twitter's production backend, allowing engineers to run MapReduce jobs over the data while maintaining the ability to apply periodic row updates (something that is more difficult to do with vanilla HDFS).  A number of applications including people search rely on HBase internally for data generation. Additionally, the operations team uses HBase as a timeseries database for cluster-wide monitoring/performance data.
  
+ [[http://www.udanax.org|Udanax.org]] (URL shortner) use HBase to store URLs and response the real-time request. And also, it used for the information-flow analysis. The rows are increasing, almost 30 per second.
+ 
  [[http://www.veoh.com/|Veoh Networks]] uses HBase to store and process visitor(human) and entity(non-human) profiles which are used for behavioral targeting, demographic detection, and personalization services.  Our site reads this data in real-time (heavily cached) and submits updates via various batch map/reduce jobs. With 25 million unique visitors a month storing this data in a traditional RDBMS is not an option. We currently have a 24 node Hadoop/HBase cluster and our profiling system is sharing this cluster with our other Hadoop data pipeline processes.
  
  [[http://www.videosurf.com/|VideoSurf]] - "The video search engine that has taught computers to see". We're using Hbase to persist various large graphs of data and other statistics. Hbase was a real win for us because it let us store substantially larger datasets without the need for manually partitioning the data and it's column-oriented nature allowed us to create schemas that were substantially more efficient for storing and retrieving data.