You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2011/06/24 07:23:55 UTC

[Hadoop Wiki] Update of "Hbase/PoweredBy" by OtisGospodnetic

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hbase/PoweredBy" page has been changed by OtisGospodnetic:
http://wiki.apache.org/hadoop/Hbase/PoweredBy?action=diff&rev1=70&rev2=71

  [[http://www.adobe.com|Adobe]] - We currently have about 30 nodes running HDFS, Hadoop and HBase  in clusters ranging from 5 to 14 nodes on both production and development. We plan a deployment on an 80 nodes cluster. We are using HBase in several areas from social services to structured data and processing for internal use. We constantly write data to HBase and run mapreduce jobs to process then store it back to HBase or external systems. Our production cluster has been running since Oct 2008.
  
- [[http://caree.rs|Caree.rs]] - Accelerated hiring platform for HiTech companies. We use HBase and Hadoop for all aspects of our backend - job and company data storage, analytics processing, machine learning algorithms for our hire recommendation engine. Our live production site is directly served from HBase. We use cascading for running offline data processing jobs. 
+ [[http://caree.rs|Caree.rs]] - Accelerated hiring platform for HiTech companies. We use HBase and Hadoop for all aspects of our backend - job and company data storage, analytics processing, machine learning algorithms for our hire recommendation engine. Our live production site is directly served from HBase. We use cascading for running offline data processing jobs.
  
  [[http://www.drawntoscaleconsulting.com|Drawn to Scale Consulting]] consults on HBase, Hadoop, Distributed Search, and Scalable architectures.
  
@@ -16, +16 @@

  
  [[http://gumgum.com|GumGum]] is an In-Image ad network. We use HBase on an 8-node Amazon EC2 High-CPU Extra Large (c1.xlarge) cluster for both real-time data and analytics. Our production cluster has been running since June 2010.
  
- [[http://www.impetus.com/ |Impetus]] - With a strong focus, established thought leadership and open source contributions in the area of Big Data analytics and consulting services, Impetus uses its Global Delivery Model to help technology businesses and enterprises evaluate and implement solutions tailored to their specific context, without being biased towards a particular solution. [[http://bigdata.impetus.com/# | More info about BigData @Impetus]]
+ [[http://www.impetus.com/|Impetus]] - With a strong focus, established thought leadership and open source contributions in the area of Big Data analytics and consulting services, Impetus uses its Global Delivery Model to help technology businesses and enterprises evaluate and implement solutions tailored to their specific context, without being biased towards a particular solution. [[http://bigdata.impetus.com/#|More info about BigData @Impetus]]
  
- [[http://www.infolinks.com/ |Infolinks]] - Infolinks is an In-Text ad provider. We use HBase to process advertisement selection and user events for our In-Text ad network. The reports generated from HBase are used as feedback for our production system to optimize ad selection. 
+ [[http://www.infolinks.com/|Infolinks]] - Infolinks is an In-Text ad provider. We use HBase to process advertisement selection and user events for our In-Text ad network. The reports generated from HBase are used as feedback for our production system to optimize ad selection.
  
  [[http://www.kalooga.com|Kalooga]] is a discovery service for image galleries. We use Hadoop, HBase and Pig on a 20-node cluster for our crawling, analysis and events processing.
  
@@ -40, +40 @@

  
  [[http://www.readpath.com/|ReadPath]] uses HBase to store several hundred million RSS items and dictionary for its RSS newsreader. Readpath is currently running on an 8 node cluster.
  
- [[http://resu.me/|resu.me]] - Career network for the net generation. We use HBase and Hadoop for all aspects of our backend - user and resume data storage, analytics processing, machine learning algorithms for our job recommendation engine. Our live production site is directly served from HBase. We use cascading for running offline data processing jobs.  
+ [[http://resu.me/|resu.me]] - Career network for the net generation. We use HBase and Hadoop for all aspects of our backend - user and resume data storage, analytics processing, machine learning algorithms for our job recommendation engine. Our live production site is directly served from HBase. We use cascading for running offline data processing jobs.
  
  [[http://www.runa.com/|Runa Inc.]] offers a SaaS that enables online merchants to offer dynamic per-consumer, per-product promotions embedded in their website. To implement this we collect the click streams of all their visitors to determine along with the rules of the merchant what promotion to offer the visitor at different points of their browsing the Merchant website. So we have lots of data and have to do lots of off-line and real-time analytics. HBase is the core for us. We also use Clojure and our own open sourced distributed processing framework, Swarmiji. The HBase Community has been key to our forward movement with HBase. We're looking for experienced developers to join us to help make things go even faster!
+ 
+ [[http://www.sematext.com/|Sematext]] runs [[http://www.sematext.com/search-analytics/index.html|Search Analytics]], a service that uses HBase to store search activity and MapReduce to produce reports showing user search behaviour and experience.
+ 
+ [[http://www.sematext.com/search-analytics/index.html|Sematext]] runs [[http://www.sematext.com/spm/index.html|Scalable Performance Monitoring]] (SPM), a service that uses HBase to store performance data over time, crunch it with the help of MapReduce, and display it in a visually rich browser-based UI.  Interestingly, SPM features [[http://www.sematext.com/spm/hbase-performance-monitoring/index.html|SPM for HBase]], which is specifically designed to monitor all HBase performance metrics.
  
  [[http://www.socialmedia.com/|SocialMedia]] uses HBase to store and process user events which allows us to provide near-realtime user metrics and reporting. HBase forms the heart of our Advertising Network data storage and management system. We use HBase as a data source and sink for both realtime request cycle queries and as a backend for mapreduce analysis.
  
@@ -70, +74 @@

  
  [[http://www.yahoo.com/|Yahoo!]] uses HBase to store document fingerprint for detecting near-duplications. We have a cluster of few nodes that runs HDFS, mapreduce, and HBase. The table contains millions of rows. We use this for querying duplicated documents with realtime traffic.
  
- [[http://h50146.www5.hp.com/products/software/security/icewall/eng/|HP IceWall SSO]] - is a web-based single sign-on solution and uses HBase to store user data to authenticate users. We have supported RDB and LDAP previously but have newly supported HBase with a view to authenticate over tens of millions of users and devices. 
+ [[http://h50146.www5.hp.com/products/software/security/icewall/eng/|HP IceWall SSO]] - is a web-based single sign-on solution and uses HBase to store user data to authenticate users. We have supported RDB and LDAP previously but have newly supported HBase with a view to authenticate over tens of millions of users and devices.