You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2010/12/15 01:21:57 UTC

[Hadoop Wiki] Update of "Hive/PoweredBy" by Avlan Makis

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/PoweredBy" page has been changed by Avlan Makis.
The comment on this change is: Added a site that uses Hive to process huge amounts of data..
http://wiki.apache.org/hadoop/Hive/PoweredBy?action=diff&rev1=24&rev2=25

--------------------------------------------------

  Applications and organizations using Hive include (alphabetically):
  
-  *  [[http://www.bizo.com|Bizo]]
+  * [[http://www.bizo.com|Bizo]]
+ 
  We use Hive for reporting and ad hoc queries.
  
-  *  [[http://www.chitika.com|Chitika]]
+  * [[http://www.chitika.com|Chitika]]
- We use Hive for data mining and analysis on our 435M monthly global users. 
  
+ We use Hive for data mining and analysis on our 435M monthly global users.
+ 
-  *  [[http://www.cnet.com|CNET]]
+  * [[http://www.cnet.com|CNET]]
+ 
  We use Hive for data mining, internal log analysis and ad hoc queries.
  
-  *  [[http://www.digg.com|Digg]]
+  * [[http://www.digg.com|Digg]]
+ 
  We use Hive for data mining, internal log analysis, R&D, and reporting/analytics.
  
-  *  [[http://www.eharmony.com|eHarmony]]
+  * [[http://www.eharmony.com|eHarmony]]
- We use Hive for Matching Trends, Model Building, In-Depth Analytics, as well as Ad-Hoc Analysis. 
  
+ We use Hive for Matching Trends, Model Building, In-Depth Analytics, as well as Ad-Hoc Analysis.
-  *  [[http://www.facebook.com|Facebook]]
- We use Hadoop to store copies of internal log and dimension data sources and use it as a source for reporting/analytics and machine learning.
- Currently have a 640 machine cluster with ~5000 cores and 2PB raw storage. Each (commodity) node has 8 cores and 4 TB of storage.
  
+  * [[http://www.facebook.com|Facebook]]
+ 
+ We use Hadoop to store copies of internal log and dimension data sources and use it as a source for reporting/analytics and machine learning. Currently have a 640 machine cluster with ~5000 cores and 2PB raw storage. Each (commodity) node has 8 cores and 4 TB of storage.
+ 
-  *  [[http://www.grooveshark.com|Grooveshark]]
+  * [[http://www.grooveshark.com|Grooveshark]]
+ 
  We use Hive for user analytics, dataset cleaning, and machine learning R&D.
  
-  *  [[http://www.hi5.com|hi5]]
+  * [[http://www.hi5.com|hi5]]
+ 
  We use Hive for analytics, machine learning and social graph analysis.
  
   * [[http://dev.hubspot.com/|HubSpot]]
+ 
  We use Hive as part of a larger Hadoop pipeline to serve near-realtime web analytics.
  
-  *  [[http://www.last.fm|Last.fm]]
+  * [[http://www.last.fm|Last.fm]]
+ 
  We use Hive for various ad hoc queries.
  
+  * [[http://www.medhelp.org/find-a-doctor|MedHelp Find a Doctor]]
+ 
+ We implemented Hive to analyse large amounts of doctors across the United States, and for internal analytics for over 1M pageview/day.
+ 
-  *  [[http://www.rocketfuelinc.com/|Rocket Fuel]]
+  * [[http://www.rocketfuelinc.com/|Rocket Fuel]]
+ 
  We use Hive to host all our fact and dimension data. Off this warehouse, we do reporting, analytics, machine learning and model building, and various ad hoc queries.
  
   * [[http://www.saaspulse.com/|SaaSPulse]]
+ 
  We use Hive for analytics, machine learning and customer interaction analysis of web applications.
  
   * [[http://www.scribd.com/|Scribd]]
+ 
  We use hive for machine learning, data mining, ad-hoc querying, and both internal and user-facing analytics
  
-  *  TaoBao (www dot taobao dot com)
+  * TaoBao (www dot taobao dot com)
+ 
  We use Hive for data mining, internal log analysis and ad-hoc queries. We also do some extensively developing work on Hive.
  
-  *  [[http://www.trendingtopics.org|Trending Topics]]
+  * [[http://www.trendingtopics.org|Trending Topics]]
+ 
  Hot Wikipedia Topics, Served Fresh Daily.  Powered by Cloudera Hadoop Distribution & Hive on EC2.  We use Hive for log data normalization and building sample datasets for trend detection R&D.
  
-  *  [[http://www.videoegg.com|VideoEgg]]
+  * [[http://www.videoegg.com|VideoEgg]]
+ 
  We use Hive as the core database for our data warehouse where we track and analyze all the usage data of the ads across our network.