You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2010/12/15 01:21:57 UTC
[Hadoop Wiki] Update of "Hive/PoweredBy" by Avlan Makis
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "Hive/PoweredBy" page has been changed by Avlan Makis.
The comment on this change is: Added a site that uses Hive to process huge amounts of data..
http://wiki.apache.org/hadoop/Hive/PoweredBy?action=diff&rev1=24&rev2=25
--------------------------------------------------
Applications and organizations using Hive include (alphabetically):
- * [[http://www.bizo.com|Bizo]]
+ * [[http://www.bizo.com|Bizo]]
+
We use Hive for reporting and ad hoc queries.
- * [[http://www.chitika.com|Chitika]]
+ * [[http://www.chitika.com|Chitika]]
- We use Hive for data mining and analysis on our 435M monthly global users.
+ We use Hive for data mining and analysis on our 435M monthly global users.
+
- * [[http://www.cnet.com|CNET]]
+ * [[http://www.cnet.com|CNET]]
+
We use Hive for data mining, internal log analysis and ad hoc queries.
- * [[http://www.digg.com|Digg]]
+ * [[http://www.digg.com|Digg]]
+
We use Hive for data mining, internal log analysis, R&D, and reporting/analytics.
- * [[http://www.eharmony.com|eHarmony]]
+ * [[http://www.eharmony.com|eHarmony]]
- We use Hive for Matching Trends, Model Building, In-Depth Analytics, as well as Ad-Hoc Analysis.
+ We use Hive for Matching Trends, Model Building, In-Depth Analytics, as well as Ad-Hoc Analysis.
- * [[http://www.facebook.com|Facebook]]
- We use Hadoop to store copies of internal log and dimension data sources and use it as a source for reporting/analytics and machine learning.
- Currently have a 640 machine cluster with ~5000 cores and 2PB raw storage. Each (commodity) node has 8 cores and 4 TB of storage.
+ * [[http://www.facebook.com|Facebook]]
+
+ We use Hadoop to store copies of internal log and dimension data sources and use it as a source for reporting/analytics and machine learning. Currently have a 640 machine cluster with ~5000 cores and 2PB raw storage. Each (commodity) node has 8 cores and 4 TB of storage.
+
- * [[http://www.grooveshark.com|Grooveshark]]
+ * [[http://www.grooveshark.com|Grooveshark]]
+
We use Hive for user analytics, dataset cleaning, and machine learning R&D.
- * [[http://www.hi5.com|hi5]]
+ * [[http://www.hi5.com|hi5]]
+
We use Hive for analytics, machine learning and social graph analysis.
* [[http://dev.hubspot.com/|HubSpot]]
+
We use Hive as part of a larger Hadoop pipeline to serve near-realtime web analytics.
- * [[http://www.last.fm|Last.fm]]
+ * [[http://www.last.fm|Last.fm]]
+
We use Hive for various ad hoc queries.
+ * [[http://www.medhelp.org/find-a-doctor|MedHelp Find a Doctor]]
+
+ We implemented Hive to analyse large amounts of doctors across the United States, and for internal analytics for over 1M pageview/day.
+
- * [[http://www.rocketfuelinc.com/|Rocket Fuel]]
+ * [[http://www.rocketfuelinc.com/|Rocket Fuel]]
+
We use Hive to host all our fact and dimension data. Off this warehouse, we do reporting, analytics, machine learning and model building, and various ad hoc queries.
* [[http://www.saaspulse.com/|SaaSPulse]]
+
We use Hive for analytics, machine learning and customer interaction analysis of web applications.
* [[http://www.scribd.com/|Scribd]]
+
We use hive for machine learning, data mining, ad-hoc querying, and both internal and user-facing analytics
- * TaoBao (www dot taobao dot com)
+ * TaoBao (www dot taobao dot com)
+
We use Hive for data mining, internal log analysis and ad-hoc queries. We also do some extensively developing work on Hive.
- * [[http://www.trendingtopics.org|Trending Topics]]
+ * [[http://www.trendingtopics.org|Trending Topics]]
+
Hot Wikipedia Topics, Served Fresh Daily. Powered by Cloudera Hadoop Distribution & Hive on EC2. We use Hive for log data normalization and building sample datasets for trend detection R&D.
- * [[http://www.videoegg.com|VideoEgg]]
+ * [[http://www.videoegg.com|VideoEgg]]
+
We use Hive as the core database for our data warehouse where we track and analyze all the usage data of the ads across our network.