You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Alain RODRIGUEZ <ar...@gmail.com> on 2012/01/06 11:35:23 UTC

Hadoop + Cassandra

Hello.

I have a 4 nodes cluster running Cassandra (without Datastax Brisk) in
production.

Now I want to add hadoop (and maybe Pig / Hive ?) to be able to perform
some analytics.
I don't know how to get started ? Is there a tutorial explaining how to
install, configure and use hadoop andadvantages using it as a cassandra
overlay or on separated nodes
http://www.slideshare.net/jeromatron/cassandrahadoop-4399672 ?

I am already able to read a lot of statistics in real time thanks to
Cassandra only and to the way I model my CFs but I also have a lot of raw
data I would like to use them in order to get more statistics.

I'll be glad to learn about any interesting things you learnt with your own
experiences with hadoop + Cassandra.

Thanks in advance.

Re: Hadoop + Cassandra

Posted by Jeremy Hanna <je...@gmail.com>.
I would first look at http://wiki.apache.org/cassandra/HadoopSupport - you'll want to look in the section on cluster configuration.  DataStax also has a product that makes it pretty simple to use Hadoop with Cassandra if you don't mind paying for it - http://www.datastax.com/products/enterprise  Where I work, we've been using hadoop with cassandra for almost a year now but we're looking into using datastax enterprise right now.

On Jan 6, 2012, at 4:35 AM, Alain RODRIGUEZ wrote:

> Hello.
> 
> I have a 4 nodes cluster running Cassandra (without Datastax Brisk) in production.
> 
> Now I want to add hadoop (and maybe Pig / Hive ?) to be able to perform some analytics.
> I don't know how to get started ? Is there a tutorial explaining how to install, configure and use hadoop andadvantages using it as a cassandra overlay or on separated nodes http://www.slideshare.net/jeromatron/cassandrahadoop-4399672 ?
> 
> I am already able to read a lot of statistics in real time thanks to Cassandra only and to the way I model my CFs but I also have a lot of raw data I would like to use them in order to get more statistics.
> 
> I'll be glad to learn about any interesting things you learnt with your own experiences with hadoop + Cassandra.
> 
> Thanks in advance.