You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Alexandru Dan Sicoe <si...@googlemail.com> on 2011/02/23 21:45:20 UTC

Is Cassandra suitable for my problem?

Hello,

I'm currently doing my masters project. I need to store lots of time series
data of any type (String, int, booleans, arrays of the previous) with a high
writing rate(20MBytes/sec -> 170TBytes/year - note not running continuously)
but less strict read requirements. This is monitoring data from a vast
distributed network. The queries will be something like: give me this data
between Time1 and Time2.

The hardware that I have available is between 2 and 5 hosts.

Questions:

                   Should I use Cassandra?

                   Suggestions of how to structure the data? (I read
Cloudkick's blog
https://www.cloudkick.com/blog/2010/mar/02/4_months_with_cassandra/ but I
found that it doesn't give too much detail)


Any help is much appreciated,

Alex

Re: Is Cassandra suitable for my problem?

Posted by Ritesh Tijoriwala <ti...@gmail.com>.
Hi Alexandru,
I feel Cassandra can certainly be used to solve the problem you have but if
your requires are not very strict, you need very high throughput and its
okay for you to lose some data occasionally due to machine crash, then I
recommend you look at Redis (http://redis.io/). It is a high performant
key/value store with very high throughput and used for analytics.

thanks,
Ritesh

On Wed, Feb 23, 2011 at 12:45 PM, Alexandru Dan Sicoe <
sicoe.alexandru@googlemail.com> wrote:

> Hello,
>
> I'm currently doing my masters project. I need to store lots of time series
> data of any type (String, int, booleans, arrays of the previous) with a high
> writing rate(20MBytes/sec -> 170TBytes/year - note not running continuously)
> but less strict read requirements. This is monitoring data from a vast
> distributed network. The queries will be something like: give me this data
> between Time1 and Time2.
>
> The hardware that I have available is between 2 and 5 hosts.
>
> Questions:
>
>                    Should I use Cassandra?
>
>                    Suggestions of how to structure the data? (I read
> Cloudkick's blog
> https://www.cloudkick.com/blog/2010/mar/02/4_months_with_cassandra/ but I
> found that it doesn't give too much detail)
>
>
> Any help is much appreciated,
>
> Alex
>

RE: Is Cassandra suitable for my problem?

Posted by Prasanna Jayapalan <pj...@evidentsoftware.com>.
Hi Alexandru,

   We @ EvidentSoftware (http://www.evidentsoftware.com/) have a monitoring
solution  that is storing timeseries information in Cassandra and also
neo4j. Check this blogpost
http://www.evidentsoftware.com/evident-clearstone-5-implements-cassandra-and-neo4j-as-an-elastic-data-store/.
Can you share more details about your use case, so we can give you some
guidance.



Prasanna



*From:* Alexandru Dan Sicoe [mailto:sicoe.alexandru@googlemail.com]
*Sent:* Wednesday, February 23, 2011 3:45 PM
*To:* user@cassandra.apache.org
*Subject:* Is Cassandra suitable for my problem?



Hello,

I'm currently doing my masters project. I need to store lots of time series
data of any type (String, int, booleans, arrays of the previous) with a high
writing rate(20MBytes/sec -> 170TBytes/year - note not running continuously)
but less strict read requirements. This is monitoring data from a vast
distributed network. The queries will be something like: give me this data
between Time1 and Time2.

The hardware that I have available is between 2 and 5 hosts.

Questions:

                   Should I use Cassandra?

                   Suggestions of how to structure the data? (I read
Cloudkick's blog
https://www.cloudkick.com/blog/2010/mar/02/4_months_with_cassandra/ but I
found that it doesn't give too much detail)



Any help is much appreciated,

Alex