You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by yaw <ya...@gmail.com> on 2010/07/02 15:10:09 UTC

Storing application logs into Cassandra / design question

Hi all,
I'd like to store logs of my application into cassandra.

I need to query logs by date (last X logs) or  user (give me last X logs for
 user Y )  and I want to dispatch data among several servers.


I think  the best design way  is  following :

Each  log identifier is a time based UUID.


A CF with key = UUID /  *Random Partitioner*  will contain log message =>
allows me to split real data  evenly between nodes

A CF with key = UUID   and *order*-*preserving partitioner * allow me to get
last X logs

A CF with key = userID   and columns name are UUIDs (UUID sorted) =>   allow
me to get last X logs  of user Y


Am I right ?

Many thanks

Re: Storing application logs into Cassandra / design question

Posted by yaw <ya...@gmail.com>.

Perfectly right Nick.

So i suppose that If I want to keep RandomPartionner ( I understand this is
the best for high volume applications), I could design database like this :

A CF with key = UUID  will contain log message details => allows me to split
real data  evenly between nodes

A CF with key = 'Date'  and columns name are UUIDs (UUID sorted) => allow me
to get last X logs of the day and data are approximately well distributed...

Many thanks,
yaw


2010/7/3 Микола Стрєбков <ni...@mykola.org>

>  On 02.07.10 16:10, yaw wrote:
> > Hi all,
> > I'd like to store logs of my application into cassandra.
> >
> > I need to query logs by date (last X logs) or  user (give me last X logs
> > for  user Y )  and I want to dispatch data among several servers.
> >
> >
> > I think  the best design way  is  following :
> >
> > Each  log identifier is a time based UUID.
> >
> >
> > A CF with key = UUID /  *Random Partitioner*  will contain log message
> > => allows me to split real data  evenly between nodes
> >
> > A CF with key = UUID   and *order*-*preserving partitioner * allow me to
> > get last X logs
> >
> > A CF with key = userID   and columns name are UUIDs (UUID sorted) =>
> > allow me to get last X logs  of user Y
> >
> > Am I right ?
>
> No: you can have only one partitioner per cluster. See:
>
>
> http://ria101.wordpress.com/2010/02/22/cassandra-randompartitioner-vs-orderpreservingpartitioner/
>
> http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model
>
> --
> Mykola Stryebkov
> Blog: http://mykola.org/blog/
> Public key: http://mykola.org/pubkey.txt
> fpr: 0226 54EE C1FF 8636 36EF 2AC9 BCE9 CFC7 9CF4 6747
>

Re: Storing application logs into Cassandra / design question

Posted by Микола Стрєбков <ni...@mykola.org>.

On 02.07.10 16:10, yaw wrote:
> Hi all,
> I'd like to store logs of my application into cassandra.
>  
> I need to query logs by date (last X logs) or  user (give me last X logs
> for  user Y )  and I want to dispatch data among several servers.
>  
>  
> I think  the best design way  is  following :
>  
> Each  log identifier is a time based UUID.
>  
>  
> A CF with key = UUID /  *Random Partitioner*  will contain log message
> => allows me to split real data  evenly between nodes
>  
> A CF with key = UUID   and *order*-*preserving partitioner * allow me to
> get last X logs
>  
> A CF with key = userID   and columns name are UUIDs (UUID sorted) =>  
> allow me to get last X logs  of user Y
> 
> Am I right ?

No: you can have only one partitioner per cluster. See:

http://ria101.wordpress.com/2010/02/22/cassandra-randompartitioner-vs-orderpreservingpartitioner/

http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model

-- 
Mykola Stryebkov
Blog: http://mykola.org/blog/
Public key: http://mykola.org/pubkey.txt
fpr: 0226 54EE C1FF 8636 36EF 2AC9 BCE9 CFC7 9CF4 6747