You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Maxim Potekhin <po...@bnl.gov> on 2011/12/15 03:34:12 UTC

Best way to implement indexing for high-cardinality values?

I now have a CF with extremely skinny rows (in the current implementation),
and the application will want to query by more than one column values.
Problem is that the values in a lot of cases will be high cardinality.
One other factor is that I want to rotate data in and our of the system
in one day buckets -- LILO in effect. The date will be on of the columns
as well.

I had 9 indexes in mind, but I think I can pare it down to 5. At least 
one of the
column I will need to query by, has values that are guaranteed to be 
unique --
there are effectively two ways to identify data for very different part 
of the
complete system. Indexing on that would be bad, wouldn't it?

Any advice would be appreciated.

Thanks

Maxim