You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Kevin <th...@gmail.com> on 2011/03/11 07:43:03 UTC

Secondary indices: Why low cardinality?

There's pretty limited information on Cassandra's built-in secondary index
facility as is, but trying to find out why the secondary index has to have
low cardinality has been like finding a needle in a haystack..that is
floating somewhere in the Atlantic.

 

Can someone explain why low cardinality is advised for the secondary index?
Has this been confirmed by anyone else besides DataStax?

 


Re: Secondary indices: Why low cardinality?

Posted by Robert Coli <rc...@digg.com>.
On Thu, Mar 10, 2011 at 10:43 PM, Kevin <th...@gmail.com> wrote:
>
> Can someone explain why low cardinality is advised for the secondary index?

The brief answer to your question is "because it is a local secondary index."

https://issues.apache.org/jira/browse/CASSANDRA-749

Has a pretty thorough discussion of why local secondary indexes were
chosen for the initial secondary index implementation.

https://issues.apache.org/jira/browse/CASSANDRA-749?focusedCommentId=12858175&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12858175
" (quoting Stu Hood)
I think we agree that both approaches have their merits. The vast
difference between their best use cases needs to be considered as we
decide on a query API. In particular:

Local indexes are better for:

    * Low cardinality fields
    * Filtering of values in base order

Distributed indexes are better for:

    * High cardinality fields
    * Querying of values in index order
"

=Rob