You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Peter Harrison <ch...@gmail.com> on 2010/09/13 01:56:52 UTC

Automatic Indexing on Composite Keys & Unions

I'm looking at how to achieve a search on several separate fields.
Before 0.7.0 this would mean separate indexes maintained by explicit
mutations by the client. Recently automatic secondary indexes were
added to 0.7.0. I have several indexes. My index row keys are
composite keys. All the indexes are ordered first by date, and then by
another field. This means that the indexes need to be a concatenation
of the date (as a long - the number of days since 1 Jan 1970) and for
example the record type. Each row of the index contains columns
containing the rowkeys for every record for that date and that type.

Finding the records for a certain date and type is therefore as easy
as looking up the right index record. Therefore it would be good to
extend the automatic indexing to support composite indexes.

Another useful feature would be unions of indexes. That is, if you
have two indexes, you can look up the first index for results, and
then perform a union with the results of another index. Again, if
these are segmented by date you can easily do unions without loading
large datasets into memory; you can load the sets for one day at a
time, perform a union, then move to the next day. Point is that doing
the union would be better done on the Cassandra Cluster rather than
pulling back the data sets for the client to do the work.

If this use case is to specific to my domain perhaps another approach
is to allow Java plug in's which would act a little like stored proc's
in the SQL world, only they would be compiled bytecode installed into
each node and loaded at runtime.  We would still enforce separation of
concerns by limiting what a plug in can do; aka would be restricted to
working with the same interface as the client, so would be unable to
mess with on disk formats.

Would these ideas be of general use to many people?