You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2009/05/19 04:52:46 UTC

[jira] Issue Comment Edited: (CASSANDRA-79) Multi-table support

    [ https://issues.apache.org/jira/browse/CASSANDRA-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710589#action_12710589 ] 

Jonathan Ellis edited comment on CASSANDRA-79 at 5/18/09 7:52 PM:
------------------------------------------------------------------

>From what I have seen the underlying mechanics for multiple table support are already there.  We just need to make all the places that assume there is only one for convenience, stop doing so.

DatabaseDescriptor parses and stores the results of the config xml.  Right now the shortcut for "get the table name" is DD.getTables().get(0).  I suggest auditing the calls to getTables() and see if any are actually using it for more than that.  (I don't think there are.)  Then start fixing them.  When you are done, get rid of it so nobody is tempted to be lazy like that again. :)

There are two classes of fixes.  Client API fixes and internal uses.  The client ones are probably easier in general.  What should happen is, the client gives cassandra a table name as part of any API call, and that is passed to one of the handler methods (e.g., ReadVerbHandler, RowMutationVerbHandler).  Those will have the table name as part of the Command object they read off the wire.  So just start including the table name down the call stack.

The internal ones are a bit harder but only a little.  Often an object will need the table name in a place where its caller does not know the table either, e.g. ColumnFamily.getColumnComparator.  Here you'll need to add an instance variable containing either the table name or a reference to the parent Table object.  Adding factory methods  to Table such as Table.getColumnFamily may be convenient.

Some of DatabaseDescriptor itself needs to stop assuming only one table.  This will not be much code.  applicationColumnFamilies_ is one place.  at the very least that needs another layer of Map<tablename, appCFs> like tableToCFMetaDataMap_ does.  If you're more ambitious you could try moving those into the Table object as additional cleanup.

Please leave any questions here in case anyone else wants to help. :)

      was (Author: jbellis):
    From what I have seen the underlying mechanics for multiple table support are already there.  We just need to make all the places that assume there is only one for convenience, stop doing so.

DatabaseDescriptor parses and stores the results of the config xml.  Right now the shortcut for "get the table name" is DD.getTables().get(0).  I suggest auditing the calls to getTables() and see if any are actually using it for more than that.  (I don't think there are.)  Then get rid of it so nobody is tempted to do that anymore and start fixing them.

There are two classes of fixes.  Client API fixes and internal uses.  The client ones are probably easier in general.  What should happen is, the client gives cassandra a table name as part of any API call, and that is passed to one of the handler methods (e.g., ReadVerbHandler, RowMutationVerbHandler).  Those will have the table name as part of the Command object they read off the wire.  So just start including the table name down the call stack.

The internal ones are a bit harder but only a little.  Often an object will need the table name in a place where its caller does not know the table either, e.g. ColumnFamily.getColumnComparator.  Here you'll need to add an instance variable containing either the table name or a reference to the parent Table object.  Adding factory methods  to Table such as Table.getColumnFamily may be convenient.

Some of DatabaseDescriptor itself needs to stop assuming only one table.  This will not be much code.  applicationColumnFamilies_ is one place.  at the very least that needs another layer of Map<tablename, appCFs> like tableToCFMetaDataMap_ does.  If you're more ambitious you could try moving those into the Table object as additional cleanup.

Please leave any questions here in case anyone else wants to help. :)
  
> Multi-table support
> -------------------
>
>                 Key: CASSANDRA-79
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-79
>             Project: Cassandra
>          Issue Type: New Feature
>    Affects Versions: trunk
>            Reporter: Jonathan Ellis
>             Fix For: 0.4
>
>
> Cassandra has preliminary support for multiple tables (namespaces / sets of ColumnFamilies) but a lot of the code assumes there is only one.  Multitable support is important for allowing multiple applications to run on a single cluster.  It's also useful to cleanly separate "system" columnfamilies from application data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.