You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by Apache Wiki <wi...@apache.org> on 2009/05/06 17:58:31 UTC

[Cassandra Wiki] Update of "ThriftInterface" by EricEvans

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The following page has been changed by EricEvans:
http://wiki.apache.org/cassandra/ThriftInterface

The comment on the change is:
imported from confluence wiki

New page:
The most common way to access Cassandra is via the [http://incubator.apache.org/thrift/ Thrift] interface.

In short Thrift allows you easily setup service clients and servers in various programming languages. It generates code from a Thrift file describing the service. See Cassandra's Thrift file [https://svn.apache.org/repos/asf/incubator/cassandra/trunk/interface/cassandra.thrift here].

Let's see how we can use a generated python client to access Cassandra.

 1. Install [http://incubator.apache.org/thrift/download/ Thrift]
 1. cd cassandra/interface
 1. thrift -gen py cassandra.thrift
 1. cd gen-py/cassandra

Run the script without arguments to get usage information:

{{{
Usage: ./Cassandra-remote [-h host:port] [-u url] [-f[ramed]] function [arg1 [arg2...]]
}}}

The following examples will use the following schema, specified in your `conf/storage-conf.xml`:

{{{ 
<Tables>
    <Table Name = "users">
   	  <ColumnFamily Index="Name">base_attributes</ColumnFamily>
   	  <ColumnFamily Index="Name">extended_attributes</ColumnFamily>
   	  <ColumnFamily ColumnType="Super" Index="Name">edges</ColumnFamily>
    </Table>
</Tables>
}}} 

== insert ==

To get started, we'll insert some data into the `users` table:

{{{ 
> ./Cassandra-remote -h <hostname>:<ThriftPort> insert 'users' '1' 'base_attributes:email' 'ted@example.com' 0
None
> ./Cassandra-remote -h <hostname>:<ThriftPort> insert 'users' '1' 'base_attributes:age' '25' 0
None
> ./Cassandra-remote -h <hostname>:<ThriftPort> insert 'users' '1' 'edges:friends:2' '1' 0
None
> ./Cassandra-remote -h <hostname>:<ThriftPort> insert 'users' '1' 'edges:friends:4' '1' 0
None
> ./Cassandra-remote -h <hostname>:<ThriftPort> insert 'users' '1' 'edges:groups:1' '1' 0
None
> ./Cassandra-remote -h <hostname>:<ThriftPort> insert 'users' '2' 'base_attributes:email' 'bill@example.com' 0
None
}}}

Note that the first two calls add data to the `email` and `age` columns in the `base_attributes` column family, while the third call adds data to the `2` column of the `friends` super column of the `edges` column family.  Also note that I'm using a timestamp of 0 in all three cases.  There are now two rows in this table, with key values of `1` and `2`.


== get_slice ==

Not quite sure what's going on here, but clearly selects a set of columns from a specific column family.  If you set `start` to a value less than 0, you get all of the columns, no matter what value you give `count`.  If you set `start` to a value of zero or greater, setting `count` to `i` will return the first `i` columns.  Returns a list of dicts, with the dicts containing `{columnName, value, timestamp}`.

Some examples on our table:

{{{ 
> ./Cassandra-remote -h <hostname>:<ThriftPort> get_slice 'users' '1' 'base_attributes' -1 0
[ {'columnName': 'email', 'value': 'ted@example.com', 'timestamp': 0},
  {'columnName': 'age', 'value': '25', 'timestamp': 0}]
> ./Cassandra-remote -h <hostname>:<ThriftPort> get_slice 'users' '1' 'base_attributes' 0 1
[{'columnName': 'email', 'value': 'ted@example.com', 'timestamp': 0}]
> ./Cassandra-remote -h <hostname>:<ThriftPort> get_slice 'users' '1' 'base_attributes' 0 2
[ {'columnName': 'email', 'value': 'ted@example.com', 'timestamp': 0},
  {'columnName': 'age', 'value': '25', 'timestamp': 0}] 
}}}

== get_column ==

Get a dict containing `{columnName, value, timestamp}` for a specific row.

Some examples on our table:

{{{ 
> ./Cassandra-remote -h <hostname>:<ThriftPort> get_column 'users' '1' 'base_attributes:age' {'columnName': 'age', 'value': '25', 'timestamp': 0}
> ./Cassandra-remote -h <hostname>:<ThriftPort> get_column 'users' '1' 'edges:friends:2' {'columnName': '2', 'value': '1', 'timestamp': 0}
}}} 

== get_column_count ==

Will tell you the number of columns for a particular row and column family.

An example on our table:

{{{
> ./Cassandra-remote -h <hostname>:<ThriftPort> get_column_count 'users' '1' 'base_attributes' 2
}}} 

== batch_insert ==
{{{
> ./Cassandra-remote -h <hostname>:<ThriftPort> batch_insert "batch_mutation_t({'table':'users', 'key':'3', 'cfmap': {'base_attributes': [column_t({'columnName': 'email', 'value': 'napoleon@example.com', 'timestamp': 0}), column_t({'columnName': 'age', 'value': '45', 'timestamp': 0}) ] } })"
None
}}} 

== batch_insert_blocking ==

== remove ==

{{{ 
> ./Cassandra-remote -h <hostname>:<ThriftPort> remove 'users' '1' 'base_attributes:email'
None
> ./Cassandra-remote -h <hostname>:<ThriftPort> get_column 'users' '1' 'base_attributes:email'
{'columnName': 'email', 'value': '', 'timestamp': 0}
}}} 

== get_slice_super ==

{{{ 
> ./Cassandra-remote -h <hostname>:<ThriftPort> get_slice_super 'users' '1' 'edges' -1 0
[ {'name': 'friends', 'columns': [{'columnName': '2', 'value': '1', 'timestamp': 0}, {'columnName': '4', 'value': '1', 'timestamp': 0}]},
  {'name': 'groups', 'columns': [{'columnName': '1', 'value': '1', 'timestamp': 0}]}]
}}}