You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Peter Lin (JIRA)" <ji...@apache.org> on 2012/11/14 17:04:12 UTC

[jira] [Commented] (CASSANDRA-4964) Return column metadata for dynamic columns

    [ https://issues.apache.org/jira/browse/CASSANDRA-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497181#comment-13497181 ] 

Peter Lin commented on CASSANDRA-4964:
--------------------------------------

I read that page a few months back. I "believe" I have a rough understanding of how Cassandra transposes wide rows into multiple rows as the page describes "That is how CQL3 allows access to wide rows: by transposing one internal wide rows into multiple CQL3 rows, one per cell of the wide row. This is however just a different way to view the same information."

For the sake better understanding, say I define column family using the first form.

create column family clicks
    with key_validation_class = UTF8Type
     and comparator = DateType
     and default_validation_class = UTF8Type

When I insert dynamic columns with CQL, the name has to be a datetype and the value has to be string. Say I also insert dynamic columns with Thrift and I have String column name and BigDecimal for the value. Obviously that doesn't conform to date/utf8, but there's nothing stopping me from doing that with thrift. For me, that flexibility is one of the nice features of Cassandra.

If I try to query all columns with "select * from clicks", Cassandra will happily run the query and return the results to me. For example, if I use a driver like FluentCassandra, it wouldn't know how to deserialize a given column. That means I have 2 options: the first is to not use CQL for situations where the contents of a row vary in type, the second only select columns that match the default name/value type.

Even though Cassandra transposes wide rows into multiple rows for the storage engine, is the metadata about the type present? In one of my use cases, I built a temporal database on top of cassandra. My temporal database only uses thrift and helper classes to read/write, so my classes always know what types a dynamic column is.

This other use case where inserts and queries are made with thrift and CQL, returning type metadata for the name in org.apache.cassandra.thrift.Column could make it easier. That's assuming the metadata is stored in the transposed internal storage format.
                
> Return column metadata for dynamic columns
> ------------------------------------------
>
>                 Key: CASSANDRA-4964
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4964
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.1.6
>            Reporter: Peter Lin
>
> Currently, org.apache.cassandra.thrift.CqlMetadata doesn't return the column name and value metadata for dynamic columns. If I execute a query against a dynamic column that was inserted through thrift or hector, the name and/or value type may not be the same as the default types declared in the column family definition.
> If the dynamic column was inserted through CQL, it will conform to the defined default types for column name and value. Even in that case, it is still nice to have the metadata returned. That will facilitate developing tools for CQL and make it easier on people writing drivers for Cassandra.
> I'm willing to contribute to this, if someone points me to the right place. I've read a lot of the core cassandra code, but I haven't gone through all of CQL yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira