You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Samphel Norden (JIRA)" <ji...@apache.org> on 2014/07/03 16:32:24 UTC

[jira] [Commented] (CASSANDRA-7494) CQL support to return first column of each row

    [ https://issues.apache.org/jira/browse/CASSANDRA-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051507#comment-14051507 ] 

Samphel Norden commented on CASSANDRA-7494:
-------------------------------------------

Using a slice query with Hector/Thrift does support getting the first n cells of a row. If a feature like this already exists, is there a specific reason why it cannot be ported over to CQL.

> CQL support to return first column of each row
> ----------------------------------------------
>
>                 Key: CASSANDRA-7494
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7494
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: fedora 64bit 
>            Reporter: Samphel Norden
>
> This jira is a request to support a query like
> select first 5 columns of each row where <whereclause>
> Currently in CQL, if we put a limit clause it applies over all rows. Not a per partition key limit. 
> More details below
> IF we create a table as follows
> CREATE TABLE xy (
> a int,
> b int,
> c int,
> d int,
> value int,
> PRIMARY KEY ((a, b), c, d)
> ) WITH CLUSTERING ORDER BY (c DESC, d ASC)
> with data = 
> a | b | c | d | value
> --------------
> 1 | 2 | 2007 | 307 | 950
> 1 | 2 | 2006 | 305 | 900
> 1 | 1 | 1006 | 205 | 800
> 1 | 1 | 1005 | 105 | 700
> The rows are sorted by c descending where assuming c is a timestamp, the idea is to store the latest timestamp first. Hence if we pull a single column from each row given a set of rows, we want that to be the latest 'c' for each row.
> In other words: 
> select first 1 value from xy where a=1 and b in (1,2)
> should return a single "value" for each rowkey
> a | b | c | d | value
> --------------
> 1 | 1 | 1006 | 205 | 800
> 1 | 2 | 2007 | 307 | 950
> I realize that if we do individual queries such as
> select a,b,c,value from xy where a=1 and b =1 limit 1;
> a | b | c | value
> -------+----
> 1 | 1 | 1006 | 800
> (1 rows)
> cqlsh:> select a,b,c,e from xy where a=1 and b =2 limit 1;
> a | b | c | value
> -------+----
> 1 | 2 | 2007 | 950
> We get the desired result.However this is highly inefficient since we would need to fire a separate query per row. If we can have a construct change to allow getting a single column for a given row that would be very helpful



--
This message was sent by Atlassian JIRA
(v6.2#6252)