You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benjamin Lerer (JIRA)" <ji...@apache.org> on 2015/12/21 12:16:47 UTC

[jira] [Commented] (CASSANDRA-10707) Add support for Group By to Select statement

    [ https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066337#comment-15066337 ] 

Benjamin Lerer commented on CASSANDRA-10707:
--------------------------------------------

The main difficulty of the ticket is the paging. Between the client and the coordinator nodes the page are returned based on the grouping but internally the data are paged by number of rows. 
For example, if a {{Group by}} query is used with a page size of 5000, the first page returned to the client must contains the aggregates for the first 5000 groups or less (if there was less than 5000 groups). As these groups can be composed of a big number of rows, in order to avoid  OOM errors, the coordinator node need to request pages of data from the other nodes until it has enough groups. One of the problem being that it is only possible to be sure that a group is complete when the next group is reached or the data exhausted.

> Add support for Group By to Select statement
> --------------------------------------------
>
>                 Key: CASSANDRA-10707
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL
>            Reporter: Benjamin Lerer
>            Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP BY partitionKey, clustering0, clustering1; 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)