You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2015/02/24 00:33:12 UTC

[jira] [Commented] (CASSANDRA-8855) Batching SELECTs

    [ https://issues.apache.org/jira/browse/CASSANDRA-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14334081#comment-14334081 ] 

Sylvain Lebresne commented on CASSANDRA-8855:
---------------------------------------------

Well, if we want to support that, the proper syntax is just to allow {{OR}} clauses, as in:
{noformat}
SELECT value FROM events WHERE event_type='myEvent' AND (time > '2011-02-03' AND time <= '2012-01-01') AND (time > '2012-02-03' AND time <= '2013-01-01’);
{noformat}
And thechnically speaking, doing so is really just about allowing and handling the new syntax since this we have everything we need internally to do this. The only reason we haven't implemented it so far is because supporting {{OR}} properly in {{SelectStatement}} will require a bit of refactor (though it has probably gotten better on trunk). We also won't really be able to support any type of {{OR}} clause so we need to be clear on what we support and what we don't.

bq. In addition, how about supporting batch for multiple SELECTs across tables

As you can easily parallelize such query yourself client side, I don't see much upside to it and that's a *lot* of complication. So I'm fairly strongly opposed to the idea (but in any case, it's a pretty separate question so let's focus on one thing per ticket).



>  Batching SELECTs 
> ------------------
>
>                 Key: CASSANDRA-8855
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8855
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jay Patel
>
> SELECT’s IN clause allows to batch selects for multiple partition keys of a given table. Can we consider supporting batch select for multiple column ranges for a given partition key?
> For instance, would like to batch below two or more SELECTs for a given partition key “event_type” and the different ranges of the “time”:
> SELECT value
> FROM events
> WHERE event_type = 'myEvent'
>   AND time > '2011-02-03'
>   AND time <= '2012-01-01'
> SELECT value
> FROM events
> WHERE event_type = 'myEvent'
>   AND time > '2012-02-03'
>   AND time <= '2013-01-01’
> One way to optimize these is to fire multiple SELECTs in parallel & async from the application, but by batching them we can do further optimizations such as avoid multiple round trips; from app server to C*, and even from coordinator to the replicas. Once request is received by the target replicas, we can return all the ranges requested for a particular partition key in one shot. 
> This will be very useful for some of the use cases we're working on. I can take a first cut at this if no concerns.
> In addition, how about supporting batch for multiple SELECTs across tables. I think that will require more changes in ResultSet and may not have lot of opportunities for optimizations. However, at least it will help to avoid multiple round trips from app server to the C*.
> Thoughts welcome. Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)