You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Sorin Manolache <so...@gmail.com> on 2013/06/05 15:29:19 UTC

partition key of composite type and "where partition_key in (...)" clause

Hello,

Is it possible to use the "where partition_key in (...)" clause if the 
partition key has a composite type?

I have a schema as follows:

create table tbl (
    k1 int,
    k2 varchar,
    k3 varchar,
    m blob,
    primary key((k1, k2), k3)
)

I would like to be able to do something like

select m from tbl where (k1, k2) in ((0, 'abc'), (1, 'xyz'));

I think though that I'll have to use the more verbose

begin unlogged batch
select m from tbl where k1=0 and k2='abc'
select m from tbl where k1=1 and k2='xyz'
apply batch;

Thanks,
Sorin

Re: partition key of composite type and "where partition_key in (...)" clause

Posted by Sorin Manolache <so...@gmail.com>.
On 2013-06-06 11:36, Sylvain Lebresne wrote:
> We indeed don't support that kind of syntax. We might someday
> (seehttps://issues.apache.org/jira/browse/CASSANDRA-4851
> <http://issues.apache.org/jira/browse/CASSANDRA-4851>, even though the
> description is more general, this kind of would introduce that kind of
> syntax).
>
> *But* another solution is just to do 2 (or more) separate queries
> (instead of using a IN). And *no*, in general using a IN will not really
> be much faster in that case because Cassandra doesn't really optimize
> this kind of IN queries (IN queries on the partition key that is).
> Provided you make sure to parallelize the queries client side, it may
> even end up being faster in practice because the client will be able to
> start processing query responses as they come back.
>
> --
> Sylvain
>

Thank you, Eric and Sylvain.

Sorin

>
> On Wed, Jun 5, 2013 at 7:18 PM, Eric Stevens <mightye@gmail.com
> <ma...@gmail.com>> wrote:
>
>     Hi Sorin,
>
>     I'm not aware of CQL supporting Cartesian set notation, and like you
>     when I tried it, the CQL parser bailed at the first parenthesis -
>     "*(*k1, k2)"
>
>     You cannot use SELECT in a BATCH statement, see
>     http://cassandra.apache.org/doc/cql/CQL.html#BATCH
>
>     So I'm not sure what your options are other than to manually
>     composite your K1, K2 pairs into a separate column, /which is part
>     of the primary key /(necessary for the IN() clause to work).  If you
>     still need to query separately on k1 and/or k2, you can also
>     preserve these as independent columns with indices.
>
>     cqlsh:test> drop table tbl;
>     cqlsh:test> create table tbl (
>              ...   k1k2 text,
>              ...   k1 int,
>              ...   k2 text,
>              ...   k3 text,
>              ...   m blob,
>              ...   PRIMARY KEY (k1k2, k3)
>              ... );
>     cqlsh:test> CREATE INDEX ON tbl (k1);
>     cqlsh:test> CREATE INDEX ON tbl (k2);
>     cqlsh:test>
>     cqlsh:test> INSERT INTO tbl (k1, k2, k3, m, k1k2) VALUES (0, 'abc',
>     'def', 'deadbeef', '0|abc');
>     cqlsh:test> INSERT INTO tbl (k1, k2, k3, m, k1k2) VALUES (1, 'xyz',
>     'uvw', '8badf00d', '1|xyz');
>     cqlsh:test> SELECT * FROM tbl WHERE k1k2 IN ('0|abc','1|xyz');
>
>       k1k2  | k3  | k1 | k2  | m
>     -------+-----+----+-----+------------
>       0|abc | def |  0 | abc | 0xdeadbeef
>       1|xyz | uvw |  1 | xyz | 0x8badf00d
>
>     cqlsh:test> SELECT * FROM tbl WHERE k1=0;
>
>       k1k2  | k3  | k1 | k2  | m
>     -------+-----+----+-----+------------
>       0|abc | def |  0 | abc | 0xdeadbeef
>
>
>     -Eric Stevens
>     ProtectWise, Inc.
>
>
>
>     On Wed, Jun 5, 2013 at 9:29 AM, Sorin Manolache <sorinm@gmail.com
>     <ma...@gmail.com>> wrote:
>
>         Hello,
>
>         Is it possible to use the "where partition_key in (...)" clause
>         if the partition key has a composite type?
>
>         I have a schema as follows:
>
>         create table tbl (
>             k1 int,
>             k2 varchar,
>             k3 varchar,
>             m blob,
>             primary key((k1, k2), k3)
>         )
>
>         I would like to be able to do something like
>
>         select m from tbl where (k1, k2) in ((0, 'abc'), (1, 'xyz'));
>
>         I think though that I'll have to use the more verbose
>
>         begin unlogged batch
>         select m from tbl where k1=0 and k2='abc'
>         select m from tbl where k1=1 and k2='xyz'
>         apply batch;
>
>         Thanks,
>         Sorin
>
>
>


Re: partition key of composite type and "where partition_key in (...)" clause

Posted by Sylvain Lebresne <sy...@datastax.com>.
We indeed don't support that kind of syntax. We might someday (seehttps://
issues.apache.org/jira/browse/CASSANDRA-4851, even though the description
is more general, this kind of would introduce that kind of syntax).

*But* another solution is just to do 2 (or more) separate queries (instead
of using a IN). And *no*, in general using a IN will not really be much
faster in that case because Cassandra doesn't really optimize this kind of
IN queries (IN queries on the partition key that is). Provided you make
sure to parallelize the queries client side, it may even end up being
faster in practice because the client will be able to start processing
query responses as they come back.

--
Sylvain


On Wed, Jun 5, 2013 at 7:18 PM, Eric Stevens <mi...@gmail.com> wrote:

> Hi Sorin,
>
> I'm not aware of CQL supporting Cartesian set notation, and like you when
> I tried it, the CQL parser bailed at the first parenthesis - "*(*k1, k2)"
>
> You cannot use SELECT in a BATCH statement, see
> http://cassandra.apache.org/doc/cql/CQL.html#BATCH
>
> So I'm not sure what your options are other than to manually composite
> your K1, K2 pairs into a separate column, *which is part of the primary
> key *(necessary for the IN() clause to work).  If you still need to query
> separately on k1 and/or k2, you can also preserve these as independent
> columns with indices.
>
> cqlsh:test> drop table tbl;
> cqlsh:test> create table tbl (
>         ...   k1k2 text,
>         ...   k1 int,
>         ...   k2 text,
>         ...   k3 text,
>         ...   m blob,
>         ...   PRIMARY KEY (k1k2, k3)
>         ... );
> cqlsh:test> CREATE INDEX ON tbl (k1);
> cqlsh:test> CREATE INDEX ON tbl (k2);
> cqlsh:test>
> cqlsh:test> INSERT INTO tbl (k1, k2, k3, m, k1k2) VALUES (0, 'abc', 'def',
> 'deadbeef', '0|abc');
> cqlsh:test> INSERT INTO tbl (k1, k2, k3, m, k1k2) VALUES (1, 'xyz', 'uvw',
> '8badf00d', '1|xyz');
> cqlsh:test> SELECT * FROM tbl WHERE k1k2 IN ('0|abc','1|xyz');
>
>  k1k2  | k3  | k1 | k2  | m
> -------+-----+----+-----+------------
>  0|abc | def |  0 | abc | 0xdeadbeef
>  1|xyz | uvw |  1 | xyz | 0x8badf00d
>
> cqlsh:test> SELECT * FROM tbl WHERE k1=0;
>
>  k1k2  | k3  | k1 | k2  | m
> -------+-----+----+-----+------------
>  0|abc | def |  0 | abc | 0xdeadbeef
>
>
> -Eric Stevens
> ProtectWise, Inc.
>
>
>
> On Wed, Jun 5, 2013 at 9:29 AM, Sorin Manolache <so...@gmail.com> wrote:
>
>> Hello,
>>
>> Is it possible to use the "where partition_key in (...)" clause if the
>> partition key has a composite type?
>>
>> I have a schema as follows:
>>
>> create table tbl (
>>    k1 int,
>>    k2 varchar,
>>    k3 varchar,
>>    m blob,
>>    primary key((k1, k2), k3)
>> )
>>
>> I would like to be able to do something like
>>
>> select m from tbl where (k1, k2) in ((0, 'abc'), (1, 'xyz'));
>>
>> I think though that I'll have to use the more verbose
>>
>> begin unlogged batch
>> select m from tbl where k1=0 and k2='abc'
>> select m from tbl where k1=1 and k2='xyz'
>> apply batch;
>>
>> Thanks,
>> Sorin
>>
>
>

Re: partition key of composite type and "where partition_key in (...)" clause

Posted by Eric Stevens <mi...@gmail.com>.
Hi Sorin,

I'm not aware of CQL supporting Cartesian set notation, and like you when I
tried it, the CQL parser bailed at the first parenthesis - "*(*k1, k2)"

You cannot use SELECT in a BATCH statement, see
http://cassandra.apache.org/doc/cql/CQL.html#BATCH

So I'm not sure what your options are other than to manually composite your
K1, K2 pairs into a separate column, *which is part of the primary key
*(necessary
for the IN() clause to work).  If you still need to query separately on k1
and/or k2, you can also preserve these as independent columns with indices.

cqlsh:test> drop table tbl;
cqlsh:test> create table tbl (
        ...   k1k2 text,
        ...   k1 int,
        ...   k2 text,
        ...   k3 text,
        ...   m blob,
        ...   PRIMARY KEY (k1k2, k3)
        ... );
cqlsh:test> CREATE INDEX ON tbl (k1);
cqlsh:test> CREATE INDEX ON tbl (k2);
cqlsh:test>
cqlsh:test> INSERT INTO tbl (k1, k2, k3, m, k1k2) VALUES (0, 'abc', 'def',
'deadbeef', '0|abc');
cqlsh:test> INSERT INTO tbl (k1, k2, k3, m, k1k2) VALUES (1, 'xyz', 'uvw',
'8badf00d', '1|xyz');
cqlsh:test> SELECT * FROM tbl WHERE k1k2 IN ('0|abc','1|xyz');

 k1k2  | k3  | k1 | k2  | m
-------+-----+----+-----+------------
 0|abc | def |  0 | abc | 0xdeadbeef
 1|xyz | uvw |  1 | xyz | 0x8badf00d

cqlsh:test> SELECT * FROM tbl WHERE k1=0;

 k1k2  | k3  | k1 | k2  | m
-------+-----+----+-----+------------
 0|abc | def |  0 | abc | 0xdeadbeef


-Eric Stevens
ProtectWise, Inc.



On Wed, Jun 5, 2013 at 9:29 AM, Sorin Manolache <so...@gmail.com> wrote:

> Hello,
>
> Is it possible to use the "where partition_key in (...)" clause if the
> partition key has a composite type?
>
> I have a schema as follows:
>
> create table tbl (
>    k1 int,
>    k2 varchar,
>    k3 varchar,
>    m blob,
>    primary key((k1, k2), k3)
> )
>
> I would like to be able to do something like
>
> select m from tbl where (k1, k2) in ((0, 'abc'), (1, 'xyz'));
>
> I think though that I'll have to use the more verbose
>
> begin unlogged batch
> select m from tbl where k1=0 and k2='abc'
> select m from tbl where k1=1 and k2='xyz'
> apply batch;
>
> Thanks,
> Sorin
>