You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jimmy Lin <y2...@gmail.com> on 2013/07/24 07:18:09 UTC

get all row keys of a table using CQL3

hi,
I want to fetch all the row keys of a table using CQL3:

e.g
select id from mytable limit 9999999


#1
For this query, does the node need to wait for all rows return from all
other nodes before returning the data to the client(I am using astyanax) ?
In other words, will this operation create a lot of load to the initial
node receiving the request?


#2
if my table is big, I have to make sure the limit is set to a big enough
number, such that I can get all the result. Seems like I have to do a
count(*) to be sure....
is there any alternative(always return all the rows)?

#3
if my id is a timeuuid, is it better to  combine the result from couple of
the following cql to obtain all keys?
e.g
select id from mytable where id t < minTimeuuid('2013-02-02 10:00+0000')
limit 20000
+
select id from mytable where id t > maxTimeuuid('2013-02-02 10:00+0000')
limit 20000

thanks

Re: get all row keys of a table using CQL3

Posted by aaron morton <aa...@thelastpickle.com>.
> I guess my question #1 still there, that does this query create a big load on the initial node that receive such request because it still has to wait for all the result coming back from other nodes before returning to client?
sort of. 
The coordinator always has to wait. Only one node will return the actual data, the others will return a digest of the data. So there is not a huge memory pressure for this type of read. 

In general though you should page the results to reduce the size of the read. 

Cheers

-----------------
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 24/07/2013, at 5:57 PM, Jimmy Lin <y2...@gmail.com> wrote:

> hi Blake,
> arh okay, token function is nice.
>  
> But I am still bit confused by the word "page through all rows"
> select id from mytable where token(id) > token(12345)
> it will return all rows whose partition key's corresponding token that is > 12345 ?
> I guess my question #1 still there, that does this query create a big load on the initial node that receive such request because it still has to wait for all the result coming back from other nodes before returning to client?
>  
> thanks
>  
>  
>  
> 
> 
> On Tue, Jul 23, 2013 at 10:34 PM, Blake Eggleston <bl...@grapheffect.com> wrote:
> Hi Jimmy,
> 
> Check out the token function:
> 
> http://www.datastax.com/docs/1.1/dml/using_cql#paging-through-non-ordered-partitioner-results
> 
> You can use it to page through your rows.
> 
> Blake
> 
> 
> On Jul 23, 2013, at 10:18 PM, Jimmy Lin wrote:
> 
>> hi,
>> I want to fetch all the row keys of a table using CQL3:
>>  
>> e.g
>> select id from mytable limit 9999999
>>  
>>  
>> #1
>> For this query, does the node need to wait for all rows return from all other nodes before returning the data to the client(I am using astyanax) ?
>> In other words, will this operation create a lot of load to the initial node receiving the request?
>>  
>>  
>> #2
>> if my table is big, I have to make sure the limit is set to a big enough number, such that I can get all the result. Seems like I have to do a count(*) to be sure....
>> is there any alternative(always return all the rows)?
>>  
>> #3
>> if my id is a timeuuid, is it better to  combine the result from couple of the following cql to obtain all keys?
>> e.g
>> select id from mytable where id t < minTimeuuid('2013-02-02 10:00+0000') limit 20000
>> +
>> select id from mytable where id t > maxTimeuuid('2013-02-02 10:00+0000') limit 20000
>>  
>> thanks
>> 
>>  
>>  
>>  
> 
> 


Re: get all row keys of a table using CQL3

Posted by Jimmy Lin <y2...@gmail.com>.
hi Blake,
arh okay, token function is nice.

But I am still bit confused by the word "page through all rows"
select id from mytable where token(id) > token(12345)
it will return all rows whose partition key's corresponding token that is >
12345 ?
I guess my question #1 still there, that does this query create a big load
on the initial node that receive such request because it still has to wait
for all the result coming back from other nodes before returning to client?

thanks





On Tue, Jul 23, 2013 at 10:34 PM, Blake Eggleston <bl...@grapheffect.com>wrote:

> Hi Jimmy,
>
> Check out the token function:
>
>
> http://www.datastax.com/docs/1.1/dml/using_cql#paging-through-non-ordered-partitioner-results
>
> You can use it to page through your rows.
>
> Blake
>
>
> On Jul 23, 2013, at 10:18 PM, Jimmy Lin wrote:
>
> hi,
> I want to fetch all the row keys of a table using CQL3:
>
> e.g
> select id from mytable limit 9999999
>
>
> #1
> For this query, does the node need to wait for all rows return from all
> other nodes before returning the data to the client(I am using astyanax) ?
> In other words, will this operation create a lot of load to the initial
> node receiving the request?
>
>
> #2
> if my table is big, I have to make sure the limit is set to a big enough
> number, such that I can get all the result. Seems like I have to do a
> count(*) to be sure....
> is there any alternative(always return all the rows)?
>
> #3
> if my id is a timeuuid, is it better to  combine the result from couple of
> the following cql to obtain all keys?
> e.g
> select id from mytable where id t < minTimeuuid('2013-02-02 10:00+0000')
> limit 20000
> +
> select id from mytable where id t > maxTimeuuid('2013-02-02 10:00+0000')
> limit 20000
>
> thanks
>
>
>
>
>
>
>

Re: get all row keys of a table using CQL3

Posted by Blake Eggleston <bl...@grapheffect.com>.
Hi Jimmy,

Check out the token function:

http://www.datastax.com/docs/1.1/dml/using_cql#paging-through-non-ordered-partitioner-results

You can use it to page through your rows.

Blake


On Jul 23, 2013, at 10:18 PM, Jimmy Lin wrote:

> hi,
> I want to fetch all the row keys of a table using CQL3:
>  
> e.g
> select id from mytable limit 9999999
>  
>  
> #1
> For this query, does the node need to wait for all rows return from all other nodes before returning the data to the client(I am using astyanax) ?
> In other words, will this operation create a lot of load to the initial node receiving the request?
>  
>  
> #2
> if my table is big, I have to make sure the limit is set to a big enough number, such that I can get all the result. Seems like I have to do a count(*) to be sure....
> is there any alternative(always return all the rows)?
>  
> #3
> if my id is a timeuuid, is it better to  combine the result from couple of the following cql to obtain all keys?
> e.g
> select id from mytable where id t < minTimeuuid('2013-02-02 10:00+0000') limit 20000
> +
> select id from mytable where id t > maxTimeuuid('2013-02-02 10:00+0000') limit 20000
>  
> thanks
> 
>  
>  
>