You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Techy Teck <co...@gmail.com> on 2013/04/21 22:04:07 UTC

How to find total number of rows in Cassandra databaase?

I have inserted 1000 rows in Cassandra database. Now I am trying to find
out how many rows have been inserted in Cassandra database using the CLI
mode.


In rdbms, I can do this sql-

*       SELECT count(*) from TABLE;*

And this will give me total count for that table;

How to do the same thing in Cassandra database?

I am running Cassandra 1.2.3

Re: How to find total number of rows in Cassandra databaase?

Posted by aaron morton <aa...@thelastpickle.com>.
cassandra-cli has some good online help.

There are no features to count rows as cassandra does not count them, but it it's only 1,000 try using list;

You can also see the number of rows by using nodetool cfstats. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 23/04/2013, at 6:50 PM, Techy Teck <co...@gmail.com> wrote:

> Is there any way to see how many rows are there using CLI mode? If I don't want to use CQL mode.
> 
> 
> On Mon, Apr 22, 2013 at 12:13 AM, Nikolay Mihaylov <nm...@nmmm.nu> wrote:
> Hi
> 
> it is very important to know that counting rows is very very very expensive.
> here is my 5 cents - 
> 
> in one of my early projects we made separate column family, with just single row.
> we inserted each row key from other CF on this row as column key.
> 
> then once a day or who, we did get_count().
> 
> however because get_count() became way too slow,
> we have split the keys on several rows - e.g. on 1024 rows.
> it is still way too slow, but we do not need it to be realtime.
> 
> in our "second" project we decided to use cassandra counters.
> however in order to be distinct, we need to read before write.
> this degrade insert performance, so we did special CF with hashesh and other stuff.
> insert performance is still slow. 2 sec or something for 500-600 counters
> (note single insert is OK, but we need to do 500-600 per batch, and 100-200 batches per second).
> 
> finally we have researched about probabilistic counters and we decided to use these.
> we also decided to make the project in Python, and we did not do proper tests yes.
> 
> this is our final "take", it uses modified hyper log log, so we do not need to read at all.
> 
> https://github.com/nmmmnu/CubicHyperLogLog
> 
> we tested the library very well, but not with real live data.
> version for Redis is included too for easy testing.
> 
> Nikolay.
> 
> 
> 
> 
> 
> On Mon, Apr 22, 2013 at 2:19 AM, Utkarsh Sengar <ut...@gmail.com> wrote:
> Difference b/w cqlsh and cli is documented by the datastax guys here nicely: http://www.datastax.com/support-forums/topic/cli-vs-cql
> 
> Thanks,
> -Utkarsh
> 
> 
> On Sun, Apr 21, 2013 at 1:39 PM, Techy Teck <co...@gmail.com> wrote:
> Yeah it helps a lot. I always have this doubt with me. What is the difference between CLI and CQL?
> 
> 
> 
> On Sun, Apr 21, 2013 at 1:30 PM, Utkarsh Sengar <ut...@gmail.com> wrote:
> Using cqlsh you can do:
> 
> SELECT COUNT(*) FROM columnfamily LIMIT 5000;
> 
> Does that help?
> 
> Read more: http://www.datastax.com/docs/1.0/references/cql/SELECT
> 
> Thanks,
> -Utkarsh
> 
> 
> 
> On Sun, Apr 21, 2013 at 1:04 PM, Techy Teck <co...@gmail.com> wrote:
> I have inserted 1000 rows in Cassandra database. Now I am trying to find out how many rows have been inserted in Cassandra database using the CLI mode.
> 
> 
> In rdbms, I can do this sql-
> 
>        SELECT count(*) from TABLE;
> 
> And this will give me total count for that table;
> 
> How to do the same thing in Cassandra database?
> 
> I am running Cassandra 1.2.3
> 
> 
> 
> -- 
> Thanks,
> -Utkarsh
> 
> 
> 
> 
> -- 
> Thanks,
> -Utkarsh
> 
> 


Re: How to find total number of rows in Cassandra databaase?

Posted by Techy Teck <co...@gmail.com>.
Is there any way to see how many rows are there using CLI mode? If I don't
want to use CQL mode.


On Mon, Apr 22, 2013 at 12:13 AM, Nikolay Mihaylov <nm...@nmmm.nu> wrote:

> Hi
>
> it is very important to know that counting rows is very very
> very expensive.
> here is my 5 cents -
>
> in one of my early projects we made separate column family, with just
> single row.
> we inserted each row key from other CF on this row as column key.
>
> then once a day or who, we did get_count().
>
> however because get_count() became way too slow,
> we have split the keys on several rows - e.g. on 1024 rows.
> it is still way too slow, but we do not need it to be realtime.
>
> in our "second" project we decided to use cassandra counters.
> however in order to be distinct, we need to read before write.
> this degrade insert performance, so we did special CF with hashesh and
> other stuff.
> insert performance is still slow. 2 sec or something for 500-600 counters
> (note single insert is OK, but we need to do 500-600 per batch, and
> 100-200 batches per second).
>
> finally we have researched about probabilistic counters and we decided to
> use these.
> we also decided to make the project in Python, and we did not do proper
> tests yes.
>
> this is our final "take", it uses modified hyper log log, so we do not
> need to read at all.
>
> https://github.com/nmmmnu/CubicHyperLogLog
>
> we tested the library very well, but not with real live data.
> version for Redis is included too for easy testing.
>
> Nikolay.
>
>
>
>
>
> On Mon, Apr 22, 2013 at 2:19 AM, Utkarsh Sengar <ut...@gmail.com>wrote:
>
>> Difference b/w cqlsh and cli is documented by the datastax guys here
>> nicely: http://www.datastax.com/support-forums/topic/cli-vs-cql
>>
>> Thanks,
>> -Utkarsh
>>
>>
>> On Sun, Apr 21, 2013 at 1:39 PM, Techy Teck <co...@gmail.com>wrote:
>>
>>> Yeah it helps a lot. I always have this doubt with me. What is the
>>> difference between CLI and CQL?
>>>
>>>
>>>
>>> On Sun, Apr 21, 2013 at 1:30 PM, Utkarsh Sengar <ut...@gmail.com>wrote:
>>>
>>>> Using cqlsh you can do:
>>>>
>>>> SELECT COUNT(*) FROM columnfamily LIMIT 5000;
>>>>
>>>> Does that help?
>>>>
>>>> Read more: http://www.datastax.com/docs/1.0/references/cql/SELECT
>>>>
>>>> Thanks,
>>>> -Utkarsh
>>>>
>>>>
>>>>
>>>> On Sun, Apr 21, 2013 at 1:04 PM, Techy Teck <co...@gmail.com>wrote:
>>>>
>>>>> I have inserted 1000 rows in Cassandra database. Now I am trying to
>>>>> find out how many rows have been inserted in Cassandra database using the
>>>>> CLI mode.
>>>>>
>>>>>
>>>>> In rdbms, I can do this sql-
>>>>>
>>>>> *       SELECT count(*) from TABLE;*
>>>>>
>>>>> And this will give me total count for that table;
>>>>>
>>>>> How to do the same thing in Cassandra database?
>>>>>
>>>>> I am running Cassandra 1.2.3
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> -Utkarsh
>>>>
>>>
>>>
>>
>>
>> --
>> Thanks,
>> -Utkarsh
>>
>
>

Re: How to find total number of rows in Cassandra databaase?

Posted by Nikolay Mihaylov <nm...@nmmm.nu>.
Hi

it is very important to know that counting rows is very very very expensive.
here is my 5 cents -

in one of my early projects we made separate column family, with just
single row.
we inserted each row key from other CF on this row as column key.

then once a day or who, we did get_count().

however because get_count() became way too slow,
we have split the keys on several rows - e.g. on 1024 rows.
it is still way too slow, but we do not need it to be realtime.

in our "second" project we decided to use cassandra counters.
however in order to be distinct, we need to read before write.
this degrade insert performance, so we did special CF with hashesh and
other stuff.
insert performance is still slow. 2 sec or something for 500-600 counters
(note single insert is OK, but we need to do 500-600 per batch, and 100-200
batches per second).

finally we have researched about probabilistic counters and we decided to
use these.
we also decided to make the project in Python, and we did not do proper
tests yes.

this is our final "take", it uses modified hyper log log, so we do not need
to read at all.

https://github.com/nmmmnu/CubicHyperLogLog

we tested the library very well, but not with real live data.
version for Redis is included too for easy testing.

Nikolay.





On Mon, Apr 22, 2013 at 2:19 AM, Utkarsh Sengar <ut...@gmail.com>wrote:

> Difference b/w cqlsh and cli is documented by the datastax guys here
> nicely: http://www.datastax.com/support-forums/topic/cli-vs-cql
>
> Thanks,
> -Utkarsh
>
>
> On Sun, Apr 21, 2013 at 1:39 PM, Techy Teck <co...@gmail.com>wrote:
>
>> Yeah it helps a lot. I always have this doubt with me. What is the
>> difference between CLI and CQL?
>>
>>
>>
>> On Sun, Apr 21, 2013 at 1:30 PM, Utkarsh Sengar <ut...@gmail.com>wrote:
>>
>>> Using cqlsh you can do:
>>>
>>> SELECT COUNT(*) FROM columnfamily LIMIT 5000;
>>>
>>> Does that help?
>>>
>>> Read more: http://www.datastax.com/docs/1.0/references/cql/SELECT
>>>
>>> Thanks,
>>> -Utkarsh
>>>
>>>
>>>
>>> On Sun, Apr 21, 2013 at 1:04 PM, Techy Teck <co...@gmail.com>wrote:
>>>
>>>> I have inserted 1000 rows in Cassandra database. Now I am trying to
>>>> find out how many rows have been inserted in Cassandra database using the
>>>> CLI mode.
>>>>
>>>>
>>>> In rdbms, I can do this sql-
>>>>
>>>> *       SELECT count(*) from TABLE;*
>>>>
>>>> And this will give me total count for that table;
>>>>
>>>> How to do the same thing in Cassandra database?
>>>>
>>>> I am running Cassandra 1.2.3
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> -Utkarsh
>>>
>>
>>
>
>
> --
> Thanks,
> -Utkarsh
>

Re: How to find total number of rows in Cassandra databaase?

Posted by Utkarsh Sengar <ut...@gmail.com>.
Difference b/w cqlsh and cli is documented by the datastax guys here
nicely: http://www.datastax.com/support-forums/topic/cli-vs-cql

Thanks,
-Utkarsh


On Sun, Apr 21, 2013 at 1:39 PM, Techy Teck <co...@gmail.com> wrote:

> Yeah it helps a lot. I always have this doubt with me. What is the
> difference between CLI and CQL?
>
>
>
> On Sun, Apr 21, 2013 at 1:30 PM, Utkarsh Sengar <ut...@gmail.com>wrote:
>
>> Using cqlsh you can do:
>>
>> SELECT COUNT(*) FROM columnfamily LIMIT 5000;
>>
>> Does that help?
>>
>> Read more: http://www.datastax.com/docs/1.0/references/cql/SELECT
>>
>> Thanks,
>> -Utkarsh
>>
>>
>>
>> On Sun, Apr 21, 2013 at 1:04 PM, Techy Teck <co...@gmail.com>wrote:
>>
>>> I have inserted 1000 rows in Cassandra database. Now I am trying to find
>>> out how many rows have been inserted in Cassandra database using the CLI
>>> mode.
>>>
>>>
>>> In rdbms, I can do this sql-
>>>
>>> *       SELECT count(*) from TABLE;*
>>>
>>> And this will give me total count for that table;
>>>
>>> How to do the same thing in Cassandra database?
>>>
>>> I am running Cassandra 1.2.3
>>>
>>
>>
>>
>> --
>> Thanks,
>> -Utkarsh
>>
>
>


-- 
Thanks,
-Utkarsh

Re: How to find total number of rows in Cassandra databaase?

Posted by Techy Teck <co...@gmail.com>.
Yeah it helps a lot. I always have this doubt with me. What is the
difference between CLI and CQL?



On Sun, Apr 21, 2013 at 1:30 PM, Utkarsh Sengar <ut...@gmail.com>wrote:

> Using cqlsh you can do:
>
> SELECT COUNT(*) FROM columnfamily LIMIT 5000;
>
> Does that help?
>
> Read more: http://www.datastax.com/docs/1.0/references/cql/SELECT
>
> Thanks,
> -Utkarsh
>
>
>
> On Sun, Apr 21, 2013 at 1:04 PM, Techy Teck <co...@gmail.com>wrote:
>
>> I have inserted 1000 rows in Cassandra database. Now I am trying to find
>> out how many rows have been inserted in Cassandra database using the CLI
>> mode.
>>
>>
>> In rdbms, I can do this sql-
>>
>> *       SELECT count(*) from TABLE;*
>>
>> And this will give me total count for that table;
>>
>> How to do the same thing in Cassandra database?
>>
>> I am running Cassandra 1.2.3
>>
>
>
>
> --
> Thanks,
> -Utkarsh
>

Re: How to find total number of rows in Cassandra databaase?

Posted by Utkarsh Sengar <ut...@gmail.com>.
Using cqlsh you can do:

SELECT COUNT(*) FROM columnfamily LIMIT 5000;

Does that help?

Read more: http://www.datastax.com/docs/1.0/references/cql/SELECT

Thanks,
-Utkarsh



On Sun, Apr 21, 2013 at 1:04 PM, Techy Teck <co...@gmail.com> wrote:

> I have inserted 1000 rows in Cassandra database. Now I am trying to find
> out how many rows have been inserted in Cassandra database using the CLI
> mode.
>
>
> In rdbms, I can do this sql-
>
> *       SELECT count(*) from TABLE;*
>
> And this will give me total count for that table;
>
> How to do the same thing in Cassandra database?
>
> I am running Cassandra 1.2.3
>



-- 
Thanks,
-Utkarsh