You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by onmstester onmstester <on...@zoho.com> on 2018/04/07 04:14:51 UTC

write latency on single partition table


I've defained a table like this



create table test (

hours int,

key1 int,

value1 varchar,

primary key (hours,key1)
)



For one hour every input would be written in single partition, because i need to group by some 500K records in the partition for a report with expected response time in less than 1 seconds so using key1 in partition key would made 500K partitions which would be slow on reads.

Although using  this mechanism gains &lt; 1 seconds response time on reads but the write delay increased surprisingly, for this table write latency reported by cfstats is more than 100ms but for other tables which accessing thousands of partitions while writing in 1 hour , the write delay is 0.02ms. But i was expecting that writes to test table be faster than other tables because always only one node and one partition would be accessed, so no memtable switch happens and all writes would be local to a single node?!

Should i add another key to my partition key to distribute data on all of nodes?


Sent using Zoho Mail






Re: write latency on single partition table

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
Hi,

Challenging the possibilty that the latancy is related with the number of
record is a good guess indeed. It might be, but I don't think so, given the
max 50 Mb partition size. This should should allow to catch a partition of
this size, probably below 1 second.

It is possible to trace a query and see how it perform throughout the
distinct internal processes, and find what takes time. There are multiple
way to do so:

- 'TRACING ON'  in cqlsh, then run a problematic query (pay attention to
the consistency level - ONE by default. Use the one in use in the
application facing latencies).
- 'nodetool settraceprobability 0.001' - (here, be careful with
implications of setting this value too high, query are tracked inside
Cassandra, potentially generating a heavy load.


Other interesting global info:

- 'nodetool cfhistograms' (or tablehistograms?) - to have more precise
statistics on percentiles for example
- 'nodetool cfstats' (or tablestats) - detailed informations on how the
table/queries are performing on the node
- 'nodetool tpstats' - Thread pool statistics. Look for pending, dropped or
blocked tasks, generally, it's not good :).


If you suspect tombstones, you can use sstablemetadata to check the
tombstone ratio. It can also be related to poor caching, the number of
sstable hit on disk or inefficient bloom filters for example. There are
other reasons to slow reads.

When it comes to the read path, multiple parts come into play and the
global result is a bit complex to troubleshoot. Yet trying to narrow down
the scope, to eliminate possibilities one by one or directly detect the
issue using tracing to find out the latency comes from mostly.

If you find something weird but unclear to you, post here again and we will
hopefully able to help with extra information on the part that is slow :).

C*heers!
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com


2018-04-07 6:05 GMT+01:00 onmstester onmstester <on...@zoho.com>:

> The size is less than 50MB
>
> Sent using Zoho Mail <https://www.zoho.com/mail/>
>
>
> ---- On Sat, 07 Apr 2018 09:09:41 +0430 *Laxmikant Upadhyay
> <laxmikant.hcl@gmail.com <la...@gmail.com>>* wrote ----
>
> It seems your partition size is more..what is the size of value field ?
> Try to keep your partition size within 100 mb.
>
> On Sat, Apr 7, 2018, 9:45 AM onmstester onmstester <on...@zoho.com>
> wrote:
>
>
>
> I've defained a table like this
>
> create table test (
> hours int,
> key1 int,
> value1 varchar,
> primary key (hours,key1)
> )
>
> For one hour every input would be written in single partition, because i
> need to group by some 500K records in the partition for a report with
> expected response time in less than 1 seconds so using key1 in partition
> key would made 500K partitions which would be slow on reads.
> Although using  this mechanism gains < 1 seconds response time on reads
> but the write delay increased surprisingly, for this table write latency
> reported by cfstats is more than 100ms but for other tables which accessing
> thousands of partitions while writing in 1 hour , the write delay is
> 0.02ms. But i was expecting that writes to test table be faster than other
> tables because always only one node and one partition would be accessed, so
> no memtable switch happens and all writes would be local to a single node?!
> Should i add another key to my partition key to distribute data on all of
> nodes?
>
> Sent using Zoho Mail <https://www.zoho.com/mail/>
>
>
>

Re: write latency on single partition table

Posted by onmstester onmstester <on...@zoho.com>.
The size is less than 50MB


Sent using Zoho Mail






---- On Sat, 07 Apr 2018 09:09:41 +0430 Laxmikant Upadhyay &lt;laxmikant.hcl@gmail.com&gt; wrote ----




It seems your partition size is more..what is the size of value field ? Try to keep your partition size within 100 mb.



On Sat, Apr 7, 2018, 9:45 AM onmstester onmstester &lt;onmstester@zoho.com&gt; wrote:









I've defained a table like this



create table test (

hours int,

key1 int,

value1 varchar,

primary key (hours,key1)

)



For one hour every input would be written in single partition, because i need to group by some 500K records in the partition for a report with expected response time in less than 1 seconds so using key1 in partition key would made 500K partitions which would be slow on reads.

Although using  this mechanism gains &lt; 1 seconds response time on reads but the write delay increased surprisingly, for this table write latency reported by cfstats is more than 100ms but for other tables which accessing thousands of partitions while writing in 1 hour , the write delay is 0.02ms. But i was expecting that writes to test table be faster than other tables because always only one node and one partition would be accessed, so no memtable switch happens and all writes would be local to a single node?!

Should i add another key to my partition key to distribute data on all of nodes?



Sent using Zoho Mail









Re: write latency on single partition table

Posted by Laxmikant Upadhyay <la...@gmail.com>.
It seems your partition size is more..what is the size of value field ? Try
to keep your partition size within 100 mb.

On Sat, Apr 7, 2018, 9:45 AM onmstester onmstester <on...@zoho.com>
wrote:

>
> I've defained a table like this
>
> create table test (
> hours int,
> key1 int,
> value1 varchar,
> primary key (hours,key1)
> )
>
> For one hour every input would be written in single partition, because i
> need to group by some 500K records in the partition for a report with
> expected response time in less than 1 seconds so using key1 in partition
> key would made 500K partitions which would be slow on reads.
> Although using  this mechanism gains < 1 seconds response time on reads
> but the write delay increased surprisingly, for this table write latency
> reported by cfstats is more than 100ms but for other tables which accessing
> thousands of partitions while writing in 1 hour , the write delay is
> 0.02ms. But i was expecting that writes to test table be faster than other
> tables because always only one node and one partition would be accessed, so
> no memtable switch happens and all writes would be local to a single node?!
> Should i add another key to my partition key to distribute data on all of
> nodes?
>
> Sent using Zoho Mail <https://www.zoho.com/mail/>
>
>
>