You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Nazario Parsacala <do...@gmail.com> on 2015/10/08 23:27:51 UTC

Cassandra query degradation with high frequency updated tables.

Hi,

so we are developing a system that computes profile of things that it observes. The observation comes in form of events. Each thing that it observe has an id and each thing has a set of subthings in it which has measurement of some kind. Roughly there are about 500 subthings within each thing. We receive events containing measurements of these 500 subthings every 10 seconds or so.

So as we receive events, we  read the old profile value, calculate the new profile based on the new value and save it back. We use the following schema to hold the profile. 

CREATE TABLE myprofile (
    id text,
    month text,
    day text,
    hour text,
    subthings text,
    lastvalue double,
    count int,
    stddev double,
 PRIMARY KEY ((id, month, day, hour), subthings)
) WITH CLUSTERING ORDER BY (subthings ASC) );


This profile will then be use for certain analytics that can use in the context of the ‘thing’ or in the context of specific thing and subthing. 

A profile can be defined as monthly, daily, hourly. So in case of monthly the month will be set to the current month (i.e. ‘Oct’) and the day and hour will be set to empty ‘’ string.


The problem that we have observed is that over time (actually in just a matter of hours) we will see a huge degradation of query response  for the monthly profile. At the start it will be respinding in 10-100 ms and after a couple of hours it will go to 2000-3000 ms . If you leave it for a couple of days you will start experiencing readtimeouts . The query is basically just :

select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and hour=‘'

This will have only about 500 rows or so.


I believe that this is cause by the fact there are multiple updates done to this specific partition. So what do we think can be done to resolve this ? 

BTW, I am using Cassandra 2.2.1 . And since this is a test , this is just running on a single node.





Re: Cassandra query degradation with high frequency updated tables.

Posted by Tyler Hobbs <ty...@datastax.com>.
That looks like CASSANDRA-10478
<https://issues.apache.org/jira/browse/CASSANDRA-10478>, which will
probably result in 2.2.3 being released shortly.  I'm not sure how that
affects performance, but as mentioned in the ticket, you can add
"disk_access_mode: standard" to cassandra.yaml to avoid it.

If you still see performance problems after that, can you try tracing the
query with cqlsh?

On Fri, Oct 9, 2015 at 12:01 PM, Nazario Parsacala <do...@gmail.com>
wrote:

> So I upgraded to 2.2.2 and change the compaction strategy from DateTieredCompactionStrategy
> to LeveledCompactionStrategy. But the problem still exists.
> At the start we were getting responses around 80 to a couple of hundred of
> ms. But after 1.5 hours of running, it is now hitting 1447 ms. I think this
> will degrade some more as time progresses. I will let this run a couple of
> hours more  and will also try to force compaction.
>
> BTW, with 2.2.2 I am getting the following exceptions. Not sure if there
> is already a bug report on this.
>
> Caused by: java.io.IOException: Seek position 182054 is not within mmap
> segment (seg offs: 0, length: 182054)
> at
> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> ... 17 common frames omitted
> WARN  [SharedPool-Worker-42] 2015-10-09 12:54:57,221
> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread
> Thread[SharedPool-Worker-42,5,main]: {}
> java.lang.RuntimeException:
> org.apache.cassandra.io.sstable.CorruptSSTableException:
> java.io.IOException: Seek position 182054 is not within mmap segment (seg
> offs: 0, length: 182054)
> at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2187)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_60]
> at
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
> [apache-cassandra-2.2.2.jar:2.2.2]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
> java.io.IOException: Seek position 182054 is not within mmap segment (seg
> offs: 0, length: 182054)
> at
> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:250)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1558)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.<init>(SSTableSliceIterator.java:42)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:75)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2004)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1808)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1537)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2183)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> ... 4 common frames omitted
> Caused by: java.io.IOException: Seek position 182054 is not within mmap
> segment (seg offs: 0, length: 182054)
> at
> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
>
>
>
>
> On Oct 9, 2015, at 9:26 AM, Carlos Alonso <in...@mrcalonso.com> wrote:
>
> Yeah, I was about to suggest the compaction strategy too. Leveled
> compaction sounds like a better fit when records are being updated
>
> Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>
>
> On 8 October 2015 at 22:35, Tyler Hobbs <ty...@datastax.com> wrote:
>
>> Upgrade to 2.2.2.  Your sstables are probably not compacting due to
>> CASSANDRA-10270 <https://issues.apache.org/jira/browse/CASSANDRA-10270>,
>> which was fixed in 2.2.2.
>>
>> Additionally, you may want to look into using leveled compaction (
>> http://www.datastax.com/dev/blog/when-to-use-leveled-compaction).
>>
>> On Thu, Oct 8, 2015 at 4:27 PM, Nazario Parsacala <do...@gmail.com>
>> wrote:
>>
>>>
>>> Hi,
>>>
>>> so we are developing a system that computes profile of things that it
>>> observes. The observation comes in form of events. Each thing that it
>>> observe has an id and each thing has a set of subthings in it which has
>>> measurement of some kind. Roughly there are about 500 subthings within each
>>> thing. We receive events containing measurements of these 500 subthings
>>> every 10 seconds or so.
>>>
>>> So as we receive events, we  read the old profile value, calculate the
>>> new profile based on the new value and save it back. We use the following
>>> schema to hold the profile.
>>>
>>> CREATE TABLE myprofile (
>>>     id text,
>>>     month text,
>>>     day text,
>>>     hour text,
>>>     subthings text,
>>>     lastvalue double,
>>>     count int,
>>>     stddev double,
>>>  PRIMARY KEY ((id, month, day, hour), subthings)
>>> ) WITH CLUSTERING ORDER BY (subthings ASC) );
>>>
>>>
>>> This profile will then be use for certain analytics that can use in the
>>> context of the ‘thing’ or in the context of specific thing and subthing.
>>>
>>> A profile can be defined as monthly, daily, hourly. So in case of
>>> monthly the month will be set to the current month (i.e. ‘Oct’) and the day
>>> and hour will be set to empty ‘’ string.
>>>
>>>
>>> The problem that we have observed is that over time (actually in just a
>>> matter of hours) we will see a huge degradation of query response  for the
>>> monthly profile. At the start it will be respinding in 10-100 ms and after
>>> a couple of hours it will go to 2000-3000 ms . If you leave it for a couple
>>> of days you will start experiencing readtimeouts . The query is basically
>>> just :
>>>
>>> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and
>>> hour=‘'
>>>
>>> This will have only about 500 rows or so.
>>>
>>>
>>> I believe that this is cause by the fact there are multiple updates done
>>> to this specific partition. So what do we think can be done to resolve this
>>> ?
>>>
>>> BTW, I am using Cassandra 2.2.1 . And since this is a test , this is
>>> just running on a single node.
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Tyler Hobbs
>> DataStax <http://datastax.com/>
>>
>
>
>


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Re: Cassandra query degradation with high frequency updated tables.

Posted by Brice Dutheil <br...@gmail.com>.
What do you mean by that *And since this is a test , this is just running
on a single node.* ? What is the hardware spec ?

Also from the schema in CASSANDRA-10502
<https://issues.apache.org/jira/browse/CASSANDRA-10502> there’s a lot of
maps, what is the size fo these maps ? I’ve seen cassandra having trouble
with large collections (2.0 / 2.1).
​

-- Brice

On Sat, Oct 10, 2015 at 12:34 AM, Nazario Parsacala <do...@gmail.com>
wrote:

> I will send the jstat output later.
>
> I have created the ticket:
>
> https://issues.apache.org/jira/browse/CASSANDRA-10502
>
>
>
>
>
> On Oct 9, 2015, at 5:20 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:
>
> I'd be curious to see GC logs.
>
> jstat -gccause <pid>
>
> On Fri, Oct 9, 2015 at 2:16 PM Tyler Hobbs <ty...@datastax.com> wrote:
>
>> Hmm, it seems off to me that the merge step is taking 1 to 2 seconds,
>> especially when there are only ~500 cells from one sstable and the
>> memtables.  Can you open a ticket (
>> https://issues.apache.org/jira/browse/CASSANDRA) with your schema,
>> details on your data layout, and these traces?
>>
>> On Fri, Oct 9, 2015 at 3:47 PM, Nazario Parsacala <do...@gmail.com>
>> wrote:
>>
>>>
>>>
>>> So the trace is varying a lot. And does not seem to correlate with the
>>> data return from the client ? Maybe datastax java  driver related. ..? (not
>>> likely).. Just checkout the results.
>>>
>>>
>>> Below is the one that I took when from the client (java application)
>>> perspective it was returning data in  about 1100 ms.
>>>
>>>
>>>
>>> *racing session: *566477c0-6ebc-11e5-9493-9131aba66d63
>>>
>>>  *activity*
>>>
>>>                                                                     |
>>> *timestamp*                  | *source*        | *source_elapsed*
>>>
>>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>>>
>>>
>>>                                               *Execute CQL3 query* | *2015-10-09
>>> 15:31:28.700000* | *172.31.17.129* |              *0*
>>>  *Parsing select * from processinfometric_profile where
>>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>>> and minute=''; [SharedPool-Worker-1]* | *2015-10-09 15:31:28.701000* |
>>> *172.31.17.129* |            *101*
>>>
>>>
>>>                         *Preparing statement [SharedPool-Worker-1]* | *2015-10-09
>>> 15:31:28.701000* | *172.31.17.129* |            *334*
>>>
>>>                                                           *Executing
>>> single-partition query on processinfometric_profile [SharedPool-Worker-3]*
>>>  | *2015-10-09 15:31:28.701000* | *172.31.17.129* |            *692*
>>>
>>>
>>>               *Acquiring sstable references [SharedPool-Worker-3]* | *2015-10-09
>>> 15:31:28.701000* | *172.31.17.129* |            *713*
>>>
>>>
>>>                 *Merging memtable tombstones [SharedPool-Worker-3]* | *2015-10-09
>>> 15:31:28.701000* | *172.31.17.129* |            *726*
>>>
>>>
>>>               *Key cache hit for sstable 209 [SharedPool-Worker-3]* | *2015-10-09
>>> 15:31:28.704000* | *172.31.17.129* |           *3143*
>>>
>>>                                                                             *Seeking
>>> to partition beginning in data file [SharedPool-Worker-3]* | *2015-10-09
>>> 15:31:28.704000* | *172.31.17.129* |           *3169*
>>>
>>>
>>>               *Key cache hit for sstable 208 [SharedPool-Worker-3]* | *2015-10-09
>>> 15:31:28.704000* | *172.31.17.129* |           *3691*
>>>
>>>                                                                             *Seeking
>>> to partition beginning in data file [SharedPool-Worker-3]* | *2015-10-09
>>> 15:31:28.704000* | *172.31.17.129* |           *3713*
>>>
>>>                                               *Skipped 0/2
>>> non-slice-intersecting sstables, included 0 due to tombstones
>>> [SharedPool-Worker-3]* | *2015-10-09 15:31:28.704000* | *172.31.17.129* |
>>>         *3807*
>>>
>>>                                                                             *Merging
>>> data from memtables and 2 sstables [SharedPool-Worker-3]* | *2015-10-09
>>> 15:31:28.704000* | *172.31.17.129* |           *3818*
>>>
>>>
>>>         *Read 462 live and 0 tombstone cells [SharedPool-Worker-3]* | *2015-10-09
>>> 15:31:29.611000* | *172.31.17.129* |         *910723*
>>>
>>>
>>>                                                 *Request complete* | *2015-10-09
>>> 15:31:29.649251* | *172.31.17.129* |         *949251*
>>>
>>>
>>>
>>>
>>> Below when this is around 1400 ms . But the trace data seems to look
>>> faster ..?
>>>
>>>
>>>
>>> *racing session: *7c591550-6ebf-11e5-9493-9131aba66d63
>>>
>>>  *activity*
>>>
>>>                                                                       |
>>> *timestamp*                  | *source*        | *source_elapsed*
>>>
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>>>
>>>
>>>                                                 *Execute CQL3 query* | *2015-10-09
>>> 15:54:00.869000* | *172.31.17.129* |              *0*
>>>  *Parsing select * from processinfometric_profile where
>>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>>> and minute=''; [SharedPool-Worker-133]* | *2015-10-09 15:54:00.869000*
>>> | *172.31.17.129* |            *122*
>>>
>>>
>>>                         *Preparing statement [SharedPool-Worker-133]* | *2015-10-09
>>> 15:54:00.869000* | *172.31.17.129* |            *265*
>>>
>>>                                                             *Executing
>>> single-partition query on processinfometric_profile [SharedPool-Worker-9]*
>>> | *2015-10-09 15:54:00.870000* | *172.31.17.129* |            *673*
>>>
>>>
>>>                 *Acquiring sstable references [SharedPool-Worker-9]* | *2015-10-09
>>> 15:54:00.870000* | *172.31.17.129* |            *695*
>>>
>>>
>>>                   *Merging memtable tombstones [SharedPool-Worker-9]* | *2015-10-09
>>> 15:54:00.870000* | *172.31.17.129* |            *709*
>>>
>>>
>>>                 *Key cache hit for sstable 242 [SharedPool-Worker-9]* | *2015-10-09
>>> 15:54:00.872000* | *172.31.17.129* |           *3134*
>>>
>>>
>>>   *Seeking to partition beginning in data file [SharedPool-Worker-9]* | *2015-10-09
>>> 15:54:00.872000* | *172.31.17.129* |           *3155*
>>>
>>>                                                 *Skipped 0/1
>>> non-slice-intersecting sstables, included 0 due to tombstones
>>> [SharedPool-Worker-9]* | *2015-10-09 15:54:00.872000* | *172.31.17.129*
>>> |           *3259*
>>>
>>>
>>>   *Merging data from memtables and 1 sstables [SharedPool-Worker-9]* | *2015-10-09
>>> 15:54:00.872000* | *172.31.17.129* |           *3270*
>>>
>>>
>>>           *Read 462 live and 0 tombstone cells [SharedPool-Worker-9]* | *2015-10-09
>>> 15:54:02.070000* | *172.31.17.129* |         *201640*
>>>
>>>
>>>                                                   *Request complete* | *2015-10-09
>>> 15:54:02.111294* | *172.31.17.129* |         *242294*
>>>
>>>
>>>
>>>
>>> This is when it was 1600 ms.
>>>
>>> *Tracing session: *9c0ea900-6ec4-11e5-9493-9131aba66d63
>>>
>>>  *activity*
>>>
>>>                                                                       |
>>> *timestamp*                  | *source*        | *source_elapsed*
>>>
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>>>
>>>
>>>                                                 *Execute CQL3 query* | *2015-10-09
>>> 16:30:41.552000* | *172.31.17.129* |              *0*
>>>  *Parsing select * from processinfometric_profile where
>>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>>> and minute=''; [SharedPool-Worker-149]* | *2015-10-09 16:30:41.552000*
>>> | *172.31.17.129* |             *99*
>>>
>>>
>>>                         *Preparing statement [SharedPool-Worker-149]* | *2015-10-09
>>> 16:30:41.552000* | *172.31.17.129* |            *215*
>>>
>>>                                                           *Executing
>>> single-partition query on processinfometric_profile [SharedPool-Worker-164]*
>>> | *2015-10-09 16:30:41.552000* | *172.31.17.129* |            *507*
>>>
>>>
>>>               *Acquiring sstable references [SharedPool-Worker-164]* | *2015-10-09
>>> 16:30:41.552000* | *172.31.17.129* |            *533*
>>>
>>>
>>>                 *Merging memtable tombstones [SharedPool-Worker-164]* | *2015-10-09
>>> 16:30:41.552000* | *172.31.17.129* |            *551*
>>>
>>>
>>>               *Key cache hit for sstable 302 [SharedPool-Worker-164]* | *2015-10-09
>>> 16:30:41.556000* | *172.31.17.129* |           *3875*
>>>
>>>                                                                             *Seeking
>>> to partition beginning in data file [SharedPool-Worker-164]* | *2015-10-09
>>> 16:30:41.556000* | *172.31.17.129* |           *3902*
>>>
>>>                                               *Skipped 0/1
>>> non-slice-intersecting sstables, included 0 due to tombstones
>>> [SharedPool-Worker-164]* | *2015-10-09 16:30:41.556000* |
>>> *172.31.17.129* |           *4050*
>>>
>>>                                                                             *Merging
>>> data from memtables and 1 sstables [SharedPool-Worker-164]* | *2015-10-09
>>> 16:30:41.556000* | *172.31.17.129* |           *4068*
>>>
>>>
>>>         *Read 468 live and 0 tombstone cells [SharedPool-Worker-164]* | *2015-10-09
>>> 16:30:43.269000* | *172.31.17.129* |         *717277*
>>>
>>>
>>>                                                   *Request complete* | *2015-10-09
>>> 16:30:43.307559* | *172.31.17.129* |         *755559*
>>>
>>>
>>>
>>>
>>> The last one is now at 2300 ms.
>>>
>>>
>>> *racing session: *7d89a1e0-6ec6-11e5-9493-9131aba66d63
>>>
>>>  *activity*
>>>
>>>                                                                     |
>>> *timestamp*                  | *source*        | *source_elapsed*
>>>
>>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>>>
>>>
>>>                                               *Execute CQL3 query* | *2015-10-09
>>> 16:44:09.342000* | *172.31.17.129* |              *0*
>>>  *Parsing select * from processinfometric_profile where
>>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>>> and minute=''; [SharedPool-Worker-1]* | *2015-10-09 16:44:09.342000* |
>>> *172.31.17.129* |             *87*
>>>
>>>
>>>                         *Preparing statement [SharedPool-Worker-1]* | *2015-10-09
>>> 16:44:09.342000* | *172.31.17.129* |            *277*
>>>
>>>                                                           *Executing
>>> single-partition query on processinfometric_profile [SharedPool-Worker-2]*
>>> | *2015-10-09 16:44:09.343000* | *172.31.17.129* |            *456*
>>>
>>>
>>>               *Acquiring sstable references [SharedPool-Worker-2]* | *2015-10-09
>>> 16:44:09.343000* | *172.31.17.129* |            *473*
>>>
>>>
>>>                 *Merging memtable tombstones [SharedPool-Worker-2]* | *2015-10-09
>>> 16:44:09.343000* | *172.31.17.129* |            *485*
>>>
>>>
>>>               *Key cache hit for sstable 328 [SharedPool-Worker-2]* | *2015-10-09
>>> 16:44:09.345000* | *172.31.17.129* |           *2851*
>>>
>>>                                                                             *Seeking
>>> to partition beginning in data file [SharedPool-Worker-2]* | *2015-10-09
>>> 16:44:09.345000* | *172.31.17.129* |           *2869*
>>>
>>>                                               *Skipped 0/1
>>> non-slice-intersecting sstables, included 0 due to tombstones
>>> [SharedPool-Worker-2]* | *2015-10-09 16:44:09.345000* | *172.31.17.129*
>>> |           *3005*
>>>
>>>                                                                             *Merging
>>> data from memtables and 1 sstables [SharedPool-Worker-2]* | *2015-10-09
>>> 16:44:09.345001* | *172.31.17.129* |           *3017*
>>>
>>>
>>>         *Read 468 live and 0 tombstone cells [SharedPool-Worker-2]* | *2015-10-09
>>> 16:44:11.329000* | *172.31.17.129* |         *987011*
>>>
>>>
>>>                                                 *Request complete* | *2015-10-09
>>> 16:44:11.388637* | *172.31.17.129* |          *46637*
>>>
>>>
>>>
>>>
>>> On Oct 9, 2015, at 2:48 PM, Nazario Parsacala <do...@gmail.com>
>>> wrote:
>>>
>>> Compaction did not help too.
>>>
>>>
>>>
>>> On Oct 9, 2015, at 1:01 PM, Nazario Parsacala <do...@gmail.com>
>>> wrote:
>>>
>>> So I upgraded to 2.2.2 and change the compaction strategy from DateTieredCompactionStrategy
>>> to LeveledCompactionStrategy. But the problem still exists.
>>> At the start we were getting responses around 80 to a couple of hundred
>>> of ms. But after 1.5 hours of running, it is now hitting 1447 ms. I think
>>> this will degrade some more as time progresses. I will let this run a
>>> couple of hours more  and will also try to force compaction.
>>>
>>> BTW, with 2.2.2 I am getting the following exceptions. Not sure if there
>>> is already a bug report on this.
>>>
>>> Caused by: java.io.IOException: Seek position 182054 is not within mmap
>>> segment (seg offs: 0, length: 182054)
>>> at
>>> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> ... 17 common frames omitted
>>> WARN  [SharedPool-Worker-42] 2015-10-09 12:54:57,221
>>> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread
>>> Thread[SharedPool-Worker-42,5,main]: {}
>>> java.lang.RuntimeException:
>>> org.apache.cassandra.io.sstable.CorruptSSTableException:
>>> java.io.IOException: Seek position 182054 is not within mmap segment (seg
>>> offs: 0, length: 182054)
>>> at
>>> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2187)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>> ~[na:1.8.0_60]
>>> at
>>> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
>>> [apache-cassandra-2.2.2.jar:2.2.2]
>>> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
>>> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
>>> java.io.IOException: Seek position 182054 is not within mmap segment (seg
>>> offs: 0, length: 182054)
>>> at
>>> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:250)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1558)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.<init>(SSTableSliceIterator.java:42)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:75)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2004)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1808)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1537)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2183)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> ... 4 common frames omitted
>>> Caused by: java.io.IOException: Seek position 182054 is not within mmap
>>> segment (seg offs: 0, length: 182054)
>>> at
>>> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> at
>>> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244)
>>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>>
>>>
>>>
>>>
>>> On Oct 9, 2015, at 9:26 AM, Carlos Alonso <in...@mrcalonso.com> wrote:
>>>
>>> Yeah, I was about to suggest the compaction strategy too. Leveled
>>> compaction sounds like a better fit when records are being updated
>>>
>>> Carlos Alonso | Software Engineer | @calonso
>>> <https://twitter.com/calonso>
>>>
>>> On 8 October 2015 at 22:35, Tyler Hobbs <ty...@datastax.com> wrote:
>>>
>>>> Upgrade to 2.2.2.  Your sstables are probably not compacting due to
>>>> CASSANDRA-10270 <https://issues.apache.org/jira/browse/CASSANDRA-10270>,
>>>> which was fixed in 2.2.2.
>>>>
>>>> Additionally, you may want to look into using leveled compaction (
>>>> http://www.datastax.com/dev/blog/when-to-use-leveled-compaction).
>>>>
>>>> On Thu, Oct 8, 2015 at 4:27 PM, Nazario Parsacala <dodongjuan@gmail.com
>>>> > wrote:
>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> so we are developing a system that computes profile of things that it
>>>>> observes. The observation comes in form of events. Each thing that it
>>>>> observe has an id and each thing has a set of subthings in it which has
>>>>> measurement of some kind. Roughly there are about 500 subthings within each
>>>>> thing. We receive events containing measurements of these 500 subthings
>>>>> every 10 seconds or so.
>>>>>
>>>>> So as we receive events, we  read the old profile value, calculate the
>>>>> new profile based on the new value and save it back. We use the following
>>>>> schema to hold the profile.
>>>>>
>>>>> CREATE TABLE myprofile (
>>>>>     id text,
>>>>>     month text,
>>>>>     day text,
>>>>>     hour text,
>>>>>     subthings text,
>>>>>     lastvalue double,
>>>>>     count int,
>>>>>     stddev double,
>>>>>  PRIMARY KEY ((id, month, day, hour), subthings)
>>>>> ) WITH CLUSTERING ORDER BY (subthings ASC) );
>>>>>
>>>>>
>>>>> This profile will then be use for certain analytics that can use in
>>>>> the context of the ‘thing’ or in the context of specific thing and
>>>>> subthing.
>>>>>
>>>>> A profile can be defined as monthly, daily, hourly. So in case of
>>>>> monthly the month will be set to the current month (i.e. ‘Oct’) and the day
>>>>> and hour will be set to empty ‘’ string.
>>>>>
>>>>>
>>>>> The problem that we have observed is that over time (actually in just
>>>>> a matter of hours) we will see a huge degradation of query response  for
>>>>> the monthly profile. At the start it will be respinding in 10-100 ms and
>>>>> after a couple of hours it will go to 2000-3000 ms . If you leave it for a
>>>>> couple of days you will start experiencing readtimeouts . The query is
>>>>> basically just :
>>>>>
>>>>> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and
>>>>> hour=‘'
>>>>>
>>>>> This will have only about 500 rows or so.
>>>>>
>>>>>
>>>>> I believe that this is cause by the fact there are multiple updates
>>>>> done to this specific partition. So what do we think can be done to resolve
>>>>> this ?
>>>>>
>>>>> BTW, I am using Cassandra 2.2.1 . And since this is a test , this is
>>>>> just running on a single node.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Tyler Hobbs
>>>> DataStax <http://datastax.com/>
>>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Tyler Hobbs
>> DataStax <http://datastax.com/>
>>
>
>

Re: Cassandra query degradation with high frequency updated tables.

Posted by Nazario Parsacala <do...@gmail.com>.
I will send the jstat output later.

I have created the ticket:

https://issues.apache.org/jira/browse/CASSANDRA-10502





> On Oct 9, 2015, at 5:20 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:
> 
> I'd be curious to see GC logs.  
> 
> jstat -gccause <pid>
> 
> On Fri, Oct 9, 2015 at 2:16 PM Tyler Hobbs <tyler@datastax.com <ma...@datastax.com>> wrote:
> Hmm, it seems off to me that the merge step is taking 1 to 2 seconds, especially when there are only ~500 cells from one sstable and the memtables.  Can you open a ticket (https://issues.apache.org/jira/browse/CASSANDRA <https://issues.apache.org/jira/browse/CASSANDRA>) with your schema, details on your data layout, and these traces?
> 
> On Fri, Oct 9, 2015 at 3:47 PM, Nazario Parsacala <dodongjuan@gmail.com <ma...@gmail.com>> wrote:
> 
> 
> So the trace is varying a lot. And does not seem to correlate with the data return from the client ? Maybe datastax java  driver related. ..? (not likely).. Just checkout the results.
> 
> 
> Below is the one that I took when from the client (java application) perspective it was returning data in  about 1100 ms.
> 
> 
> 
> racing session: 566477c0-6ebc-11e5-9493-9131aba66d63
> 
>  activity                                                                                                                                                                                                             | timestamp                  | source        | source_elapsed
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>                                                                                                                                                                                                    Execute CQL3 query | 2015-10-09 15:31:28.700000 | 172.31.17.129 |              0
>  Parsing select * from processinfometric_profile where profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour='' and minute=''; [SharedPool-Worker-1] | 2015-10-09 15:31:28.701000 | 172.31.17.129 |            101
>                                                                                                                                                                             Preparing statement [SharedPool-Worker-1] | 2015-10-09 15:31:28.701000 | 172.31.17.129 |            334
>                                                                                                                                   Executing single-partition query on processinfometric_profile [SharedPool-Worker-3] | 2015-10-09 15:31:28.701000 | 172.31.17.129 |            692
>                                                                                                                                                                    Acquiring sstable references [SharedPool-Worker-3] | 2015-10-09 15:31:28.701000 | 172.31.17.129 |            713
>                                                                                                                                                                     Merging memtable tombstones [SharedPool-Worker-3] | 2015-10-09 15:31:28.701000 | 172.31.17.129 |            726
>                                                                                                                                                                   Key cache hit for sstable 209 [SharedPool-Worker-3] | 2015-10-09 15:31:28.704000 | 172.31.17.129 |           3143
>                                                                                                                                                     Seeking to partition beginning in data file [SharedPool-Worker-3] | 2015-10-09 15:31:28.704000 | 172.31.17.129 |           3169
>                                                                                                                                                                   Key cache hit for sstable 208 [SharedPool-Worker-3] | 2015-10-09 15:31:28.704000 | 172.31.17.129 |           3691
>                                                                                                                                                     Seeking to partition beginning in data file [SharedPool-Worker-3] | 2015-10-09 15:31:28.704000 | 172.31.17.129 |           3713
>                                                                                                                       Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-3] | 2015-10-09 15:31:28.704000 | 172.31.17.129 |           3807
>                                                                                                                                                      Merging data from memtables and 2 sstables [SharedPool-Worker-3] | 2015-10-09 15:31:28.704000 | 172.31.17.129 |           3818
>                                                                                                                                                             Read 462 live and 0 tombstone cells [SharedPool-Worker-3] | 2015-10-09 15:31:29.611000 | 172.31.17.129 |         910723
>                                                                                                                                                                                                      Request complete | 2015-10-09 15:31:29.649251 | 172.31.17.129 |         949251
> 
> 
> 
> 
> Below when this is around 1400 ms . But the trace data seems to look faster ..?
> 
> 
> 
> racing session: 7c591550-6ebf-11e5-9493-9131aba66d63
> 
>  activity                                                                                                                                                                                                               | timestamp                  | source        | source_elapsed
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>                                                                                                                                                                                                      Execute CQL3 query | 2015-10-09 15:54:00.869000 | 172.31.17.129 |              0
>  Parsing select * from processinfometric_profile where profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour='' and minute=''; [SharedPool-Worker-133] | 2015-10-09 15:54:00.869000 | 172.31.17.129 |            122
>                                                                                                                                                                             Preparing statement [SharedPool-Worker-133] | 2015-10-09 15:54:00.869000 | 172.31.17.129 |            265
>                                                                                                                                     Executing single-partition query on processinfometric_profile [SharedPool-Worker-9] | 2015-10-09 15:54:00.870000 | 172.31.17.129 |            673
>                                                                                                                                                                      Acquiring sstable references [SharedPool-Worker-9] | 2015-10-09 15:54:00.870000 | 172.31.17.129 |            695
>                                                                                                                                                                       Merging memtable tombstones [SharedPool-Worker-9] | 2015-10-09 15:54:00.870000 | 172.31.17.129 |            709
>                                                                                                                                                                     Key cache hit for sstable 242 [SharedPool-Worker-9] | 2015-10-09 15:54:00.872000 | 172.31.17.129 |           3134
>                                                                                                                                                       Seeking to partition beginning in data file [SharedPool-Worker-9] | 2015-10-09 15:54:00.872000 | 172.31.17.129 |           3155
>                                                                                                                         Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-9] | 2015-10-09 15:54:00.872000 | 172.31.17.129 |           3259
>                                                                                                                                                        Merging data from memtables and 1 sstables [SharedPool-Worker-9] | 2015-10-09 15:54:00.872000 | 172.31.17.129 |           3270
>                                                                                                                                                               Read 462 live and 0 tombstone cells [SharedPool-Worker-9] | 2015-10-09 15:54:02.070000 | 172.31.17.129 |         201640
>                                                                                                                                                                                                        Request complete | 2015-10-09 15:54:02.111294 | 172.31.17.129 |         242294
> 
> 
> 
> 
> This is when it was 1600 ms.
> 
> Tracing session: 9c0ea900-6ec4-11e5-9493-9131aba66d63
> 
>  activity                                                                                                                                                                                                               | timestamp                  | source        | source_elapsed
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>                                                                                                                                                                                                      Execute CQL3 query | 2015-10-09 16:30:41.552000 | 172.31.17.129 |              0
>  Parsing select * from processinfometric_profile where profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour='' and minute=''; [SharedPool-Worker-149] | 2015-10-09 16:30:41.552000 | 172.31.17.129 |             99
>                                                                                                                                                                             Preparing statement [SharedPool-Worker-149] | 2015-10-09 16:30:41.552000 | 172.31.17.129 |            215
>                                                                                                                                   Executing single-partition query on processinfometric_profile [SharedPool-Worker-164] | 2015-10-09 16:30:41.552000 | 172.31.17.129 |            507
>                                                                                                                                                                    Acquiring sstable references [SharedPool-Worker-164] | 2015-10-09 16:30:41.552000 | 172.31.17.129 |            533
>                                                                                                                                                                     Merging memtable tombstones [SharedPool-Worker-164] | 2015-10-09 16:30:41.552000 | 172.31.17.129 |            551
>                                                                                                                                                                   Key cache hit for sstable 302 [SharedPool-Worker-164] | 2015-10-09 16:30:41.556000 | 172.31.17.129 |           3875
>                                                                                                                                                     Seeking to partition beginning in data file [SharedPool-Worker-164] | 2015-10-09 16:30:41.556000 | 172.31.17.129 |           3902
>                                                                                                                       Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-164] | 2015-10-09 16:30:41.556000 | 172.31.17.129 |           4050
>                                                                                                                                                      Merging data from memtables and 1 sstables [SharedPool-Worker-164] | 2015-10-09 16:30:41.556000 | 172.31.17.129 |           4068
>                                                                                                                                                             Read 468 live and 0 tombstone cells [SharedPool-Worker-164] | 2015-10-09 16:30:43.269000 | 172.31.17.129 |         717277
>                                                                                                                                                                                                        Request complete | 2015-10-09 16:30:43.307559 | 172.31.17.129 |         755559
> 
> 
> 
> 
> The last one is now at 2300 ms.
> 
> 
> racing session: 7d89a1e0-6ec6-11e5-9493-9131aba66d63
> 
>  activity                                                                                                                                                                                                             | timestamp                  | source        | source_elapsed
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>                                                                                                                                                                                                    Execute CQL3 query | 2015-10-09 16:44:09.342000 | 172.31.17.129 |              0
>  Parsing select * from processinfometric_profile where profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour='' and minute=''; [SharedPool-Worker-1] | 2015-10-09 16:44:09.342000 | 172.31.17.129 |             87
>                                                                                                                                                                             Preparing statement [SharedPool-Worker-1] | 2015-10-09 16:44:09.342000 | 172.31.17.129 |            277
>                                                                                                                                   Executing single-partition query on processinfometric_profile [SharedPool-Worker-2] | 2015-10-09 16:44:09.343000 | 172.31.17.129 |            456
>                                                                                                                                                                    Acquiring sstable references [SharedPool-Worker-2] | 2015-10-09 16:44:09.343000 | 172.31.17.129 |            473
>                                                                                                                                                                     Merging memtable tombstones [SharedPool-Worker-2] | 2015-10-09 16:44:09.343000 | 172.31.17.129 |            485
>                                                                                                                                                                   Key cache hit for sstable 328 [SharedPool-Worker-2] | 2015-10-09 16:44:09.345000 | 172.31.17.129 |           2851
>                                                                                                                                                     Seeking to partition beginning in data file [SharedPool-Worker-2] | 2015-10-09 16:44:09.345000 | 172.31.17.129 |           2869
>                                                                                                                       Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-2] | 2015-10-09 16:44:09.345000 | 172.31.17.129 |           3005
>                                                                                                                                                      Merging data from memtables and 1 sstables [SharedPool-Worker-2] | 2015-10-09 16:44:09.345001 | 172.31.17.129 |           3017
>                                                                                                                                                             Read 468 live and 0 tombstone cells [SharedPool-Worker-2] | 2015-10-09 16:44:11.329000 | 172.31.17.129 |         987011
>                                                                                                                                                                                                      Request complete | 2015-10-09 16:44:11.388637 | 172.31.17.129 |          46637
> 
> 
> 
> 
>> On Oct 9, 2015, at 2:48 PM, Nazario Parsacala <dodongjuan@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Compaction did not help too. 
>> 
>> 
>> 
>>> On Oct 9, 2015, at 1:01 PM, Nazario Parsacala <dodongjuan@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> So I upgraded to 2.2.2 and change the compaction strategy from DateTieredCompactionStrategy to LeveledCompactionStrategy. But the problem still exists.
>>> At the start we were getting responses around 80 to a couple of hundred of ms. But after 1.5 hours of running, it is now hitting 1447 ms. I think this will degrade some more as time progresses. I will let this run a couple of hours more  and will also try to force compaction.
>>> 
>>> BTW, with 2.2.2 I am getting the following exceptions. Not sure if there is already a bug report on this.
>>> 
>>> Caused by: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
>>> 	at org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	... 17 common frames omitted
>>> WARN  [SharedPool-Worker-42] 2015-10-09 12:54:57,221 AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-42,5,main]: {}
>>> java.lang.RuntimeException: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
>>> 	at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2187) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_60]
>>> 	at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
>>> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
>>> 	at org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:250) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1558) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.<init>(SSTableSliceIterator.java:42) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:75) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2004) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1808) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1537) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2183) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	... 4 common frames omitted
>>> Caused by: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
>>> 	at org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 	at org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244) ~[apache-cassandra-2.2.2.jar:2.2.2]
>>> 
>>> 
>>> 
>>> 
>>>> On Oct 9, 2015, at 9:26 AM, Carlos Alonso <info@mrcalonso.com <ma...@mrcalonso.com>> wrote:
>>>> 
>>>> Yeah, I was about to suggest the compaction strategy too. Leveled compaction sounds like a better fit when records are being updated
>>>> 
>>>> Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>
>>>> 
>>>> On 8 October 2015 at 22:35, Tyler Hobbs <tyler@datastax.com <ma...@datastax.com>> wrote:
>>>> Upgrade to 2.2.2.  Your sstables are probably not compacting due to CASSANDRA-10270 <https://issues.apache.org/jira/browse/CASSANDRA-10270>, which was fixed in 2.2.2.
>>>> 
>>>> Additionally, you may want to look into using leveled compaction (http://www.datastax.com/dev/blog/when-to-use-leveled-compaction <http://www.datastax.com/dev/blog/when-to-use-leveled-compaction>).
>>>> 
>>>> On Thu, Oct 8, 2015 at 4:27 PM, Nazario Parsacala <dodongjuan@gmail.com <ma...@gmail.com>> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> so we are developing a system that computes profile of things that it observes. The observation comes in form of events. Each thing that it observe has an id and each thing has a set of subthings in it which has measurement of some kind. Roughly there are about 500 subthings within each thing. We receive events containing measurements of these 500 subthings every 10 seconds or so.
>>>> 
>>>> So as we receive events, we  read the old profile value, calculate the new profile based on the new value and save it back. We use the following schema to hold the profile. 
>>>> 
>>>> CREATE TABLE myprofile (
>>>>     id text,
>>>>     month text,
>>>>     day text,
>>>>     hour text,
>>>>     subthings text,
>>>>     lastvalue double,
>>>>     count int,
>>>>     stddev double,
>>>>  PRIMARY KEY ((id, month, day, hour), subthings)
>>>> ) WITH CLUSTERING ORDER BY (subthings ASC) );
>>>> 
>>>> 
>>>> This profile will then be use for certain analytics that can use in the context of the ‘thing’ or in the context of specific thing and subthing. 
>>>> 
>>>> A profile can be defined as monthly, daily, hourly. So in case of monthly the month will be set to the current month (i.e. ‘Oct’) and the day and hour will be set to empty ‘’ string.
>>>> 
>>>> 
>>>> The problem that we have observed is that over time (actually in just a matter of hours) we will see a huge degradation of query response  for the monthly profile. At the start it will be respinding in 10-100 ms and after a couple of hours it will go to 2000-3000 ms . If you leave it for a couple of days you will start experiencing readtimeouts . The query is basically just :
>>>> 
>>>> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and hour=‘'
>>>> 
>>>> This will have only about 500 rows or so.
>>>> 
>>>> 
>>>> I believe that this is cause by the fact there are multiple updates done to this specific partition. So what do we think can be done to resolve this ? 
>>>> 
>>>> BTW, I am using Cassandra 2.2.1 . And since this is a test , this is just running on a single node.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Tyler Hobbs
>>>> DataStax <http://datastax.com/>
>>>> 
>>> 
>> 
> 
> 
> 
> 
> -- 
> Tyler Hobbs
> DataStax <http://datastax.com/>


Re: Cassandra query degradation with high frequency updated tables.

Posted by Jonathan Haddad <jo...@jonhaddad.com>.
I'd be curious to see GC logs.

jstat -gccause <pid>

On Fri, Oct 9, 2015 at 2:16 PM Tyler Hobbs <ty...@datastax.com> wrote:

> Hmm, it seems off to me that the merge step is taking 1 to 2 seconds,
> especially when there are only ~500 cells from one sstable and the
> memtables.  Can you open a ticket (
> https://issues.apache.org/jira/browse/CASSANDRA) with your schema,
> details on your data layout, and these traces?
>
> On Fri, Oct 9, 2015 at 3:47 PM, Nazario Parsacala <do...@gmail.com>
> wrote:
>
>>
>>
>> So the trace is varying a lot. And does not seem to correlate with the
>> data return from the client ? Maybe datastax java  driver related. ..? (not
>> likely).. Just checkout the results.
>>
>>
>> Below is the one that I took when from the client (java application)
>> perspective it was returning data in  about 1100 ms.
>>
>>
>>
>> *racing session: *566477c0-6ebc-11e5-9493-9131aba66d63
>>
>>  *activity*
>>
>>                                                                     |
>> *timestamp*                  | *source*        | *source_elapsed*
>>
>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>>
>>
>>                                               *Execute CQL3 query* | *2015-10-09
>> 15:31:28.700000* | *172.31.17.129* |              *0*
>>  *Parsing select * from processinfometric_profile where
>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>> and minute=''; [SharedPool-Worker-1]* | *2015-10-09 15:31:28.701000* |
>> *172.31.17.129* |            *101*
>>
>>
>>                       *Preparing statement [SharedPool-Worker-1]* | *2015-10-09
>> 15:31:28.701000* | *172.31.17.129* |            *334*
>>
>>                                                         *Executing
>> single-partition query on processinfometric_profile [SharedPool-Worker-3]*
>>  | *2015-10-09 15:31:28.701000* | *172.31.17.129* |            *692*
>>
>>
>>               *Acquiring sstable references [SharedPool-Worker-3]* | *2015-10-09
>> 15:31:28.701000* | *172.31.17.129* |            *713*
>>
>>
>>               *Merging memtable tombstones [SharedPool-Worker-3]* | *2015-10-09
>> 15:31:28.701000* | *172.31.17.129* |            *726*
>>
>>
>>             *Key cache hit for sstable 209 [SharedPool-Worker-3]* | *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |           *3143*
>>
>>                                                                           *Seeking
>> to partition beginning in data file [SharedPool-Worker-3]* | *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |           *3169*
>>
>>
>>             *Key cache hit for sstable 208 [SharedPool-Worker-3]* | *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |           *3691*
>>
>>                                                                           *Seeking
>> to partition beginning in data file [SharedPool-Worker-3]* | *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |           *3713*
>>
>>                                             *Skipped 0/2
>> non-slice-intersecting sstables, included 0 due to tombstones
>> [SharedPool-Worker-3]* | *2015-10-09 15:31:28.704000* | *172.31.17.129* |
>>         *3807*
>>
>>                                                                             *Merging
>> data from memtables and 2 sstables [SharedPool-Worker-3]* | *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |           *3818*
>>
>>
>>       *Read 462 live and 0 tombstone cells [SharedPool-Worker-3]* | *2015-10-09
>> 15:31:29.611000* | *172.31.17.129* |         *910723*
>>
>>
>>                                                 *Request complete* | *2015-10-09
>> 15:31:29.649251* | *172.31.17.129* |         *949251*
>>
>>
>>
>>
>> Below when this is around 1400 ms . But the trace data seems to look
>> faster ..?
>>
>>
>>
>> *racing session: *7c591550-6ebf-11e5-9493-9131aba66d63
>>
>>  *activity*
>>
>>                                                                       |
>> *timestamp*                  | *source*        | *source_elapsed*
>>
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>>
>>
>>                                                 *Execute CQL3 query* | *2015-10-09
>> 15:54:00.869000* | *172.31.17.129* |              *0*
>>  *Parsing select * from processinfometric_profile where
>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>> and minute=''; [SharedPool-Worker-133]* | *2015-10-09 15:54:00.869000* |
>> *172.31.17.129* |            *122*
>>
>>
>>                       *Preparing statement [SharedPool-Worker-133]* | *2015-10-09
>> 15:54:00.869000* | *172.31.17.129* |            *265*
>>
>>                                                           *Executing
>> single-partition query on processinfometric_profile [SharedPool-Worker-9]*
>> | *2015-10-09 15:54:00.870000* | *172.31.17.129* |            *673*
>>
>>
>>                 *Acquiring sstable references [SharedPool-Worker-9]* | *2015-10-09
>> 15:54:00.870000* | *172.31.17.129* |            *695*
>>
>>
>>                 *Merging memtable tombstones [SharedPool-Worker-9]* | *2015-10-09
>> 15:54:00.870000* | *172.31.17.129* |            *709*
>>
>>
>>               *Key cache hit for sstable 242 [SharedPool-Worker-9]* | *2015-10-09
>> 15:54:00.872000* | *172.31.17.129* |           *3134*
>>
>>                                                                             *Seeking
>> to partition beginning in data file [SharedPool-Worker-9]* | *2015-10-09
>> 15:54:00.872000* | *172.31.17.129* |           *3155*
>>
>>                                               *Skipped 0/1
>> non-slice-intersecting sstables, included 0 due to tombstones
>> [SharedPool-Worker-9]* | *2015-10-09 15:54:00.872000* | *172.31.17.129*
>> |           *3259*
>>
>>
>>   *Merging data from memtables and 1 sstables [SharedPool-Worker-9]* | *2015-10-09
>> 15:54:00.872000* | *172.31.17.129* |           *3270*
>>
>>
>>         *Read 462 live and 0 tombstone cells [SharedPool-Worker-9]* | *2015-10-09
>> 15:54:02.070000* | *172.31.17.129* |         *201640*
>>
>>
>>                                                   *Request complete* | *2015-10-09
>> 15:54:02.111294* | *172.31.17.129* |         *242294*
>>
>>
>>
>>
>> This is when it was 1600 ms.
>>
>> *Tracing session: *9c0ea900-6ec4-11e5-9493-9131aba66d63
>>
>>  *activity*
>>
>>                                                                       |
>> *timestamp*                  | *source*        | *source_elapsed*
>>
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>>
>>
>>                                                 *Execute CQL3 query* | *2015-10-09
>> 16:30:41.552000* | *172.31.17.129* |              *0*
>>  *Parsing select * from processinfometric_profile where
>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>> and minute=''; [SharedPool-Worker-149]* | *2015-10-09 16:30:41.552000* |
>> *172.31.17.129* |             *99*
>>
>>
>>                       *Preparing statement [SharedPool-Worker-149]* | *2015-10-09
>> 16:30:41.552000* | *172.31.17.129* |            *215*
>>
>>                                                         *Executing
>> single-partition query on processinfometric_profile [SharedPool-Worker-164]*
>> | *2015-10-09 16:30:41.552000* | *172.31.17.129* |            *507*
>>
>>
>>               *Acquiring sstable references [SharedPool-Worker-164]* | *2015-10-09
>> 16:30:41.552000* | *172.31.17.129* |            *533*
>>
>>
>>               *Merging memtable tombstones [SharedPool-Worker-164]* | *2015-10-09
>> 16:30:41.552000* | *172.31.17.129* |            *551*
>>
>>
>>             *Key cache hit for sstable 302 [SharedPool-Worker-164]* | *2015-10-09
>> 16:30:41.556000* | *172.31.17.129* |           *3875*
>>
>>                                                                           *Seeking
>> to partition beginning in data file [SharedPool-Worker-164]* | *2015-10-09
>> 16:30:41.556000* | *172.31.17.129* |           *3902*
>>
>>                                             *Skipped 0/1
>> non-slice-intersecting sstables, included 0 due to tombstones
>> [SharedPool-Worker-164]* | *2015-10-09 16:30:41.556000* | *172.31.17.129*
>> |           *4050*
>>
>>                                                                             *Merging
>> data from memtables and 1 sstables [SharedPool-Worker-164]* | *2015-10-09
>> 16:30:41.556000* | *172.31.17.129* |           *4068*
>>
>>
>>       *Read 468 live and 0 tombstone cells [SharedPool-Worker-164]* | *2015-10-09
>> 16:30:43.269000* | *172.31.17.129* |         *717277*
>>
>>
>>                                                   *Request complete* | *2015-10-09
>> 16:30:43.307559* | *172.31.17.129* |         *755559*
>>
>>
>>
>>
>> The last one is now at 2300 ms.
>>
>>
>> *racing session: *7d89a1e0-6ec6-11e5-9493-9131aba66d63
>>
>>  *activity*
>>
>>                                                                     |
>> *timestamp*                  | *source*        | *source_elapsed*
>>
>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>>
>>
>>                                               *Execute CQL3 query* | *2015-10-09
>> 16:44:09.342000* | *172.31.17.129* |              *0*
>>  *Parsing select * from processinfometric_profile where
>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>> and minute=''; [SharedPool-Worker-1]* | *2015-10-09 16:44:09.342000* |
>> *172.31.17.129* |             *87*
>>
>>
>>                       *Preparing statement [SharedPool-Worker-1]* | *2015-10-09
>> 16:44:09.342000* | *172.31.17.129* |            *277*
>>
>>                                                         *Executing
>> single-partition query on processinfometric_profile [SharedPool-Worker-2]*
>> | *2015-10-09 16:44:09.343000* | *172.31.17.129* |            *456*
>>
>>
>>               *Acquiring sstable references [SharedPool-Worker-2]* | *2015-10-09
>> 16:44:09.343000* | *172.31.17.129* |            *473*
>>
>>
>>               *Merging memtable tombstones [SharedPool-Worker-2]* | *2015-10-09
>> 16:44:09.343000* | *172.31.17.129* |            *485*
>>
>>
>>             *Key cache hit for sstable 328 [SharedPool-Worker-2]* | *2015-10-09
>> 16:44:09.345000* | *172.31.17.129* |           *2851*
>>
>>                                                                           *Seeking
>> to partition beginning in data file [SharedPool-Worker-2]* | *2015-10-09
>> 16:44:09.345000* | *172.31.17.129* |           *2869*
>>
>>                                             *Skipped 0/1
>> non-slice-intersecting sstables, included 0 due to tombstones
>> [SharedPool-Worker-2]* | *2015-10-09 16:44:09.345000* | *172.31.17.129*
>> |           *3005*
>>
>>                                                                             *Merging
>> data from memtables and 1 sstables [SharedPool-Worker-2]* | *2015-10-09
>> 16:44:09.345001* | *172.31.17.129* |           *3017*
>>
>>
>>       *Read 468 live and 0 tombstone cells [SharedPool-Worker-2]* | *2015-10-09
>> 16:44:11.329000* | *172.31.17.129* |         *987011*
>>
>>
>>                                                 *Request complete* | *2015-10-09
>> 16:44:11.388637* | *172.31.17.129* |          *46637*
>>
>>
>>
>>
>> On Oct 9, 2015, at 2:48 PM, Nazario Parsacala <do...@gmail.com>
>> wrote:
>>
>> Compaction did not help too.
>>
>>
>>
>> On Oct 9, 2015, at 1:01 PM, Nazario Parsacala <do...@gmail.com>
>> wrote:
>>
>> So I upgraded to 2.2.2 and change the compaction strategy from DateTieredCompactionStrategy
>> to LeveledCompactionStrategy. But the problem still exists.
>> At the start we were getting responses around 80 to a couple of hundred
>> of ms. But after 1.5 hours of running, it is now hitting 1447 ms. I think
>> this will degrade some more as time progresses. I will let this run a
>> couple of hours more  and will also try to force compaction.
>>
>> BTW, with 2.2.2 I am getting the following exceptions. Not sure if there
>> is already a bug report on this.
>>
>> Caused by: java.io.IOException: Seek position 182054 is not within mmap
>> segment (seg offs: 0, length: 182054)
>> at
>> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> ... 17 common frames omitted
>> WARN  [SharedPool-Worker-42] 2015-10-09 12:54:57,221
>> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread
>> Thread[SharedPool-Worker-42,5,main]: {}
>> java.lang.RuntimeException:
>> org.apache.cassandra.io.sstable.CorruptSSTableException:
>> java.io.IOException: Seek position 182054 is not within mmap segment (seg
>> offs: 0, length: 182054)
>> at
>> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2187)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> ~[na:1.8.0_60]
>> at
>> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
>> [apache-cassandra-2.2.2.jar:2.2.2]
>> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
>> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
>> java.io.IOException: Seek position 182054 is not within mmap segment (seg
>> offs: 0, length: 182054)
>> at
>> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:250)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1558)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.<init>(SSTableSliceIterator.java:42)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:75)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2004)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1808)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1537)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2183)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> ... 4 common frames omitted
>> Caused by: java.io.IOException: Seek position 182054 is not within mmap
>> segment (seg offs: 0, length: 182054)
>> at
>> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>
>>
>>
>>
>> On Oct 9, 2015, at 9:26 AM, Carlos Alonso <in...@mrcalonso.com> wrote:
>>
>> Yeah, I was about to suggest the compaction strategy too. Leveled
>> compaction sounds like a better fit when records are being updated
>>
>> Carlos Alonso | Software Engineer | @calonso
>> <https://twitter.com/calonso>
>>
>> On 8 October 2015 at 22:35, Tyler Hobbs <ty...@datastax.com> wrote:
>>
>>> Upgrade to 2.2.2.  Your sstables are probably not compacting due to
>>> CASSANDRA-10270 <https://issues.apache.org/jira/browse/CASSANDRA-10270>,
>>> which was fixed in 2.2.2.
>>>
>>> Additionally, you may want to look into using leveled compaction (
>>> http://www.datastax.com/dev/blog/when-to-use-leveled-compaction).
>>>
>>> On Thu, Oct 8, 2015 at 4:27 PM, Nazario Parsacala <do...@gmail.com>
>>> wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>> so we are developing a system that computes profile of things that it
>>>> observes. The observation comes in form of events. Each thing that it
>>>> observe has an id and each thing has a set of subthings in it which has
>>>> measurement of some kind. Roughly there are about 500 subthings within each
>>>> thing. We receive events containing measurements of these 500 subthings
>>>> every 10 seconds or so.
>>>>
>>>> So as we receive events, we  read the old profile value, calculate the
>>>> new profile based on the new value and save it back. We use the following
>>>> schema to hold the profile.
>>>>
>>>> CREATE TABLE myprofile (
>>>>     id text,
>>>>     month text,
>>>>     day text,
>>>>     hour text,
>>>>     subthings text,
>>>>     lastvalue double,
>>>>     count int,
>>>>     stddev double,
>>>>  PRIMARY KEY ((id, month, day, hour), subthings)
>>>> ) WITH CLUSTERING ORDER BY (subthings ASC) );
>>>>
>>>>
>>>> This profile will then be use for certain analytics that can use in the
>>>> context of the ‘thing’ or in the context of specific thing and subthing.
>>>>
>>>> A profile can be defined as monthly, daily, hourly. So in case of
>>>> monthly the month will be set to the current month (i.e. ‘Oct’) and the day
>>>> and hour will be set to empty ‘’ string.
>>>>
>>>>
>>>> The problem that we have observed is that over time (actually in just a
>>>> matter of hours) we will see a huge degradation of query response  for the
>>>> monthly profile. At the start it will be respinding in 10-100 ms and after
>>>> a couple of hours it will go to 2000-3000 ms . If you leave it for a couple
>>>> of days you will start experiencing readtimeouts . The query is basically
>>>> just :
>>>>
>>>> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and
>>>> hour=‘'
>>>>
>>>> This will have only about 500 rows or so.
>>>>
>>>>
>>>> I believe that this is cause by the fact there are multiple updates
>>>> done to this specific partition. So what do we think can be done to resolve
>>>> this ?
>>>>
>>>> BTW, I am using Cassandra 2.2.1 . And since this is a test , this is
>>>> just running on a single node.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Tyler Hobbs
>>> DataStax <http://datastax.com/>
>>>
>>
>>
>>
>>
>>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Re: Cassandra query degradation with high frequency updated tables.

Posted by Tyler Hobbs <ty...@datastax.com>.
Hmm, it seems off to me that the merge step is taking 1 to 2 seconds,
especially when there are only ~500 cells from one sstable and the
memtables.  Can you open a ticket (
https://issues.apache.org/jira/browse/CASSANDRA) with your schema, details
on your data layout, and these traces?

On Fri, Oct 9, 2015 at 3:47 PM, Nazario Parsacala <do...@gmail.com>
wrote:

>
>
> So the trace is varying a lot. And does not seem to correlate with the
> data return from the client ? Maybe datastax java  driver related. ..? (not
> likely).. Just checkout the results.
>
>
> Below is the one that I took when from the client (java application)
> perspective it was returning data in  about 1100 ms.
>
>
>
> *racing session: *566477c0-6ebc-11e5-9493-9131aba66d63
>
>  *activity*
>
>                                                                   |
> *timestamp*                  | *source*        | *source_elapsed*
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>
>
>                                             *Execute CQL3 query* | *2015-10-09
> 15:31:28.700000* | *172.31.17.129* |              *0*
>  *Parsing select * from processinfometric_profile where
> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
> and minute=''; [SharedPool-Worker-1]* | *2015-10-09 15:31:28.701000* |
> *172.31.17.129* |            *101*
>
>
>                       *Preparing statement [SharedPool-Worker-1]* | *2015-10-09
> 15:31:28.701000* | *172.31.17.129* |            *334*
>
>                                                         *Executing
> single-partition query on processinfometric_profile [SharedPool-Worker-3]*
>  | *2015-10-09 15:31:28.701000* | *172.31.17.129* |            *692*
>
>
>             *Acquiring sstable references [SharedPool-Worker-3]* | *2015-10-09
> 15:31:28.701000* | *172.31.17.129* |            *713*
>
>
>               *Merging memtable tombstones [SharedPool-Worker-3]* | *2015-10-09
> 15:31:28.701000* | *172.31.17.129* |            *726*
>
>
>             *Key cache hit for sstable 209 [SharedPool-Worker-3]* | *2015-10-09
> 15:31:28.704000* | *172.31.17.129* |           *3143*
>
>                                                                           *Seeking
> to partition beginning in data file [SharedPool-Worker-3]* | *2015-10-09
> 15:31:28.704000* | *172.31.17.129* |           *3169*
>
>
>             *Key cache hit for sstable 208 [SharedPool-Worker-3]* | *2015-10-09
> 15:31:28.704000* | *172.31.17.129* |           *3691*
>
>                                                                           *Seeking
> to partition beginning in data file [SharedPool-Worker-3]* | *2015-10-09
> 15:31:28.704000* | *172.31.17.129* |           *3713*
>
>                                             *Skipped 0/2
> non-slice-intersecting sstables, included 0 due to tombstones
> [SharedPool-Worker-3]* | *2015-10-09 15:31:28.704000* | *172.31.17.129* |
>         *3807*
>
>                                                                           *Merging
> data from memtables and 2 sstables [SharedPool-Worker-3]* | *2015-10-09
> 15:31:28.704000* | *172.31.17.129* |           *3818*
>
>
>       *Read 462 live and 0 tombstone cells [SharedPool-Worker-3]* | *2015-10-09
> 15:31:29.611000* | *172.31.17.129* |         *910723*
>
>
>                                               *Request complete* | *2015-10-09
> 15:31:29.649251* | *172.31.17.129* |         *949251*
>
>
>
>
> Below when this is around 1400 ms . But the trace data seems to look
> faster ..?
>
>
>
> *racing session: *7c591550-6ebf-11e5-9493-9131aba66d63
>
>  *activity*
>
>                                                                     |
> *timestamp*                  | *source*        | *source_elapsed*
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>
>
>                                               *Execute CQL3 query* | *2015-10-09
> 15:54:00.869000* | *172.31.17.129* |              *0*
>  *Parsing select * from processinfometric_profile where
> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
> and minute=''; [SharedPool-Worker-133]* | *2015-10-09 15:54:00.869000* |
> *172.31.17.129* |            *122*
>
>
>                       *Preparing statement [SharedPool-Worker-133]* | *2015-10-09
> 15:54:00.869000* | *172.31.17.129* |            *265*
>
>                                                           *Executing
> single-partition query on processinfometric_profile [SharedPool-Worker-9]*
> | *2015-10-09 15:54:00.870000* | *172.31.17.129* |            *673*
>
>
>               *Acquiring sstable references [SharedPool-Worker-9]* | *2015-10-09
> 15:54:00.870000* | *172.31.17.129* |            *695*
>
>
>                 *Merging memtable tombstones [SharedPool-Worker-9]* | *2015-10-09
> 15:54:00.870000* | *172.31.17.129* |            *709*
>
>
>               *Key cache hit for sstable 242 [SharedPool-Worker-9]* | *2015-10-09
> 15:54:00.872000* | *172.31.17.129* |           *3134*
>
>                                                                             *Seeking
> to partition beginning in data file [SharedPool-Worker-9]* | *2015-10-09
> 15:54:00.872000* | *172.31.17.129* |           *3155*
>
>                                               *Skipped 0/1
> non-slice-intersecting sstables, included 0 due to tombstones
> [SharedPool-Worker-9]* | *2015-10-09 15:54:00.872000* | *172.31.17.129* |
>           *3259*
>
>                                                                             *Merging
> data from memtables and 1 sstables [SharedPool-Worker-9]* | *2015-10-09
> 15:54:00.872000* | *172.31.17.129* |           *3270*
>
>
>         *Read 462 live and 0 tombstone cells [SharedPool-Worker-9]* | *2015-10-09
> 15:54:02.070000* | *172.31.17.129* |         *201640*
>
>
>                                                 *Request complete* | *2015-10-09
> 15:54:02.111294* | *172.31.17.129* |         *242294*
>
>
>
>
> This is when it was 1600 ms.
>
> *Tracing session: *9c0ea900-6ec4-11e5-9493-9131aba66d63
>
>  *activity*
>
>                                                                     |
> *timestamp*                  | *source*        | *source_elapsed*
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>
>
>                                               *Execute CQL3 query* | *2015-10-09
> 16:30:41.552000* | *172.31.17.129* |              *0*
>  *Parsing select * from processinfometric_profile where
> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
> and minute=''; [SharedPool-Worker-149]* | *2015-10-09 16:30:41.552000* |
> *172.31.17.129* |             *99*
>
>
>                       *Preparing statement [SharedPool-Worker-149]* | *2015-10-09
> 16:30:41.552000* | *172.31.17.129* |            *215*
>
>                                                         *Executing
> single-partition query on processinfometric_profile [SharedPool-Worker-164]*
> | *2015-10-09 16:30:41.552000* | *172.31.17.129* |            *507*
>
>
>             *Acquiring sstable references [SharedPool-Worker-164]* | *2015-10-09
> 16:30:41.552000* | *172.31.17.129* |            *533*
>
>
>               *Merging memtable tombstones [SharedPool-Worker-164]* | *2015-10-09
> 16:30:41.552000* | *172.31.17.129* |            *551*
>
>
>             *Key cache hit for sstable 302 [SharedPool-Worker-164]* | *2015-10-09
> 16:30:41.556000* | *172.31.17.129* |           *3875*
>
>                                                                           *Seeking
> to partition beginning in data file [SharedPool-Worker-164]* | *2015-10-09
> 16:30:41.556000* | *172.31.17.129* |           *3902*
>
>                                             *Skipped 0/1
> non-slice-intersecting sstables, included 0 due to tombstones
> [SharedPool-Worker-164]* | *2015-10-09 16:30:41.556000* | *172.31.17.129*
> |           *4050*
>
>                                                                           *Merging
> data from memtables and 1 sstables [SharedPool-Worker-164]* | *2015-10-09
> 16:30:41.556000* | *172.31.17.129* |           *4068*
>
>
>       *Read 468 live and 0 tombstone cells [SharedPool-Worker-164]* | *2015-10-09
> 16:30:43.269000* | *172.31.17.129* |         *717277*
>
>
>                                                 *Request complete* | *2015-10-09
> 16:30:43.307559* | *172.31.17.129* |         *755559*
>
>
>
>
> The last one is now at 2300 ms.
>
>
> *racing session: *7d89a1e0-6ec6-11e5-9493-9131aba66d63
>
>  *activity*
>
>                                                                   |
> *timestamp*                  | *source*        | *source_elapsed*
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>
>
>                                             *Execute CQL3 query* | *2015-10-09
> 16:44:09.342000* | *172.31.17.129* |              *0*
>  *Parsing select * from processinfometric_profile where
> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
> and minute=''; [SharedPool-Worker-1]* | *2015-10-09 16:44:09.342000* |
> *172.31.17.129* |             *87*
>
>
>                       *Preparing statement [SharedPool-Worker-1]* | *2015-10-09
> 16:44:09.342000* | *172.31.17.129* |            *277*
>
>                                                         *Executing
> single-partition query on processinfometric_profile [SharedPool-Worker-2]*
> | *2015-10-09 16:44:09.343000* | *172.31.17.129* |            *456*
>
>
>             *Acquiring sstable references [SharedPool-Worker-2]* | *2015-10-09
> 16:44:09.343000* | *172.31.17.129* |            *473*
>
>
>               *Merging memtable tombstones [SharedPool-Worker-2]* | *2015-10-09
> 16:44:09.343000* | *172.31.17.129* |            *485*
>
>
>             *Key cache hit for sstable 328 [SharedPool-Worker-2]* | *2015-10-09
> 16:44:09.345000* | *172.31.17.129* |           *2851*
>
>                                                                           *Seeking
> to partition beginning in data file [SharedPool-Worker-2]* | *2015-10-09
> 16:44:09.345000* | *172.31.17.129* |           *2869*
>
>                                             *Skipped 0/1
> non-slice-intersecting sstables, included 0 due to tombstones
> [SharedPool-Worker-2]* | *2015-10-09 16:44:09.345000* | *172.31.17.129* |
>           *3005*
>
>                                                                           *Merging
> data from memtables and 1 sstables [SharedPool-Worker-2]* | *2015-10-09
> 16:44:09.345001* | *172.31.17.129* |           *3017*
>
>
>       *Read 468 live and 0 tombstone cells [SharedPool-Worker-2]* | *2015-10-09
> 16:44:11.329000* | *172.31.17.129* |         *987011*
>
>
>                                               *Request complete* | *2015-10-09
> 16:44:11.388637* | *172.31.17.129* |          *46637*
>
>
>
>
> On Oct 9, 2015, at 2:48 PM, Nazario Parsacala <do...@gmail.com>
> wrote:
>
> Compaction did not help too.
>
>
>
> On Oct 9, 2015, at 1:01 PM, Nazario Parsacala <do...@gmail.com>
> wrote:
>
> So I upgraded to 2.2.2 and change the compaction strategy from DateTieredCompactionStrategy
> to LeveledCompactionStrategy. But the problem still exists.
> At the start we were getting responses around 80 to a couple of hundred of
> ms. But after 1.5 hours of running, it is now hitting 1447 ms. I think this
> will degrade some more as time progresses. I will let this run a couple of
> hours more  and will also try to force compaction.
>
> BTW, with 2.2.2 I am getting the following exceptions. Not sure if there
> is already a bug report on this.
>
> Caused by: java.io.IOException: Seek position 182054 is not within mmap
> segment (seg offs: 0, length: 182054)
> at
> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> ... 17 common frames omitted
> WARN  [SharedPool-Worker-42] 2015-10-09 12:54:57,221
> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread
> Thread[SharedPool-Worker-42,5,main]: {}
> java.lang.RuntimeException:
> org.apache.cassandra.io.sstable.CorruptSSTableException:
> java.io.IOException: Seek position 182054 is not within mmap segment (seg
> offs: 0, length: 182054)
> at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2187)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_60]
> at
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
> [apache-cassandra-2.2.2.jar:2.2.2]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
> java.io.IOException: Seek position 182054 is not within mmap segment (seg
> offs: 0, length: 182054)
> at
> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:250)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1558)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.<init>(SSTableSliceIterator.java:42)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:75)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2004)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1808)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1537)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2183)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> ... 4 common frames omitted
> Caused by: java.io.IOException: Seek position 182054 is not within mmap
> segment (seg offs: 0, length: 182054)
> at
> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
> at
> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244)
> ~[apache-cassandra-2.2.2.jar:2.2.2]
>
>
>
>
> On Oct 9, 2015, at 9:26 AM, Carlos Alonso <in...@mrcalonso.com> wrote:
>
> Yeah, I was about to suggest the compaction strategy too. Leveled
> compaction sounds like a better fit when records are being updated
>
> Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>
>
> On 8 October 2015 at 22:35, Tyler Hobbs <ty...@datastax.com> wrote:
>
>> Upgrade to 2.2.2.  Your sstables are probably not compacting due to
>> CASSANDRA-10270 <https://issues.apache.org/jira/browse/CASSANDRA-10270>,
>> which was fixed in 2.2.2.
>>
>> Additionally, you may want to look into using leveled compaction (
>> http://www.datastax.com/dev/blog/when-to-use-leveled-compaction).
>>
>> On Thu, Oct 8, 2015 at 4:27 PM, Nazario Parsacala <do...@gmail.com>
>> wrote:
>>
>>>
>>> Hi,
>>>
>>> so we are developing a system that computes profile of things that it
>>> observes. The observation comes in form of events. Each thing that it
>>> observe has an id and each thing has a set of subthings in it which has
>>> measurement of some kind. Roughly there are about 500 subthings within each
>>> thing. We receive events containing measurements of these 500 subthings
>>> every 10 seconds or so.
>>>
>>> So as we receive events, we  read the old profile value, calculate the
>>> new profile based on the new value and save it back. We use the following
>>> schema to hold the profile.
>>>
>>> CREATE TABLE myprofile (
>>>     id text,
>>>     month text,
>>>     day text,
>>>     hour text,
>>>     subthings text,
>>>     lastvalue double,
>>>     count int,
>>>     stddev double,
>>>  PRIMARY KEY ((id, month, day, hour), subthings)
>>> ) WITH CLUSTERING ORDER BY (subthings ASC) );
>>>
>>>
>>> This profile will then be use for certain analytics that can use in the
>>> context of the ‘thing’ or in the context of specific thing and subthing.
>>>
>>> A profile can be defined as monthly, daily, hourly. So in case of
>>> monthly the month will be set to the current month (i.e. ‘Oct’) and the day
>>> and hour will be set to empty ‘’ string.
>>>
>>>
>>> The problem that we have observed is that over time (actually in just a
>>> matter of hours) we will see a huge degradation of query response  for the
>>> monthly profile. At the start it will be respinding in 10-100 ms and after
>>> a couple of hours it will go to 2000-3000 ms . If you leave it for a couple
>>> of days you will start experiencing readtimeouts . The query is basically
>>> just :
>>>
>>> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and
>>> hour=‘'
>>>
>>> This will have only about 500 rows or so.
>>>
>>>
>>> I believe that this is cause by the fact there are multiple updates done
>>> to this specific partition. So what do we think can be done to resolve this
>>> ?
>>>
>>> BTW, I am using Cassandra 2.2.1 . And since this is a test , this is
>>> just running on a single node.
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Tyler Hobbs
>> DataStax <http://datastax.com/>
>>
>
>
>
>
>


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Re: Cassandra query degradation with high frequency updated tables.

Posted by Nazario Parsacala <do...@gmail.com>.

So the trace is varying a lot. And does not seem to correlate with the data return from the client ? Maybe datastax java  driver related. ..? (not likely).. Just checkout the results.


Below is the one that I took when from the client (java application) perspective it was returning data in  about 1100 ms.



racing session: 566477c0-6ebc-11e5-9493-9131aba66d63

 activity                                                                                                                                                                                                             | timestamp                  | source        | source_elapsed
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
                                                                                                                                                                                                   Execute CQL3 query | 2015-10-09 15:31:28.700000 | 172.31.17.129 |              0
 Parsing select * from processinfometric_profile where profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour='' and minute=''; [SharedPool-Worker-1] | 2015-10-09 15:31:28.701000 | 172.31.17.129 |            101
                                                                                                                                                                            Preparing statement [SharedPool-Worker-1] | 2015-10-09 15:31:28.701000 | 172.31.17.129 |            334
                                                                                                                                  Executing single-partition query on processinfometric_profile [SharedPool-Worker-3] | 2015-10-09 15:31:28.701000 | 172.31.17.129 |            692
                                                                                                                                                                   Acquiring sstable references [SharedPool-Worker-3] | 2015-10-09 15:31:28.701000 | 172.31.17.129 |            713
                                                                                                                                                                    Merging memtable tombstones [SharedPool-Worker-3] | 2015-10-09 15:31:28.701000 | 172.31.17.129 |            726
                                                                                                                                                                  Key cache hit for sstable 209 [SharedPool-Worker-3] | 2015-10-09 15:31:28.704000 | 172.31.17.129 |           3143
                                                                                                                                                    Seeking to partition beginning in data file [SharedPool-Worker-3] | 2015-10-09 15:31:28.704000 | 172.31.17.129 |           3169
                                                                                                                                                                  Key cache hit for sstable 208 [SharedPool-Worker-3] | 2015-10-09 15:31:28.704000 | 172.31.17.129 |           3691
                                                                                                                                                    Seeking to partition beginning in data file [SharedPool-Worker-3] | 2015-10-09 15:31:28.704000 | 172.31.17.129 |           3713
                                                                                                                      Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-3] | 2015-10-09 15:31:28.704000 | 172.31.17.129 |           3807
                                                                                                                                                     Merging data from memtables and 2 sstables [SharedPool-Worker-3] | 2015-10-09 15:31:28.704000 | 172.31.17.129 |           3818
                                                                                                                                                            Read 462 live and 0 tombstone cells [SharedPool-Worker-3] | 2015-10-09 15:31:29.611000 | 172.31.17.129 |         910723
                                                                                                                                                                                                     Request complete | 2015-10-09 15:31:29.649251 | 172.31.17.129 |         949251




Below when this is around 1400 ms . But the trace data seems to look faster ..?



racing session: 7c591550-6ebf-11e5-9493-9131aba66d63

 activity                                                                                                                                                                                                               | timestamp                  | source        | source_elapsed
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
                                                                                                                                                                                                     Execute CQL3 query | 2015-10-09 15:54:00.869000 | 172.31.17.129 |              0
 Parsing select * from processinfometric_profile where profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour='' and minute=''; [SharedPool-Worker-133] | 2015-10-09 15:54:00.869000 | 172.31.17.129 |            122
                                                                                                                                                                            Preparing statement [SharedPool-Worker-133] | 2015-10-09 15:54:00.869000 | 172.31.17.129 |            265
                                                                                                                                    Executing single-partition query on processinfometric_profile [SharedPool-Worker-9] | 2015-10-09 15:54:00.870000 | 172.31.17.129 |            673
                                                                                                                                                                     Acquiring sstable references [SharedPool-Worker-9] | 2015-10-09 15:54:00.870000 | 172.31.17.129 |            695
                                                                                                                                                                      Merging memtable tombstones [SharedPool-Worker-9] | 2015-10-09 15:54:00.870000 | 172.31.17.129 |            709
                                                                                                                                                                    Key cache hit for sstable 242 [SharedPool-Worker-9] | 2015-10-09 15:54:00.872000 | 172.31.17.129 |           3134
                                                                                                                                                      Seeking to partition beginning in data file [SharedPool-Worker-9] | 2015-10-09 15:54:00.872000 | 172.31.17.129 |           3155
                                                                                                                        Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-9] | 2015-10-09 15:54:00.872000 | 172.31.17.129 |           3259
                                                                                                                                                       Merging data from memtables and 1 sstables [SharedPool-Worker-9] | 2015-10-09 15:54:00.872000 | 172.31.17.129 |           3270
                                                                                                                                                              Read 462 live and 0 tombstone cells [SharedPool-Worker-9] | 2015-10-09 15:54:02.070000 | 172.31.17.129 |         201640
                                                                                                                                                                                                       Request complete | 2015-10-09 15:54:02.111294 | 172.31.17.129 |         242294




This is when it was 1600 ms.

Tracing session: 9c0ea900-6ec4-11e5-9493-9131aba66d63

 activity                                                                                                                                                                                                               | timestamp                  | source        | source_elapsed
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
                                                                                                                                                                                                     Execute CQL3 query | 2015-10-09 16:30:41.552000 | 172.31.17.129 |              0
 Parsing select * from processinfometric_profile where profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour='' and minute=''; [SharedPool-Worker-149] | 2015-10-09 16:30:41.552000 | 172.31.17.129 |             99
                                                                                                                                                                            Preparing statement [SharedPool-Worker-149] | 2015-10-09 16:30:41.552000 | 172.31.17.129 |            215
                                                                                                                                  Executing single-partition query on processinfometric_profile [SharedPool-Worker-164] | 2015-10-09 16:30:41.552000 | 172.31.17.129 |            507
                                                                                                                                                                   Acquiring sstable references [SharedPool-Worker-164] | 2015-10-09 16:30:41.552000 | 172.31.17.129 |            533
                                                                                                                                                                    Merging memtable tombstones [SharedPool-Worker-164] | 2015-10-09 16:30:41.552000 | 172.31.17.129 |            551
                                                                                                                                                                  Key cache hit for sstable 302 [SharedPool-Worker-164] | 2015-10-09 16:30:41.556000 | 172.31.17.129 |           3875
                                                                                                                                                    Seeking to partition beginning in data file [SharedPool-Worker-164] | 2015-10-09 16:30:41.556000 | 172.31.17.129 |           3902
                                                                                                                      Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-164] | 2015-10-09 16:30:41.556000 | 172.31.17.129 |           4050
                                                                                                                                                     Merging data from memtables and 1 sstables [SharedPool-Worker-164] | 2015-10-09 16:30:41.556000 | 172.31.17.129 |           4068
                                                                                                                                                            Read 468 live and 0 tombstone cells [SharedPool-Worker-164] | 2015-10-09 16:30:43.269000 | 172.31.17.129 |         717277
                                                                                                                                                                                                       Request complete | 2015-10-09 16:30:43.307559 | 172.31.17.129 |         755559




The last one is now at 2300 ms.


racing session: 7d89a1e0-6ec6-11e5-9493-9131aba66d63

 activity                                                                                                                                                                                                             | timestamp                  | source        | source_elapsed
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
                                                                                                                                                                                                   Execute CQL3 query | 2015-10-09 16:44:09.342000 | 172.31.17.129 |              0
 Parsing select * from processinfometric_profile where profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour='' and minute=''; [SharedPool-Worker-1] | 2015-10-09 16:44:09.342000 | 172.31.17.129 |             87
                                                                                                                                                                            Preparing statement [SharedPool-Worker-1] | 2015-10-09 16:44:09.342000 | 172.31.17.129 |            277
                                                                                                                                  Executing single-partition query on processinfometric_profile [SharedPool-Worker-2] | 2015-10-09 16:44:09.343000 | 172.31.17.129 |            456
                                                                                                                                                                   Acquiring sstable references [SharedPool-Worker-2] | 2015-10-09 16:44:09.343000 | 172.31.17.129 |            473
                                                                                                                                                                    Merging memtable tombstones [SharedPool-Worker-2] | 2015-10-09 16:44:09.343000 | 172.31.17.129 |            485
                                                                                                                                                                  Key cache hit for sstable 328 [SharedPool-Worker-2] | 2015-10-09 16:44:09.345000 | 172.31.17.129 |           2851
                                                                                                                                                    Seeking to partition beginning in data file [SharedPool-Worker-2] | 2015-10-09 16:44:09.345000 | 172.31.17.129 |           2869
                                                                                                                      Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-2] | 2015-10-09 16:44:09.345000 | 172.31.17.129 |           3005
                                                                                                                                                     Merging data from memtables and 1 sstables [SharedPool-Worker-2] | 2015-10-09 16:44:09.345001 | 172.31.17.129 |           3017
                                                                                                                                                            Read 468 live and 0 tombstone cells [SharedPool-Worker-2] | 2015-10-09 16:44:11.329000 | 172.31.17.129 |         987011
                                                                                                                                                                                                     Request complete | 2015-10-09 16:44:11.388637 | 172.31.17.129 |          46637




> On Oct 9, 2015, at 2:48 PM, Nazario Parsacala <do...@gmail.com> wrote:
> 
> Compaction did not help too. 
> 
> 
> 
>> On Oct 9, 2015, at 1:01 PM, Nazario Parsacala <dodongjuan@gmail.com <ma...@gmail.com>> wrote:
>> 
>> So I upgraded to 2.2.2 and change the compaction strategy from DateTieredCompactionStrategy to LeveledCompactionStrategy. But the problem still exists.
>> At the start we were getting responses around 80 to a couple of hundred of ms. But after 1.5 hours of running, it is now hitting 1447 ms. I think this will degrade some more as time progresses. I will let this run a couple of hours more  and will also try to force compaction.
>> 
>> BTW, with 2.2.2 I am getting the following exceptions. Not sure if there is already a bug report on this.
>> 
>> Caused by: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
>> 	at org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	... 17 common frames omitted
>> WARN  [SharedPool-Worker-42] 2015-10-09 12:54:57,221 AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-42,5,main]: {}
>> java.lang.RuntimeException: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
>> 	at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2187) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_60]
>> 	at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-2.2.2.jar:2.2.2]
>> 	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
>> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
>> 	at org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:250) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1558) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.<init>(SSTableSliceIterator.java:42) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:75) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2004) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1808) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1537) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2183) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	... 4 common frames omitted
>> Caused by: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
>> 	at org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 	at org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244) ~[apache-cassandra-2.2.2.jar:2.2.2]
>> 
>> 
>> 
>> 
>>> On Oct 9, 2015, at 9:26 AM, Carlos Alonso <info@mrcalonso.com <ma...@mrcalonso.com>> wrote:
>>> 
>>> Yeah, I was about to suggest the compaction strategy too. Leveled compaction sounds like a better fit when records are being updated
>>> 
>>> Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>
>>> 
>>> On 8 October 2015 at 22:35, Tyler Hobbs <tyler@datastax.com <ma...@datastax.com>> wrote:
>>> Upgrade to 2.2.2.  Your sstables are probably not compacting due to CASSANDRA-10270 <https://issues.apache.org/jira/browse/CASSANDRA-10270>, which was fixed in 2.2.2.
>>> 
>>> Additionally, you may want to look into using leveled compaction (http://www.datastax.com/dev/blog/when-to-use-leveled-compaction <http://www.datastax.com/dev/blog/when-to-use-leveled-compaction>).
>>> 
>>> On Thu, Oct 8, 2015 at 4:27 PM, Nazario Parsacala <dodongjuan@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> Hi,
>>> 
>>> so we are developing a system that computes profile of things that it observes. The observation comes in form of events. Each thing that it observe has an id and each thing has a set of subthings in it which has measurement of some kind. Roughly there are about 500 subthings within each thing. We receive events containing measurements of these 500 subthings every 10 seconds or so.
>>> 
>>> So as we receive events, we  read the old profile value, calculate the new profile based on the new value and save it back. We use the following schema to hold the profile. 
>>> 
>>> CREATE TABLE myprofile (
>>>     id text,
>>>     month text,
>>>     day text,
>>>     hour text,
>>>     subthings text,
>>>     lastvalue double,
>>>     count int,
>>>     stddev double,
>>>  PRIMARY KEY ((id, month, day, hour), subthings)
>>> ) WITH CLUSTERING ORDER BY (subthings ASC) );
>>> 
>>> 
>>> This profile will then be use for certain analytics that can use in the context of the ‘thing’ or in the context of specific thing and subthing. 
>>> 
>>> A profile can be defined as monthly, daily, hourly. So in case of monthly the month will be set to the current month (i.e. ‘Oct’) and the day and hour will be set to empty ‘’ string.
>>> 
>>> 
>>> The problem that we have observed is that over time (actually in just a matter of hours) we will see a huge degradation of query response  for the monthly profile. At the start it will be respinding in 10-100 ms and after a couple of hours it will go to 2000-3000 ms . If you leave it for a couple of days you will start experiencing readtimeouts . The query is basically just :
>>> 
>>> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and hour=‘'
>>> 
>>> This will have only about 500 rows or so.
>>> 
>>> 
>>> I believe that this is cause by the fact there are multiple updates done to this specific partition. So what do we think can be done to resolve this ? 
>>> 
>>> BTW, I am using Cassandra 2.2.1 . And since this is a test , this is just running on a single node.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Tyler Hobbs
>>> DataStax <http://datastax.com/>
>>> 
>> 
> 


Re: Cassandra query degradation with high frequency updated tables.

Posted by Nazario Parsacala <do...@gmail.com>.
Compaction did not help too. 



> On Oct 9, 2015, at 1:01 PM, Nazario Parsacala <do...@gmail.com> wrote:
> 
> So I upgraded to 2.2.2 and change the compaction strategy from DateTieredCompactionStrategy to LeveledCompactionStrategy. But the problem still exists.
> At the start we were getting responses around 80 to a couple of hundred of ms. But after 1.5 hours of running, it is now hitting 1447 ms. I think this will degrade some more as time progresses. I will let this run a couple of hours more  and will also try to force compaction.
> 
> BTW, with 2.2.2 I am getting the following exceptions. Not sure if there is already a bug report on this.
> 
> Caused by: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
> 	at org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	... 17 common frames omitted
> WARN  [SharedPool-Worker-42] 2015-10-09 12:54:57,221 AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-42,5,main]: {}
> java.lang.RuntimeException: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
> 	at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2187) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_60]
> 	at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-2.2.2.jar:2.2.2]
> 	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
> 	at org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:250) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1558) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.<init>(SSTableSliceIterator.java:42) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:75) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2004) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1808) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1537) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2183) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	... 4 common frames omitted
> Caused by: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
> 	at org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 	at org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244) ~[apache-cassandra-2.2.2.jar:2.2.2]
> 
> 
> 
> 
>> On Oct 9, 2015, at 9:26 AM, Carlos Alonso <info@mrcalonso.com <ma...@mrcalonso.com>> wrote:
>> 
>> Yeah, I was about to suggest the compaction strategy too. Leveled compaction sounds like a better fit when records are being updated
>> 
>> Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>
>> 
>> On 8 October 2015 at 22:35, Tyler Hobbs <tyler@datastax.com <ma...@datastax.com>> wrote:
>> Upgrade to 2.2.2.  Your sstables are probably not compacting due to CASSANDRA-10270 <https://issues.apache.org/jira/browse/CASSANDRA-10270>, which was fixed in 2.2.2.
>> 
>> Additionally, you may want to look into using leveled compaction (http://www.datastax.com/dev/blog/when-to-use-leveled-compaction <http://www.datastax.com/dev/blog/when-to-use-leveled-compaction>).
>> 
>> On Thu, Oct 8, 2015 at 4:27 PM, Nazario Parsacala <dodongjuan@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi,
>> 
>> so we are developing a system that computes profile of things that it observes. The observation comes in form of events. Each thing that it observe has an id and each thing has a set of subthings in it which has measurement of some kind. Roughly there are about 500 subthings within each thing. We receive events containing measurements of these 500 subthings every 10 seconds or so.
>> 
>> So as we receive events, we  read the old profile value, calculate the new profile based on the new value and save it back. We use the following schema to hold the profile. 
>> 
>> CREATE TABLE myprofile (
>>     id text,
>>     month text,
>>     day text,
>>     hour text,
>>     subthings text,
>>     lastvalue double,
>>     count int,
>>     stddev double,
>>  PRIMARY KEY ((id, month, day, hour), subthings)
>> ) WITH CLUSTERING ORDER BY (subthings ASC) );
>> 
>> 
>> This profile will then be use for certain analytics that can use in the context of the ‘thing’ or in the context of specific thing and subthing. 
>> 
>> A profile can be defined as monthly, daily, hourly. So in case of monthly the month will be set to the current month (i.e. ‘Oct’) and the day and hour will be set to empty ‘’ string.
>> 
>> 
>> The problem that we have observed is that over time (actually in just a matter of hours) we will see a huge degradation of query response  for the monthly profile. At the start it will be respinding in 10-100 ms and after a couple of hours it will go to 2000-3000 ms . If you leave it for a couple of days you will start experiencing readtimeouts . The query is basically just :
>> 
>> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and hour=‘'
>> 
>> This will have only about 500 rows or so.
>> 
>> 
>> I believe that this is cause by the fact there are multiple updates done to this specific partition. So what do we think can be done to resolve this ? 
>> 
>> BTW, I am using Cassandra 2.2.1 . And since this is a test , this is just running on a single node.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> -- 
>> Tyler Hobbs
>> DataStax <http://datastax.com/>
>> 
> 


Re: Cassandra query degradation with high frequency updated tables.

Posted by Nazario Parsacala <do...@gmail.com>.
So I upgraded to 2.2.2 and change the compaction strategy from DateTieredCompactionStrategy to LeveledCompactionStrategy. But the problem still exists.
At the start we were getting responses around 80 to a couple of hundred of ms. But after 1.5 hours of running, it is now hitting 1447 ms. I think this will degrade some more as time progresses. I will let this run a couple of hours more  and will also try to force compaction.

BTW, with 2.2.2 I am getting the following exceptions. Not sure if there is already a bug report on this.

Caused by: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
	at org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244) ~[apache-cassandra-2.2.2.jar:2.2.2]
	... 17 common frames omitted
WARN  [SharedPool-Worker-42] 2015-10-09 12:54:57,221 AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-42,5,main]: {}
java.lang.RuntimeException: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
	at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2187) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_60]
	at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-2.2.2.jar:2.2.2]
	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
	at org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:250) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1558) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.<init>(SSTableSliceIterator.java:42) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:75) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2004) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1808) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1537) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2183) ~[apache-cassandra-2.2.2.jar:2.2.2]
	... 4 common frames omitted
Caused by: java.io.IOException: Seek position 182054 is not within mmap segment (seg offs: 0, length: 182054)
	at org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155) ~[apache-cassandra-2.2.2.jar:2.2.2]
	at org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244) ~[apache-cassandra-2.2.2.jar:2.2.2]




> On Oct 9, 2015, at 9:26 AM, Carlos Alonso <in...@mrcalonso.com> wrote:
> 
> Yeah, I was about to suggest the compaction strategy too. Leveled compaction sounds like a better fit when records are being updated
> 
> Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>
> 
> On 8 October 2015 at 22:35, Tyler Hobbs <tyler@datastax.com <ma...@datastax.com>> wrote:
> Upgrade to 2.2.2.  Your sstables are probably not compacting due to CASSANDRA-10270 <https://issues.apache.org/jira/browse/CASSANDRA-10270>, which was fixed in 2.2.2.
> 
> Additionally, you may want to look into using leveled compaction (http://www.datastax.com/dev/blog/when-to-use-leveled-compaction <http://www.datastax.com/dev/blog/when-to-use-leveled-compaction>).
> 
> On Thu, Oct 8, 2015 at 4:27 PM, Nazario Parsacala <dodongjuan@gmail.com <ma...@gmail.com>> wrote:
> 
> Hi,
> 
> so we are developing a system that computes profile of things that it observes. The observation comes in form of events. Each thing that it observe has an id and each thing has a set of subthings in it which has measurement of some kind. Roughly there are about 500 subthings within each thing. We receive events containing measurements of these 500 subthings every 10 seconds or so.
> 
> So as we receive events, we  read the old profile value, calculate the new profile based on the new value and save it back. We use the following schema to hold the profile. 
> 
> CREATE TABLE myprofile (
>     id text,
>     month text,
>     day text,
>     hour text,
>     subthings text,
>     lastvalue double,
>     count int,
>     stddev double,
>  PRIMARY KEY ((id, month, day, hour), subthings)
> ) WITH CLUSTERING ORDER BY (subthings ASC) );
> 
> 
> This profile will then be use for certain analytics that can use in the context of the ‘thing’ or in the context of specific thing and subthing. 
> 
> A profile can be defined as monthly, daily, hourly. So in case of monthly the month will be set to the current month (i.e. ‘Oct’) and the day and hour will be set to empty ‘’ string.
> 
> 
> The problem that we have observed is that over time (actually in just a matter of hours) we will see a huge degradation of query response  for the monthly profile. At the start it will be respinding in 10-100 ms and after a couple of hours it will go to 2000-3000 ms . If you leave it for a couple of days you will start experiencing readtimeouts . The query is basically just :
> 
> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and hour=‘'
> 
> This will have only about 500 rows or so.
> 
> 
> I believe that this is cause by the fact there are multiple updates done to this specific partition. So what do we think can be done to resolve this ? 
> 
> BTW, I am using Cassandra 2.2.1 . And since this is a test , this is just running on a single node.
> 
> 
> 
> 
> 
> 
> 
> -- 
> Tyler Hobbs
> DataStax <http://datastax.com/>
> 


Re: Cassandra query degradation with high frequency updated tables.

Posted by Carlos Alonso <in...@mrcalonso.com>.
Yeah, I was about to suggest the compaction strategy too. Leveled
compaction sounds like a better fit when records are being updated

Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>

On 8 October 2015 at 22:35, Tyler Hobbs <ty...@datastax.com> wrote:

> Upgrade to 2.2.2.  Your sstables are probably not compacting due to
> CASSANDRA-10270 <https://issues.apache.org/jira/browse/CASSANDRA-10270>,
> which was fixed in 2.2.2.
>
> Additionally, you may want to look into using leveled compaction (
> http://www.datastax.com/dev/blog/when-to-use-leveled-compaction).
>
> On Thu, Oct 8, 2015 at 4:27 PM, Nazario Parsacala <do...@gmail.com>
> wrote:
>
>>
>> Hi,
>>
>> so we are developing a system that computes profile of things that it
>> observes. The observation comes in form of events. Each thing that it
>> observe has an id and each thing has a set of subthings in it which has
>> measurement of some kind. Roughly there are about 500 subthings within each
>> thing. We receive events containing measurements of these 500 subthings
>> every 10 seconds or so.
>>
>> So as we receive events, we  read the old profile value, calculate the
>> new profile based on the new value and save it back. We use the following
>> schema to hold the profile.
>>
>> CREATE TABLE myprofile (
>>     id text,
>>     month text,
>>     day text,
>>     hour text,
>>     subthings text,
>>     lastvalue double,
>>     count int,
>>     stddev double,
>>  PRIMARY KEY ((id, month, day, hour), subthings)
>> ) WITH CLUSTERING ORDER BY (subthings ASC) );
>>
>>
>> This profile will then be use for certain analytics that can use in the
>> context of the ‘thing’ or in the context of specific thing and subthing.
>>
>> A profile can be defined as monthly, daily, hourly. So in case of monthly
>> the month will be set to the current month (i.e. ‘Oct’) and the day and
>> hour will be set to empty ‘’ string.
>>
>>
>> The problem that we have observed is that over time (actually in just a
>> matter of hours) we will see a huge degradation of query response  for the
>> monthly profile. At the start it will be respinding in 10-100 ms and after
>> a couple of hours it will go to 2000-3000 ms . If you leave it for a couple
>> of days you will start experiencing readtimeouts . The query is basically
>> just :
>>
>> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and
>> hour=‘'
>>
>> This will have only about 500 rows or so.
>>
>>
>> I believe that this is cause by the fact there are multiple updates done
>> to this specific partition. So what do we think can be done to resolve this
>> ?
>>
>> BTW, I am using Cassandra 2.2.1 . And since this is a test , this is just
>> running on a single node.
>>
>>
>>
>>
>>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Re: Cassandra query degradation with high frequency updated tables.

Posted by Tyler Hobbs <ty...@datastax.com>.
Upgrade to 2.2.2.  Your sstables are probably not compacting due to
CASSANDRA-10270 <https://issues.apache.org/jira/browse/CASSANDRA-10270>,
which was fixed in 2.2.2.

Additionally, you may want to look into using leveled compaction (
http://www.datastax.com/dev/blog/when-to-use-leveled-compaction).

On Thu, Oct 8, 2015 at 4:27 PM, Nazario Parsacala <do...@gmail.com>
wrote:

>
> Hi,
>
> so we are developing a system that computes profile of things that it
> observes. The observation comes in form of events. Each thing that it
> observe has an id and each thing has a set of subthings in it which has
> measurement of some kind. Roughly there are about 500 subthings within each
> thing. We receive events containing measurements of these 500 subthings
> every 10 seconds or so.
>
> So as we receive events, we  read the old profile value, calculate the new
> profile based on the new value and save it back. We use the following
> schema to hold the profile.
>
> CREATE TABLE myprofile (
>     id text,
>     month text,
>     day text,
>     hour text,
>     subthings text,
>     lastvalue double,
>     count int,
>     stddev double,
>  PRIMARY KEY ((id, month, day, hour), subthings)
> ) WITH CLUSTERING ORDER BY (subthings ASC) );
>
>
> This profile will then be use for certain analytics that can use in the
> context of the ‘thing’ or in the context of specific thing and subthing.
>
> A profile can be defined as monthly, daily, hourly. So in case of monthly
> the month will be set to the current month (i.e. ‘Oct’) and the day and
> hour will be set to empty ‘’ string.
>
>
> The problem that we have observed is that over time (actually in just a
> matter of hours) we will see a huge degradation of query response  for the
> monthly profile. At the start it will be respinding in 10-100 ms and after
> a couple of hours it will go to 2000-3000 ms . If you leave it for a couple
> of days you will start experiencing readtimeouts . The query is basically
> just :
>
> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and hour=‘'
>
> This will have only about 500 rows or so.
>
>
> I believe that this is cause by the fact there are multiple updates done
> to this specific partition. So what do we think can be done to resolve this
> ?
>
> BTW, I am using Cassandra 2.2.1 . And since this is a test , this is just
> running on a single node.
>
>
>
>
>


-- 
Tyler Hobbs
DataStax <http://datastax.com/>