You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Thomas Julian <th...@zoho.com> on 2016/09/29 09:30:52 UTC

Optimising the data model for reads

Hello,



I have created a column family for User File Management.


CREATE TABLE "UserFile" ("USERID" bigint,"FILEID" text,"FILETYPE" int,"FOLDER_UID" text,"FILEPATHINFO" text,"JSONCOLUMN" text,PRIMARY KEY ("USERID","FILEID"));



Sample Entry



(4*************003, 3f9**************************6a1, null, 2 , [{"FOLDER_TYPE":"-1","UID":"1","FOLDER":"\"HOME\""}] ,{"filename":"untitled","size":1,"kind":-1,"where":""})




Queries :



Select "USERID","FILEID","FILETYPE","FOLDER_UID","JSONCOLUMN" from "UserFile" where "USERID"=&lt;value&gt; and "FILEID" in (&lt;value&gt;,&lt;value&gt;,...)



Select "USERID","FILEID","FILEPATHINFO" from "UserFile" where "USERID"=&lt;value&gt; and "FILEID" in (&lt;value&gt;,&lt;value&gt;,...) 



This column family was perfectly working in our lab. I was able to fetch the results for the queries stated at less than 10ms. I deployed this in production(Cassandra 2.1.13), It was working perfectly for a month or two. But now at times the queries are taking 5s to 10s. On analysing further, I found that few users are deleting the files too frequently. This generates too many tombstones. I have set the gc_grace_seconds to the default 10 days and I have chosen SizeTieredCompactionStrategy. I want to optimise this Data Model for read efficiency. 



Any help is much appreciated.



Best Regards,

Julian.

Re: Optimising the data model for reads

Posted by Romain Hardouin <ro...@yahoo.fr>.

Hi Julian,
The problem with any deletes here is that you can *read* potentially many tombstones. I mean you have two concerns: 1. Avoid to read tombstones during a query 2. How to evict tombstones as quickly as possible to reclaim disk space The first point is a data model consideration. Generally speaking, to avoid to read tombstones we have to think about order. Let's take an example not related to your data model: say you have a "updated_at" column, maybe you always want to read the newest data (e.g. < 7 days) while oldest ones will be TTL'ed (tombstones). If you order your data by "updated_at DESC" (and TTL>7 days and there are no manual deletes) you won't read tombstones.
The second point depends on many factors: gc_grace, compaction strategy, compaction throughput, number of compactors, IO performances, #CPUs, ... Also, with such a data model, you will have unbalance data distribution. What if a user has 1,000,000 files or more?You can use a composite partition key to avoid that: PRIMARY KEY ((userid, fileid), ...).The data distribution will be much better and on top of that you won't read tombstones when a file is deleted (because you won't query the partition key at all). *However if you always read many files per user, each query will hit many nodes.*You have to decide depending on the query pattern, the average/max number of files per user, the average/max file size, etc.
Regarding the compaction strategy, LCS is good for read heavy workload but you need good disk IO and enough CPUs/vCPUs (watch out if your write workload is quite heavy).The LCS will compact frequently so, *if tombstones are evictable*, they will be evicted faster that with STCS.As you mentioned, you have 10 days of gc_grace so you might consider to lower this value if maintenance repair are running in few hours/days.
LCS is doing a good job with updates and that gives me an idea: what about soft deletes? A clustering column "status int" could do the trick. Let's say 1=>"live file", 2=>"to delete".When a user deletes a file, you set the "status" to 2 and write the userid and fileid in a table "files_to_delete" (the partition key can be the date of the day if there are not millions of deletion per day). Then a batch job can run during off-peak hours to delete i.e. add a tombstone on files to delete.In read queries you would have to add "WHERE status = 1 AND ...". Again it's just an idea that crosses my mind, I never tested this model, but maybe you can think about it. The bonus is that you can "undeleted" a file as long as the batch job has not been triggered.
Best,
Romain

Le Jeudi 29 septembre 2016 11h31, Thomas Julian <th...@zoho.com> a écrit :

Hello,

I have created a column family for User File Management.
CREATE TABLE "UserFile" ("USERID" bigint,"FILEID" text,"FILETYPE" int,"FOLDER_UID" text,"FILEPATHINFO" text,"JSONCOLUMN" text,PRIMARY KEY ("USERID","FILEID"));

Sample Entry

(4*************003, 3f9**************************6a1, null, 2 , [{"FOLDER_TYPE":"-1","UID":"1","FOLDER":"\"HOME\""}] ,{"filename":"untitled","size":1,"kind":-1,"where":""})

Queries :

Select "USERID","FILEID","FILETYPE","FOLDER_UID","JSONCOLUMN" from "UserFile" where "USERID"=<value> and "FILEID" in (<value>,<value>,...)

Select "USERID","FILEID","FILEPATHINFO" from "UserFile" where "USERID"=<value> and "FILEID" in (<value>,<value>,...)

This column family was perfectly working in our lab. I was able to fetch the results for the queries stated at less than 10ms. I deployed this in production(Cassandra 2.1.13), It was working perfectly for a month or two. But now at times the queries are taking 5s to 10s. On analysing further, I found that few users are deleting the files too frequently. This generates too many tombstones. I have set the gc_grace_seconds to the default 10 days and I have chosen SizeTieredCompactionStrategy. I want to optimise this Data Model for read efficiency.

Any help is much appreciated.

Best Regards,
Julian.