You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by "qihuang.zheng" <qi...@fraudmetrix.cn> on 2015/11/12 14:20:46 UTC

Data.db too large and after sstableloader still large

We do snapshot, and found some Data.db too large:
[qihuang.zheng@spark047219 5]$ find . -type f -size +800M -print0 | xargs -0 ls -lh
-rw-r--r--. 2 qihuang.zheng users 1.5G 10月 28 14:49 ./forseti/velocity/forseti-velocity-jb-103631-Data.db


And sstableloader to new cluster, one node has this large file:
[qihuang.zheng@spark047243 velocity]$ ll -rth | grep Data
-rw-r--r--. 1 admin admin 46M 11月 12 18:22 forseti-velocity-jb-21-Data.db
-rw-r--r--. 1 admin admin 156M 11月 12 18:22 forseti-velocity-jb-22-Data.db
-rw-r--r--. 1 admin admin 2.6M 11月 12 18:22 forseti-velocity-jb-23-Data.db
-rw-r--r--. 1 admin admin 162M 11月 12 18:22 forseti-velocity-jb-24-Data.db
-rw-r--r--. 1 admin admin 1.5G 11月 12 18:22 forseti-velocity-jb-25-Data.db  -BigFile Still here


Seems sstableloader don’t split file very well. Why sstableloader can’t split to small filter to new cluster?
I tried usesstablesplit at snapshot before sstableloader, but this progress is too slow.



Tks,qihuang.zheng