You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "qihuang.zheng" <qi...@fraudmetrix.cn> on 2015/10/29 09:08:31 UTC
nodetool status Load not same with disk used
We have some nodes Load too large, but some are normal.
[qihuang.zheng@cass047221 forseti]$ /usr/install/cassandra/bin/nodetool status
-- Address Load Tokens Owns Host ID Rack
UN 192.168.47.221 2.66 TB 256 8.7% 87e100ed-85c4-44cb-9d9f-2d602d016038 RAC1
UN 192.168.47.204 614.58 GB 256 8.2% 91ad3d42-4207-46fe-8188-34c3f0b2dbd2 RAC1
I check the node with df command, and find disk used only 715G.
[qihuang.zheng@cass047221 forseti]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 20G 8.6G 11G 47% /
tmpfs 16G 0 16G 0% /dev/shm
/dev/sda1 190M 58M 123M 32% /boot
/dev/sda4 3.5T 715G 2.6T 22% /home
and this is a normal node’s disk used:
[qihuang.zheng@cass047204 ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
tmpfs 16G 0 16G 0% /dev/shm
/dev/sda1 485M 57M 403M 13% /boot
/dev/mapper/VolGroup-lv_home
3.4T 659G 2.6T 21% /home
How does nodetool status Load come from? should't It based on sstable file size which also based on disk used?
Tks, qihuang.zheng
Re: nodetool status Load not same with disk used
Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Oct 29, 2015 at 1:08 AM, qihuang.zheng <qihuang.zheng@fraudmetrix.cn
> wrote:
> *We have some nodes Load too large, but some are normal. *
>
tl;dr - Clear the snapshots on the nodes which are too large.
Longer :
Are you sure that the nodes which are too large differ in the actual *data*
size, or do they just contain snapshots?
Cassandra snapshots are hard links to SSTables, which means a number of odd
things :
1) Snapshots grow in actual disk usage over time, as they only consume
"extra" disk space when the SSTable they are a hard link to is removed from
the data directory.
2) Unless you use du --apparent-size, the order in which du sees files
determines which file is counted as using the disk, so you might see weird
results from du in the data directory if you are also involving the
snapshots.
--apparent-size
print apparent sizes, rather than disk usage; although the
apparent size is usually smaller, it may be larger due to
holes in (`sparse') files, internal fragmentation, indirect
blocks, and the like
=Rob