You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Mike Torra (JIRA)" <ji...@apache.org> on 2016/08/30 12:55:21 UTC
[jira] [Commented] (CASSANDRA-10430) "Load" report from "nodetool status" is inaccurate

    [ https://issues.apache.org/jira/browse/CASSANDRA-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15448933#comment-15448933 ] 

Mike Torra commented on CASSANDRA-10430:
----------------------------------------

I am running a relatively small cluster using datastax community cassandra 3.5 in ec2, and I regularly experience this issue. Even without running repair after restarting all nodes, eventually the reported 'load' diverges quite a bit. Is it really supposed to reflect disk usage? I found that to not be the case, so instead I depend on collectd reporting disk usage on my nodes.

$ nodetool status
Datacenter: ap-southeast
========================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
UN  52.76.137.45    72.89 GB   256          100.0%            47747850-8edb-4f26-9fc0-41cbd9763dd3  1a
UN  52.77.178.30    63.64 GB   256          100.0%            e9817aff-0d12-489e-aa6e-4960e0c43404  1a
UN  52.77.175.217   82.93 GB   256          100.0%            56f44708-cd29-4937-8450-86fe8dbc7445  1b
Datacenter: eu-west
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
UN  52.31.103.14    64.44 GB   256          100.0%            26479115-07fa-4d3a-bbb5-cf491b509946  1b
UN  52.30.151.214   36.61 GB   256          100.0%            298b143c-a2a9-45bb-b9c2-68675a0a46e0  1c
UN  52.210.34.43    48.55 GB   256          100.0%            a723bdf4-8575-4adf-ae13-891deb4bc986  1a
Datacenter: us-east
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
UN  52.205.224.43   141.15 GB  256          100.0%            35b4cf08-fb44-4b2e-869d-707b939e646d  1e
UN  52.204.232.195  1.15 TB    256          100.0%            dfb048f4-c61f-4b77-9d24-5cbf9080a923  1d
UN  52.205.186.242  797.57 GB  256          100.0%            71204c7a-6455-441c-a6a3-282672e01736  1b
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
UN  52.26.238.177   76.71 GB   256          100.0%            15e0550a-4798-4dc1-95b2-b5749ebece56  2c
UN  52.43.246.80    59.43 GB   256          100.0%            28b009e3-928e-457c-98cb-c39c201b3a7f  2a
UN  52.42.227.38    99.6 GB    256          100.0%            49ec7e6d-b392-464f-918b-09e0cc329c31  2b

> "Load" report from "nodetool status" is inaccurate
> --------------------------------------------------
>
>                 Key: CASSANDRA-10430
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10430
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>         Environment: Cassandra v2.1.9 running on 6 node Amazon AWS, vnodes enabled. 
>            Reporter: julia zhang
>         Attachments: system.log.2.zip, system.log.3.zip, system.log.4.zip
>
>
> After running an incremental repair, nodetool status report unbalanced load among cluster. 
> $ nodetool status mykeyspace
> ==========================
> ||Status|| Address         ||Load           ||Tokens  ||Owns (effective)  ||Host ID ||  Rack ||                         
> |UN  |10.1.1.1  |1.13 TB       |256    |48.5%            |a4477534-a5c6-4e3e-9108-17a69aebcfc0|  RAC1|
> |UN  |10.1.1.2  |2.58 TB       |256     |50.5%             |1a7c3864-879f-48c5-8dde-bc00cf4b23e6  |RAC2|
> |UN  |10.1.1.3  |1.49 TB       |256     |51.5%             |27df5b30-a5fc-44a5-9a2c-1cd65e1ba3f7  |RAC1|
> |UN  |10.1.1.4  |250.97 GB  |256     |51.9%             |9898a278-2fe6-4da2-b6dc-392e5fda51e6  |RAC3|
> |UN  |10.1.1.5 |1.88 TB      |256     |49.5%             |04aa9ce1-c1c3-4886-8d72-270b024b49b9  |RAC2|
> |UN  |10.1.1.6 |1.3 TB        |256     |48.1%             |6d5d48e6-d188-4f88-808d-dcdbb39fdca5  |RAC3|
> It seems that only 10.1.1.4 reports correct "Load". There is no hints in the cluster and report remains the same after running "nodetool cleanup" on each node. "nodetool cfstats" shows number of keys are evenly distributed and Cassandra data physical disk on each node report about the same usage. 
> "nodetool status" report these inaccurate large storage load until we restart each node, after the restart, "Load" report match what we've seen from disk.  
> We did not see this behavior until upgrade to v2.1.9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)