You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Abdul Patel <ab...@gmail.com> on 2021/09/17 04:48:03 UTC

High disk usage casaandra 3.11.7

Hello

We have cassandra with leveledcompaction strategy, recently found
filesystem almost 90% full but the data was only 10m records.
Manual compaction will work? As not sure its recommended and space is also
constraint ..tried removing and adding one node and now data is at 20GB
which looks appropropiate.
So is only solution to reclaim space is remove/add node?

Re: High disk usage casaandra 3.11.7

Posted by Dipan Shah <di...@hotmail.com>.

Hello Abdul,

Adding to what Bowen already shared for snapshots.

Assuming that you're not just amplifying disk space by updating\deleting existing data many times, these are the following things that you should consider:

  *   Manual snapshots
     *   Check (nodetool listsnapshots) and remove (nodetool clearsnapshot) unwanted snapshots
  *   Automatic snapshots
     *   You can have unwanted snapshots if auto snapshot is enabled and you're frequently dropping, trucating or scrubbing tables. Check if that is the case
  *   Incremental backups
     *   Check if you have enabled incremental backups. Those files do not get deleted on their own and need to be cleaned out regularly
  *   Not running cleanup after adding new nodes to the cluster
     *   Check if you have recently added nodes to the cluster and missed running cleanups after that
  *   Compaction failing due to low disk space
     *   Cassandra will not be able to compact data (and free up space) if it does not have the required disk space to rewrite files. Check system.log for compaction error

Thanks,

Dipan Shah

________________________________
From: Bowen Song <bo...@bso.ng>
Sent: Friday, September 17, 2021 4:53 PM
To: user@cassandra.apache.org <us...@cassandra.apache.org>
Subject: Re: High disk usage casaandra 3.11.7

Assuming your total disk space is a lot bigger than 50GB in size
(accounting for disk space amplification, commit log, logs, OS data,
etc.), I would suspect the disk space is being used by something else.
Have you checked that the disk space is actually being used by the
cassandra data directory? If so, have a look at 'nodetool listsnapshots'
command output as well.


On 17/09/2021 05:48, Abdul Patel wrote:
> Hello
>
> We have cassandra with leveledcompaction strategy, recently found
> filesystem almost 90% full but the data was only 10m records.
> Manual compaction will work? As not sure its recommended and space is
> also constraint ..tried removing and adding one node and now data is
> at 20GB which looks appropropiate.
> So is only solution to reclaim space is remove/add node?

Re: High disk usage casaandra 3.11.7

Posted by Bowen Song <bo...@bso.ng>.

Is there any reason to not use TTL? No compaction strategy is going to 
cope with frequent massive deletions. In fact, queue-like data model is 
a Cassandra antipattern.

On 17/09/2021 23:54, Abdul Patel wrote:
> Twcs is best for TTL not for excipilitly delete correct?
>
>
> On Friday, September 17, 2021, Abdul Patel <abd786.ap@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     48hrs deletion is deleting older data more than 48hrs .
>     LCS was used as its more of an write once and read many application.
>
>     On Friday, September 17, 2021, Bowen Song <bowen@bso.ng
>     <ma...@bso.ng>> wrote:
>
>         Congratulation! You've just found out the cause of it. Does
>         all data get deletes 48 hours after they are inserted? If so,
>         are you sure LCS is the right compaction strategy for this
>         table? TWCS sounds like a much better fit for this purpose.
>
>         On 17/09/2021 19:16, Abdul Patel wrote:
>>         Thanks.
>>         Application deletes data every 48hrs of older data.
>>         Auto compaction works but as space is full ..errorlog only
>>         says not enough space to run compaction.
>>
>>
>>         On Friday, September 17, 2021, Bowen Song <bowen@bso.ng
>>         <ma...@bso.ng>> wrote:
>>
>>             If major compaction is failing due to disk space
>>             constraint, you could copy the files to another server
>>             and run a major compaction there instead (i.e.: start
>>             cassandra on new server but not joining the existing
>>             cluster). If you must replace the node, at least use the
>>             '-Dcassandra.replace_address=...' parameter instead of
>>             'nodetool decommission' and then re-add, because the
>>             later changes the token ranges on the node, and that
>>             makes troubleshooting harder.
>>
>>             22GB of data amplifies to nearly 300GB sounds very
>>             impossible to me, there must be something else going on.
>>             Have you turned off auto compaction? Did you change the
>>             default parameters (namely, the 'fanout_size') for LCS?
>>             If this doesn't give you a clue, have a look at the
>>             SSTable data files, do you notice anything unusual? For
>>             example, too many small files, or some files are
>>             extraordinarily large. Also have a look at the logs, is
>>             there anything unusual? Also, do you know the application
>>             logic? Does it do a lots of delete or update (including
>>             'upsert')? Writes with TTL? Does the table has a default TTL?
>>
>>             On 17/09/2021 13:45, Abdul Patel wrote:
>>>             Close 300 gb data. Nodetool decommission/removenode and
>>>             added back one node ans it came back to 22Gb.
>>>             Cant run major compaction as no space much left.
>>>
>>>             On Friday, September 17, 2021, Bowen Song <bowen@bso.ng
>>>             <ma...@bso.ng>> wrote:
>>>
>>>                 Okay, so how big exactly is the data on disk? You
>>>                 said removing and adding a new node gives you 20GB
>>>                 on disk, was that done via the
>>>                 '-Dcassandra.replace_address=...' parameter? If not,
>>>                 the new node will almost certainly have a different
>>>                 token range and not directly comparable to the
>>>                 existing node if you have uneven partitions or small
>>>                 number of partitions in the table. Also, try major
>>>                 compaction, it's a lot easier than replacing a node.
>>>
>>>
>>>                 On 17/09/2021 12:28, Abdul Patel wrote:
>>>>                 Yes i checked and cleared all snapshots and also i
>>>>                 had incremental backups in backup folder ..i
>>>>                 removed the same .. its purely data..
>>>>
>>>>
>>>>                 On Friday, September 17, 2021, Bowen Song
>>>>                 <bowen@bso.ng <ma...@bso.ng>> wrote:
>>>>
>>>>                     Assuming your total disk space is a lot bigger
>>>>                     than 50GB in size (accounting for disk space
>>>>                     amplification, commit log, logs, OS data,
>>>>                     etc.), I would suspect the disk space is being
>>>>                     used by something else. Have you checked that
>>>>                     the disk space is actually being used by the
>>>>                     cassandra data directory? If so, have a look at
>>>>                     'nodetool listsnapshots' command output as well.
>>>>
>>>>
>>>>                     On 17/09/2021 05:48, Abdul Patel wrote:
>>>>
>>>>                         Hello
>>>>
>>>>                         We have cassandra with leveledcompaction
>>>>                         strategy, recently found filesystem almost
>>>>                         90% full but the data was only 10m records.
>>>>                         Manual compaction will work? As not sure
>>>>                         its recommended and space is also
>>>>                         constraint ..tried removing and adding one
>>>>                         node and now data is at 20GB which looks
>>>>                         appropropiate.
>>>>                         So is only solution to reclaim space is
>>>>                         remove/add node?
>>>>

Re: High disk usage casaandra 3.11.7

Posted by Abdul Patel <ab...@gmail.com>.

Twcs is best for TTL not for excipilitly delete correct?


On Friday, September 17, 2021, Abdul Patel <ab...@gmail.com> wrote:

> 48hrs deletion is deleting older data more than 48hrs .
> LCS was used as its more of an write once and read many application.
>
> On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:
>
>> Congratulation! You've just found out the cause of it. Does all data get
>> deletes 48 hours after they are inserted? If so, are you sure LCS is the
>> right compaction strategy for this table? TWCS sounds like a much better
>> fit for this purpose.
>> On 17/09/2021 19:16, Abdul Patel wrote:
>>
>> Thanks.
>> Application deletes data every 48hrs of older data.
>> Auto compaction works but as space is full ..errorlog only says not
>> enough space to run compaction.
>>
>>
>> On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:
>>
>>> If major compaction is failing due to disk space constraint, you could
>>> copy the files to another server and run a major compaction there instead
>>> (i.e.: start cassandra on new server but not joining the existing cluster).
>>> If you must replace the node, at least use the
>>> '-Dcassandra.replace_address=...' parameter instead of 'nodetool
>>> decommission' and then re-add, because the later changes the token ranges
>>> on the node, and that makes troubleshooting harder.
>>>
>>> 22GB of data amplifies to nearly 300GB sounds very impossible to me,
>>> there must be something else going on. Have you turned off auto compaction?
>>> Did you change the default parameters (namely, the 'fanout_size') for LCS?
>>> If this doesn't give you a clue, have a look at the SSTable data files, do
>>> you notice anything unusual? For example, too many small files, or some
>>> files are extraordinarily large. Also have a look at the logs, is there
>>> anything unusual? Also, do you know the application logic? Does it do a
>>> lots of delete or update (including 'upsert')? Writes with TTL? Does the
>>> table has a default TTL?
>>> On 17/09/2021 13:45, Abdul Patel wrote:
>>>
>>> Close 300 gb data. Nodetool decommission/removenode and added back one
>>> node ans it came back to 22Gb.
>>> Cant run major compaction as no space much left.
>>>
>>> On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:
>>>
>>>> Okay, so how big exactly is the data on disk? You said removing and
>>>> adding a new node gives you 20GB on disk, was that done via the
>>>> '-Dcassandra.replace_address=...' parameter? If not, the new node will
>>>> almost certainly have a different token range and not directly comparable
>>>> to the existing node if you have uneven partitions or small number of
>>>> partitions in the table. Also, try major compaction, it's a lot easier than
>>>> replacing a node.
>>>>
>>>>
>>>> On 17/09/2021 12:28, Abdul Patel wrote:
>>>>
>>>> Yes i checked and cleared all snapshots and also i had incremental
>>>> backups in backup folder ..i removed the same .. its purely data..
>>>>
>>>>
>>>> On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:
>>>>
>>>>> Assuming your total disk space is a lot bigger than 50GB in size
>>>>> (accounting for disk space amplification, commit log, logs, OS data, etc.),
>>>>> I would suspect the disk space is being used by something else. Have you
>>>>> checked that the disk space is actually being used by the cassandra data
>>>>> directory? If so, have a look at 'nodetool listsnapshots' command output as
>>>>> well.
>>>>>
>>>>>
>>>>> On 17/09/2021 05:48, Abdul Patel wrote:
>>>>>
>>>>>> Hello
>>>>>>
>>>>>> We have cassandra with leveledcompaction strategy, recently found
>>>>>> filesystem almost 90% full but the data was only 10m records.
>>>>>> Manual compaction will work? As not sure its recommended and space is
>>>>>> also constraint ..tried removing and adding one node and now data is at
>>>>>> 20GB which looks appropropiate.
>>>>>> So is only solution to reclaim space is remove/add node?
>>>>>>
>>>>>

Re: High disk usage casaandra 3.11.7

Posted by Abdul Patel <ab...@gmail.com>.

48hrs deletion is deleting older data more than 48hrs .
LCS was used as its more of an write once and read many application.

On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:

> Congratulation! You've just found out the cause of it. Does all data get
> deletes 48 hours after they are inserted? If so, are you sure LCS is the
> right compaction strategy for this table? TWCS sounds like a much better
> fit for this purpose.
> On 17/09/2021 19:16, Abdul Patel wrote:
>
> Thanks.
> Application deletes data every 48hrs of older data.
> Auto compaction works but as space is full ..errorlog only says not enough
> space to run compaction.
>
>
> On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:
>
>> If major compaction is failing due to disk space constraint, you could
>> copy the files to another server and run a major compaction there instead
>> (i.e.: start cassandra on new server but not joining the existing cluster).
>> If you must replace the node, at least use the
>> '-Dcassandra.replace_address=...' parameter instead of 'nodetool
>> decommission' and then re-add, because the later changes the token ranges
>> on the node, and that makes troubleshooting harder.
>>
>> 22GB of data amplifies to nearly 300GB sounds very impossible to me,
>> there must be something else going on. Have you turned off auto compaction?
>> Did you change the default parameters (namely, the 'fanout_size') for LCS?
>> If this doesn't give you a clue, have a look at the SSTable data files, do
>> you notice anything unusual? For example, too many small files, or some
>> files are extraordinarily large. Also have a look at the logs, is there
>> anything unusual? Also, do you know the application logic? Does it do a
>> lots of delete or update (including 'upsert')? Writes with TTL? Does the
>> table has a default TTL?
>> On 17/09/2021 13:45, Abdul Patel wrote:
>>
>> Close 300 gb data. Nodetool decommission/removenode and added back one
>> node ans it came back to 22Gb.
>> Cant run major compaction as no space much left.
>>
>> On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:
>>
>>> Okay, so how big exactly is the data on disk? You said removing and
>>> adding a new node gives you 20GB on disk, was that done via the
>>> '-Dcassandra.replace_address=...' parameter? If not, the new node will
>>> almost certainly have a different token range and not directly comparable
>>> to the existing node if you have uneven partitions or small number of
>>> partitions in the table. Also, try major compaction, it's a lot easier than
>>> replacing a node.
>>>
>>>
>>> On 17/09/2021 12:28, Abdul Patel wrote:
>>>
>>> Yes i checked and cleared all snapshots and also i had incremental
>>> backups in backup folder ..i removed the same .. its purely data..
>>>
>>>
>>> On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:
>>>
>>>> Assuming your total disk space is a lot bigger than 50GB in size
>>>> (accounting for disk space amplification, commit log, logs, OS data, etc.),
>>>> I would suspect the disk space is being used by something else. Have you
>>>> checked that the disk space is actually being used by the cassandra data
>>>> directory? If so, have a look at 'nodetool listsnapshots' command output as
>>>> well.
>>>>
>>>>
>>>> On 17/09/2021 05:48, Abdul Patel wrote:
>>>>
>>>>> Hello
>>>>>
>>>>> We have cassandra with leveledcompaction strategy, recently found
>>>>> filesystem almost 90% full but the data was only 10m records.
>>>>> Manual compaction will work? As not sure its recommended and space is
>>>>> also constraint ..tried removing and adding one node and now data is at
>>>>> 20GB which looks appropropiate.
>>>>> So is only solution to reclaim space is remove/add node?
>>>>>
>>>>

Re: High disk usage casaandra 3.11.7

Posted by Bowen Song <bo...@bso.ng>.

Congratulation! You've just found out the cause of it. Does all data get 
deletes 48 hours after they are inserted? If so, are you sure LCS is the 
right compaction strategy for this table? TWCS sounds like a much better 
fit for this purpose.

On 17/09/2021 19:16, Abdul Patel wrote:
> Thanks.
> Application deletes data every 48hrs of older data.
> Auto compaction works but as space is full ..errorlog only says not 
> enough space to run compaction.
>
>
> On Friday, September 17, 2021, Bowen Song <bowen@bso.ng 
> <ma...@bso.ng>> wrote:
>
>     If major compaction is failing due to disk space constraint, you
>     could copy the files to another server and run a major compaction
>     there instead (i.e.: start cassandra on new server but not joining
>     the existing cluster). If you must replace the node, at least use
>     the '-Dcassandra.replace_address=...' parameter instead of
>     'nodetool decommission' and then re-add, because the later changes
>     the token ranges on the node, and that makes troubleshooting harder.
>
>     22GB of data amplifies to nearly 300GB sounds very impossible to
>     me, there must be something else going on. Have you turned off
>     auto compaction? Did you change the default parameters (namely,
>     the 'fanout_size') for LCS? If this doesn't give you a clue, have
>     a look at the SSTable data files, do you notice anything unusual?
>     For example, too many small files, or some files are
>     extraordinarily large. Also have a look at the logs, is there
>     anything unusual? Also, do you know the application logic? Does it
>     do a lots of delete or update (including 'upsert')? Writes with
>     TTL? Does the table has a default TTL?
>
>     On 17/09/2021 13:45, Abdul Patel wrote:
>>     Close 300 gb data. Nodetool decommission/removenode and added
>>     back one node ans it came back to 22Gb.
>>     Cant run major compaction as no space much left.
>>
>>     On Friday, September 17, 2021, Bowen Song <bowen@bso.ng
>>     <ma...@bso.ng>> wrote:
>>
>>         Okay, so how big exactly is the data on disk? You said
>>         removing and adding a new node gives you 20GB on disk, was
>>         that done via the '-Dcassandra.replace_address=...'
>>         parameter? If not, the new node will almost certainly have a
>>         different token range and not directly comparable to the
>>         existing node if you have uneven partitions or small number
>>         of partitions in the table. Also, try major compaction, it's
>>         a lot easier than replacing a node.
>>
>>
>>         On 17/09/2021 12:28, Abdul Patel wrote:
>>>         Yes i checked and cleared all snapshots and also i had
>>>         incremental backups in backup folder ..i removed the same ..
>>>         its purely data..
>>>
>>>
>>>         On Friday, September 17, 2021, Bowen Song <bowen@bso.ng
>>>         <ma...@bso.ng>> wrote:
>>>
>>>             Assuming your total disk space is a lot bigger than 50GB
>>>             in size (accounting for disk space amplification, commit
>>>             log, logs, OS data, etc.), I would suspect the disk
>>>             space is being used by something else. Have you checked
>>>             that the disk space is actually being used by the
>>>             cassandra data directory? If so, have a look at
>>>             'nodetool listsnapshots' command output as well.
>>>
>>>
>>>             On 17/09/2021 05:48, Abdul Patel wrote:
>>>
>>>                 Hello
>>>
>>>                 We have cassandra with leveledcompaction strategy,
>>>                 recently found filesystem almost 90% full but the
>>>                 data was only 10m records.
>>>                 Manual compaction will work? As not sure its
>>>                 recommended and space is also constraint ..tried
>>>                 removing and adding one node and now data is at 20GB
>>>                 which looks appropropiate.
>>>                 So is only solution to reclaim space is remove/add node?
>>>

Re: High disk usage casaandra 3.11.7

Posted by Abdul Patel <ab...@gmail.com>.

Thanks.
Application deletes data every 48hrs of older data.
Auto compaction works but as space is full ..errorlog only says not enough
space to run compaction.


On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:

> If major compaction is failing due to disk space constraint, you could
> copy the files to another server and run a major compaction there instead
> (i.e.: start cassandra on new server but not joining the existing cluster).
> If you must replace the node, at least use the
> '-Dcassandra.replace_address=...' parameter instead of 'nodetool
> decommission' and then re-add, because the later changes the token ranges
> on the node, and that makes troubleshooting harder.
>
> 22GB of data amplifies to nearly 300GB sounds very impossible to me, there
> must be something else going on. Have you turned off auto compaction? Did
> you change the default parameters (namely, the 'fanout_size') for LCS? If
> this doesn't give you a clue, have a look at the SSTable data files, do you
> notice anything unusual? For example, too many small files, or some files
> are extraordinarily large. Also have a look at the logs, is there anything
> unusual? Also, do you know the application logic? Does it do a lots of
> delete or update (including 'upsert')? Writes with TTL? Does the table has
> a default TTL?
>
> On 17/09/2021 13:45, Abdul Patel wrote:
>
> Close 300 gb data. Nodetool decommission/removenode and added back one
> node ans it came back to 22Gb.
> Cant run major compaction as no space much left.
>
> On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:
>
>> Okay, so how big exactly is the data on disk? You said removing and
>> adding a new node gives you 20GB on disk, was that done via the
>> '-Dcassandra.replace_address=...' parameter? If not, the new node will
>> almost certainly have a different token range and not directly comparable
>> to the existing node if you have uneven partitions or small number of
>> partitions in the table. Also, try major compaction, it's a lot easier than
>> replacing a node.
>>
>>
>> On 17/09/2021 12:28, Abdul Patel wrote:
>>
>> Yes i checked and cleared all snapshots and also i had incremental
>> backups in backup folder ..i removed the same .. its purely data..
>>
>>
>> On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:
>>
>>> Assuming your total disk space is a lot bigger than 50GB in size
>>> (accounting for disk space amplification, commit log, logs, OS data, etc.),
>>> I would suspect the disk space is being used by something else. Have you
>>> checked that the disk space is actually being used by the cassandra data
>>> directory? If so, have a look at 'nodetool listsnapshots' command output as
>>> well.
>>>
>>>
>>> On 17/09/2021 05:48, Abdul Patel wrote:
>>>
>>>> Hello
>>>>
>>>> We have cassandra with leveledcompaction strategy, recently found
>>>> filesystem almost 90% full but the data was only 10m records.
>>>> Manual compaction will work? As not sure its recommended and space is
>>>> also constraint ..tried removing and adding one node and now data is at
>>>> 20GB which looks appropropiate.
>>>> So is only solution to reclaim space is remove/add node?
>>>>
>>>

Re: High disk usage casaandra 3.11.7

Posted by Bowen Song <bo...@bso.ng>.

If major compaction is failing due to disk space constraint, you could 
copy the files to another server and run a major compaction there 
instead (i.e.: start cassandra on new server but not joining the 
existing cluster). If you must replace the node, at least use the 
'-Dcassandra.replace_address=...' parameter instead of 'nodetool 
decommission' and then re-add, because the later changes the token 
ranges on the node, and that makes troubleshooting harder.

22GB of data amplifies to nearly 300GB sounds very impossible to me, 
there must be something else going on. Have you turned off auto 
compaction? Did you change the default parameters (namely, the 
'fanout_size') for LCS? If this doesn't give you a clue, have a look at 
the SSTable data files, do you notice anything unusual? For example, too 
many small files, or some files are extraordinarily large. Also have a 
look at the logs, is there anything unusual? Also, do you know the 
application logic? Does it do a lots of delete or update (including 
'upsert')? Writes with TTL? Does the table has a default TTL?

On 17/09/2021 13:45, Abdul Patel wrote:
> Close 300 gb data. Nodetool decommission/removenode and added back one 
> node ans it came back to 22Gb.
> Cant run major compaction as no space much left.
>
> On Friday, September 17, 2021, Bowen Song <bowen@bso.ng 
> <ma...@bso.ng>> wrote:
>
>     Okay, so how big exactly is the data on disk? You said removing
>     and adding a new node gives you 20GB on disk, was that done via
>     the '-Dcassandra.replace_address=...' parameter? If not, the new
>     node will almost certainly have a different token range and not
>     directly comparable to the existing node if you have uneven
>     partitions or small number of partitions in the table. Also, try
>     major compaction, it's a lot easier than replacing a node.
>
>
>     On 17/09/2021 12:28, Abdul Patel wrote:
>>     Yes i checked and cleared all snapshots and also i had
>>     incremental backups in backup folder ..i removed the same .. its
>>     purely data..
>>
>>
>>     On Friday, September 17, 2021, Bowen Song <bowen@bso.ng
>>     <ma...@bso.ng>> wrote:
>>
>>         Assuming your total disk space is a lot bigger than 50GB in
>>         size (accounting for disk space amplification, commit log,
>>         logs, OS data, etc.), I would suspect the disk space is being
>>         used by something else. Have you checked that the disk space
>>         is actually being used by the cassandra data directory? If
>>         so, have a look at 'nodetool listsnapshots' command output as
>>         well.
>>
>>
>>         On 17/09/2021 05:48, Abdul Patel wrote:
>>
>>             Hello
>>
>>             We have cassandra with leveledcompaction strategy,
>>             recently found filesystem almost 90% full but the data
>>             was only 10m records.
>>             Manual compaction will work? As not sure its recommended
>>             and space is also constraint ..tried removing and adding
>>             one node and now data is at 20GB which looks appropropiate.
>>             So is only solution to reclaim space is remove/add node?
>>

Re: High disk usage casaandra 3.11.7

Posted by Abdul Patel <ab...@gmail.com>.

Close 300 gb data. Nodetool decommission/removenode and added back one node
ans it came back to 22Gb.
Cant run major compaction as no space much left.

On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:

> Okay, so how big exactly is the data on disk? You said removing and adding
> a new node gives you 20GB on disk, was that done via the
> '-Dcassandra.replace_address=...' parameter? If not, the new node will
> almost certainly have a different token range and not directly comparable
> to the existing node if you have uneven partitions or small number of
> partitions in the table. Also, try major compaction, it's a lot easier than
> replacing a node.
>
>
> On 17/09/2021 12:28, Abdul Patel wrote:
>
> Yes i checked and cleared all snapshots and also i had incremental backups
> in backup folder ..i removed the same .. its purely data..
>
>
> On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:
>
>> Assuming your total disk space is a lot bigger than 50GB in size
>> (accounting for disk space amplification, commit log, logs, OS data, etc.),
>> I would suspect the disk space is being used by something else. Have you
>> checked that the disk space is actually being used by the cassandra data
>> directory? If so, have a look at 'nodetool listsnapshots' command output as
>> well.
>>
>>
>> On 17/09/2021 05:48, Abdul Patel wrote:
>>
>>> Hello
>>>
>>> We have cassandra with leveledcompaction strategy, recently found
>>> filesystem almost 90% full but the data was only 10m records.
>>> Manual compaction will work? As not sure its recommended and space is
>>> also constraint ..tried removing and adding one node and now data is at
>>> 20GB which looks appropropiate.
>>> So is only solution to reclaim space is remove/add node?
>>>
>>

Re: High disk usage casaandra 3.11.7

Posted by Bowen Song <bo...@bso.ng>.

Okay, so how big exactly is the data on disk? You said removing and 
adding a new node gives you 20GB on disk, was that done via the 
'-Dcassandra.replace_address=...' parameter? If not, the new node will 
almost certainly have a different token range and not directly 
comparable to the existing node if you have uneven partitions or small 
number of partitions in the table. Also, try major compaction, it's a 
lot easier than replacing a node.


On 17/09/2021 12:28, Abdul Patel wrote:
> Yes i checked and cleared all snapshots and also i had incremental 
> backups in backup folder ..i removed the same .. its purely data..
>
>
> On Friday, September 17, 2021, Bowen Song <bowen@bso.ng 
> <ma...@bso.ng>> wrote:
>
>     Assuming your total disk space is a lot bigger than 50GB in size
>     (accounting for disk space amplification, commit log, logs, OS
>     data, etc.), I would suspect the disk space is being used by
>     something else. Have you checked that the disk space is actually
>     being used by the cassandra data directory? If so, have a look at
>     'nodetool listsnapshots' command output as well.
>
>
>     On 17/09/2021 05:48, Abdul Patel wrote:
>
>         Hello
>
>         We have cassandra with leveledcompaction strategy, recently
>         found filesystem almost 90% full but the data was only 10m
>         records.
>         Manual compaction will work? As not sure its recommended and
>         space is also constraint ..tried removing and adding one node
>         and now data is at 20GB which looks appropropiate.
>         So is only solution to reclaim space is remove/add node?
>

Re: High disk usage casaandra 3.11.7

Posted by Abdul Patel <ab...@gmail.com>.

Yes i checked and cleared all snapshots and also i had incremental backups
in backup folder ..i removed the same .. its purely data..


On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:

> Assuming your total disk space is a lot bigger than 50GB in size
> (accounting for disk space amplification, commit log, logs, OS data, etc.),
> I would suspect the disk space is being used by something else. Have you
> checked that the disk space is actually being used by the cassandra data
> directory? If so, have a look at 'nodetool listsnapshots' command output as
> well.
>
>
> On 17/09/2021 05:48, Abdul Patel wrote:
>
>> Hello
>>
>> We have cassandra with leveledcompaction strategy, recently found
>> filesystem almost 90% full but the data was only 10m records.
>> Manual compaction will work? As not sure its recommended and space is
>> also constraint ..tried removing and adding one node and now data is at
>> 20GB which looks appropropiate.
>> So is only solution to reclaim space is remove/add node?
>>
>

Re: High disk usage casaandra 3.11.7

Posted by Bowen Song <bo...@bso.ng>.

Assuming your total disk space is a lot bigger than 50GB in size 
(accounting for disk space amplification, commit log, logs, OS data, 
etc.), I would suspect the disk space is being used by something else. 
Have you checked that the disk space is actually being used by the 
cassandra data directory? If so, have a look at 'nodetool listsnapshots' 
command output as well.

On 17/09/2021 05:48, Abdul Patel wrote:
> Hello
>
> We have cassandra with leveledcompaction strategy, recently found 
> filesystem almost 90% full but the data was only 10m records.
> Manual compaction will work? As not sure its recommended and space is 
> also constraint ..tried removing and adding one node and now data is 
> at 20GB which looks appropropiate.
> So is only solution to reclaim space is remove/add node?