You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Hari Rajendhran <ha...@tcs.com> on 2014/04/07 12:56:37 UTC

Cassandra Disk storage capacity

Hi Team,

We have a 3 node Apache cassandra 2.0.4 setup installed in our lab setup.We have set data directory to /var/lib/cassandra/data.What would be the maximum 
disk storage that will be used for cassandra data storage.

Note : /var partition has a storage capacity of 40GB.

My question is whether cassandra will  the entire / directory for data storage ?
If no, how to specify multiple directories for data storage ??

 


 
Best Regards
Hari Krishnan Rajendhran
Hadoop Admin
DESS-ABIM ,Chennai BIGDATA Galaxy
Tata Consultancy Services
Cell:- 9677985515
Mailto: hari.rajendhran@tcs.com
Website: http://www.tcs.com
____________________________________________
Experience certainty.	IT Services
Business Solutions
Consulting
____________________________________________
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you



Re: Cassandra Disk storage capacity

Posted by Prem Yadav <ip...@gmail.com>.
you can specify multiple data directories in cassandra.yaml.
ex:
data_file_directories:
  - /var/lib.cass1
  - /var/lib/cass2
  -/<some_other_mountpoint>


On Mon, Apr 7, 2014 at 12:10 PM, Jan Kesten <j....@enercast.de> wrote:

>  Hi Hari,
>
> C* will use your entire space - that is something one should monitor.
> Depending on your choose on compaction strategy your data_dir should not be
> filled up entirely - in the worst case compaction will need space as large
> as the sstables on disk, therefore 50% should be free space.
>
> The parameters used for on disk storage are commitlog_directory and
> data_file_directories and saved_caches_directory. The paramter
> data_file_directories is in plural, you can easily put more than one
> directory here (and you should do this instead of using RAID).
>
> Cheers,
> Jan
>
> Am 07.04.2014 12:56, schrieb Hari Rajendhran:
>
> Hi Team,
>
>  We have a 3 node Apache cassandra 2.0.4 setup installed in our lab
> setup.We have set data directory to /var/lib/cassandra/data.What would be
> the maximum
> disk storage that will be used for cassandra data storage.
>
>  Note : /var partition has a storage capacity of 40GB.
>
>  My question is whether cassandra will  the entire / directory for data
> storage ?
> If no, how to specify multiple directories for data storage ??
>
>
>
>
>
> Best Regards
> Hari Krishnan Rajendhran
> Hadoop Admin
> DESS-ABIM ,Chennai BIGDATA Galaxy
> Tata Consultancy Services
> Cell:- 9677985515
> Mailto: hari.rajendhran@tcs.com
> Website: http://www.tcs.com
> ____________________________________________
> Experience certainty. IT Services
> Business Solutions
> Consulting
> ____________________________________________
>
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>
>
>
> --
> Jan Kesten, mailto:j.kesten@enercast.de <j....@enercast.de>
> Tel.: +49 561/4739664-0 FAX: -9
> enercast GmbH Friedrich-Ebert-Str. 104 D-34119 Kassel       HRB15471http://www.enercast.de Online-Prognosen für erneuerbare Energien
> Geschäftsführung: Dipl. Ing. Thomas Landgraf, Bernd Kratz
>
> Diese E-Mail und etwaige Anhänge können vertrauliche und/oder rechtlich geschützte Informationen enthalten. Falls Sie nicht der angegebene Empfänger sind oder falls diese E-Mail irrtümlich an Sie adressiert wurde, benachrichtigen Sie uns bitte sofort durch Antwort-E-Mail und löschen Sie diese E-Mail nebst etwaigen Anlagen von Ihrem System. Ebenso dürfen Sie diese E-Mail oder ihre Anlagen nicht kopieren oder an Dritte weitergeben. Vielen Dank.
>
> This e-mail and any attachment may contain confidential and/or privileged information. If you are not the named addressee or if this transmission has been addressed to you in error, please notify us immediately by reply e-mail and then delete this e-mail and any attachment from your system. Please understand that you must not copy this e-mail or any attachment or disclose the contents to any other person. Thank you for your cooperation.
>
>

Re: Cassandra Disk storage capacity

Posted by Bèrto ëd Sèra <be...@gmail.com>.
I guess there is a misunderstanding here:
>I am confused why cassandra uses the entire disk space ( / Directory) even
when we specify /var/lib/cassandra/data as the directory in Cassandra.yaml
file

C* will use the entire MOUNTPOINT, which is not necessarily your entire
total disk space. If you have a separate mountpoint for /var/lib/cassandra/data
it is my understanding that C* will fill that, to the best of my knowledge.
You still have stuff located outside of it, though. Like logs, to begin
with.

Cheers
Bèrto




On 7 April 2014 12:24, Hari Rajendhran <ha...@tcs.com> wrote:

> Hi,
>
> Thanks for the update
>
> Still i have few queries which needs to be clarified
>
> 1) I am confused why cassandra uses the entire disk space ( / Directory)
> even when we specify /var/lib/cassandra/data as the directory in
> Cassandra.yaml file
> 2) Is it only during compaction ,cassandra will use the entire Disk space ?
> 3) What is the best way to monitor the cassandra Disk usage ?? is there a
> opensource monitoring tool for this ??
>
>
>
> Best Regards
> Hari Krishnan Rajendhran
> Hadoop Admin
> DESS-ABIM ,Chennai BIGDATA Galaxy
> Tata Consultancy Services
> Cell:- 9677985515
> Mailto: hari.rajendhran@tcs.com
> Website: http://www.tcs.com
> ____________________________________________
> Experience certainty. IT Services
> Business Solutions
> Consulting
> ____________________________________________
>
> -----Jan Kesten <j....@enercast.de> wrote: -----
>
> To: user@cassandra.apache.org
> From: Jan Kesten <j....@enercast.de>
> Date: 04/07/2014 04:41PM
> Subject: Re: Cassandra Disk storage capacity
>
>
> Hi Hari,
>
> C* will use your entire space - that is something one should monitor.
> Depending on your choose on compaction strategy your data_dir should not be
> filled up entirely - in the worst case compaction will need space as large
> as the sstables on disk, therefore 50% should be free space.
>
> The parameters used for on disk storage are commitlog_directory and
> data_file_directories and saved_caches_directory. The paramter
> data_file_directories is in plural, you can easily put more than one
> directory here (and you should do this instead of using RAID).
>
> Cheers,
> Jan
>
> Am 07.04.2014 12:56, schrieb Hari Rajendhran:
>
> Hi Team,
>
>  We have a 3 node Apache cassandra 2.0.4 setup installed in our lab
> setup.We have set data directory to /var/lib/cassandra/data.What would be
> the maximum
> disk storage that will be used for cassandra data storage.
>
>  Note : /var partition has a storage capacity of 40GB.
>
>  My question is whether cassandra will  the entire / directory for data
> storage ?
> If no, how to specify multiple directories for data storage ??
>
>
>
>
>
> Best Regards
> Hari Krishnan Rajendhran
> Hadoop Admin
> DESS-ABIM ,Chennai BIGDATA Galaxy
> Tata Consultancy Services
> Cell:- 9677985515
> Mailto: hari.rajendhran@tcs.com
> Website: http://www.tcs.com
> ____________________________________________
> Experience certainty. IT Services
> Business Solutions
> Consulting
> ____________________________________________
>
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>
>
>
> --
> Jan Kesten, mailto:j.kesten@enercast.de <j....@enercast.de>
> Tel.: +49 561/4739664-0 FAX: -9
> enercast GmbH Friedrich-Ebert-Str. 104 D-34119 Kassel       HRB15471
> http://www.enercast.de Online-Prognosen für erneuerbare Energien
> Geschäftsführung: Dipl. Ing. Thomas Landgraf, Bernd Kratz
>
> Diese E-Mail und etwaige Anhänge können vertrauliche und/oder rechtlich
> geschützte Informationen enthalten. Falls Sie nicht der angegebene
> Empfänger sind oder falls diese E-Mail irrtümlich an Sie adressiert wurde,
> benachrichtigen Sie uns bitte sofort durch Antwort-E-Mail und löschen Sie
> diese E-Mail nebst etwaigen Anlagen von Ihrem System. Ebenso dürfen Sie
> diese E-Mail oder ihre Anlagen nicht kopieren oder an Dritte weitergeben.
> Vielen Dank.
>
> This e-mail and any attachment may contain confidential and/or privileged
> information. If you are not the named addressee or if this transmission has
> been addressed to you in error, please notify us immediately by reply
> e-mail and then delete this e-mail and any attachment from your system.
> Please understand that you must not copy this e-mail or any attachment or
> disclose the contents to any other person. Thank you for your cooperation.
>
>


-- 
==============================
If Pac-Man had affected us as kids, we'd all be running around in a
darkened room munching pills and listening to repetitive music.

Re: Cassandra Disk storage capacity

Posted by Jan Kesten <j....@enercast.de>.
Am 07.04.2014 13:24, schrieb Hari Rajendhran:
> 1) I am confused why cassandra uses the entire disk space ( / 
> Directory) even when we specify /var/lib/cassandra/data as the 
> directory in Cassandra.yaml file
> 2) Is it only during compaction ,cassandra will use the entire Disk 
> space ?
> 3) What is the best way to monitor the cassandra Disk usage ?? is 
> there a opensource monitoring tool for this ??

Hi,

if your / and /var/lib/cassandra/data are on different disks (or 
partitions) only /var/lib/cassandra/data will get filled entirely. Often 
this is not the case per default and you will have to create this 
mountpoints by yourself. Also keep in mind to keep commitlogs on a 
seperate disk from data to improve performance.

The extra space is only needed during compaction - but cassandra will 
fire up compactions by itself, so you must keep this free space 
maintained all the time. This is valid for SizeTieredCompation, Leveled- 
or HybridCompations are "cheaper" on disk space.

For the last point - there are many tools to monitor your servers inside 
your cluster. Nagios, Hyperic HQ and OpenNMS are some of them - you can 
define alerts which keep you up to date.

Cheers,
jan

Re: Cassandra Disk storage capacity

Posted by Hari Rajendhran <ha...@tcs.com>.
Hi,

Thanks for the update 

Still i have few queries which needs to be clarified 

1) I am confused why cassandra uses the entire disk space ( / Directory) even when we specify /var/lib/cassandra/data as the directory in Cassandra.yaml file
2) Is it only during compaction ,cassandra will use the entire Disk space ?
3) What is the best way to monitor the cassandra Disk usage ?? is there a opensource monitoring tool for this ??



Best Regards
Hari Krishnan Rajendhran
Hadoop Admin
DESS-ABIM ,Chennai BIGDATA Galaxy
Tata Consultancy Services
Cell:- 9677985515
Mailto: hari.rajendhran@tcs.com
Website: http://www.tcs.com
____________________________________________
Experience certainty.	IT Services
Business Solutions
Consulting
____________________________________________

-----Jan Kesten <j....@enercast.de> wrote: -----
To: user@cassandra.apache.org
From: Jan Kesten <j....@enercast.de>
Date: 04/07/2014 04:41PM
Subject: Re: Cassandra Disk storage capacity

Hi Hari,

C* will use your entire space - that is something one should monitor. Depending on your choose on compaction strategy your data_dir should not be filled up entirely - in the worst case compaction will need space as large as the sstables on disk, therefore 50% should be free space.

The parameters used for on disk storage are commitlog_directory and data_file_directories and saved_caches_directory. The paramter data_file_directories is in plural, you can easily put more than one directory here (and you should do this instead of using RAID). 

Cheers,
Jan

Am 07.04.2014 12:56, schrieb Hari Rajendhran:
Hi Team,

We have a 3 node Apache cassandra 2.0.4 setup installed in our lab setup.We have set data directory to /var/lib/cassandra/data.What would be the maximum 
disk storage that will be used for cassandra data storage.

Note : /var partition has a storage capacity of 40GB.

My question is whether cassandra will  the entire / directory for data storage ?
If no, how to specify multiple directories for data storage ??

 


 
Best Regards
Hari Krishnan Rajendhran
Hadoop Admin
DESS-ABIM ,Chennai BIGDATA Galaxy
Tata Consultancy Services
Cell:- 9677985515
Mailto: hari.rajendhran@tcs.com
Website: http://www.tcs.com
____________________________________________
Experience certainty. IT Services
Business Solutions
Consulting
____________________________________________
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you



-- 
Jan Kesten, mailto:j.kesten@enercast.de
Tel.: +49 561/4739664-0 FAX: -9
enercast GmbH Friedrich-Ebert-Str. 104 D-34119 Kassel       HRB15471
http://www.enercast.de Online-Prognosen für erneuerbare Energien
Geschäftsführung: Dipl. Ing. Thomas Landgraf, Bernd Kratz

Diese E-Mail und etwaige Anhänge können vertrauliche und/oder rechtlich geschützte Informationen enthalten. Falls Sie nicht der angegebene Empfänger sind oder falls diese E-Mail irrtümlich an Sie adressiert wurde, benachrichtigen Sie uns bitte sofort durch Antwort-E-Mail und löschen Sie diese E-Mail nebst etwaigen Anlagen von Ihrem System. Ebenso dürfen Sie diese E-Mail oder ihre Anlagen nicht kopieren oder an Dritte weitergeben. Vielen Dank.

This e-mail and any attachment may contain confidential and/or privileged information. If you are not the named addressee or if this transmission has been addressed to you in error, please notify us immediately by reply e-mail and then delete this e-mail and any attachment from your system. Please understand that you must not copy this e-mail or any attachment or disclose the contents to any other person. Thank you for your cooperation.

Re: Cassandra Disk storage capacity

Posted by Jan Kesten <j....@enercast.de>.
Hi Hari,

C* will use your entire space - that is something one should monitor. 
Depending on your choose on compaction strategy your data_dir should not 
be filled up entirely - in the worst case compaction will need space as 
large as the sstables on disk, therefore 50% should be free space.

The parameters used for on disk storage are commitlog_directory and 
data_file_directories and saved_caches_directory. The paramter 
data_file_directories is in plural, you can easily put more than one 
directory here (and you should do this instead of using RAID).

Cheers,
Jan

Am 07.04.2014 12:56, schrieb Hari Rajendhran:
> Hi Team,
>
> We have a 3 node Apache cassandra 2.0.4 setup installed in our lab 
> setup.We have set data directory to /var/lib/cassandra/data.What would 
> be the maximum
> disk storage that will be used for cassandra data storage.
>
> Note : /var partition has a storage capacity of 40GB.
>
> My question is whether cassandra will  the entire / directory for data 
> storage ?
> If no, how to specify multiple directories for data storage ??
>
>
>
>
>
> Best Regards
> Hari Krishnan Rajendhran
> Hadoop Admin
> DESS-ABIM ,Chennai BIGDATA Galaxy
> Tata Consultancy Services
> Cell:- 9677985515
> Mailto: hari.rajendhran@tcs.com
> Website: http://www.tcs.com
> ____________________________________________
> Experience certainty. IT Services
> Business Solutions
> Consulting
> ____________________________________________
>
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>


-- 
Jan Kesten, mailto:j.kesten@enercast.de
Tel.: +49 561/4739664-0 FAX: -9
enercast GmbH Friedrich-Ebert-Str. 104 D-34119 Kassel       HRB15471
http://www.enercast.de Online-Prognosen für erneuerbare Energien
Geschäftsführung: Dipl. Ing. Thomas Landgraf, Bernd Kratz

Diese E-Mail und etwaige Anhänge können vertrauliche und/oder rechtlich geschützte Informationen enthalten. Falls Sie nicht der angegebene Empfänger sind oder falls diese E-Mail irrtümlich an Sie adressiert wurde, benachrichtigen Sie uns bitte sofort durch Antwort-E-Mail und löschen Sie diese E-Mail nebst etwaigen Anlagen von Ihrem System. Ebenso dürfen Sie diese E-Mail oder ihre Anlagen nicht kopieren oder an Dritte weitergeben. Vielen Dank.

This e-mail and any attachment may contain confidential and/or privileged information. If you are not the named addressee or if this transmission has been addressed to you in error, please notify us immediately by reply e-mail and then delete this e-mail and any attachment from your system. Please understand that you must not copy this e-mail or any attachment or disclose the contents to any other person. Thank you for your cooperation.


RE: Cassandra Disk storage capacity

Posted by Romain HARDOUIN <ro...@urssaf.fr>.
Hi,

See data_file_directories and commitlog_directory in the settings file 
cassandra.yaml.

Cheers,

Romain

Hari Rajendhran <ha...@tcs.com> a écrit sur 07/04/2014 12:56:37 
:

> De : Hari Rajendhran <ha...@tcs.com>
> A : user@cassandra.apache.org, 
> Date : 07/04/2014 12:58
> Objet : Cassandra Disk storage capacity
> 
> Hi Team,
> 
> We have a 3 node Apache cassandra 2.0.4 setup installed in our lab 
> setup.We have set data directory to /var/lib/cassandra/data.What 
> would be the maximum 
> disk storage that will be used for cassandra data storage.
> 
> Note : /var partition has a storage capacity of 40GB.
> 
> My question is whether cassandra will  the entire / directory for 
> data storage ?
> If no, how to specify multiple directories for data storage ??
> 
> 
> 
> 
> 
> Best Regards
> Hari Krishnan Rajendhran
> Hadoop Admin
> DESS-ABIM ,Chennai BIGDATA Galaxy
> Tata Consultancy Services
> Cell:- 9677985515
> Mailto: hari.rajendhran@tcs.com
> Website: http://www.tcs.com
> ____________________________________________
> Experience certainty. IT Services
> Business Solutions
> Consulting
> ____________________________________________
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain 
> confidential or privileged information. If you are 
> not the intended recipient, any dissemination, use, 
> review, distribution, printing or copying of the 
> information contained in this e-mail message 
> and/or attachments to it are strictly prohibited. If 
> you have received this communication in error, 
> please notify us by reply e-mail or telephone and 
> immediately and permanently delete the message 
> and any attachments. Thank you