You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Parag Patel <pp...@clearpoolgroup.com> on 2014/04/09 12:06:52 UTC
Commitlog questions
1) Why is the default 4GB? Has anyone changed this? What are some aspects to consider when determining the commitlog size?
2) If the commitlog is in periodic mode, there is a property to set a time interval to flush the incoming mutations to disk. This implies that there is a queue inside Cassandra to hold this data in memory until it is flushed.
a. Is there a name for this queue?
b. Is there a limit for this queue?
c. Are there any tuning parameters for this queue?
Thanks,
Parag
Re: Commitlog questions
Posted by Russell Hatch <rh...@datastax.com>.
>
> If the commitlog is in periodic mode and the fsync happens every 10
> seconds, Cassandra is storing the stuff that needs to be sync'd somewhere
> for a period of 10 seconds. I'm talking about before it even hits any
> disk. This has to be in memory, correct?
The information you are referring to is stored in the OS page cache[1] so
it's not part of Cassandra's memory, though I imagine Cassandra will keep a
small handle of some kind on the mutation for making the system fsync[2]
call when appropriate.
[1] http://en.wikipedia.org/wiki/Page_cache
[2] http://linux.die.net/man/2/fsync
Thanks,
Russ
On Thu, Apr 10, 2014 at 1:11 PM, Parag Patel <pp...@clearpoolgroup.com>wrote:
> Oleg,
>
> Thanks for the response. If the commitlog is in periodic mode and the
> fsync happens every 10 seconds, Cassandra is storing the stuff that needs
> to be sync'd somewhere for a period of 10 seconds. I'm talking about
> before it even hits any disk. This has to be in memory, correct?
>
> Parag
>
> -----Original Message-----
> From: Oleg Dulin [mailto:oleg.dulin@gmail.com]
> Sent: Wednesday, April 09, 2014 10:42 AM
> To: user@cassandra.apache.org
> Subject: Re: Commitlog questions
>
> Parag:
>
> To answer your questions:
>
> 1) Default is just that, a default. I wouldn't advise raising it though.
> The bigger it is the longer it takes to restart the node.
> 2) I think they juse use fsync. There is no queue. All files in cassandra
> use java.nio buffers, but they need to be fsynced periodically. Look at
> commitlog_sync parameters in cassandra.yaml file, the comments there
> explain how it works. I believe the difference between periodic and batch
> is just that -- if it is periodic, it will fsync every 10 seconds, if it is
> batch it will fsync if there were any changes within a time window.
>
> On 2014-04-09 10:06:52 +0000, Parag Patel said:
>
> >
> >>>>> 1) Why is the default 4GB? Has anyone changed this? What are
> >>>>> some aspects to consider when determining the commitlog size?
> >>>>> 2) If the commitlog is in periodic mode, there is a property
> >>>>> to set a time interval to flush the incoming mutations to disk.
> >>>>> This implies that there is a queue inside Cassandra to hold this
> >>>>> data in memory until it is flushed.
> >>>>>>>>> a. Is there a name for this queue?
> >>>>>>>>> b. Is there a limit for this queue?
> >>>>>>>>> c. Are there any tuning parameters for this queue?
> >
> > Thanks,
> > Parag
>
>
> --
> Regards,
> Oleg Dulin
> http://www.olegdulin.com
>
>
>
RE: Commitlog questions
Posted by Parag Patel <pp...@clearpoolgroup.com>.
Oleg,
Thanks for the response. If the commitlog is in periodic mode and the fsync happens every 10 seconds, Cassandra is storing the stuff that needs to be sync'd somewhere for a period of 10 seconds. I'm talking about before it even hits any disk. This has to be in memory, correct?
Parag
-----Original Message-----
From: Oleg Dulin [mailto:oleg.dulin@gmail.com]
Sent: Wednesday, April 09, 2014 10:42 AM
To: user@cassandra.apache.org
Subject: Re: Commitlog questions
Parag:
To answer your questions:
1) Default is just that, a default. I wouldn't advise raising it though. The bigger it is the longer it takes to restart the node.
2) I think they juse use fsync. There is no queue. All files in cassandra use java.nio buffers, but they need to be fsynced periodically. Look at commitlog_sync parameters in cassandra.yaml file, the comments there explain how it works. I believe the difference between periodic and batch is just that -- if it is periodic, it will fsync every 10 seconds, if it is batch it will fsync if there were any changes within a time window.
On 2014-04-09 10:06:52 +0000, Parag Patel said:
>
>>>>> 1) Why is the default 4GB? Has anyone changed this? What are
>>>>> some aspects to consider when determining the commitlog size?
>>>>> 2) If the commitlog is in periodic mode, there is a property
>>>>> to set a time interval to flush the incoming mutations to disk.
>>>>> This implies that there is a queue inside Cassandra to hold this
>>>>> data in memory until it is flushed.
>>>>>>>>> a. Is there a name for this queue?
>>>>>>>>> b. Is there a limit for this queue?
>>>>>>>>> c. Are there any tuning parameters for this queue?
>
> Thanks,
> Parag
--
Regards,
Oleg Dulin
http://www.olegdulin.com
Re: Commitlog questions
Posted by Oleg Dulin <ol...@gmail.com>.
Parag:
To answer your questions:
1) Default is just that, a default. I wouldn't advise raising it
though. The bigger it is the longer it takes to restart the node.
2) I think they juse use fsync. There is no queue. All files in
cassandra use java.nio buffers, but they need to be fsynced
periodically. Look at commitlog_sync parameters in cassandra.yaml file,
the comments there explain how it works. I believe the difference
between periodic and batch is just that -- if it is periodic, it will
fsync every 10 seconds, if it is batch it will fsync if there were any
changes within a time window.
On 2014-04-09 10:06:52 +0000, Parag Patel said:
>
>>>>> 1) Why is the default 4GB? Has anyone changed this? What are some
>>>>> aspects to consider when determining the commitlog size?
>>>>> 2) If the commitlog is in periodic mode, there is a property to
>>>>> set a time interval to flush the incoming mutations to disk. This
>>>>> implies that there is a queue inside Cassandra to hold this data in
>>>>> memory until it is flushed.
>>>>>>>>> a. Is there a name for this queue?
>>>>>>>>> b. Is there a limit for this queue?
>>>>>>>>> c. Are there any tuning parameters for this queue?
>
> Thanks,
> Parag
--
Regards,
Oleg Dulin
http://www.olegdulin.com
Re: Commitlog questions
Posted by Panagiotis Garefalakis <pa...@gmail.com>.
The incoming mutations are written per column in a Memtable (an in memory
cache) . The default size for this table is 64MB if I can recall correctly.
For more information take a look here:
https://wiki.apache.org/cassandra/MemtableSSTable
http://wiki.apache.org/cassandra/MemtableThresholds
Regards,
Panagiotis
On Wed, Apr 9, 2014 at 8:44 PM, Robert Coli <rc...@eventbrite.com> wrote:
> On Wed, Apr 9, 2014 at 3:06 AM, Parag Patel <pp...@clearpoolgroup.com>wrote:
>
>> <some questions about the commitlog and related assumptions>
>>
>
> https://issues.apache.org/jira/browse/CASSANDRA-6764
>
> You might wish to get in contact with the reporter here, who has similar
> questions!
>
> =Rob
>
>
Re: Commitlog questions
Posted by Robert Coli <rc...@eventbrite.com>.
On Wed, Apr 9, 2014 at 3:06 AM, Parag Patel <pp...@clearpoolgroup.com>wrote:
> <some questions about the commitlog and related assumptions>
>
https://issues.apache.org/jira/browse/CASSANDRA-6764
You might wish to get in contact with the reporter here, who has similar
questions!
=Rob