You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Abdul Haq Shaik <ab...@gmail.com> on 2011/07/21 09:42:46 UTC

Memtables stored in which location

Hi,

Can you please let me know where exactly the memtables are getting stored. I
wanted to know the physical location

Re: Memtables stored in which location

Posted by CASSANDRA learner <ca...@gmail.com>.
Thanks Aaron and samal for your quick response. Its going to be helpful....

On Thu, Jul 21, 2011 at 4:15 PM, aaron morton <aa...@thelastpickle.com>wrote:

> Try the project wiki here
> http://wiki.apache.org/cassandra/ArchitectureOverview or the my own blog
> here
> http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/
>
> There is also a list of articles on the wiki here
> http://wiki.apache.org/cassandra/ArticlesAndPresentations
>
> in short, writes got to the commit log first, then the memtable in memory,
> which is later flushed to disk. A read is from potentially multiple sstables
> and memtables.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 21 Jul 2011, at 21:17, CASSANDRA learner wrote:
>
> Hi,
>
> You r right but i too have some concerns...
>
> Any ways , some where memtable has to be stored right, like we say memtable
> data is flushed to create sstable on disk.
> Exactly from which location or memory it will be getting from. is it like
> an objects streams or like it is storing the values in commitlog.
> my next question is , data is written to commit log. all the data is
> available here, and the sstable are getting created on disk, then where and
> when these memtables are coming into picture
>
> On Thu, Jul 21, 2011 at 1:44 PM, samal <sa...@wakya.in> wrote:
>
>> SSTable is stored on disk not memtable.
>>
>> Memtable is memory representation of data, which is on flush to create
>> SSTable on disk.
>>
>> This is the location where SSTable is stored
>> https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L71
>>
>>
>> Where as Commitlog which is back up (log) for memtable replaying store in
>> https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L75
>> location.
>>
>> Once the all memtable is flushed to disk, new commit log segment is
>> created.
>>
>> On Thu, Jul 21, 2011 at 1:12 PM, Abdul Haq Shaik <
>> abdulsk.cassandra@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Can you please let me know where exactly the memtables are getting
>>> stored. I wanted to know the physical location
>>>
>>
>>
>
>

Re: Memtables stored in which location

Posted by aaron morton <aa...@thelastpickle.com>.
Try the project wiki here http://wiki.apache.org/cassandra/ArchitectureOverview or the my own blog here
http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/

There is also a list of articles on the wiki here http://wiki.apache.org/cassandra/ArticlesAndPresentations

in short, writes got to the commit log first, then the memtable in memory, which is later flushed to disk. A read is from potentially multiple sstables and memtables. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 21 Jul 2011, at 21:17, CASSANDRA learner wrote:

> Hi,
> 
> You r right but i too have some concerns...
> 
> Any ways , some where memtable has to be stored right, like we say memtable data is flushed to create sstable on disk.
> Exactly from which location or memory it will be getting from. is it like an objects streams or like it is storing the values in commitlog.
> my next question is , data is written to commit log. all the data is available here, and the sstable are getting created on disk, then where and when these memtables are coming into picture
> 
> On Thu, Jul 21, 2011 at 1:44 PM, samal <sa...@wakya.in> wrote:
> SSTable is stored on disk not memtable.
> 
> Memtable is memory representation of data, which is on flush to create SSTable on disk.
> 
> This is the location where SSTable is stored
> https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L71
> 
> 
> Where as Commitlog which is back up (log) for memtable replaying store in
> https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L75
> location.
> 
> Once the all memtable is flushed to disk, new commit log segment is created.
> 
> On Thu, Jul 21, 2011 at 1:12 PM, Abdul Haq Shaik <ab...@gmail.com> wrote:
> Hi,
> 
> Can you please let me know where exactly the memtables are getting stored. I wanted to know the physical location
> 
> 


Re: Memtables stored in which location

Posted by aaron morton <aa...@thelastpickle.com>.
The data file with rows and columns, the bloom filter for the rows in the data file, the index for rows in the data file and the statistics. 

Cheers

 
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 21 Jul 2011, at 23:26, Nilabja Banerjee wrote:

> One more thing I want to ask here ...in the data folder of cassandra, for each columnfamily four type of .db files are generated. for example:  CFname-f-1-Data.db, CFname-f-1-Filter.db, CFname-f-1-Index.db, CFname-f-1-Statistic.db, 
> 
> What are these extensions are? 
> 
> Thank you
> 
> 
> 
> On 21 July 2011 16:11, samal <sa...@wakya.in> wrote:
> 
> Any ways , some where memtable has to be stored right, like we say memtable data is flushed to create sstable on disk.
> Exactly from which location or memory it will be getting from. is it like an objects streams or like it is storing the values in commitlog.
> 
> A Memtable is Cassandra's in-memory representation of key/value pairs.
>  
> my next question is , data is written to commit log. all the data is available here, and the sstable are getting created on disk, then where and when these memtables are coming into picture
> 
> Commitlog is append only file which record write sequentially, more[2], can be thought as check sum file, which to used to recalculate data for memtables in case of crash.
> A write first hits the CommitLog, then Cassandra stores/writes values to in-memory data structures called Memtables. The Memtables are flushed to disk whenever one of the configurable thresholds is met.[3] 
> For each column family there is corresponding memtable.
> There is generally one commitlog file for all CF.
> 
> SSTables are immutable once written to disk cannot be modified. It will only be replaced by new SSTable after compaction
> 
> 
> [1]http://wiki.apache.org/cassandra/ArchitectureOverview
> [2]http://wiki.apache.org/cassandra/ArchitectureCommitLog
> [3]http://wiki.apache.org/cassandra/MemtableThresholds
> 
> 


Re: Memtables stored in which location

Posted by Nilabja Banerjee <ni...@gmail.com>.
*One more thing I want to ask here* ...in the data folder of cassandra, for
each columnfamily four type of .db files are generated. for example:
CFname-f-1-*Data*.db, CFname-f-1-*Filter*.db, CFname-f-1-*Index*.db,
CFname-f-1-*Statistic*.db,

*What are these extensions are?

*Thank you



On 21 July 2011 16:11, samal <sa...@wakya.in> wrote:

>
> Any ways , some where memtable has to be stored right, like we say memtable
> data is flushed to create sstable on disk.
>
>> Exactly from which location or memory it will be getting from. is it like
>> an objects streams or like it is storing the values in commitlog.
>>
>
> A Memtable is Cassandra's in-memory representation of key/value pairs.
>
>
>> my next question is , data is written to commit log. all the data is
>> available here, and the sstable are getting created on disk, then where and
>> when these memtables are coming into picture
>
>
> Commitlog is append only file which record write sequentially, more[2], can
> be thought as check sum file, which to used to recalculate data for
> memtables in case of crash.
> A write first hits the *CommitLog*, then Cassandra stores/writes values to
> in-memory data structures called Memtables. The Memtables are flushed to
> disk whenever one of the configurable thresholds is met.[3] <http://wiki.apache.org/cassandra/MemtableThresholds>
> For each column family there is corresponding memtable.
> There is generally one commitlog file for all CF.
>
> SSTables are immutable once written to disk cannot be modified. It will
> only be replaced by new SSTable after compaction
>
>
> [1]http://wiki.apache.org/cassandra/ArchitectureOverview
> [2]http://wiki.apache.org/cassandra/ArchitectureCommitLog
> [3]http://wiki.apache.org/cassandra/MemtableThresholds
>
>

Re: Memtables stored in which location

Posted by CASSANDRA learner <ca...@gmail.com>.
Thanks samal... I got it now....

On Thu, Jul 21, 2011 at 4:11 PM, samal <sa...@wakya.in> wrote:

>
> Any ways , some where memtable has to be stored right, like we say memtable
> data is flushed to create sstable on disk.
>
>> Exactly from which location or memory it will be getting from. is it like
>> an objects streams or like it is storing the values in commitlog.
>>
>
> A Memtable is Cassandra's in-memory representation of key/value pairs.
>
>
>> my next question is , data is written to commit log. all the data is
>> available here, and the sstable are getting created on disk, then where and
>> when these memtables are coming into picture
>
>
> Commitlog is append only file which record write sequentially, more[2], can
> be thought as check sum file, which to used to recalculate data for
> memtables in case of crash.
> A write first hits the *CommitLog*, then Cassandra stores/writes values to
> in-memory data structures called Memtables. The Memtables are flushed to
> disk whenever one of the configurable thresholds is met.[3] <http://wiki.apache.org/cassandra/MemtableThresholds>
> For each column family there is corresponding memtable.
> There is generally one commitlog file for all CF.
>
> SSTables are immutable once written to disk cannot be modified. It will
> only be replaced by new SSTable after compaction
>
>
> [1]http://wiki.apache.org/cassandra/ArchitectureOverview
> [2]http://wiki.apache.org/cassandra/ArchitectureCommitLog
> [3]http://wiki.apache.org/cassandra/MemtableThresholds
>
>

Re: Memtables stored in which location

Posted by samal <sa...@wakya.in>.
Any ways , some where memtable has to be stored right, like we say memtable
data is flushed to create sstable on disk.

> Exactly from which location or memory it will be getting from. is it like
> an objects streams or like it is storing the values in commitlog.
>

A Memtable is Cassandra's in-memory representation of key/value pairs.


> my next question is , data is written to commit log. all the data is
> available here, and the sstable are getting created on disk, then where and
> when these memtables are coming into picture


Commitlog is append only file which record write sequentially, more[2], can
be thought as check sum file, which to used to recalculate data for
memtables in case of crash.
A write first hits the *CommitLog*, then Cassandra stores/writes values to
in-memory data structures called Memtables. The Memtables are flushed to
disk whenever one of the configurable thresholds is met.[3]
<http://wiki.apache.org/cassandra/MemtableThresholds>
For each column family there is corresponding memtable.
There is generally one commitlog file for all CF.

SSTables are immutable once written to disk cannot be modified. It will only
be replaced by new SSTable after compaction


[1]http://wiki.apache.org/cassandra/ArchitectureOverview
[2]http://wiki.apache.org/cassandra/ArchitectureCommitLog
[3]http://wiki.apache.org/cassandra/MemtableThresholds

Re: Memtables stored in which location

Posted by CASSANDRA learner <ca...@gmail.com>.
Hi,

You r right but i too have some concerns...

Any ways , some where memtable has to be stored right, like we say memtable
data is flushed to create sstable on disk.
Exactly from which location or memory it will be getting from. is it like an
objects streams or like it is storing the values in commitlog.
my next question is , data is written to commit log. all the data is
available here, and the sstable are getting created on disk, then where and
when these memtables are coming into picture

On Thu, Jul 21, 2011 at 1:44 PM, samal <sa...@wakya.in> wrote:

> SSTable is stored on disk not memtable.
>
> Memtable is memory representation of data, which is on flush to create
> SSTable on disk.
>
> This is the location where SSTable is stored
> https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L71
>
>
> Where as Commitlog which is back up (log) for memtable replaying store in
> https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L75
> location.
>
> Once the all memtable is flushed to disk, new commit log segment is
> created.
>
> On Thu, Jul 21, 2011 at 1:12 PM, Abdul Haq Shaik <
> abdulsk.cassandra@gmail.com> wrote:
>
>> Hi,
>>
>> Can you please let me know where exactly the memtables are getting stored.
>> I wanted to know the physical location
>>
>
>

Re: Memtables stored in which location

Posted by samal <sa...@wakya.in>.
SSTable is stored on disk not memtable.

Memtable is memory representation of data, which is on flush to create
SSTable on disk.

This is the location where SSTable is stored
https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L71


Where as Commitlog which is back up (log) for memtable replaying store in
https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L75
location.

Once the all memtable is flushed to disk, new commit log segment is created.

On Thu, Jul 21, 2011 at 1:12 PM, Abdul Haq Shaik <
abdulsk.cassandra@gmail.com> wrote:

> Hi,
>
> Can you please let me know where exactly the memtables are getting stored.
> I wanted to know the physical location
>