You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ravikumar Govindarajan <ra...@gmail.com> on 2012/12/04 11:59:45 UTC

Help on MMap of SSTables

Our current SSTable sizes are far greater than RAM. {150 Gigs of data, 32GB
RAM}. Currently we run with mlockall and mmap_index_only options and don't
experience swapping at all.

We use wide rows and size-tiered-compaction, so a given key will definitely
be spread across multiple sstables. Will MMapping data files be detrimental
for reads, in this case?

In general, when should we opt for MMap data files and what are the factors
that need special attention when enabling the same?

--
Ravi

Re: Help on MMap of SSTables

Posted by Edward Capriolo <ed...@gmail.com>.
This issue has to be looked from a micro and macro level. On the microlevel
the "best" way is workload specific. On the macro level this mostly boils
down to data and memory size.

Companions are going to churn cache, this is unavoidable. Imho solid state
makes the micro optimization meanless in the big picture. Not that we
should not consider tweaking flags but just saying it is hard to believe
anything like that is a game change.

On Monday, December 10, 2012, Rob Coli <rc...@palominodb.com> wrote:
> On Thu, Dec 6, 2012 at 7:36 PM, aaron morton <aa...@thelastpickle.com>
wrote:
>> So for memory mapped files, compaction can do a madvise SEQUENTIAL
instead
>> of current DONTNEED flag after detecting appropriate OS versions. Will
this
>> help?
>>
>>
>> AFAIK Compaction does use memory mapped file access.
>
> The history :
>
> https://issues.apache.org/jira/browse/CASSANDRA-1470
>
> =Rob
>
> --
> =Robert Coli
> AIM&GTALK - rcoli@palominodb.com
> YAHOO - rcoli.palominob
> SKYPE - rcoli_palominodb
>

Re: Help on MMap of SSTables

Posted by Rob Coli <rc...@palominodb.com>.
On Thu, Dec 6, 2012 at 7:36 PM, aaron morton <aa...@thelastpickle.com> wrote:
> So for memory mapped files, compaction can do a madvise SEQUENTIAL instead
> of current DONTNEED flag after detecting appropriate OS versions. Will this
> help?
>
>
> AFAIK Compaction does use memory mapped file access.

The history :

https://issues.apache.org/jira/browse/CASSANDRA-1470

=Rob

-- 
=Robert Coli
AIM&GTALK - rcoli@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb

Re: Help on MMap of SSTables

Posted by aaron morton <aa...@thelastpickle.com>.
> So for memory mapped files, compaction can do a madvise SEQUENTIAL instead of current DONTNEED flag after detecting appropriate OS versions. Will this help?

AFAIK Compaction does use memory mapped file access. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 6/12/2012, at 7:48 PM, Ravikumar Govindarajan <ra...@gmail.com> wrote:

> Thanks Aaron,
> 
> I found the implementation in CLibrary.trySkipCache() method which uses fadvise DONTNEED flag after going through https://issues.apache.org/jira/browse/CASSANDRA-1470
> 
> I also came across the link mentioned in JIRA http://blog.mikemccandless.com/2010/06/lucene-and-fadvisemadvise.html?showComment=1303235497682#c2572106601600642254
> 
> which says 2.6.29 version above has implemented madvise SEQUENTIAL in a better manner.
> 
> So for memory mapped files, compaction can do a madvise SEQUENTIAL instead of current DONTNEED flag after detecting appropriate OS versions. Will this help?
> 
> --
> Ravi
> 
> On Thu, Dec 6, 2012 at 8:19 AM, aaron morton <aa...@thelastpickle.com> wrote:
> Background http://en.wikipedia.org/wiki/Memory-mapped_file
> 
>> Is it going to load only relevant pages per SSTable on read or is it going to load an entire SSTable on first access?
> 
> It will load what is requested, and maybe some additional data taking into account the amount of memory available for caches. 
> 
>> Say suppose compaction kicks in. Will it then evict hot MMapped pages for read and substitute it with a lot of pages involving full SSTables?
> 
> Some file access in cassandra, such as compaction, hints to the OS that the reads should not be cached. Technically is uses posix_fadvise if you want to look it up.
> 
> Cheers
> 
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 5/12/2012, at 11:04 PM, Ravikumar Govindarajan <ra...@gmail.com> wrote:
> 
>> Thanks Aaron,
>> 
>> I am not quite clear on how MMap loads SSTables other than the fact that it kicks in only during a first-time access
>> 
>> Is it going to load only relevant pages per SSTable on read or is it going to load an entire SSTable on first access?
>> 
>> Say suppose compaction kicks in. Will it then evict hot MMapped pages for read and substitute it with a lot of pages involving full SSTables?
>> 
>> --
>> Ravi
>> 
>> On Wed, Dec 5, 2012 at 1:22 AM, aaron morton <aa...@thelastpickle.com> wrote:
>>> Will MMapping data files be detrimental for reads, in this case?
>> No. 
>> 
>>> In general, when should we opt for MMap data files and what are the factors that need special attention when enabling the same?
>> mmapping is the default, so I would say use it until you have a reason not to. 
>> 
>> mmapping will map the entire file, but pages of data are read into memory on demand and purged when space is needed. 
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 4/12/2012, at 11:59 PM, Ravikumar Govindarajan <ra...@gmail.com> wrote:
>> 
>>> Our current SSTable sizes are far greater than RAM. {150 Gigs of data, 32GB RAM}. Currently we run with mlockall and mmap_index_only options and don't experience swapping at all.
>>> 
>>> We use wide rows and size-tiered-compaction, so a given key will definitely be spread across multiple sstables. Will MMapping data files be detrimental for reads, in this case?
>>> 
>>> In general, when should we opt for MMap data files and what are the factors that need special attention when enabling the same?
>>> 
>>> --
>>> Ravi
>> 
>> 
> 
> 


Re: Help on MMap of SSTables

Posted by Ravikumar Govindarajan <ra...@gmail.com>.
Thanks Aaron,

I found the implementation in CLibrary.trySkipCache() method which uses
fadvise DONTNEED flag after going through
https://issues.apache.org/jira/browse/CASSANDRA-1470

I also came across the link mentioned in JIRA
http://blog.mikemccandless.com/2010/06/lucene-and-fadvisemadvise.html?showComment=1303235497682#c2572106601600642254

which says 2.6.29 version above has implemented madvise SEQUENTIAL in a
better manner.

So for memory mapped files, compaction can do a madvise SEQUENTIAL instead
of current DONTNEED flag after detecting appropriate OS versions. Will this
help?

--
Ravi

On Thu, Dec 6, 2012 at 8:19 AM, aaron morton <aa...@thelastpickle.com>wrote:

> Background http://en.wikipedia.org/wiki/Memory-mapped_file
>
> Is it going to load only relevant pages per SSTable on read or is it going
> to load an entire SSTable on first access?
>
> It will load what is requested, and maybe some additional data taking into
> account the amount of memory available for caches.
>
> Say suppose compaction kicks in. Will it then evict hot MMapped pages for
> read and substitute it with a lot of pages involving full SSTables?
>
> Some file access in cassandra, such as compaction, hints to the OS that
> the reads should not be cached. Technically is uses posix_fadvise if you
> want to look it up.
>
> Cheers
>
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 5/12/2012, at 11:04 PM, Ravikumar Govindarajan <
> ravikumar.govindarajan@gmail.com> wrote:
>
> Thanks Aaron,
>
> I am not quite clear on how MMap loads SSTables other than the fact that
> it kicks in only during a first-time access
>
> Is it going to load only relevant pages per SSTable on read or is it going
> to load an entire SSTable on first access?
>
> Say suppose compaction kicks in. Will it then evict hot MMapped pages for
> read and substitute it with a lot of pages involving full SSTables?
>
> --
> Ravi
>
> On Wed, Dec 5, 2012 at 1:22 AM, aaron morton <aa...@thelastpickle.com>wrote:
>
>> Will MMapping data files be detrimental for reads, in this case?
>>
>> No.
>>
>> In general, when should we opt for MMap data files and what are the
>> factors that need special attention when enabling the same?
>>
>> mmapping is the default, so I would say use it until you have a reason
>> not to.
>>
>> mmapping will map the entire file, but pages of data are read into memory
>> on demand and purged when space is needed.
>>
>> Cheers
>>
>>    -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> New Zealand
>>
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 4/12/2012, at 11:59 PM, Ravikumar Govindarajan <
>> ravikumar.govindarajan@gmail.com> wrote:
>>
>> Our current SSTable sizes are far greater than RAM. {150 Gigs of data,
>> 32GB RAM}. Currently we run with mlockall and mmap_index_only options and
>> don't experience swapping at all.
>>
>> We use wide rows and size-tiered-compaction, so a given key will
>> definitely be spread across multiple sstables. Will MMapping data files be
>> detrimental for reads, in this case?
>>
>> In general, when should we opt for MMap data files and what are the
>> factors that need special attention when enabling the same?
>>
>> --
>> Ravi
>>
>>
>>
>
>

Re: Help on MMap of SSTables

Posted by aaron morton <aa...@thelastpickle.com>.
Background http://en.wikipedia.org/wiki/Memory-mapped_file

> Is it going to load only relevant pages per SSTable on read or is it going to load an entire SSTable on first access?
It will load what is requested, and maybe some additional data taking into account the amount of memory available for caches. 

> Say suppose compaction kicks in. Will it then evict hot MMapped pages for read and substitute it with a lot of pages involving full SSTables?

Some file access in cassandra, such as compaction, hints to the OS that the reads should not be cached. Technically is uses posix_fadvise if you want to look it up.

Cheers


-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 5/12/2012, at 11:04 PM, Ravikumar Govindarajan <ra...@gmail.com> wrote:

> Thanks Aaron,
> 
> I am not quite clear on how MMap loads SSTables other than the fact that it kicks in only during a first-time access
> 
> Is it going to load only relevant pages per SSTable on read or is it going to load an entire SSTable on first access?
> 
> Say suppose compaction kicks in. Will it then evict hot MMapped pages for read and substitute it with a lot of pages involving full SSTables?
> 
> --
> Ravi
> 
> On Wed, Dec 5, 2012 at 1:22 AM, aaron morton <aa...@thelastpickle.com> wrote:
>> Will MMapping data files be detrimental for reads, in this case?
> No. 
> 
>> In general, when should we opt for MMap data files and what are the factors that need special attention when enabling the same?
> mmapping is the default, so I would say use it until you have a reason not to. 
> 
> mmapping will map the entire file, but pages of data are read into memory on demand and purged when space is needed. 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 4/12/2012, at 11:59 PM, Ravikumar Govindarajan <ra...@gmail.com> wrote:
> 
>> Our current SSTable sizes are far greater than RAM. {150 Gigs of data, 32GB RAM}. Currently we run with mlockall and mmap_index_only options and don't experience swapping at all.
>> 
>> We use wide rows and size-tiered-compaction, so a given key will definitely be spread across multiple sstables. Will MMapping data files be detrimental for reads, in this case?
>> 
>> In general, when should we opt for MMap data files and what are the factors that need special attention when enabling the same?
>> 
>> --
>> Ravi
> 
> 


Re: Help on MMap of SSTables

Posted by Ravikumar Govindarajan <ra...@gmail.com>.
Thanks Aaron,

I am not quite clear on how MMap loads SSTables other than the fact that it
kicks in only during a first-time access

Is it going to load only relevant pages per SSTable on read or is it going
to load an entire SSTable on first access?

Say suppose compaction kicks in. Will it then evict hot MMapped pages for
read and substitute it with a lot of pages involving full SSTables?

--
Ravi

On Wed, Dec 5, 2012 at 1:22 AM, aaron morton <aa...@thelastpickle.com>wrote:

> Will MMapping data files be detrimental for reads, in this case?
>
> No.
>
> In general, when should we opt for MMap data files and what are the
> factors that need special attention when enabling the same?
>
> mmapping is the default, so I would say use it until you have a reason not
> to.
>
> mmapping will map the entire file, but pages of data are read into memory
> on demand and purged when space is needed.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 4/12/2012, at 11:59 PM, Ravikumar Govindarajan <
> ravikumar.govindarajan@gmail.com> wrote:
>
> Our current SSTable sizes are far greater than RAM. {150 Gigs of data,
> 32GB RAM}. Currently we run with mlockall and mmap_index_only options and
> don't experience swapping at all.
>
> We use wide rows and size-tiered-compaction, so a given key will
> definitely be spread across multiple sstables. Will MMapping data files be
> detrimental for reads, in this case?
>
> In general, when should we opt for MMap data files and what are the
> factors that need special attention when enabling the same?
>
> --
> Ravi
>
>
>

Re: Help on MMap of SSTables

Posted by aaron morton <aa...@thelastpickle.com>.
> Will MMapping data files be detrimental for reads, in this case?
No. 

> In general, when should we opt for MMap data files and what are the factors that need special attention when enabling the same?
mmapping is the default, so I would say use it until you have a reason not to. 

mmapping will map the entire file, but pages of data are read into memory on demand and purged when space is needed. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/12/2012, at 11:59 PM, Ravikumar Govindarajan <ra...@gmail.com> wrote:

> Our current SSTable sizes are far greater than RAM. {150 Gigs of data, 32GB RAM}. Currently we run with mlockall and mmap_index_only options and don't experience swapping at all.
> 
> We use wide rows and size-tiered-compaction, so a given key will definitely be spread across multiple sstables. Will MMapping data files be detrimental for reads, in this case?
> 
> In general, when should we opt for MMap data files and what are the factors that need special attention when enabling the same?
> 
> --
> Ravi