You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Yang Yang (JIRA)" <ji...@apache.org> on 2011/02/09 08:07:57 UTC

[jira] Commented: (CASSANDRA-1214) Force linux to not swap the JVM

    [ https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992350#comment-12992350 ] 

Yang Yang commented on CASSANDRA-1214:
--------------------------------------

Jonathan:

why is MCL_CURRENT chosen? I thought you would want to use MCL_FUTURE (ignoring the discussion above that these 2 seem to have the same value).

with MCL_CURRENT, supposedly SSTables that you mmap() later will still have the possibility to be paged out. or maybe I am not understanding it correctly?

Thanks
Yang

> Force linux to not swap the JVM
> -------------------------------
>
>                 Key: CASSANDRA-1214
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1214
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: James Golick
>            Assignee: Jonathan Ellis
>             Fix For: 0.6.5, 0.7 beta 2
>
>         Attachments: 1214-v3.txt, 1214-v4.txt, Read Throughput with mmap.jpg, mlockall-jna.patch.txt, trunk-1214.txt
>
>
> The way mmap()'d IO is handled in cassandra is dangerous. It allocates potentially massive buffers without any care for bounding the total size of the program's buffers. As the node's dataset grows, this *will* lead to swapping and instability.
> This is a dangerous and wrong default for a couple of reasons.
> 1) People are likely to test cassandra with the default settings. This issue is insidious because it only appears when you have sufficient data in a certain node, there is absolutely no way to control it, and it doesn't at all respect the memory limits that you give to the JVM.
> That can all be ascertained by reading the code, and people should certainly do their homework, but nevertheless, cassandra should ship with sane defaults that don't break down when you cross some magic unknown threshold.
> 2) It's deceptive. Unless you are extremely careful with capacity planning, you will get bit by this. Most people won't really be able to use this in production, so why get them excited about performance that they can't actually have?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira