You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cassandra.apache.org by Peter Schuller <pe...@infidyne.com> on 2010/12/10 21:45:04 UTC

mmap:ed i/o and buffer sizes

I was looking closer at sliced_buffer_size_in_kb and
column_index_size_in_kb and reached the conclusion that for the
purpose of I/O, these are irrelevant when using mmap:ed I/O mode
(which makes sense, since there is no way to use a "buffer size" when
all you're doing is touching memory). The only effect is that
column_index_size_in_kb still affects the size at which indexing
triggers, which is as advertised.

Firstly, can anyone confirm/deny my interpretation?

Secondly, has anyone done testing as to the effects on mmap():ed I/O
on the efficiency (in terms of disk seeks) of reads on large data
sets? The CPU benefits of mmap() may be negated when disk-bound if the
read-ahead logic of the kernel interacts sub-optimally with
Cassandra's use-case. Potentially even reading more than a single page
could imply multiple seeks (assuming a loaded system with other I/O in
the queue) if there is no read-ahead until the first successive
access.

I have not checked what actually does happen, nor have I benchmarked
for comparison. But I'd be interested in hearing if people have
already addressed this in the past.

-- 
/ Peter Schuller

Re: mmap:ed i/o and buffer sizes

Posted by Jonathan Ellis <jb...@gmail.com>.

On Fri, Dec 10, 2010 at 2:45 PM, Peter Schuller <peter.schuller@infidyne.com
> wrote:

> I was looking closer at sliced_buffer_size_in_kb and
> column_index_size_in_kb and reached the conclusion that for the
> purpose of I/O, these are irrelevant when using mmap:ed I/O mode
> (which makes sense, since there is no way to use a "buffer size" when
> all you're doing is touching memory). The only effect is that
> column_index_size_in_kb still affects the size at which indexing
> triggers, which is as advertised.
>
> Firstly, can anyone confirm/deny my interpretation?
>

That's correct.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com