You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Wout Mertens <wo...@gmail.com> on 2010/06/16 12:09:43 UTC

Is CouchDB Cache-Oblivious?

Hi all,

I recently stumbled across this article:
http://queue.acm.org/detail.cfm?id=1814327

Basically, the author is arguing that often, the effect of virtual memory is ignored, at great cost. (it's somewhat abrasive but it gets the point across). The comments were very helpful in showing that if you want data structures that behave well on multi-tiered memory, you need to use "Cache Oblivious" algorithms.

I only have a vague idea of the in-memory and on-disk data structures that CouchDB has (I know that they're B-trees), so I'm here to ask if CouchDB's algorithms have been designed with paging penalties in mind.

One interesting paper popped up in a search for cache oblivious B-trees:

http://www.cs.sunysb.edu/~bender/pub/PODS06-BFK.pdf
===============
ABSTRACT
B-trees are the data structure of choice for maintaining searchable data on disk. However, B-trees perform suboptimally
• when keys are long or of variable length,
• when keys are compressed, even when using front compression, the standard B-tree compression scheme,
• for range queries, and
• with respect to memory effects such as disk prefetching.

This paper presents a cache-oblivious string B-tree (COSB-tree) data structure that is efficient in all these ways:
• The COSB-tree searches asymptotically optimally and inserts and deletes nearly optimally.
• It maintains an index whose size is proportional to the front-compressed size of the dictionary. Furthermore, unlike standard front-compressed strings, keys can be decompressed in a memory-efficient manner.
• It performs range queries with no extra disk seeks; in contrast, B-trees incur disk seeks when skipping from leaf block to leaf block.
• It utilizes all levels of a memory hierarchy efficiently and makes good use of disk locality by using cache-oblivious layout strategies.
===============

Is this what CouchDB implements? If not, would implementing this be considered?

Thanks,

Wout.