You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Stefan Klein <st...@gmail.com> on 2014/01/20 21:19:34 UTC

Clarification on "UUIDs Configuration"

Hello all,

there is a note on
http://docs.couchdb.org/en/latest/config/misc.html#uuids-configurationabout
a performance impact of random document ids.
If the document ids are not sequential larger portions of the b-tree need
to be rewriten.
Is this related only to inserts or also to updates?

I guess on updates the path (the intermediate B-tree nodes) to the document
needs to be rewritten anyways, so the document id should not matter.


thank you,
Stefan

Re: Clarification on "UUIDs Configuration"

Posted by Adam Kocoloski <ko...@apache.org>.
On Jan 20, 2014, at 3:36 PM, Jens Alfke <je...@couchbase.com> wrote:

> 
> On Jan 20, 2014, at 12:19 PM, Stefan Klein <st...@gmail.com> wrote:
> 
>> a performance impact of random document ids.
>> If the document ids are not sequential larger portions of the b-tree need
>> to be rewriten.
>> Is this related only to inserts or also to updates?
> 
> It only applies to inserts, because if nodes are added to the b-tree in random order, more rebalancing will be necessary. Adding them in sequential order is more optimal.
> 
> Updates don't change the structure of the tree (only the contents of leaf nodes) so their ordering doesn't matter as much.
> 
> —Jens

Well, at the end of the day the goal is that documents which mutated concurrently share long common id prefixes, because if they do they'll share many of the same inner nodes in their respective paths to the root, and we can optimize away extra rewrites of those inner nodes.

The easiest place to achieve this is during insertion by a judicious choice of document ID, but if for some reason you have a subset of documents in your database which are "hot" (i.e., frequently updated relative to the others) and you can afford to update them via _bulk_docs then it would make sense to give that document class a common ID prefix so that you can benefit from this group commit optimization.

Adam

Re: Clarification on "UUIDs Configuration"

Posted by Jens Alfke <je...@couchbase.com>.
On Jan 20, 2014, at 12:19 PM, Stefan Klein <st...@gmail.com> wrote:

> a performance impact of random document ids.
> If the document ids are not sequential larger portions of the b-tree need
> to be rewriten.
> Is this related only to inserts or also to updates?

It only applies to inserts, because if nodes are added to the b-tree in random order, more rebalancing will be necessary. Adding them in sequential order is more optimal.

Updates don't change the structure of the tree (only the contents of leaf nodes) so their ordering doesn't matter as much.

—Jens