You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2017/05/03 17:58:04 UTC

[jira] [Commented] (COUCHDB-3298) Improve couch_btree:chunkify logic

    [ https://issues.apache.org/jira/browse/COUCHDB-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995337#comment-15995337 ] 

ASF subversion and git services commented on COUCHDB-3298:
----------------------------------------------------------

Commit e3b5f40d9130a6d347b02f31a74fd5300b41bc6e in couchdb's branch refs/heads/COUCHDB-3298-optimize-writing-kv-nodes from [~paul.joseph.davis]
[ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=e3b5f40 ]

Opimize writing KV node append writes

As it turns out, the original change in COUCHDB-3298 ends up hurting
disk usage when a view emits large amounts of data (i.e., more than
half of the btree chunk size). The cause for this is that instead of
writing single element nodes it would instead prefer to write kv nodes
with three elements. While normally we might prefer this in memory, it
turns out that our append only storage this causes a significantly more
amount of trash on disk.

We can show this with a few trivial examples. Imagine we write KV's a
through f. The two following patterns show the nodes as we write each
new kv.

    Before 3298:

    []
    [a]
    [a, b]
    [a, b]', [c]
    [a, b]', [c, d]
    [a, b]', [c, d]', [e]
    [a, b]', [c, d]', [e, f]

    After 3298:

    []
    [a]
    [a, b]
    [a, b, c]
    [a, b]', [c, d]
    [a, b]', [c, d, e]
    [a, b]', [c, d]', [e, f]

The thing to realize here is which of these nodes end up as garbage. In
the first example we end up with [a], [a, b], [c], [c, d], and [e] nodes
that have been orphaned. Where as in the second case we end up with
[a], [a, b], [a, b, c], [c, d], [c, d, e] as nodes that have been
orphaned. A quick aside, the reason that [a, b] and [c, d] are orphaned
is due to how a btree update works. For instance, when adding c, we read
[a, b] into memory, append c, and then during our node write we call
chunkify which gives us back [a, b], [c] which leads us to writing [a,
b] a second time.

This patch changes the write function to realize when we're merely
appending KVs and saves us this extra write and generation of garbage.
Its node patterns look like such:

    []
    [a]
    [a, b]
    [a, b], [c]
    [a, b], [c, d]
    [a, b], [c, d], [e]
    [a, b], [c, d], [e, f]

Which means we only end up generating [a], [c], and [e] as garbage (with
respect to kv nodes, kp nodes retain their historical behavior).


> Improve couch_btree:chunkify logic
> ----------------------------------
>
>                 Key: COUCHDB-3298
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-3298
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Paul Joseph Davis
>
> The current chunkify has problems when reduce functions create large values in that it will produce chunks (ie, kp nodes) that contain a single key. In some pathological cases this can create long chains of nodes that never branch.
> The old chunkify would also try and create nodes with an even number of bytes in each chunk. Given that we don't re-use chunks it makes more sense to try and pack our chunks as close to the threshold as possible so that we're creating fewer branches in our tree.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)