You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by davisp <gi...@git.apache.org> on 2017/02/11 21:41:42 UTC

[GitHub] couchdb-couch pull request #228: Couchdb 3298 improve couch btree chunkify

GitHub user davisp opened a pull request:

    https://github.com/apache/couchdb-couch/pull/228

    Couchdb 3298 improve couch btree chunkify

    This PR adds two slight tweaks to the couch_btree:chunkify/1 function.
    
    First, rather than create a larger number of evenly sized chunks it will create as few chunks under the configurable chunk threshold as possible with the final chunk possibly being less full than average.
    
    Second, it prevents the return of a final chunk that has a single key. In some pathological cases we could end up with a branch of the tree that had multiple levels of nodes that had a single key. This is caused when a reduce function is returning values that are larger than the chunk threshold.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/couchdb-couch COUCHDB-3298-improve-couch-btree-chunkify

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/couchdb-couch/pull/228.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #228
    
----
commit 8556adbb98e79a09ec254967ee6acf3bef8d1fb6
Author: Paul J. Davis <pa...@gmail.com>
Date:   2017-02-11T21:26:26Z

    Make couch_btree:chunkify/1 prefer fewer chunks
    
    This changes couch_btree:chunkify/1 to produce fewer larger chunks
    rather than creating chunks of even-ish size.
    
    COUCHDB-3298

commit ff9fb7112ee5250af01e1b38c8cfa9caed152ae7
Author: Paul J. Davis <pa...@gmail.com>
Date:   2017-02-11T21:29:14Z

    Ensure multi-item chunks in couch_btree:chunkify/1
    
    If the last element of a chunk has a huge reduction it was possible to
    return a btree node that had a single key. This prevents the edge case
    by forcing it into the previous chunk. Without this we can end up with a
    case where a path in the tree can extend for many levels with only a
    single key in each node.
    
    COUCHDB-3298

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch pull request #228: Couchdb 3298 improve couch btree chunkify

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/couchdb-couch/pull/228


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch issue #228: Couchdb 3298 improve couch btree chunkify

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on the issue:

    https://github.com/apache/couchdb-couch/pull/228
  
    Tested manually with the following:
    ```
    LL = couch_btree:chunkify(
       [{list_to_binary(couch_util:to_hex(crypto:rand_bytes(10))), binary:copy(<<"x">>, 1300)}
           || _ <- lists:seq(1, 10)]).
    length(LL).
    
     Generator = fun(Size, N) ->
       [{list_to_binary(couch_util:to_hex(crypto:rand_bytes(10))), binary:copy(<<"x">>, Size)}
          || _ <- lists:seq(1, N)]
     end.
    
     LL = couch_btree:chunkify(Generator(1300, 3)).
     [Last | _] = lists:reverse(LL).
     [{"Number of chunks", length(LL)}, {"Elements in last chunk", length(Last)}].
    ```
    
    The output without this change is:
    ```
    [
      {"Number of chunks",5},
      {"Elements in last chunk",2}
    ]
    ```
    
    The output with this change is:
    ```
    [
      {"Number of chunks",1},
      {"Elements in last chunk",3}
    ]
    ```
    
    +1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---