You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by nickva <gi...@git.apache.org> on 2017/03/15 18:08:08 UTC

[GitHub] couchdb-couch pull request #239: An ETS based couch_lru

GitHub user nickva opened a pull request:

    https://github.com/apache/couchdb-couch/pull/239

    An ETS based couch_lru

    The interface is the same as previous couch_lru.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloudant/couchdb-couch ets-based-lru

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/couchdb-couch/pull/239.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #239
    
----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch issue #239: An ETS based couch_lru

Posted by nickva <gi...@git.apache.org>.
Github user nickva commented on the issue:

    https://github.com/apache/couchdb-couch/pull/239
  
    @davisp @eiri 
    
    I tried a 5 minute eprof run. At default 500 max_dbs_open. Cluster had 20 continuous 1-to-n replications. 
    
    ```
    ==== master + ETS lru (this PR)
    
    eprof:start_profiling([couch_server], {couch_lru, '_', '_'}), timer:sleep(300000), eprof:stop_profiling(), eprof:analyze(), eprof:stop().
    
    ****** Process <4641.251.0>    -- 100.00 % of profiled time ***
    FUNCTION               CALLS        %    TIME  [uS / CALLS]
    --------               -----  -------    ----  [----------]
    couch_lru:insert/2       564     1.46   11437  [     20.28]
    couch_lru:close_int/3   1800     5.36   42068  [     23.37]
    couch_lru:close/1        450     6.29   49341  [    109.65]
    couch_lru:update/2     26542    86.90  681987  [     25.69]
    ---------------------  -----  -------  ------  [----------]
    Total:                 29356  100.00%  784833  [     26.74]
    
    
    ==== master + monotonic counter
    (https://github.com/cloudant/couchdb-couch/commit/4b49e71c14837c0f1b63e5d0ba2fea34c5fd997e)
    
    eprof:start_profiling([couch_server], {couch_lru, '_', '_'}), timer:sleep(300000), eprof:stop_profiling(), eprof:analyze(), eprof:stop().
    
    ****** Process <4641.251.0>    -- 100.00 % of profiled time ***
    FUNCTION               CALLS        %    TIME  [uS / CALLS]
    --------               -----  -------    ----  [----------]
    couch_lru:close_int/2    323     1.50   13817  [     42.78]
    couch_lru:insert/2       621     2.36   21723  [     34.98]
    couch_lru:close/1        320     3.90   35962  [    112.38]
    couch_lru:update/2     25953    92.24  849899  [     32.75]
    ---------------------  -----  -------  ------  [----------]
    Total:                 27217  100.00%  921401  [     33.85]
    
    
    ==== current master
    
    eprof:start_profiling([couch_server], {couch_lru, '_', '_'}), timer:sleep(300000), eprof:stop_profiling(), eprof:analyze(), eprof:stop().
    
    ****** Process <4641.251.0>    -- 100.00 % of profiled time ***
    FUNCTION               CALLS        %     TIME  [uS / CALLS]
    --------               -----  -------     ----  [----------]
    couch_lru:close_int/2    513     1.60    22893  [     44.63]
    couch_lru:insert/2       727     2.06    29407  [     40.45]
    couch_lru:close/1        507     3.90    55623  [    109.71]
    couch_lru:update/2     35114    92.44  1318808  [     37.56]
    ---------------------  -----  -------  -------  [----------]
    Total:                 36861  100.00%  1426731  [     38.71]
    
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch pull request #239: An ETS based couch_lru

Posted by nickva <gi...@git.apache.org>.
Github user nickva commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch/pull/239#discussion_r106270691
  
    --- Diff: src/couch_lru.erl ---
    @@ -16,48 +16,57 @@
     -include_lib("couch/include/couch_db.hrl").
     
     new() ->
    -    {gb_trees:empty(), dict:new()}.
    -
    -insert(DbName, {Tree0, Dict0}) ->
    -    Lru = erlang:now(),
    -    {gb_trees:insert(Lru, DbName, Tree0), dict:store(DbName, Lru, Dict0)}.
    -
    -update(DbName, {Tree0, Dict0}) ->
    -    case dict:find(DbName, Dict0) of
    -    {ok, Old} ->
    -        New = erlang:now(),
    -        Tree = gb_trees:insert(New, DbName, gb_trees:delete(Old, Tree0)),
    -        Dict = dict:store(DbName, New, Dict0),
    -        {Tree, Dict};
    -    error ->
    -        % We closed this database before processing the update.  Ignore
    -        {Tree0, Dict0}
    +    Updates = ets:new(couch_lru_updates, [ordered_set]),
    +    Dbs = ets:new(couch_lru_dbs, [set]),
    +    {0, Updates, Dbs}.
    +
    +insert(DbName, {Count, Updates, Dbs}) ->
    +    update(DbName, {Count, Updates, Dbs}).
    +
    +update(DbName, {Count, Updates, Dbs}) ->
    --- End diff --
    
    Though thinking about it would change the logic. Notice now `Count` is a globally incrementing counter. This change would make so that the count is increment per db shard.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch pull request #239: An ETS based couch_lru

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch/pull/239#discussion_r106263970
  
    --- Diff: src/couch_lru.erl ---
    @@ -16,48 +16,57 @@
     -include_lib("couch/include/couch_db.hrl").
     
     new() ->
    -    {gb_trees:empty(), dict:new()}.
    -
    -insert(DbName, {Tree0, Dict0}) ->
    -    Lru = erlang:now(),
    -    {gb_trees:insert(Lru, DbName, Tree0), dict:store(DbName, Lru, Dict0)}.
    -
    -update(DbName, {Tree0, Dict0}) ->
    -    case dict:find(DbName, Dict0) of
    -    {ok, Old} ->
    -        New = erlang:now(),
    -        Tree = gb_trees:insert(New, DbName, gb_trees:delete(Old, Tree0)),
    -        Dict = dict:store(DbName, New, Dict0),
    -        {Tree, Dict};
    -    error ->
    -        % We closed this database before processing the update.  Ignore
    -        {Tree0, Dict0}
    +    Updates = ets:new(couch_lru_updates, [ordered_set]),
    +    Dbs = ets:new(couch_lru_dbs, [set]),
    +    {0, Updates, Dbs}.
    +
    +insert(DbName, {Count, Updates, Dbs}) ->
    +    update(DbName, {Count, Updates, Dbs}).
    +
    +update(DbName, {Count, Updates, Dbs}) ->
    --- End diff --
    
    This possibly could be replaced with:
    ```
    case ets:update_counter(Dbs, DbName, 1, {DbName, Count}) of
        Count -> 
            true = ets:insert(Updates, {{Count, DbName}});
        NewCount ->
                true = ets:delete(Updates, {NewCount - 1, DbName}),
                true = ets:insert(Updates, {{Count, DbName}})
       end,
      {Count +1, Updates, Dbs}.
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch pull request #239: An ETS based couch_lru

Posted by nickva <gi...@git.apache.org>.
Github user nickva commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch/pull/239#discussion_r106269371
  
    --- Diff: src/couch_lru.erl ---
    @@ -16,48 +16,57 @@
     -include_lib("couch/include/couch_db.hrl").
     
     new() ->
    -    {gb_trees:empty(), dict:new()}.
    -
    -insert(DbName, {Tree0, Dict0}) ->
    -    Lru = erlang:now(),
    -    {gb_trees:insert(Lru, DbName, Tree0), dict:store(DbName, Lru, Dict0)}.
    -
    -update(DbName, {Tree0, Dict0}) ->
    -    case dict:find(DbName, Dict0) of
    -    {ok, Old} ->
    -        New = erlang:now(),
    -        Tree = gb_trees:insert(New, DbName, gb_trees:delete(Old, Tree0)),
    -        Dict = dict:store(DbName, New, Dict0),
    -        {Tree, Dict};
    -    error ->
    -        % We closed this database before processing the update.  Ignore
    -        {Tree0, Dict0}
    +    Updates = ets:new(couch_lru_updates, [ordered_set]),
    +    Dbs = ets:new(couch_lru_dbs, [set]),
    +    {0, Updates, Dbs}.
    +
    +insert(DbName, {Count, Updates, Dbs}) ->
    +    update(DbName, {Count, Updates, Dbs}).
    +
    +update(DbName, {Count, Updates, Dbs}) ->
    --- End diff --
    
    Good ideas, Ilya


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch issue #239: An ETS based couch_lru

Posted by sagelywizard <gi...@git.apache.org>.
Github user sagelywizard commented on the issue:

    https://github.com/apache/couchdb-couch/pull/239
  
    :+1:


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch pull request #239: An ETS based couch_lru

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch/pull/239#discussion_r106264428
  
    --- Diff: src/couch_lru.erl ---
    @@ -16,48 +16,57 @@
     -include_lib("couch/include/couch_db.hrl").
     
     new() ->
    -    {gb_trees:empty(), dict:new()}.
    -
    -insert(DbName, {Tree0, Dict0}) ->
    -    Lru = erlang:now(),
    -    {gb_trees:insert(Lru, DbName, Tree0), dict:store(DbName, Lru, Dict0)}.
    -
    -update(DbName, {Tree0, Dict0}) ->
    -    case dict:find(DbName, Dict0) of
    -    {ok, Old} ->
    -        New = erlang:now(),
    -        Tree = gb_trees:insert(New, DbName, gb_trees:delete(Old, Tree0)),
    -        Dict = dict:store(DbName, New, Dict0),
    -        {Tree, Dict};
    -    error ->
    -        % We closed this database before processing the update.  Ignore
    -        {Tree0, Dict0}
    +    Updates = ets:new(couch_lru_updates, [ordered_set]),
    +    Dbs = ets:new(couch_lru_dbs, [set]),
    +    {0, Updates, Dbs}.
    +
    +insert(DbName, {Count, Updates, Dbs}) ->
    +    update(DbName, {Count, Updates, Dbs}).
    +
    +update(DbName, {Count, Updates, Dbs}) ->
    --- End diff --
    
    If we use update_counter we don't have to pass Count around.
    Also  `true = ets:insert(Updates, {{Count, DbName}});` happen in both branches so could be moved outside of a case statement.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch issue #239: An ETS based couch_lru

Posted by nickva <gi...@git.apache.org>.
Github user nickva commented on the issue:

    https://github.com/apache/couchdb-couch/pull/239
  
    I ended up not using `update_counter` because update_counter/4 is not available in R16B which CouchDB still supports. Tried simply replacing update_element with update_counter/3 but that didn't seem to provide any visible speed improvement.
    
    @eiri thank you for the test module. I updated it to work with ETS table, but kept the same structure.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch issue #239: An ETS based couch_lru

Posted by eiri <gi...@git.apache.org>.
Github user eiri commented on the issue:

    https://github.com/apache/couchdb-couch/pull/239
  
    Travis failed with infamous `gnutls_handshake() failed`, but fwiw local tests all passed:
    ```
    <skip>
    module 'chttpd_endpoints_tests'
    =======================================================
      All 1066 tests passed.
    ```
    
    I'm +1 for this change, with us adding `sys_db` into lru improved performance there is a good thing&trade; Just wait for other reviewers to agree before merging. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch pull request #239: An ETS based couch_lru

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/couchdb-couch/pull/239


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch pull request #239: An ETS based couch_lru

Posted by davisp <gi...@git.apache.org>.
Github user davisp commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch/pull/239#discussion_r106261898
  
    --- Diff: src/couch_lru.erl ---
    @@ -16,48 +16,57 @@
     -include_lib("couch/include/couch_db.hrl").
     
     new() ->
    -    {gb_trees:empty(), dict:new()}.
    -
    -insert(DbName, {Tree0, Dict0}) ->
    -    Lru = erlang:now(),
    -    {gb_trees:insert(Lru, DbName, Tree0), dict:store(DbName, Lru, Dict0)}.
    -
    -update(DbName, {Tree0, Dict0}) ->
    -    case dict:find(DbName, Dict0) of
    -    {ok, Old} ->
    -        New = erlang:now(),
    -        Tree = gb_trees:insert(New, DbName, gb_trees:delete(Old, Tree0)),
    -        Dict = dict:store(DbName, New, Dict0),
    -        {Tree, Dict};
    -    error ->
    -        % We closed this database before processing the update.  Ignore
    -        {Tree0, Dict0}
    +    Updates = ets:new(couch_lru_updates, [ordered_set]),
    +    Dbs = ets:new(couch_lru_dbs, [set]),
    +    {0, Updates, Dbs}.
    +
    +insert(DbName, {Count, Updates, Dbs}) ->
    +    update(DbName, {Count, Updates, Dbs}).
    +
    +update(DbName, {Count, Updates, Dbs}) ->
    +    case ets:lookup(Dbs, DbName) of
    +        [] ->
    +            true = ets:insert(Updates, {{Count, DbName}}),
    --- End diff --
    
    why be ye usin a tuple key?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch pull request #239: An ETS based couch_lru

Posted by nickva <gi...@git.apache.org>.
Github user nickva commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch/pull/239#discussion_r106262038
  
    --- Diff: src/couch_lru.erl ---
    @@ -16,48 +16,57 @@
     -include_lib("couch/include/couch_db.hrl").
     
     new() ->
    -    {gb_trees:empty(), dict:new()}.
    -
    -insert(DbName, {Tree0, Dict0}) ->
    -    Lru = erlang:now(),
    -    {gb_trees:insert(Lru, DbName, Tree0), dict:store(DbName, Lru, Dict0)}.
    -
    -update(DbName, {Tree0, Dict0}) ->
    -    case dict:find(DbName, Dict0) of
    -    {ok, Old} ->
    -        New = erlang:now(),
    -        Tree = gb_trees:insert(New, DbName, gb_trees:delete(Old, Tree0)),
    -        Dict = dict:store(DbName, New, Dict0),
    -        {Tree, Dict};
    -    error ->
    -        % We closed this database before processing the update.  Ignore
    -        {Tree0, Dict0}
    +    Updates = ets:new(couch_lru_updates, [ordered_set]),
    +    Dbs = ets:new(couch_lru_dbs, [set]),
    +    {0, Updates, Dbs}.
    +
    +insert(DbName, {Count, Updates, Dbs}) ->
    +    update(DbName, {Count, Updates, Dbs}).
    +
    +update(DbName, {Count, Updates, Dbs}) ->
    +    case ets:lookup(Dbs, DbName) of
    +        [] ->
    +            true = ets:insert(Updates, {{Count, DbName}}),
    --- End diff --
    
    Ha! Caught my trick. To allow iterating easier without doing an extra lookup in close_int


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch issue #239: An ETS based couch_lru

Posted by sagelywizard <gi...@git.apache.org>.
Github user sagelywizard commented on the issue:

    https://github.com/apache/couchdb-couch/pull/239
  
    That's a rebase.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch issue #239: An ETS based couch_lru

Posted by eiri <gi...@git.apache.org>.
Github user eiri commented on the issue:

    https://github.com/apache/couchdb-couch/pull/239
  
    Prof looks promising. Can you add tests, just to be on a safe side? Feel free to pull from here: https://github.com/cloudant/couchdb-couch/commit/4358b8d678ab15d0153530a41f68ff7867b633a5


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---