You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2020/03/23 12:21:41 UTC

[GitHub] [couchdb] jiangphcn opened a new pull request #2707: Set DbPrefix with value allocated by erlfdb_hca

jiangphcn opened a new pull request #2707: Set DbPrefix with value allocated by erlfdb_hca
URL: https://github.com/apache/couchdb/pull/2707
 
 
   <!-- Thank you for your contribution!
   
        Please file this form by replacing the Markdown comments
        with your text. If a section needs no action - remove it.
   
        Also remember, that CouchDB uses the Review-Then-Commit (RTC) model
        of code collaboration. Positive feedback is represented +1 from committers
        and negative is a -1. The -1 also means veto, and needs to be addressed
        to proceed. Once there are no objections, the PR can be merged by a
        CouchDB committer.
   
        See: http://couchdb.apache.org/bylaws.html#decisions for more info. -->
   
   ## Overview
   
   <!-- Please give a short brief for the pull request,
        what problem it solves or how it makes things better. -->
   
   In CouchDB 4.0, directories and indirection access in FoundationDB are already used to better build data model. One key/value pair is used to build reference from Dbkey to DbPrefix. All other key/value pairs are based on DbPrefix instead of DbKey. This decouples the direct relationship between DBName and data in this database. The current implementation for `DBKey -> DBPrefix` is `{?ALL_DBS, DbName} -> {?DBS, DbName}`. So you can see below information in FoundationDB using fdbcli, etc.
   
   ```
   {?ALL_DBS, DbName} -> {?DBS, DbName}
   {?DBS, DbName, other part of key} -> <value>
   ```
   
   To support soft-deletion, especially allowing one database to be deleted/re-created multiple time, we need to use different DbPrefix for the same DbKey/DBName. The proposed change is to use a unique value allocated via High Contention Allocator(HCA) algorithm.
   
   
   ```
       DbPrefixAllocator = erlfdb_hca:create(?ERLFDB_EXTEND(DbId, <<"hca">>)),,
       DbPrefix = erlfdb_hca:allocate(DBPrefixAllocator, Tx),
       erlfdb:set(Tx, DbKey, DbPrefix),
   ```
   
   The data in FoundationDB looks like:
   
   ```
   {?ALL_DBS, DbName} -> <unique key allocated by hca>
   {<unique key allocated by hca>, other part of key} -> <value>
   ```
   
   Using HCA algorithm, it can acquire one unique key quickly while avoiding conflicting. The more important, it is shorter enough to save space because `DBPrefix` exists in almost every key/value pair for database. 
   
   ## Testing recommendations
   
   <!-- Describe how we can test your changes.
        Does it provides any behaviour that the end users
        could notice? -->
   
   All existing CURD test case for database should pass.
   
   ## Related Issues or Pull Requests
   
   <!-- If your changes affects multiple components in different
        repositories please put links to those issues or pull requests here.  -->
   https://github.com/apache/couchdb/pull/2666
   
   ## Checklist
   
   - [X] Code is written and works correctly
   - [X] Changes are covered by tests
   - [ ] Any new configurable parameters are documented in `rel/overlay/etc/default.ini`
   - [ ] A PR for documentation changes has been made in https://github.com/apache/couchdb-documentation
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nickva commented on a change in pull request #2707: Set DbPrefix with value allocated by erlfdb_hca

Posted by GitBox <gi...@apache.org>.
nickva commented on a change in pull request #2707: Set DbPrefix with value allocated by erlfdb_hca
URL: https://github.com/apache/couchdb/pull/2707#discussion_r396566069
 
 

 ##########
 File path: src/fabric/src/fabric2_fdb.erl
 ##########
 @@ -177,10 +177,13 @@ create(#{} = Db0, Options) ->
         layer_prefix := LayerPrefix
     } = Db = ensure_current(Db0, false),
 
-    % Eventually DbPrefix will be HCA allocated. For now
-    % we're just using the DbName so that debugging is easier.
     DbKey = erlfdb_tuple:pack({?ALL_DBS, DbName}, LayerPrefix),
-    DbPrefix = erlfdb_tuple:pack({?DBS, DbName}, LayerPrefix),
+    DefDbPref = ?DEFAULT_DB_PREFIX,
+    AllDbPrefix = erlfdb_util:get(Options, db_prefix, DefDbPref),
+    DbId = erlfdb_tuple:pack({AllDbPrefix}, AllDbPrefix),
+    DbPrefixAllocator = erlfdb_hca:create(erlfdb_tuple:pack({DbId}, <<"hca">>)),
+    AllocPrefix = erlfdb_hca:allocate(DbPrefixAllocator, Tx),
+    DbPrefix = erlfdb_tuple:pack({?DBS, AllocPrefix}, LayerPrefix),
 
 Review comment:
   I wonder we could just keep the HCA counters right under the the `LayerPrefix?
   
   As in:
   
   ```
   HCA = erlfdb_hca:create(erlfdb_tuple:pack({?DB_HCA}, LayerPrefix))
   ```
   
   And then define `DB_HCA` to be in fabric2.hrl alongside `ALL_DBS`. 
   
   ```
   ...
   -define(ALL_DBS, 1).
   -define(DB_HCA, 2).
   ```
   
   Then we don't even need the `DEFAULT_DB_PREFIX` since we'd always be creating these instances under `LayerPrefix` as well. In other words, they won't clash with FDB directory nodes.
   
   Would that work? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nickva commented on a change in pull request #2707: Set DbPrefix with value allocated by erlfdb_hca

Posted by GitBox <gi...@apache.org>.
nickva commented on a change in pull request #2707: Set DbPrefix with value allocated by erlfdb_hca
URL: https://github.com/apache/couchdb/pull/2707#discussion_r396566069
 
 

 ##########
 File path: src/fabric/src/fabric2_fdb.erl
 ##########
 @@ -177,10 +177,13 @@ create(#{} = Db0, Options) ->
         layer_prefix := LayerPrefix
     } = Db = ensure_current(Db0, false),
 
-    % Eventually DbPrefix will be HCA allocated. For now
-    % we're just using the DbName so that debugging is easier.
     DbKey = erlfdb_tuple:pack({?ALL_DBS, DbName}, LayerPrefix),
-    DbPrefix = erlfdb_tuple:pack({?DBS, DbName}, LayerPrefix),
+    DefDbPref = ?DEFAULT_DB_PREFIX,
+    AllDbPrefix = erlfdb_util:get(Options, db_prefix, DefDbPref),
+    DbId = erlfdb_tuple:pack({AllDbPrefix}, AllDbPrefix),
+    DbPrefixAllocator = erlfdb_hca:create(erlfdb_tuple:pack({DbId}, <<"hca">>)),
+    AllocPrefix = erlfdb_hca:allocate(DbPrefixAllocator, Tx),
+    DbPrefix = erlfdb_tuple:pack({?DBS, AllocPrefix}, LayerPrefix),
 
 Review comment:
   I wonder we could just keep the HCA counters right under the the `LayerPrefix` ?
   
   As in:
   
   ```
   HCA = erlfdb_hca:create(erlfdb_tuple:pack({?DB_HCA}, LayerPrefix))
   ```
   
   And then define `DB_HCA` to be in fabric2.hrl alongside `ALL_DBS`. 
   
   ```
   ...
   -define(ALL_DBS, 1).
   -define(DB_HCA, 2).
   ```
   
   Then we don't even need the `DEFAULT_DB_PREFIX` since we'd always be creating these instances under `LayerPrefix` as well. In other words, they won't clash with FDB directory nodes.
   
   Would that work? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] jiangphcn commented on issue #2707: Set DbPrefix with value allocated by erlfdb_hca

Posted by GitBox <gi...@apache.org>.
jiangphcn commented on issue #2707: Set DbPrefix with value allocated by erlfdb_hca
URL: https://github.com/apache/couchdb/pull/2707#issuecomment-603037901
 
 
   Thanks @nickva again. I should have squashed two commits before.  I just squashed them and have one commit now for this PR :-)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] jiangphcn commented on issue #2707: Set DbPrefix with value allocated by erlfdb_hca

Posted by GitBox <gi...@apache.org>.
jiangphcn commented on issue #2707: Set DbPrefix with value allocated by erlfdb_hca
URL: https://github.com/apache/couchdb/pull/2707#issuecomment-603018623
 
 
   thanks @nickva 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] jiangphcn commented on a change in pull request #2707: Set DbPrefix with value allocated by erlfdb_hca

Posted by GitBox <gi...@apache.org>.
jiangphcn commented on a change in pull request #2707: Set DbPrefix with value allocated by erlfdb_hca
URL: https://github.com/apache/couchdb/pull/2707#discussion_r396870210
 
 

 ##########
 File path: src/fabric/src/fabric2_fdb.erl
 ##########
 @@ -177,10 +177,13 @@ create(#{} = Db0, Options) ->
         layer_prefix := LayerPrefix
     } = Db = ensure_current(Db0, false),
 
-    % Eventually DbPrefix will be HCA allocated. For now
-    % we're just using the DbName so that debugging is easier.
     DbKey = erlfdb_tuple:pack({?ALL_DBS, DbName}, LayerPrefix),
-    DbPrefix = erlfdb_tuple:pack({?DBS, DbName}, LayerPrefix),
+    DefDbPref = ?DEFAULT_DB_PREFIX,
+    AllDbPrefix = erlfdb_util:get(Options, db_prefix, DefDbPref),
+    DbId = erlfdb_tuple:pack({AllDbPrefix}, AllDbPrefix),
+    DbPrefixAllocator = erlfdb_hca:create(erlfdb_tuple:pack({DbId}, <<"hca">>)),
+    AllocPrefix = erlfdb_hca:allocate(DbPrefixAllocator, Tx),
+    DbPrefix = erlfdb_tuple:pack({?DBS, AllocPrefix}, LayerPrefix),
 
 Review comment:
   thanks @nickva and @davisp. I simplified creation of HCA according to your suggestion.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nickva commented on a change in pull request #2707: Set DbPrefix with value allocated by erlfdb_hca

Posted by GitBox <gi...@apache.org>.
nickva commented on a change in pull request #2707: Set DbPrefix with value allocated by erlfdb_hca
URL: https://github.com/apache/couchdb/pull/2707#discussion_r396566069
 
 

 ##########
 File path: src/fabric/src/fabric2_fdb.erl
 ##########
 @@ -177,10 +177,13 @@ create(#{} = Db0, Options) ->
         layer_prefix := LayerPrefix
     } = Db = ensure_current(Db0, false),
 
-    % Eventually DbPrefix will be HCA allocated. For now
-    % we're just using the DbName so that debugging is easier.
     DbKey = erlfdb_tuple:pack({?ALL_DBS, DbName}, LayerPrefix),
-    DbPrefix = erlfdb_tuple:pack({?DBS, DbName}, LayerPrefix),
+    DefDbPref = ?DEFAULT_DB_PREFIX,
+    AllDbPrefix = erlfdb_util:get(Options, db_prefix, DefDbPref),
+    DbId = erlfdb_tuple:pack({AllDbPrefix}, AllDbPrefix),
+    DbPrefixAllocator = erlfdb_hca:create(erlfdb_tuple:pack({DbId}, <<"hca">>)),
+    AllocPrefix = erlfdb_hca:allocate(DbPrefixAllocator, Tx),
+    DbPrefix = erlfdb_tuple:pack({?DBS, AllocPrefix}, LayerPrefix),
 
 Review comment:
   I wonder we could just keep the HCA counters right under the the `LayerPrefix?
   
   As in:
   
   ```
   DbPrefixAlloc = erlfdb_hca:create(erlfdb_tuple:pack({?DB_HCA}, LayerPrefix))
   ```
   
   And then define `DB_HCA` to be in fabric2.hrl alongside `ALL_DBS`. 
   
   ```
   ...
   -define(ALL_DBS, 1).
   -define(DB_HCA, 2).
   ```
   
   Then we don't even need the `DEFAULT_DB_PREFIX` since we'd always be creating these instances under `LayerPrefix` as well. In other words, they won't clash with FDB directory nodes.
   
   Would that work? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] davisp commented on a change in pull request #2707: Set DbPrefix with value allocated by erlfdb_hca

Posted by GitBox <gi...@apache.org>.
davisp commented on a change in pull request #2707: Set DbPrefix with value allocated by erlfdb_hca
URL: https://github.com/apache/couchdb/pull/2707#discussion_r396576000
 
 

 ##########
 File path: src/fabric/src/fabric2_fdb.erl
 ##########
 @@ -177,10 +177,13 @@ create(#{} = Db0, Options) ->
         layer_prefix := LayerPrefix
     } = Db = ensure_current(Db0, false),
 
-    % Eventually DbPrefix will be HCA allocated. For now
-    % we're just using the DbName so that debugging is easier.
     DbKey = erlfdb_tuple:pack({?ALL_DBS, DbName}, LayerPrefix),
-    DbPrefix = erlfdb_tuple:pack({?DBS, DbName}, LayerPrefix),
+    DefDbPref = ?DEFAULT_DB_PREFIX,
+    AllDbPrefix = erlfdb_util:get(Options, db_prefix, DefDbPref),
+    DbId = erlfdb_tuple:pack({AllDbPrefix}, AllDbPrefix),
+    DbPrefixAllocator = erlfdb_hca:create(erlfdb_tuple:pack({DbId}, <<"hca">>)),
+    AllocPrefix = erlfdb_hca:allocate(DbPrefixAllocator, Tx),
+    DbPrefix = erlfdb_tuple:pack({?DBS, AllocPrefix}, LayerPrefix),
 
 Review comment:
   That's the correct approach. The directory prefix thing is just defined there because its for layer coordination. We just need to create our own subspace like `DB_HCA` as mentioned and all HCA operations just take that key as a parameter if memory serves.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nickva commented on issue #2707: Set DbPrefix with value allocated by erlfdb_hca

Posted by GitBox <gi...@apache.org>.
nickva commented on issue #2707: Set DbPrefix with value allocated by erlfdb_hca
URL: https://github.com/apache/couchdb/pull/2707#issuecomment-603021791
 
 
   Wonder if we can still rebase, squash and force push. It might be nicer to have just one commit.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] nickva commented on a change in pull request #2707: Set DbPrefix with value allocated by erlfdb_hca

Posted by GitBox <gi...@apache.org>.
nickva commented on a change in pull request #2707: Set DbPrefix with value allocated by erlfdb_hca
URL: https://github.com/apache/couchdb/pull/2707#discussion_r396566069
 
 

 ##########
 File path: src/fabric/src/fabric2_fdb.erl
 ##########
 @@ -177,10 +177,13 @@ create(#{} = Db0, Options) ->
         layer_prefix := LayerPrefix
     } = Db = ensure_current(Db0, false),
 
-    % Eventually DbPrefix will be HCA allocated. For now
-    % we're just using the DbName so that debugging is easier.
     DbKey = erlfdb_tuple:pack({?ALL_DBS, DbName}, LayerPrefix),
-    DbPrefix = erlfdb_tuple:pack({?DBS, DbName}, LayerPrefix),
+    DefDbPref = ?DEFAULT_DB_PREFIX,
+    AllDbPrefix = erlfdb_util:get(Options, db_prefix, DefDbPref),
+    DbId = erlfdb_tuple:pack({AllDbPrefix}, AllDbPrefix),
+    DbPrefixAllocator = erlfdb_hca:create(erlfdb_tuple:pack({DbId}, <<"hca">>)),
+    AllocPrefix = erlfdb_hca:allocate(DbPrefixAllocator, Tx),
+    DbPrefix = erlfdb_tuple:pack({?DBS, AllocPrefix}, LayerPrefix),
 
 Review comment:
   I wonder we could just keep the HCA counters right under the the `LayerPrefix` ?
   
   As in:
   
   ```
   HCA = erlfdb_hca:create(erlfdb_tuple:pack({?DB_HCA}, LayerPrefix)),
   AllocPrefix = erlfdb_hca:allocate(HCA, Tx),
   DbPrefix = erlfdb_tuple:pack({?DBS, AllocPrefix}, LayerPrefix),
   ```
   
   And then define `DB_HCA` to be in fabric2.hrl alongside `ALL_DBS`. 
   
   ```
   ...
   -define(ALL_DBS, 1).
   -define(DB_HCA, 2).
   ```
   
   Then we don't even need the `DEFAULT_DB_PREFIX` since we'd always be creating these instances under `LayerPrefix` as well. In other words, they won't clash with FDB directory nodes.
   
   Would that work? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [couchdb] jiangphcn merged pull request #2707: Set DbPrefix with value allocated by erlfdb_hca

Posted by GitBox <gi...@apache.org>.
jiangphcn merged pull request #2707: Set DbPrefix with value allocated by erlfdb_hca
URL: https://github.com/apache/couchdb/pull/2707
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services