You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pegasus.apache.org by GitBox <gi...@apache.org> on 2020/11/23 08:11:13 UTC

[GitHub] [incubator-pegasus] empiredan opened a new pull request #646: feat: support costing memtable and index and filter blocks to block c…

empiredan opened a new pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646


   ### What problem does this PR solve? <!--add issue link with summary if exists-->
   
   While multiple services are deployed on a node, memory usage of each service has to be restricted. As for pegasus, memory is consumed by both rDSN and rocksdb. We first restrict rocksdb's memory since it's easier to implement.
   
   ### What is changed and how it works?
   
   This PR involves some rocksdb's options, to manage memtable, block cache, index and filter blocks into block cache size, by which we can take precise control over the memory used by rocksdb.
   
   If `rocksdb_enable_write_buffer_manager = true`, memtable will be managed into block cache size, and `rocksdb_block_cache_capacity` will be considered as the total memory assigned to rocksdb,  `rocksdb_total_size_across_write_buffer` is the maximum memory assigned to memtables. 
   
   If `rocksdb_cache_index_and_filter_blocks = true`, index and filter blocks will be stored in block cache, and meanwhile you'd better use partitioned filters; If `rocksdb_cache_index_and_filter_blocks = false`, you have to use `rocksdb_max_open_files` to take control.
   
   As for details of both write buffer manager and index and filter blocks, see the docs as below: 
   https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB
   https://github.com/facebook/rocksdb/wiki/Write-Buffer-Manager
   https://github.com/facebook/rocksdb/wiki/Partitioned-Index-Filters
   
   ### Check List <!--REMOVE the items that are not applicable-->
   
   Tests <!-- At least one of them must be included. -->
   
   - Unit test
   - Integration test
   - Manual test (add detailed scripts or steps below)
   - No code
   
   Code changes
   
   - Has exported function/method change
   - Has exported variable/fields change
   - Has interface methods change
   - Has persistent data change
   
   Side effects
   
   - Possible performance regression
   - Increased code complexity
   - Breaking backward compatibility
   
   Related changes
   
   - Need to cherry-pick to the release branch
   - Need to update the documentation
   - Need to be included in the release note
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r538960813



##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,128 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager =
+        dsn_config_get_value_bool("pegasus.server",
+                                  "rocksdb_enable_write_buffer_manager",
+                                  false,
+                                  "enable write buffer manager to limit total memory "
+                                  "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and block caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                "pegasus.server",
+                "rocksdb_total_size_across_write_buffer",
+                0,
+                "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64,
+                   total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }
+            _s_write_buffer_manager = std::make_shared<rocksdb::WriteBufferManager>(
+                static_cast<size_t>(total_size_across_write_buffer), tbl_opts.block_cache);
+        });
+        _db_opts.write_buffer_manager = _s_write_buffer_manager;
+    }
+
+    int64_t max_open_files = dsn_config_get_value_int64(

Review comment:
       uint64_t




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] neverchanje commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
neverchanje commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r529295670



##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,130 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager = dsn_config_get_value_bool(
+                "pegasus.server",
+                "rocksdb_enable_write_buffer_manager",
+                false,
+                "enable write buffer manager to limit total memory "
+                "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and bloack caches used by this server

Review comment:
       ```suggestion
           // of memtables and block caches used by this server
   ```

##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,130 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager = dsn_config_get_value_bool(
+                "pegasus.server",
+                "rocksdb_enable_write_buffer_manager",
+                false,
+                "enable write buffer manager to limit total memory "
+                "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and bloack caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                    "pegasus.server",
+                    "rocksdb_total_size_across_write_buffer",
+                    0,
+                    "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64, total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }
+            _s_write_buffer_manager = std::make_shared<rocksdb::WriteBufferManager>(
+                    static_cast<size_t>(total_size_across_write_buffer), tbl_opts.block_cache);
+        });
+        _db_opts.write_buffer_manager = _s_write_buffer_manager;
+    }
+
+    int64_t max_open_files = dsn_config_get_value_int64(
+            "pegasus.server",
+            "rocksdb_max_open_files",
+            -1,
+            "number of open files that can be used by a replica(namely a DB instance)");
+    _db_opts.max_open_files = static_cast<int>(max_open_files);
+    ddebug("rocksdb_max_open_files = %d", _db_opts.max_open_files);
+
+    std::string index_type = dsn_config_get_value_string(
+            "pegasus.server",
+            "rocksdb_index_type",
+            "kBinarySearch",
+            "The index type that will be used for this table.");
+    const std::unordered_map<std::string, rocksdb::BlockBasedTableOptions::IndexType>
+    index_type_string_map = {
+        {"kBinarySearch", rocksdb::BlockBasedTableOptions::IndexType::kBinarySearch},
+        {"kHashSearch", rocksdb::BlockBasedTableOptions::IndexType::kHashSearch},
+        {"kTwoLevelIndexSearch",
+         rocksdb::BlockBasedTableOptions::IndexType::kTwoLevelIndexSearch},
+        {"kBinarySearchWithFirstKey",
+         rocksdb::BlockBasedTableOptions::IndexType::kBinarySearchWithFirstKey}
+    };
+    auto index_type_item = index_type_string_map.find(index_type);
+    dassert(index_type_item != index_type_string_map.end(),
+            "[pegasus.server]rocksdb_index_type shold be one among kBinarySearch, "
+            "kHashSearch, kTwoLevelIndexSearch or kBinarySearchWithFirstKey.");
+    tbl_opts.index_type = index_type_item->second;
+    ddebug("rocksdb_index_type = %s", index_type.c_str());

Review comment:
       This is irrelevant to the topic of this PR. But I do curious if you test the performance effect of this configuration.
   
   I'd suggest this configuration naming to:
   
   ```
   [pegasus.server]
   rocksdb_index_type = binary_search | hash_search | two_level_index_search | binary_search_with_first_key
   ```
   
   without prefix "k".

##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,130 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager = dsn_config_get_value_bool(
+                "pegasus.server",
+                "rocksdb_enable_write_buffer_manager",
+                false,
+                "enable write buffer manager to limit total memory "
+                "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and bloack caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                    "pegasus.server",
+                    "rocksdb_total_size_across_write_buffer",
+                    0,
+                    "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64, total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }
+            _s_write_buffer_manager = std::make_shared<rocksdb::WriteBufferManager>(
+                    static_cast<size_t>(total_size_across_write_buffer), tbl_opts.block_cache);
+        });
+        _db_opts.write_buffer_manager = _s_write_buffer_manager;
+    }
+
+    int64_t max_open_files = dsn_config_get_value_int64(
+            "pegasus.server",
+            "rocksdb_max_open_files",
+            -1,
+            "number of open files that can be used by a replica(namely a DB instance)");
+    _db_opts.max_open_files = static_cast<int>(max_open_files);
+    ddebug("rocksdb_max_open_files = %d", _db_opts.max_open_files);

Review comment:
       Does that actually take effect in your case? We do not recommend a deployment with too large storage size per replica (around 3GB~5GB is prefered). Usually, it won't exceed the default rocksdb "max_open_files" limit.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] neverchanje commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
neverchanje commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r530156047



##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,130 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager = dsn_config_get_value_bool(
+                "pegasus.server",
+                "rocksdb_enable_write_buffer_manager",
+                false,
+                "enable write buffer manager to limit total memory "
+                "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and bloack caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                    "pegasus.server",
+                    "rocksdb_total_size_across_write_buffer",
+                    0,
+                    "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64, total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }
+            _s_write_buffer_manager = std::make_shared<rocksdb::WriteBufferManager>(
+                    static_cast<size_t>(total_size_across_write_buffer), tbl_opts.block_cache);
+        });
+        _db_opts.write_buffer_manager = _s_write_buffer_manager;
+    }
+
+    int64_t max_open_files = dsn_config_get_value_int64(
+            "pegasus.server",
+            "rocksdb_max_open_files",
+            -1,
+            "number of open files that can be used by a replica(namely a DB instance)");
+    _db_opts.max_open_files = static_cast<int>(max_open_files);
+    ddebug("rocksdb_max_open_files = %d", _db_opts.max_open_files);

Review comment:
       OK, sounds very reasonable.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r540067160



##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,128 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager =
+        dsn_config_get_value_bool("pegasus.server",
+                                  "rocksdb_enable_write_buffer_manager",
+                                  false,
+                                  "enable write buffer manager to limit total memory "
+                                  "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and block caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                "pegasus.server",
+                "rocksdb_total_size_across_write_buffer",
+                0,
+                "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64,
+                   total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }

Review comment:
       Yes.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r538957043



##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,128 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager =
+        dsn_config_get_value_bool("pegasus.server",
+                                  "rocksdb_enable_write_buffer_manager",
+                                  false,
+                                  "enable write buffer manager to limit total memory "
+                                  "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and block caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                "pegasus.server",
+                "rocksdb_total_size_across_write_buffer",
+                0,
+                "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64,
+                   total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }
+            _s_write_buffer_manager = std::make_shared<rocksdb::WriteBufferManager>(
+                static_cast<size_t>(total_size_across_write_buffer), tbl_opts.block_cache);
+        });
+        _db_opts.write_buffer_manager = _s_write_buffer_manager;
+    }
+
+    int64_t max_open_files = dsn_config_get_value_int64(
+        "pegasus.server",
+        "rocksdb_max_open_files",
+        -1,
+        "number of open files that can be used by a replica(namely a DB instance)");

Review comment:
       number of opened files...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r538961400



##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,128 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager =
+        dsn_config_get_value_bool("pegasus.server",
+                                  "rocksdb_enable_write_buffer_manager",
+                                  false,
+                                  "enable write buffer manager to limit total memory "
+                                  "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and block caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                "pegasus.server",
+                "rocksdb_total_size_across_write_buffer",
+                0,
+                "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64,
+                   total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }

Review comment:
       use uint64_t, so you can remove this check




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r538961900



##########
File path: src/server/pegasus_server_impl.cpp
##########
@@ -68,6 +68,7 @@ static bool chkpt_init_from_dir(const char *name, int64_t &decree)
 std::shared_ptr<rocksdb::RateLimiter> pegasus_server_impl::_s_rate_limiter;
 int64_t pegasus_server_impl::_rocksdb_limiter_last_total_through;
 std::shared_ptr<rocksdb::Cache> pegasus_server_impl::_s_block_cache;
+std::shared_ptr<rocksdb::WriteBufferManager> pegasus_server_impl::_s_write_buffer_manager;

Review comment:
       what about unique_ptr?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] empiredan commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
empiredan commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r539805436



##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,128 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager =
+        dsn_config_get_value_bool("pegasus.server",
+                                  "rocksdb_enable_write_buffer_manager",
+                                  false,
+                                  "enable write buffer manager to limit total memory "
+                                  "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);

Review comment:
       Ok, I see. I've replaced all ddebug.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r538958481



##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,128 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager =
+        dsn_config_get_value_bool("pegasus.server",
+                                  "rocksdb_enable_write_buffer_manager",
+                                  false,
+                                  "enable write buffer manager to limit total memory "
+                                  "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and block caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                "pegasus.server",
+                "rocksdb_total_size_across_write_buffer",
+                0,
+                "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64,
+                   total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }
+            _s_write_buffer_manager = std::make_shared<rocksdb::WriteBufferManager>(
+                static_cast<size_t>(total_size_across_write_buffer), tbl_opts.block_cache);
+        });
+        _db_opts.write_buffer_manager = _s_write_buffer_manager;
+    }
+
+    int64_t max_open_files = dsn_config_get_value_int64(
+        "pegasus.server",
+        "rocksdb_max_open_files",
+        -1,
+        "number of open files that can be used by a replica(namely a DB instance)");
+    _db_opts.max_open_files = static_cast<int>(max_open_files);
+    ddebug("rocksdb_max_open_files = %d", _db_opts.max_open_files);
+
+    std::string index_type =
+        dsn_config_get_value_string("pegasus.server",
+                                    "rocksdb_index_type",
+                                    "binary_search",
+                                    "The index type that will be used for this table.");
+    const std::unordered_map<std::string, rocksdb::BlockBasedTableOptions::IndexType>

Review comment:
       maybe you can make index_type_string_map as a global static variable
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] empiredan commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
empiredan commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r530353919



##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,130 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager = dsn_config_get_value_bool(
+                "pegasus.server",
+                "rocksdb_enable_write_buffer_manager",
+                false,
+                "enable write buffer manager to limit total memory "
+                "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and bloack caches used by this server

Review comment:
       Ok, fixed~




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] empiredan commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
empiredan commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r539807185



##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,128 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager =
+        dsn_config_get_value_bool("pegasus.server",
+                                  "rocksdb_enable_write_buffer_manager",
+                                  false,
+                                  "enable write buffer manager to limit total memory "
+                                  "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and block caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                "pegasus.server",
+                "rocksdb_total_size_across_write_buffer",
+                0,
+                "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64,
+                   total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }

Review comment:
       Ok, I use uint64_t instead. BTW, after I set total_size_across_write_buffer with a negative, I tried to start pegasus_server and I found total_size_across_write_buffer=0, is this as expected ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] empiredan commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
empiredan commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r530149162



##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,130 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager = dsn_config_get_value_bool(
+                "pegasus.server",
+                "rocksdb_enable_write_buffer_manager",
+                false,
+                "enable write buffer manager to limit total memory "
+                "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and bloack caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                    "pegasus.server",
+                    "rocksdb_total_size_across_write_buffer",
+                    0,
+                    "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64, total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }
+            _s_write_buffer_manager = std::make_shared<rocksdb::WriteBufferManager>(
+                    static_cast<size_t>(total_size_across_write_buffer), tbl_opts.block_cache);
+        });
+        _db_opts.write_buffer_manager = _s_write_buffer_manager;
+    }
+
+    int64_t max_open_files = dsn_config_get_value_int64(
+            "pegasus.server",
+            "rocksdb_max_open_files",
+            -1,
+            "number of open files that can be used by a replica(namely a DB instance)");
+    _db_opts.max_open_files = static_cast<int>(max_open_files);
+    ddebug("rocksdb_max_open_files = %d", _db_opts.max_open_files);

Review comment:
       That's right, keeping per-replica  (i.e. a rocksdb instance) size not very large is the best practice. However, our case is somewhat complex. We deploy the services on our customer's private cluster for security. Some customer's configuration, for example, say max-open-files per process, is small (e.g. 65536). A case that once we experienced is making import faster, we set bulk-load mode, but it was failed at last due to too-many-open-files.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] empiredan commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
empiredan commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r530149511



##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,130 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager = dsn_config_get_value_bool(
+                "pegasus.server",
+                "rocksdb_enable_write_buffer_manager",
+                false,
+                "enable write buffer manager to limit total memory "
+                "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and bloack caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                    "pegasus.server",
+                    "rocksdb_total_size_across_write_buffer",
+                    0,
+                    "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64, total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }
+            _s_write_buffer_manager = std::make_shared<rocksdb::WriteBufferManager>(
+                    static_cast<size_t>(total_size_across_write_buffer), tbl_opts.block_cache);
+        });
+        _db_opts.write_buffer_manager = _s_write_buffer_manager;
+    }
+
+    int64_t max_open_files = dsn_config_get_value_int64(
+            "pegasus.server",
+            "rocksdb_max_open_files",
+            -1,
+            "number of open files that can be used by a replica(namely a DB instance)");
+    _db_opts.max_open_files = static_cast<int>(max_open_files);
+    ddebug("rocksdb_max_open_files = %d", _db_opts.max_open_files);
+
+    std::string index_type = dsn_config_get_value_string(
+            "pegasus.server",
+            "rocksdb_index_type",
+            "kBinarySearch",
+            "The index type that will be used for this table.");
+    const std::unordered_map<std::string, rocksdb::BlockBasedTableOptions::IndexType>
+    index_type_string_map = {
+        {"kBinarySearch", rocksdb::BlockBasedTableOptions::IndexType::kBinarySearch},
+        {"kHashSearch", rocksdb::BlockBasedTableOptions::IndexType::kHashSearch},
+        {"kTwoLevelIndexSearch",
+         rocksdb::BlockBasedTableOptions::IndexType::kTwoLevelIndexSearch},
+        {"kBinarySearchWithFirstKey",
+         rocksdb::BlockBasedTableOptions::IndexType::kBinarySearchWithFirstKey}
+    };
+    auto index_type_item = index_type_string_map.find(index_type);
+    dassert(index_type_item != index_type_string_map.end(),
+            "[pegasus.server]rocksdb_index_type shold be one among kBinarySearch, "
+            "kHashSearch, kTwoLevelIndexSearch or kBinarySearchWithFirstKey.");
+    tbl_opts.index_type = index_type_item->second;
+    ddebug("rocksdb_index_type = %s", index_type.c_str());

Review comment:
       Actually, since I'd decided to cache index and filter blocks, I just took this opportunity to involve partitioned indexes in. However we've been developing pegasus based on v1.12.3, whose rocksdb's version is 5.9.2 and not high enough to support index and filter blocks cached well. After several related core-dumps had happened we had to disable all the options about the cache and partitioned indexes. Until I saw the version of rocksdb used by latest pegasus had reached 6.6.4, I thought those options could be enabled again albeit disabled still by default. Unfortunately we've not done any performance test since.  I think these options should be enabled by users if necessary




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] neverchanje commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
neverchanje commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r540062717



##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,128 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager =
+        dsn_config_get_value_bool("pegasus.server",
+                                  "rocksdb_enable_write_buffer_manager",
+                                  false,
+                                  "enable write buffer manager to limit total memory "
+                                  "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and block caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                "pegasus.server",
+                "rocksdb_total_size_across_write_buffer",
+                0,
+                "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64,
+                   total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }
+            _s_write_buffer_manager = std::make_shared<rocksdb::WriteBufferManager>(
+                static_cast<size_t>(total_size_across_write_buffer), tbl_opts.block_cache);
+        });
+        _db_opts.write_buffer_manager = _s_write_buffer_manager;
+    }
+
+    int64_t max_open_files = dsn_config_get_value_int64(

Review comment:
       Agree with @empiredan 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] empiredan commented on pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
empiredan commented on pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#issuecomment-733689877


   > > If `rocksdb_cache_index_and_filter_blocks = false`, you have to use `rocksdb_max_open_files` to take control.
   > 
   > Why is the two configs relevant?
   
   If `rocksdb_cache_index_and_filter_blocks=false`, table reader will pre-load index and filter blocks from files, whose number can be limited by `rocksdb_max_open_files`, to control the memory consumption.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] neverchanje commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
neverchanje commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r538156772



##########
File path: src/server/config.ini
##########
@@ -323,6 +323,18 @@
   rocksdb_limiter_max_write_megabytes_per_sec = 500
   rocksdb_limiter_enable_auto_tune = false
 
+  rocksdb_enable_write_buffer_manager = false
+  rocksdb_total_size_across_write_buffer =

Review comment:
       ```suggestion
     rocksdb_total_size_across_write_buffer = 0
   ```
   set default value here.

##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,128 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager =
+        dsn_config_get_value_bool("pegasus.server",
+                                  "rocksdb_enable_write_buffer_manager",
+                                  false,
+                                  "enable write buffer manager to limit total memory "
+                                  "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);

Review comment:
       ```suggestion
       ddebug_replica("rocksdb_enable_write_buffer_manager = {}", enable_write_buffer_manager);
   ```
   `%d` is not a boolean formatter. Use ddebug_replica instead. It uses fmtlib internally.
   
   ---
   Btw, we are refactoring the configuration definitions from `dsn_config_get_value_xxx` to the flag API:
   https://github.com/XiaoMi/rdsn/blob/master/include/dsn/utility/flags.h
   We have not yet finished refactoring entirely. Please note that your new code will someday be moved to other places.

##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,128 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager =
+        dsn_config_get_value_bool("pegasus.server",
+                                  "rocksdb_enable_write_buffer_manager",
+                                  false,
+                                  "enable write buffer manager to limit total memory "
+                                  "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and block caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                "pegasus.server",
+                "rocksdb_total_size_across_write_buffer",
+                0,
+                "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64,
+                   total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }

Review comment:
       ```suggestion
               dcheck_gt_replica(total_size_across_write_buffer, 0);
   ```
   
   A misconfigured server should not be able to bootstrap. You can use `dsn_config_get_value_uint64` to read config.

##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,128 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager =
+        dsn_config_get_value_bool("pegasus.server",
+                                  "rocksdb_enable_write_buffer_manager",
+                                  false,
+                                  "enable write buffer manager to limit total memory "
+                                  "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and block caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                "pegasus.server",
+                "rocksdb_total_size_across_write_buffer",
+                0,
+                "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64,
+                   total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }
+            _s_write_buffer_manager = std::make_shared<rocksdb::WriteBufferManager>(
+                static_cast<size_t>(total_size_across_write_buffer), tbl_opts.block_cache);
+        });
+        _db_opts.write_buffer_manager = _s_write_buffer_manager;
+    }
+
+    int64_t max_open_files = dsn_config_get_value_int64(
+        "pegasus.server",
+        "rocksdb_max_open_files",
+        -1,

Review comment:
       ```suggestion
           -1, /*always keep files opened, default by rocksdb*/
   ```

##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,128 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager =
+        dsn_config_get_value_bool("pegasus.server",
+                                  "rocksdb_enable_write_buffer_manager",
+                                  false,
+                                  "enable write buffer manager to limit total memory "
+                                  "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and block caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                "pegasus.server",
+                "rocksdb_total_size_across_write_buffer",
+                0,
+                "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64,
+                   total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }
+            _s_write_buffer_manager = std::make_shared<rocksdb::WriteBufferManager>(
+                static_cast<size_t>(total_size_across_write_buffer), tbl_opts.block_cache);
+        });
+        _db_opts.write_buffer_manager = _s_write_buffer_manager;
+    }
+
+    int64_t max_open_files = dsn_config_get_value_int64(
+        "pegasus.server",
+        "rocksdb_max_open_files",
+        -1,
+        "number of open files that can be used by a replica(namely a DB instance)");
+    _db_opts.max_open_files = static_cast<int>(max_open_files);
+    ddebug("rocksdb_max_open_files = %d", _db_opts.max_open_files);
+
+    std::string index_type =
+        dsn_config_get_value_string("pegasus.server",
+                                    "rocksdb_index_type",
+                                    "binary_search",
+                                    "The index type that will be used for this table.");
+    const std::unordered_map<std::string, rocksdb::BlockBasedTableOptions::IndexType>
+        index_type_string_map = {
+            {"binary_search", rocksdb::BlockBasedTableOptions::IndexType::kBinarySearch},
+            {"hash_search", rocksdb::BlockBasedTableOptions::IndexType::kHashSearch},
+            {"two_level_index_search",
+             rocksdb::BlockBasedTableOptions::IndexType::kTwoLevelIndexSearch},
+            {"binary_search_with_first_key",
+             rocksdb::BlockBasedTableOptions::IndexType::kBinarySearchWithFirstKey}};
+    auto index_type_item = index_type_string_map.find(index_type);
+    dassert(index_type_item != index_type_string_map.end(),
+            "[pegasus.server]rocksdb_index_type shold be one among binary_search, "

Review comment:
       ```suggestion
               "[pegasus.server]rocksdb_index_type should be one among binary_search, "
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] empiredan commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
empiredan commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r539810835



##########
File path: src/server/pegasus_server_impl.cpp
##########
@@ -68,6 +68,7 @@ static bool chkpt_init_from_dir(const char *name, int64_t &decree)
 std::shared_ptr<rocksdb::RateLimiter> pegasus_server_impl::_s_rate_limiter;
 int64_t pegasus_server_impl::_rocksdb_limiter_last_total_through;
 std::shared_ptr<rocksdb::Cache> pegasus_server_impl::_s_block_cache;
+std::shared_ptr<rocksdb::WriteBufferManager> pegasus_server_impl::_s_write_buffer_manager;

Review comment:
       In include/rocksdb/options.h, it is defined as std::shared_ptr<WriteBufferManager> as a member of rocksdb::DBOptions, so we'd better keep shared_ptr here ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] empiredan commented on a change in pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
empiredan commented on a change in pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#discussion_r539808859



##########
File path: src/server/pegasus_server_impl_init.cpp
##########
@@ -299,6 +300,128 @@ pegasus_server_impl::pegasus_server_impl(dsn::replication::replica *r)
         _db_opts.rate_limiter = _s_rate_limiter;
     }
 
+    bool enable_write_buffer_manager =
+        dsn_config_get_value_bool("pegasus.server",
+                                  "rocksdb_enable_write_buffer_manager",
+                                  false,
+                                  "enable write buffer manager to limit total memory "
+                                  "used by memtables and block caches across multiple replicas");
+    ddebug("rocksdb_enable_write_buffer_manager = %d", enable_write_buffer_manager);
+    if (enable_write_buffer_manager) {
+        // If write buffer manager is enabled, all replicas(one DB instance for each
+        // replica) on this server will share the same write buffer manager object,
+        // thus the same block cache object. It's convenient to control the total memory
+        // of memtables and block caches used by this server
+        static std::once_flag flag;
+        std::call_once(flag, [&]() {
+            int64_t total_size_across_write_buffer = dsn_config_get_value_int64(
+                "pegasus.server",
+                "rocksdb_total_size_across_write_buffer",
+                0,
+                "total size limit used by memtables across multiple replicas");
+            ddebug("rocksdb_total_size_across_write_buffer = %" PRId64,
+                   total_size_across_write_buffer);
+            if (total_size_across_write_buffer < 0) {
+                total_size_across_write_buffer = 0;
+            }
+            _s_write_buffer_manager = std::make_shared<rocksdb::WriteBufferManager>(
+                static_cast<size_t>(total_size_across_write_buffer), tbl_opts.block_cache);
+        });
+        _db_opts.write_buffer_manager = _s_write_buffer_manager;
+    }
+
+    int64_t max_open_files = dsn_config_get_value_int64(

Review comment:
       Actually max_open_files is defined as int type in include/rocksdb/options.h, and -1 means files opened are always kept open. So here it has to be signed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] neverchanje commented on pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
neverchanje commented on pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646#issuecomment-732787130


   >  If `rocksdb_cache_index_and_filter_blocks = false`, you have to use `rocksdb_max_open_files` to take control.
   
   Why is the two configs relevant?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 merged pull request #646: feat: support costing memtable and index and filter blocks to block c…

Posted by GitBox <gi...@apache.org>.
levy5307 merged pull request #646:
URL: https://github.com/apache/incubator-pegasus/pull/646


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org