You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/12/02 10:26:53 UTC

[GitHub] [incubator-doris] Skysheepwang opened a new pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Skysheepwang opened a new pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008


   ## Proposed changes
   #4995
   **Implementation of Separated Page Cache**
   - Add config "index_page_cache_ratio" to set the ratio of capacity of index page cache
   - Change the member of StoragePageCache to maintain two type of cache
   - Change the interface of StoragePageCache for selecting type of cache
   - Change the usage of page cache in read_and_decompress_page in page_io.cpp
     - add page type as argument
     - check if current page type is available in StoragePageCache (cover the situation of ratio == 0 or 1)
   - Add type as argument in superior call of read_and_decompress_page
   - Change Unit Test
   
   ## Types of changes
   
   What types of changes does your code introduce to Doris?
   _Put an `x` in the boxes that apply_
   
   - [] Bugfix (non-breaking change which fixes an issue)
   - [x] New feature (non-breaking change which adds functionality)
   - [] Breaking change (fix or feature that would cause existing functionality to not work as expected)
   - [] Documentation Update (if none of the other choices apply)
   - [x] Code refactor (Modify the code structure, format the code, etc...)
   
   ## Checklist
   
   _Put an `x` in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code._
   
   - [x] I have create an issue on (Fix #4995), and have described the bug/feature there in detail
   - [x] Compiling and unit tests pass locally with my changes
   - [x] I have added tests that prove my fix is effective or that my feature works
   - [] If this change need a document change, I have updated the document
   - [x] Any dependent changes have been merged
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] acelyc111 commented on a change in pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on a change in pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#discussion_r539999599



##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,51 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, int32_t index_cache_percentage) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_percentage);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, int32_t index_cache_percentage)
+        : _index_cache_percentage(index_cache_percentage) {
+    if (index_cache_percentage == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    }
+    else if (index_cache_percentage == 100) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    }
+    else if (index_cache_percentage > 0 && index_cache_percentage < 100) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity * (100 - index_cache_percentage) / 100));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity * index_cache_percentage / 100));
+    }
+    else {

Review comment:
       ```suggestion
       } else if (index_cache_percentage == 100) {
           _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
       } else if (index_cache_percentage > 0 && index_cache_percentage < 100) {
           _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity * (100 - index_cache_percentage) / 100));
           _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity * index_cache_percentage / 100));
       } else {
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] acelyc111 commented on a change in pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on a change in pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#discussion_r539347142



##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,70 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, int32_t index_cache_percentage) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_percentage);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, int32_t index_cache_percentage)
+        : _index_cache_percentage(index_cache_percentage) {
+    if (index_cache_percentage == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    }
+    else if (index_cache_percentage == 100) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    }
+    else if (index_cache_percentage > 0 && index_cache_percentage < 100) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity * (100 - index_cache_percentage) / 100));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity * index_cache_percentage / 100));
+    }
+    else {
+        CHECK(false) << "invalid index page cache percentage";
+    }
+}
 
-bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle) {
-    auto lru_handle = _cache->lookup(key.encode());
+bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type) {
+    Cache::Handle* lru_handle = nullptr;
+    lru_handle = _get_page_cache(page_type)->lookup(key.encode());
     if (lru_handle == nullptr) {
         return false;
     }
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
+    *handle = PageCacheHandle(_get_page_cache(page_type), lru_handle);
     return true;

Review comment:
       How about
   ```
   auto cache = _get_page_cache(page_type);
   auto lru_handle = cache->lookup(key.encode());
   if (lru_handle == nullptr) {
       return false;
   }
   *handle = PageCacheHandle(cache, lru_handle);
   return true;
   ```

##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,70 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, int32_t index_cache_percentage) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_percentage);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, int32_t index_cache_percentage)
+        : _index_cache_percentage(index_cache_percentage) {
+    if (index_cache_percentage == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    }
+    else if (index_cache_percentage == 100) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    }
+    else if (index_cache_percentage > 0 && index_cache_percentage < 100) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity * (100 - index_cache_percentage) / 100));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity * index_cache_percentage / 100));
+    }
+    else {
+        CHECK(false) << "invalid index page cache percentage";
+    }
+}
 
-bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle) {
-    auto lru_handle = _cache->lookup(key.encode());
+bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type) {
+    Cache::Handle* lru_handle = nullptr;
+    lru_handle = _get_page_cache(page_type)->lookup(key.encode());
     if (lru_handle == nullptr) {
         return false;
     }
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
+    *handle = PageCacheHandle(_get_page_cache(page_type), lru_handle);
     return true;
 }
 
 void StoragePageCache::insert(const CacheKey& key, const Slice& data, PageCacheHandle* handle,
-                              bool in_memory) {
+                              segment_v2::PageTypePB page_type, bool in_memory) {
     auto deleter = [](const doris::CacheKey& key, void* value) { delete[](uint8_t*) value; };
 
     CachePriority priority = CachePriority::NORMAL;
     if (in_memory) {
         priority = CachePriority::DURABLE;
     }
 
-    auto lru_handle = _cache->insert(key.encode(), data.data, data.size, deleter, priority);
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
+    Cache::Handle* lru_handle = nullptr;
+    lru_handle = _get_page_cache(page_type)->insert(key.encode(), data.data, data.size, deleter, priority);
+    *handle = PageCacheHandle(_get_page_cache(page_type), lru_handle);

Review comment:
       ```suggestion
       auto cache = _get_page_cache(page_type);
       Cache::Handle* lru_handle = cache->insert(key.encode(), data.data, data.size, deleter, priority);
       *handle = PageCacheHandle(cache, lru_handle);
   ```

##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,70 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, int32_t index_cache_percentage) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_percentage);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, int32_t index_cache_percentage)
+        : _index_cache_percentage(index_cache_percentage) {
+    if (index_cache_percentage == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    }
+    else if (index_cache_percentage == 100) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    }
+    else if (index_cache_percentage > 0 && index_cache_percentage < 100) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity * (100 - index_cache_percentage) / 100));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity * index_cache_percentage / 100));
+    }
+    else {
+        CHECK(false) << "invalid index page cache percentage";
+    }
+}
 
-bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle) {
-    auto lru_handle = _cache->lookup(key.encode());
+bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type) {
+    Cache::Handle* lru_handle = nullptr;
+    lru_handle = _get_page_cache(page_type)->lookup(key.encode());
     if (lru_handle == nullptr) {
         return false;
     }
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
+    *handle = PageCacheHandle(_get_page_cache(page_type), lru_handle);
     return true;
 }
 
 void StoragePageCache::insert(const CacheKey& key, const Slice& data, PageCacheHandle* handle,
-                              bool in_memory) {
+                              segment_v2::PageTypePB page_type, bool in_memory) {
     auto deleter = [](const doris::CacheKey& key, void* value) { delete[](uint8_t*) value; };
 
     CachePriority priority = CachePriority::NORMAL;
     if (in_memory) {
         priority = CachePriority::DURABLE;
     }
 
-    auto lru_handle = _cache->insert(key.encode(), data.data, data.size, deleter, priority);
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
+    Cache::Handle* lru_handle = nullptr;
+    lru_handle = _get_page_cache(page_type)->insert(key.encode(), data.data, data.size, deleter, priority);
+    *handle = PageCacheHandle(_get_page_cache(page_type), lru_handle);
+}
+
+bool StoragePageCache::is_cache_available(segment_v2::PageTypePB page_type) {
+    if (_get_page_cache(page_type) == nullptr) {
+        return false;
+    }
+    return true;

Review comment:
       ```suggestion
       return _get_page_cache(page_type) != nullptr;
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] Skysheepwang commented on a change in pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
Skysheepwang commented on a change in pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#discussion_r542993713



##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,48 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, int32_t index_cache_percentage) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_percentage);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, int32_t index_cache_percentage)
+        : _index_cache_percentage(index_cache_percentage) {
+    if (index_cache_percentage == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    } else if (index_cache_percentage == 100) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    } else if (index_cache_percentage > 0 && index_cache_percentage < 100) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity * (100 - index_cache_percentage) / 100));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity * index_cache_percentage / 100));
+    } else {
+        CHECK(false) << "invalid index page cache percentage";
+    }
+}
 
-bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle) {
-    auto lru_handle = _cache->lookup(key.encode());
+bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type) {
+    auto cache = _get_page_cache(page_type);
+    auto lru_handle = cache->lookup(key.encode());
     if (lru_handle == nullptr) {
         return false;
     }
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
+    *handle = PageCacheHandle(cache, lru_handle);
     return true;
 }
 
 void StoragePageCache::insert(const CacheKey& key, const Slice& data, PageCacheHandle* handle,
-                              bool in_memory) {
+                              segment_v2::PageTypePB page_type, bool in_memory) {
     auto deleter = [](const doris::CacheKey& key, void* value) { delete[](uint8_t*) value; };
 
     CachePriority priority = CachePriority::NORMAL;
     if (in_memory) {
         priority = CachePriority::DURABLE;
     }
 
-    auto lru_handle = _cache->insert(key.encode(), data.data, data.size, deleter, priority);
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
+    auto cache = _get_page_cache(page_type);

Review comment:
       Same as above.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] acelyc111 commented on a change in pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on a change in pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#discussion_r538964830



##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,81 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, double index_cache_ratio) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_ratio);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, double index_cache_ratio)
+        : _index_cache_ratio(index_cache_ratio) {
+    if (index_cache_ratio == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    }
+    else if (index_cache_ratio == 1) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    }
+    else if (index_cache_ratio > 0 && index_cache_ratio < 1) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", static_cast<size_t>(capacity * (1 - index_cache_ratio))));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", static_cast<size_t>(capacity * index_cache_ratio)));
+    }   
+}
 
-bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle) {
-    auto lru_handle = _cache->lookup(key.encode());
-    if (lru_handle == nullptr) {
+bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type) {
+    Cache::Handle* lru_handle = nullptr;
+    switch (page_type) {
+    case segment_v2::DATA_PAGE:
+        lru_handle = _data_page_cache->lookup(key.encode());
+        if (lru_handle == nullptr) {
+            return false;
+        }
+        *handle = PageCacheHandle(_data_page_cache.get(), lru_handle);
+        break;
+    case segment_v2::INDEX_PAGE:
+        lru_handle = _index_page_cache->lookup(key.encode());
+        if (lru_handle == nullptr) {
+            return false;
+        }
+        *handle = PageCacheHandle(_index_page_cache.get(), lru_handle);
+        break;

Review comment:
       Sorry, it should be `Cache* get_cache(segment_v2::PageTypePB page_type)` to get Cache at first




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] acelyc111 commented on a change in pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on a change in pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#discussion_r538049137



##########
File path: be/src/olap/page_cache.h
##########
@@ -53,36 +54,44 @@ class StoragePageCache {
     };
 
     // Create global instance of this class
-    static void create_global_cache(size_t capacity);
+    static void create_global_cache(size_t capacity, double index_cache_ratio);
 
     // Return global instance.
     // Client should call create_global_cache before.
     static StoragePageCache* instance() { return _s_instance; }
 
-    StoragePageCache(size_t capacity);
+    StoragePageCache(size_t capacity, double index_cache_ratio);
 
     // Lookup the given page in the cache.
     //
     // If the page is found, the cache entry will be written into handle.
     // PageCacheHandle will release cache entry to cache when it
     // destructs.
     //
+    // Cache type selection is determined by page_type argument
+    //
     // Return true if entry is found, otherwise return false.
-    bool lookup(const CacheKey& key, PageCacheHandle* handle);
+    bool lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type);
 
     // Insert a page with key into this cache.
     // Given handle will be set to valid reference.
     // This function is thread-safe, and when two clients insert two same key
     // concurrently, this function can assure that only one page is cached.
     // The in_memory page will have higher priority.
     void insert(const CacheKey& key, const Slice& data, PageCacheHandle* handle,
-                bool in_memory = false);
+                segment_v2::PageTypePB page_type, bool in_memory = false);
+
+    // Page cache available check.
+    // When ratio is set to 0 or 1, the index or data cache will not be allocated.
+    bool is_cache_available(segment_v2::PageTypePB page_type);

Review comment:
       Better to define it as a static function




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] acelyc111 commented on a change in pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on a change in pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#discussion_r538053224



##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,81 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, double index_cache_ratio) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_ratio);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, double index_cache_ratio)
+        : _index_cache_ratio(index_cache_ratio) {
+    if (index_cache_ratio == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    }
+    else if (index_cache_ratio == 1) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    }
+    else if (index_cache_ratio > 0 && index_cache_ratio < 1) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", static_cast<size_t>(capacity * (1 - index_cache_ratio))));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", static_cast<size_t>(capacity * index_cache_ratio)));
+    }   

Review comment:
       What if the other values passed, how about add `CHECK(false) << "xxx"` in `else` scope?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
morningman commented on pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#issuecomment-750698702


   Hi @Skysheepwang, Sorry for late review. Please resolve the conflict~


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
morningman commented on pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#issuecomment-745986785


   hi @Skysheepwang Could you please rebase the trunk to make UT the NewDorisTest happy?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] acelyc111 commented on a change in pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on a change in pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#discussion_r538054989



##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,81 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, double index_cache_ratio) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_ratio);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, double index_cache_ratio)
+        : _index_cache_ratio(index_cache_ratio) {
+    if (index_cache_ratio == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    }
+    else if (index_cache_ratio == 1) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    }
+    else if (index_cache_ratio > 0 && index_cache_ratio < 1) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", static_cast<size_t>(capacity * (1 - index_cache_ratio))));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", static_cast<size_t>(capacity * index_cache_ratio)));
+    }   
+}
 
-bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle) {
-    auto lru_handle = _cache->lookup(key.encode());
-    if (lru_handle == nullptr) {
+bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type) {
+    Cache::Handle* lru_handle = nullptr;
+    switch (page_type) {
+    case segment_v2::DATA_PAGE:
+        lru_handle = _data_page_cache->lookup(key.encode());
+        if (lru_handle == nullptr) {
+            return false;
+        }
+        *handle = PageCacheHandle(_data_page_cache.get(), lru_handle);
+        break;
+    case segment_v2::INDEX_PAGE:
+        lru_handle = _index_page_cache->lookup(key.encode());
+        if (lru_handle == nullptr) {
+            return false;
+        }
+        *handle = PageCacheHandle(_index_page_cache.get(), lru_handle);
+        break;
+    default:
         return false;
     }
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
     return true;
 }
 
 void StoragePageCache::insert(const CacheKey& key, const Slice& data, PageCacheHandle* handle,
-                              bool in_memory) {
+                              segment_v2::PageTypePB page_type, bool in_memory) {
     auto deleter = [](const doris::CacheKey& key, void* value) { delete[](uint8_t*) value; };
 
     CachePriority priority = CachePriority::NORMAL;
     if (in_memory) {
         priority = CachePriority::DURABLE;
     }
 
-    auto lru_handle = _cache->insert(key.encode(), data.data, data.size, deleter, priority);
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
+    Cache::Handle* lru_handle = nullptr;
+    switch (page_type) {
+    case segment_v2::DATA_PAGE:
+        lru_handle = _data_page_cache->insert(key.encode(), data.data, data.size, deleter, priority);
+        *handle = PageCacheHandle(_data_page_cache.get(), lru_handle);
+        break;
+    case segment_v2::INDEX_PAGE:
+        lru_handle = _index_page_cache->insert(key.encode(), data.data, data.size, deleter, priority);
+        *handle = PageCacheHandle(_index_page_cache.get(), lru_handle);
+        break;
+    default:
+        break;
+    }
+}
+
+bool StoragePageCache::is_cache_available(segment_v2::PageTypePB page_type) {
+    if (page_type == segment_v2::DATA_PAGE && _index_cache_ratio == 1) {
+        return false;
+    }
+    else if (page_type == segment_v2::INDEX_PAGE && _index_cache_ratio == 0) {
+        return false;
+    }

Review comment:
       If you do the refactor as mentioned above, you can simply judge whether the return `Cache* ` is nullptr or not




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a change in pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#discussion_r544144905



##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,48 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, int32_t index_cache_percentage) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_percentage);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, int32_t index_cache_percentage)
+        : _index_cache_percentage(index_cache_percentage) {
+    if (index_cache_percentage == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    } else if (index_cache_percentage == 100) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    } else if (index_cache_percentage > 0 && index_cache_percentage < 100) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity * (100 - index_cache_percentage) / 100));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity * index_cache_percentage / 100));
+    } else {
+        CHECK(false) << "invalid index page cache percentage";
+    }
+}
 
-bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle) {
-    auto lru_handle = _cache->lookup(key.encode());
+bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type) {
+    auto cache = _get_page_cache(page_type);

Review comment:
       Ok




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman merged pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
morningman merged pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] Skysheepwang commented on a change in pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
Skysheepwang commented on a change in pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#discussion_r539769020



##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,70 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, int32_t index_cache_percentage) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_percentage);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, int32_t index_cache_percentage)
+        : _index_cache_percentage(index_cache_percentage) {
+    if (index_cache_percentage == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    }
+    else if (index_cache_percentage == 100) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    }
+    else if (index_cache_percentage > 0 && index_cache_percentage < 100) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity * (100 - index_cache_percentage) / 100));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity * index_cache_percentage / 100));
+    }
+    else {
+        CHECK(false) << "invalid index page cache percentage";
+    }
+}
 
-bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle) {
-    auto lru_handle = _cache->lookup(key.encode());
+bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type) {
+    Cache::Handle* lru_handle = nullptr;
+    lru_handle = _get_page_cache(page_type)->lookup(key.encode());
     if (lru_handle == nullptr) {
         return false;
     }
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
+    *handle = PageCacheHandle(_get_page_cache(page_type), lru_handle);
     return true;
 }
 
 void StoragePageCache::insert(const CacheKey& key, const Slice& data, PageCacheHandle* handle,
-                              bool in_memory) {
+                              segment_v2::PageTypePB page_type, bool in_memory) {
     auto deleter = [](const doris::CacheKey& key, void* value) { delete[](uint8_t*) value; };
 
     CachePriority priority = CachePriority::NORMAL;
     if (in_memory) {
         priority = CachePriority::DURABLE;
     }
 
-    auto lru_handle = _cache->insert(key.encode(), data.data, data.size, deleter, priority);
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
+    Cache::Handle* lru_handle = nullptr;
+    lru_handle = _get_page_cache(page_type)->insert(key.encode(), data.data, data.size, deleter, priority);
+    *handle = PageCacheHandle(_get_page_cache(page_type), lru_handle);
+}
+
+bool StoragePageCache::is_cache_available(segment_v2::PageTypePB page_type) {
+    if (_get_page_cache(page_type) == nullptr) {
+        return false;
+    }
+    return true;

Review comment:
       Thank you! I think it is a really good change. I will take it and change the method to an inline member method.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] Skysheepwang commented on a change in pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
Skysheepwang commented on a change in pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#discussion_r542993632



##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,48 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, int32_t index_cache_percentage) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_percentage);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, int32_t index_cache_percentage)
+        : _index_cache_percentage(index_cache_percentage) {
+    if (index_cache_percentage == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    } else if (index_cache_percentage == 100) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    } else if (index_cache_percentage > 0 && index_cache_percentage < 100) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity * (100 - index_cache_percentage) / 100));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity * index_cache_percentage / 100));
+    } else {
+        CHECK(false) << "invalid index page cache percentage";
+    }
+}
 
-bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle) {
-    auto lru_handle = _cache->lookup(key.encode());
+bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type) {
+    auto cache = _get_page_cache(page_type);

Review comment:
       Actually not because the `is_cache_available()` check will be called before using the `lookup()` method.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
morningman commented on pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#issuecomment-748087247


   > @morningman I 've already rebased but there is still something wrong in UT (Incubator Doris), but my local UT was passed so could you please help me check the problem?
   
   It seems our proxy of github has some problem. I checked the UT, it works.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] Skysheepwang commented on a change in pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
Skysheepwang commented on a change in pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#discussion_r538947011



##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,81 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, double index_cache_ratio) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_ratio);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, double index_cache_ratio)
+        : _index_cache_ratio(index_cache_ratio) {
+    if (index_cache_ratio == 0) {

Review comment:
       I think it is a good idea to use the percentage as index cache ratio because of the safety problem as you mentioned and could also avoid using the type cast. I will change the definition. Thank you very much!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a change in pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#discussion_r542388914



##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,48 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, int32_t index_cache_percentage) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_percentage);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, int32_t index_cache_percentage)
+        : _index_cache_percentage(index_cache_percentage) {
+    if (index_cache_percentage == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    } else if (index_cache_percentage == 100) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    } else if (index_cache_percentage > 0 && index_cache_percentage < 100) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity * (100 - index_cache_percentage) / 100));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity * index_cache_percentage / 100));
+    } else {
+        CHECK(false) << "invalid index page cache percentage";
+    }
+}
 
-bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle) {
-    auto lru_handle = _cache->lookup(key.encode());
+bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type) {
+    auto cache = _get_page_cache(page_type);
+    auto lru_handle = cache->lookup(key.encode());
     if (lru_handle == nullptr) {
         return false;
     }
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
+    *handle = PageCacheHandle(cache, lru_handle);
     return true;
 }
 
 void StoragePageCache::insert(const CacheKey& key, const Slice& data, PageCacheHandle* handle,
-                              bool in_memory) {
+                              segment_v2::PageTypePB page_type, bool in_memory) {
     auto deleter = [](const doris::CacheKey& key, void* value) { delete[](uint8_t*) value; };
 
     CachePriority priority = CachePriority::NORMAL;
     if (in_memory) {
         priority = CachePriority::DURABLE;
     }
 
-    auto lru_handle = _cache->insert(key.encode(), data.data, data.size, deleter, priority);
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
+    auto cache = _get_page_cache(page_type);

Review comment:
       cache may be null

##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,48 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, int32_t index_cache_percentage) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_percentage);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, int32_t index_cache_percentage)
+        : _index_cache_percentage(index_cache_percentage) {
+    if (index_cache_percentage == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    } else if (index_cache_percentage == 100) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    } else if (index_cache_percentage > 0 && index_cache_percentage < 100) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity * (100 - index_cache_percentage) / 100));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity * index_cache_percentage / 100));
+    } else {
+        CHECK(false) << "invalid index page cache percentage";
+    }
+}
 
-bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle) {
-    auto lru_handle = _cache->lookup(key.encode());
+bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type) {
+    auto cache = _get_page_cache(page_type);

Review comment:
       the cache maybe null if the index_page_cache_percentage is 0 or 100 

##########
File path: be/src/olap/rowset/segment_v2/page_io.cpp
##########
@@ -112,7 +112,7 @@ Status PageIO::read_and_decompress_page(const PageReadOptions& opts, PageHandle*
     auto cache = StoragePageCache::instance();
     PageCacheHandle cache_handle;
     StoragePageCache::CacheKey cache_key(opts.rblock->path(), opts.page_pointer.offset);
-    if (opts.use_page_cache && cache->lookup(cache_key, &cache_handle)) {
+    if (opts.use_page_cache && cache->is_cache_available(opts.type) && cache->lookup(cache_key, &cache_handle, opts.type)) {

Review comment:
       I think the check of `is_cache_available()` can be put into `lookup()` method for easy to use?
   As well as the following `insert` logic




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] Skysheepwang commented on pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
Skysheepwang commented on pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#issuecomment-747985078


   @morningman I 've already rebased but there is still something wrong in UT (Incubator Doris), but my local UT was passed so could you please help me check the problem?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] acelyc111 commented on pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#issuecomment-753737730


   > Hi @Skysheepwang, Sorry for late review. Please resolve the conflict~
   
   @morningman conflicts have been resolved.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] acelyc111 commented on a change in pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on a change in pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#discussion_r538048284



##########
File path: be/src/common/config.h
##########
@@ -260,6 +260,8 @@ CONF_Int64(index_stream_cache_capacity, "10737418240");
 
 // Cache for storage page size
 CONF_String(storage_page_cache_limit, "20G");
+// Ratio for index page cache

Review comment:
       I think it's better to give more description, such as what does the left size used for?

##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,81 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, double index_cache_ratio) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_ratio);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, double index_cache_ratio)
+        : _index_cache_ratio(index_cache_ratio) {
+    if (index_cache_ratio == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    }
+    else if (index_cache_ratio == 1) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    }
+    else if (index_cache_ratio > 0 && index_cache_ratio < 1) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", static_cast<size_t>(capacity * (1 - index_cache_ratio))));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", static_cast<size_t>(capacity * index_cache_ratio)));
+    }   

Review comment:
       What if the other values passed, how about add `CHECK(false) << "xxx"` here?

##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,81 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, double index_cache_ratio) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_ratio);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, double index_cache_ratio)
+        : _index_cache_ratio(index_cache_ratio) {
+    if (index_cache_ratio == 0) {

Review comment:
       For double type values, better to use expression like `std::abs(index_cache_ratio) < 1e-6` to judge whether they are equaled. If you think it's too ugly, you can use percentagle int value instead.

##########
File path: be/src/olap/page_cache.h
##########
@@ -53,36 +54,44 @@ class StoragePageCache {
     };
 
     // Create global instance of this class
-    static void create_global_cache(size_t capacity);
+    static void create_global_cache(size_t capacity, double index_cache_ratio);
 
     // Return global instance.
     // Client should call create_global_cache before.
     static StoragePageCache* instance() { return _s_instance; }
 
-    StoragePageCache(size_t capacity);
+    StoragePageCache(size_t capacity, double index_cache_ratio);
 
     // Lookup the given page in the cache.
     //
     // If the page is found, the cache entry will be written into handle.
     // PageCacheHandle will release cache entry to cache when it
     // destructs.
     //
+    // Cache type selection is determined by page_type argument
+    //
     // Return true if entry is found, otherwise return false.
-    bool lookup(const CacheKey& key, PageCacheHandle* handle);
+    bool lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type);
 
     // Insert a page with key into this cache.
     // Given handle will be set to valid reference.
     // This function is thread-safe, and when two clients insert two same key
     // concurrently, this function can assure that only one page is cached.
     // The in_memory page will have higher priority.
     void insert(const CacheKey& key, const Slice& data, PageCacheHandle* handle,
-                bool in_memory = false);
+                segment_v2::PageTypePB page_type, bool in_memory = false);
+
+    // Page cache available check.
+    // When ratio is set to 0 or 1, the index or data cache will not be allocated.
+    bool is_cache_available(segment_v2::PageTypePB page_type);

Review comment:
       Better to define it as a static finction

##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,81 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, double index_cache_ratio) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_ratio);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, double index_cache_ratio)
+        : _index_cache_ratio(index_cache_ratio) {
+    if (index_cache_ratio == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    }
+    else if (index_cache_ratio == 1) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    }
+    else if (index_cache_ratio > 0 && index_cache_ratio < 1) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", static_cast<size_t>(capacity * (1 - index_cache_ratio))));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", static_cast<size_t>(capacity * index_cache_ratio)));
+    }   
+}
 
-bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle) {
-    auto lru_handle = _cache->lookup(key.encode());
-    if (lru_handle == nullptr) {
+bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type) {
+    Cache::Handle* lru_handle = nullptr;
+    switch (page_type) {
+    case segment_v2::DATA_PAGE:
+        lru_handle = _data_page_cache->lookup(key.encode());
+        if (lru_handle == nullptr) {
+            return false;
+        }
+        *handle = PageCacheHandle(_data_page_cache.get(), lru_handle);
+        break;
+    case segment_v2::INDEX_PAGE:
+        lru_handle = _index_page_cache->lookup(key.encode());
+        if (lru_handle == nullptr) {
+            return false;
+        }
+        *handle = PageCacheHandle(_index_page_cache.get(), lru_handle);
+        break;

Review comment:
       There are some duplicate code, how about define a function like `Cache::Handle* get_cache(segment_v2::PageTypePB page_type)` to get `lru_handle ` at first, then do the same thing as before?

##########
File path: docs/en/administrator-guide/config/be_config.md
##########
@@ -695,6 +695,8 @@ Indicates how many tablets in this data directory failed to load. At the same ti
 
 ### `storage_page_cache_limit`
 
+### `index_page_cache_ratio`

Review comment:
       Add descriptions.

##########
File path: docs/zh-CN/administrator-guide/config/be_config.md
##########
@@ -693,6 +693,8 @@ load tablets from header failed, failed tablets size: xxx, path=xxx
 
 ### `storage_page_cache_limit`
 
+### `index_page_cache_ratio`

Review comment:
       Add descriptions.

##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,81 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, double index_cache_ratio) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_ratio);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, double index_cache_ratio)
+        : _index_cache_ratio(index_cache_ratio) {
+    if (index_cache_ratio == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    }
+    else if (index_cache_ratio == 1) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    }
+    else if (index_cache_ratio > 0 && index_cache_ratio < 1) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", static_cast<size_t>(capacity * (1 - index_cache_ratio))));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", static_cast<size_t>(capacity * index_cache_ratio)));
+    }   
+}
 
-bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle) {
-    auto lru_handle = _cache->lookup(key.encode());
-    if (lru_handle == nullptr) {
+bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type) {
+    Cache::Handle* lru_handle = nullptr;
+    switch (page_type) {
+    case segment_v2::DATA_PAGE:
+        lru_handle = _data_page_cache->lookup(key.encode());
+        if (lru_handle == nullptr) {
+            return false;
+        }
+        *handle = PageCacheHandle(_data_page_cache.get(), lru_handle);
+        break;
+    case segment_v2::INDEX_PAGE:
+        lru_handle = _index_page_cache->lookup(key.encode());
+        if (lru_handle == nullptr) {
+            return false;
+        }
+        *handle = PageCacheHandle(_index_page_cache.get(), lru_handle);
+        break;
+    default:
         return false;
     }
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
     return true;
 }
 
 void StoragePageCache::insert(const CacheKey& key, const Slice& data, PageCacheHandle* handle,
-                              bool in_memory) {
+                              segment_v2::PageTypePB page_type, bool in_memory) {
     auto deleter = [](const doris::CacheKey& key, void* value) { delete[](uint8_t*) value; };
 
     CachePriority priority = CachePriority::NORMAL;
     if (in_memory) {
         priority = CachePriority::DURABLE;
     }
 
-    auto lru_handle = _cache->insert(key.encode(), data.data, data.size, deleter, priority);
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
+    Cache::Handle* lru_handle = nullptr;
+    switch (page_type) {
+    case segment_v2::DATA_PAGE:
+        lru_handle = _data_page_cache->insert(key.encode(), data.data, data.size, deleter, priority);
+        *handle = PageCacheHandle(_data_page_cache.get(), lru_handle);
+        break;
+    case segment_v2::INDEX_PAGE:
+        lru_handle = _index_page_cache->insert(key.encode(), data.data, data.size, deleter, priority);
+        *handle = PageCacheHandle(_index_page_cache.get(), lru_handle);
+        break;

Review comment:
       Same

##########
File path: be/src/olap/page_cache.cpp
##########
@@ -21,35 +21,81 @@ namespace doris {
 
 StoragePageCache* StoragePageCache::_s_instance = nullptr;
 
-void StoragePageCache::create_global_cache(size_t capacity) {
+void StoragePageCache::create_global_cache(size_t capacity, double index_cache_ratio) {
     DCHECK(_s_instance == nullptr);
-    static StoragePageCache instance(capacity);
+    static StoragePageCache instance(capacity, index_cache_ratio);
     _s_instance = &instance;
 }
 
-StoragePageCache::StoragePageCache(size_t capacity)
-        : _cache(new_lru_cache("StoragePageCache", capacity)) {}
+StoragePageCache::StoragePageCache(size_t capacity, double index_cache_ratio)
+        : _index_cache_ratio(index_cache_ratio) {
+    if (index_cache_ratio == 0) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", capacity));
+    }
+    else if (index_cache_ratio == 1) {
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", capacity));
+    }
+    else if (index_cache_ratio > 0 && index_cache_ratio < 1) {
+        _data_page_cache = std::unique_ptr<Cache>(new_lru_cache("DataPageCache", static_cast<size_t>(capacity * (1 - index_cache_ratio))));
+        _index_page_cache = std::unique_ptr<Cache>(new_lru_cache("IndexPageCache", static_cast<size_t>(capacity * index_cache_ratio)));
+    }   
+}
 
-bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle) {
-    auto lru_handle = _cache->lookup(key.encode());
-    if (lru_handle == nullptr) {
+bool StoragePageCache::lookup(const CacheKey& key, PageCacheHandle* handle, segment_v2::PageTypePB page_type) {
+    Cache::Handle* lru_handle = nullptr;
+    switch (page_type) {
+    case segment_v2::DATA_PAGE:
+        lru_handle = _data_page_cache->lookup(key.encode());
+        if (lru_handle == nullptr) {
+            return false;
+        }
+        *handle = PageCacheHandle(_data_page_cache.get(), lru_handle);
+        break;
+    case segment_v2::INDEX_PAGE:
+        lru_handle = _index_page_cache->lookup(key.encode());
+        if (lru_handle == nullptr) {
+            return false;
+        }
+        *handle = PageCacheHandle(_index_page_cache.get(), lru_handle);
+        break;
+    default:
         return false;
     }
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
     return true;
 }
 
 void StoragePageCache::insert(const CacheKey& key, const Slice& data, PageCacheHandle* handle,
-                              bool in_memory) {
+                              segment_v2::PageTypePB page_type, bool in_memory) {
     auto deleter = [](const doris::CacheKey& key, void* value) { delete[](uint8_t*) value; };
 
     CachePriority priority = CachePriority::NORMAL;
     if (in_memory) {
         priority = CachePriority::DURABLE;
     }
 
-    auto lru_handle = _cache->insert(key.encode(), data.data, data.size, deleter, priority);
-    *handle = PageCacheHandle(_cache.get(), lru_handle);
+    Cache::Handle* lru_handle = nullptr;
+    switch (page_type) {
+    case segment_v2::DATA_PAGE:
+        lru_handle = _data_page_cache->insert(key.encode(), data.data, data.size, deleter, priority);
+        *handle = PageCacheHandle(_data_page_cache.get(), lru_handle);
+        break;
+    case segment_v2::INDEX_PAGE:
+        lru_handle = _index_page_cache->insert(key.encode(), data.data, data.size, deleter, priority);
+        *handle = PageCacheHandle(_index_page_cache.get(), lru_handle);
+        break;
+    default:
+        break;
+    }
+}
+
+bool StoragePageCache::is_cache_available(segment_v2::PageTypePB page_type) {
+    if (page_type == segment_v2::DATA_PAGE && _index_cache_ratio == 1) {
+        return false;
+    }
+    else if (page_type == segment_v2::INDEX_PAGE && _index_cache_ratio == 0) {
+        return false;
+    }

Review comment:
       If you do the refactor as mentioned above, you can simply judge whether the return `lru_handle ` is nullptr or not




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] Skysheepwang commented on pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
Skysheepwang commented on pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#issuecomment-749320964


   @morningman @acelyc111 Sorry I just made a mistake to push my code to this branch, which dismissed your approving review, please add again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] Skysheepwang commented on a change in pull request #5008: [Optimize][Cache]Implementation of Separated Page Cache

Posted by GitBox <gi...@apache.org>.
Skysheepwang commented on a change in pull request #5008:
URL: https://github.com/apache/incubator-doris/pull/5008#discussion_r542993176



##########
File path: be/src/olap/rowset/segment_v2/page_io.cpp
##########
@@ -112,7 +112,7 @@ Status PageIO::read_and_decompress_page(const PageReadOptions& opts, PageHandle*
     auto cache = StoragePageCache::instance();
     PageCacheHandle cache_handle;
     StoragePageCache::CacheKey cache_key(opts.rblock->path(), opts.page_pointer.offset);
-    if (opts.use_page_cache && cache->lookup(cache_key, &cache_handle)) {
+    if (opts.use_page_cache && cache->is_cache_available(opts.type) && cache->lookup(cache_key, &cache_handle, opts.type)) {

Review comment:
       I choose to make the `is_cache_available()` check as an interface basically for two reasons:
   
   1. If the check is failed (which means that the percentage is set to 0 or 100), the `lookup()` and `insert()` method will not be called, which could save some of the cost of parameter passing.
   2. It will do minimal changes to the original `read_and_decompress_page()` method in page_io.cpp and remain its design on whether call the `lookup()` and `insert()` methods.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org