You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/05/12 06:28:36 UTC

[GitHub] [incubator-doris] gaodayue opened a new pull request, #9521: [improvement][performance] improve lru cache resize performance and memory usage

gaodayue opened a new pull request, #9521:
URL: https://github.com/apache/incubator-doris/pull/9521

   # Proposed changes
   
   ## Problem Summary:
   
   Previously we use doubly linked list for LRUCache's HandleTable, in order to support a lookup-free `remove(const LRUHandle* h)` function. However it 1) slows down the resize operation because we need to malloc/free the dummy LRUHandle node for each bucket 2) increases the overall memory usage.
   
   It turns out that we can achieve fast `remove(const LRUHandle* h)` using just singly linked list.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (No)
   2. Has unit tests been added: (Yes)
   3. Has document been added or modified: (No Need)
   4. Does it need to update dependencies: (No)
   5. Are there any changes that cannot be rolled back: (No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] gaodayue commented on a diff in pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

gaodayue commented on code in PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#discussion_r876832379


##########
be/src/olap/lru_cache.cpp:
##########
@@ -73,9 +73,6 @@ uint32_t CacheKey::hash(const char* data, size_t n, uint32_t seed) const {
 Cache::~Cache() {}
 
 HandleTable::~HandleTable() {
-    for (uint32_t i = 0; i < _length; i++) {
-        delete _list[i];

Review Comment:
   we don't need dummy node for each bucket anymore



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] xinyiZzz commented on pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

xinyiZzz commented on PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#issuecomment-1128600341

   > A bit of context for this PR: we observed several latency spikes after restarting BE and found that it's caused by the resize operation (see the graph below), so we'd like to improve the resize operation in order to avoid any query timeout.
   > 
   > The cluster uses 200G page cache per BE and resizing single shard to 8 million entries took >2 seconds.
   > 
   > I wrote a simple program to test the resize speed. From the results, we can see that for large resize, this PR reduces the resize duration by 60%. This PR also reduces the memory usage, but it's not our main concern.
   
   Convincing test results, nice job~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] gaodayue commented on a diff in pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

gaodayue commented on code in PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#discussion_r873618153


##########
be/src/olap/lru_cache.cpp:
##########
@@ -106,40 +103,38 @@ LRUHandle* HandleTable::remove(const CacheKey& key, uint32_t hash) {
     LRUHandle** ptr = _find_pointer(key, hash);
     LRUHandle* result = *ptr;
 
-    remove(result);
+    if (result != nullptr) {
+        *ptr = result->next_hash;
+        _elems--;
+    }
 
     return result;
 }
 
-void HandleTable::remove(const LRUHandle* h) {
-    if (h != nullptr) {
-        if (h->next_hash != nullptr) {
-            h->next_hash->prev_hash = h->prev_hash;
-        }
-        DCHECK(h->prev_hash != nullptr);
-        h->prev_hash->next_hash = h->next_hash;
-        --_elems;
+bool HandleTable::remove(const LRUHandle* h) {
+    LRUHandle** ptr = &(_list[h->hash & (_length - 1)]);
+    while (*ptr != nullptr && *ptr != h) {
+        ptr = &(*ptr)->next_hash;
     }
+
+    LRUHandle* result = *ptr;

Review Comment:
   This PR will not slow down `remove` because
   1. the expected length for the linked list is 1 given a good hash function, so we won't spend too much time traversing the list
   2. only pointer comparison is used for each iteration, no time-consuming key comparison
   
   As a result, CacheTest.SimpleBenchmark changed from 1576ms to 1570ms on 160000 iteration, and from 161392ms to 158165ms on 16000000 iteration.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

xinyiZzz commented on code in PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#discussion_r871019174


##########
be/src/olap/lru_cache.cpp:
##########
@@ -106,40 +103,38 @@ LRUHandle* HandleTable::remove(const CacheKey& key, uint32_t hash) {
     LRUHandle** ptr = _find_pointer(key, hash);
     LRUHandle* result = *ptr;
 
-    remove(result);
+    if (result != nullptr) {
+        *ptr = result->next_hash;
+        _elems--;
+    }
 
     return result;
 }
 
-void HandleTable::remove(const LRUHandle* h) {
-    if (h != nullptr) {
-        if (h->next_hash != nullptr) {
-            h->next_hash->prev_hash = h->prev_hash;
-        }
-        DCHECK(h->prev_hash != nullptr);
-        h->prev_hash->next_hash = h->next_hash;
-        --_elems;
+bool HandleTable::remove(const LRUHandle* h) {
+    LRUHandle** ptr = &(_list[h->hash & (_length - 1)]);
+    while (*ptr != nullptr && *ptr != h) {
+        ptr = &(*ptr)->next_hash;
     }
+

Review Comment:
   So, I think faster remove speed is more important than faster resize.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

xinyiZzz commented on code in PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#discussion_r874730688


##########
be/src/olap/lru_cache.cpp:
##########
@@ -240,7 +228,7 @@ void LRUCache::release(Cache::Handle* handle) {
             // only exists in cache
             if (_usage > _capacity) {
                 // take this opportunity and remove the item
-                _table.remove(e);
+                DCHECK(_table.remove(e));

Review Comment:
   Why remove table only when DCHECK.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

yiguolei commented on code in PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#discussion_r876820043


##########
be/src/olap/lru_cache.cpp:
##########
@@ -148,29 +143,22 @@ void HandleTable::_resize() {
 
     LRUHandle** new_list = new (std::nothrow) LRUHandle*[new_length];
     memset(new_list, 0, sizeof(new_list[0]) * new_length);
-    for (uint32_t i = 0; i < new_length; i++) {

Review Comment:
   If you removed this code, the first node in linked list is not a dummy node any more. Have you check other logics?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] yiguolei merged pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

yiguolei merged PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] gaodayue commented on pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

gaodayue commented on PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#issuecomment-1131256420

   Error from the Code Quality Analysis check, not related to this PR
   
   ```
   [INFO] Doris FE Project Parent POM ........................ FAILURE [11:44 min]
   [INFO] fe-common .......................................... SUCCESS [ 21.655 s]
   [INFO] spark-dpp .......................................... SUCCESS [ 11.115 s]
   [INFO] fe-core ............................................ SUCCESS [01:07 min]
   [INFO] hive-udf ........................................... SUCCESS [ 19.401 s]
   [INFO] java-udf ........................................... SUCCESS [ 46.357 s]
   [INFO] ------------------------------------------------------------------------
   [INFO] BUILD FAILURE
   [INFO] ------------------------------------------------------------------------
   [INFO] Total time:  14:33 min
   [INFO] Finished at: 2022-05-18T04:06:40Z
   [INFO] ------------------------------------------------------------------------
   Error:  Failed to execute goal org.sonarsource.scanner.maven:sonar-maven-plugin:3.9.1.2184:sonar (default-cli) on project fe: Project not found. Please check the ‘sonar.projectKey’ and ‘sonar.organization’ properties, the ‘SONAR_TOKEN’ environment variable, or contact the project administrator -> [Help 1]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

yiguolei commented on code in PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#discussion_r876790065


##########
be/src/olap/lru_cache.cpp:
##########
@@ -73,9 +73,6 @@ uint32_t CacheKey::hash(const char* data, size_t n, uint32_t seed) const {
 Cache::~Cache() {}
 
 HandleTable::~HandleTable() {
-    for (uint32_t i = 0; i < _length; i++) {
-        delete _list[i];

Review Comment:
   should not delete this code， maybe memory leak here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

xinyiZzz commented on code in PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#discussion_r871020043


##########
be/src/olap/lru_cache.cpp:
##########
@@ -106,40 +103,38 @@ LRUHandle* HandleTable::remove(const CacheKey& key, uint32_t hash) {
     LRUHandle** ptr = _find_pointer(key, hash);
     LRUHandle* result = *ptr;
 
-    remove(result);
+    if (result != nullptr) {
+        *ptr = result->next_hash;
+        _elems--;
+    }
 
     return result;
 }
 
-void HandleTable::remove(const LRUHandle* h) {
-    if (h != nullptr) {
-        if (h->next_hash != nullptr) {
-            h->next_hash->prev_hash = h->prev_hash;
-        }
-        DCHECK(h->prev_hash != nullptr);
-        h->prev_hash->next_hash = h->next_hash;
-        --_elems;
+bool HandleTable::remove(const LRUHandle* h) {
+    LRUHandle** ptr = &(_list[h->hash & (_length - 1)]);
+    while (*ptr != nullptr && *ptr != h) {
+        ptr = &(*ptr)->next_hash;
     }
+
+    LRUHandle* result = *ptr;

Review Comment:
   So, I think faster remove speed is more important than faster resize.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] gaodayue commented on pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

gaodayue commented on PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#issuecomment-1127505782

   @xinyiZzz thanks for your reply, I will answer inline
   
   > The lru cache is used globally in Doris, and the resize is very low frequency. It will only be resized frequently when the BE is cold started.
   
   A bit of context for this PR: we observed several latency spikes after restarting BE and found that it's caused by the resize operation (see the graph below), so we'd like to improve the resize operation in order to avoid any query timeout.
   
   ![image](https://user-images.githubusercontent.com/1198446/168571937-1e01780a-bc8e-44ca-af62-712978711751.png)
   
   The cluster uses 200G page cache per BE and resizing single shard to 8 million entries took >2 seconds.
   
   > Please provide test results to demonstrate the performance and memory usage benefits of singly linked list, especially at high concurrency.
   
   I wrote a simple program to test the resize speed. From the results, we can see that for large resize, this PR reduces the resize duration by 60%. This PR also reduces the memory usage, but it's not our main concern.
   
   ```
   -- before the PR
   resize to 16 takes 5 us
   resize to 32 takes 6 us
   resize to 64 takes 6 us
   resize to 128 takes 14 us
   resize to 256 takes 23 us
   resize to 512 takes 49 us
   resize to 1024 takes 102 us
   resize to 2048 takes 193 us
   resize to 4096 takes 444 us
   resize to 8192 takes 813 us
   resize to 16384 takes 1708 us
   resize to 32768 takes 3391 us
   resize to 65536 takes 7104 us
   resize to 131072 takes 16235 us
   resize to 262144 takes 38272 us
   resize to 524288 takes 88278 us
   resize to 1048576 takes 184775 us
   resize to 2097152 takes 380625 us
   resize to 4194304 takes 774974 us
   resize to 8388608 takes 1573816 us
   resize to 16777216 takes 3170542 us
   
   -- after the PR
   resize to 16 takes 5 us
   resize to 32 takes 2 us
   resize to 64 takes 2 us
   resize to 128 takes 2 us
   resize to 256 takes 3 us
   resize to 512 takes 4 us
   resize to 1024 takes 8 us
   resize to 2048 takes 18 us
   resize to 4096 takes 36 us
   resize to 8192 takes 82 us
   resize to 16384 takes 170 us
   resize to 32768 takes 406 us
   resize to 65536 takes 753 us
   resize to 131072 takes 1754 us
   resize to 262144 takes 6720 us
   resize to 524288 takes 24822 us
   resize to 1048576 takes 58967 us
   resize to 2097152 takes 132347 us
   resize to 4194304 takes 278078 us
   resize to 8388608 takes 587134 us
   resize to 16777216 takes 1212146 us
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] xinyiZzz commented on pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

xinyiZzz commented on PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#issuecomment-1124604826

   Hi @gaodayue 
   Please provide test results to demonstrate the performance and memory usage benefits of singly linked list, especially at high concurrency.
   
   The lru cache is used globally in Doris, and the resize is very low frequency. It will only be resized frequently when the BE is cold started.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] gaodayue commented on pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

gaodayue commented on PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#issuecomment-1127565162

   The failed checks seem unrelated to this PR. Can anyone help to trigger the rerun?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

xinyiZzz commented on code in PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#discussion_r874550846


##########
be/src/olap/lru_cache.cpp:
##########
@@ -106,40 +103,38 @@ LRUHandle* HandleTable::remove(const CacheKey& key, uint32_t hash) {
     LRUHandle** ptr = _find_pointer(key, hash);
     LRUHandle* result = *ptr;
 
-    remove(result);
+    if (result != nullptr) {
+        *ptr = result->next_hash;
+        _elems--;
+    }
 
     return result;
 }
 
-void HandleTable::remove(const LRUHandle* h) {
-    if (h != nullptr) {
-        if (h->next_hash != nullptr) {
-            h->next_hash->prev_hash = h->prev_hash;
-        }
-        DCHECK(h->prev_hash != nullptr);
-        h->prev_hash->next_hash = h->next_hash;
-        --_elems;
+bool HandleTable::remove(const LRUHandle* h) {
+    LRUHandle** ptr = &(_list[h->hash & (_length - 1)]);
+    while (*ptr != nullptr && *ptr != h) {
+        ptr = &(*ptr)->next_hash;
     }
+
+    LRUHandle* result = *ptr;

Review Comment:
   You're right~



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] gaodayue commented on a diff in pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

gaodayue commented on code in PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#discussion_r876839145


##########
be/src/olap/lru_cache.cpp:
##########
@@ -148,29 +143,22 @@ void HandleTable::_resize() {
 
     LRUHandle** new_list = new (std::nothrow) LRUHandle*[new_length];
     memset(new_list, 0, sizeof(new_list[0]) * new_length);
-    for (uint32_t i = 0; i < new_length; i++) {

Review Comment:
   > Why the first node in the linked list is a dummy node before?
   
   The dummy node was introduced in https://github.com/apache/incubator-doris/pull/4781 , in order to convert the chaining list from singly list to doubly list. But it's not necessary.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

github-actions[bot] commented on PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#issuecomment-1131516256

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

yiguolei commented on code in PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#discussion_r876813350


##########
be/src/olap/lru_cache.cpp:
##########
@@ -148,29 +143,22 @@ void HandleTable::_resize() {
 
     LRUHandle** new_list = new (std::nothrow) LRUHandle*[new_length];
     memset(new_list, 0, sizeof(new_list[0]) * new_length);
-    for (uint32_t i = 0; i < new_length; i++) {
-        // The first node in the linked-list is a dummy node used for
-        // inserting new node mainly.
-        new_list[i] = new LRUHandle();
-    }
 
     uint32_t count = 0;
     for (uint32_t i = 0; i < _length; i++) {
-        LRUHandle* h = _list[i]->next_hash;
+        LRUHandle* h = _list[i];
         while (h != nullptr) {
             LRUHandle* next = h->next_hash;
             uint32_t hash = h->hash;
-            LRUHandle* head = new_list[hash & (new_length - 1)];
-            _head_insert(head, h);
+            LRUHandle** ptr = &new_list[hash & (new_length - 1)];
+            h->next_hash = *ptr;
+            *ptr = h;
             h = next;
             count++;
         }
     }
 
     DCHECK_EQ(_elems, count);
-    for (uint32_t i = 0; i < _length; i++) {
-        delete _list[i];
-    }

Review Comment:
   not memory leak ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] xinyiZzz commented on pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

xinyiZzz commented on PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#issuecomment-1129590047

   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] xinyiZzz commented on pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

xinyiZzz commented on PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#issuecomment-1128605389

   > The failed checks seem unrelated to this PR. Can anyone help to trigger the rerun?
   
   Try git rebase and then push again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

xinyiZzz commented on code in PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#discussion_r874550846


##########
be/src/olap/lru_cache.cpp:
##########
@@ -106,40 +103,38 @@ LRUHandle* HandleTable::remove(const CacheKey& key, uint32_t hash) {
     LRUHandle** ptr = _find_pointer(key, hash);
     LRUHandle* result = *ptr;
 
-    remove(result);
+    if (result != nullptr) {
+        *ptr = result->next_hash;
+        _elems--;
+    }
 
     return result;
 }
 
-void HandleTable::remove(const LRUHandle* h) {
-    if (h != nullptr) {
-        if (h->next_hash != nullptr) {
-            h->next_hash->prev_hash = h->prev_hash;
-        }
-        DCHECK(h->prev_hash != nullptr);
-        h->prev_hash->next_hash = h->next_hash;
-        --_elems;
+bool HandleTable::remove(const LRUHandle* h) {
+    LRUHandle** ptr = &(_list[h->hash & (_length - 1)]);
+    while (*ptr != nullptr && *ptr != h) {
+        ptr = &(*ptr)->next_hash;
     }
+
+    LRUHandle* result = *ptr;

Review Comment:
   You're right~, I see you optimize the code logic at the same time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] gaodayue commented on a diff in pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

gaodayue commented on code in PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#discussion_r875443166


##########
be/src/olap/lru_cache.cpp:
##########
@@ -240,7 +228,7 @@ void LRUCache::release(Cache::Handle* handle) {
             // only exists in cache
             if (_usage > _capacity) {
                 // take this opportunity and remove the item
-                _table.remove(e);
+                DCHECK(_table.remove(e));

Review Comment:
   good catch, it's a mistake.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9521: [improvement][performance] improve lru cache resize performance and memory usage

Posted by GitBox <gi...@apache.org>.

yiguolei commented on code in PR #9521:
URL: https://github.com/apache/incubator-doris/pull/9521#discussion_r876820856


##########
be/src/olap/lru_cache.cpp:
##########
@@ -148,29 +143,22 @@ void HandleTable::_resize() {
 
     LRUHandle** new_list = new (std::nothrow) LRUHandle*[new_length];
     memset(new_list, 0, sizeof(new_list[0]) * new_length);
-    for (uint32_t i = 0; i < new_length; i++) {

Review Comment:
   Why the first node in the linked list is a dummy node before?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org