You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/07/02 10:09:39 UTC

[GitHub] [incubator-doris] marising opened a new pull request #4005: LRU cache for sql/partition cache #2581

marising opened a new pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005


   ## Features
   1. Find the cache node by SQL Key, then find the corresponding partition data by Partition Key, and then decide whether to hit Cache by LastVersion and LastVersionTime
   2. Refers to the classic cache algorithm LRU, which is the least recently used algorithm, using a three-layer data structure to achieve
   3. The Cache elimination algorithm is implemented by ensuring the range of the partition as much as possible, to avoid the situation of partition discontinuity, which will reduce the hit rate of the Cache partition,
   4. Use the two thresholds of maximum memory and elastic memory to control to avoid frequent elimination of data
   
   ## Cache fetch
   1. HashMap guarantees to quickly find Cache nodes
   2. Doubly linked list, put the most recently visited node at the bottom and the least visited at the top
   3. The partition data under the Node node is stored in order according to the linked list, and sorted according to the partition key. Considering that the number of requested partitions will not be very large, the two ordered data are combined and the loop is used to find the partition.
   4. Every access, will update the access time of the partition
   
   ## Cache update
   1. Considering that the amount of updates will be relatively small, the Hash table is used here to find the corresponding partition. 
   2. Determine whether the updated version is higher than the existing version. If it is higher, it will be updated, otherwise it will not be updated.
   
   ## Cache pruning
   1. The number below Part in the figure below represents the timestamp, and the timestamp of the most recent visit is saved
   2. The entire algorithm uses a doubly linked list to find the nodes that have not been accessed recently, eliminate them, and then check them back and forth, and so on, until the memory reaches the standard
   3. As shown in the figure below, find Node1 at the top, then find Part1, and eliminate, then find Part1 of Node2, and eliminate, thus eliminating the partition with timestamp 1-3
   4. Node1 is cleaned up because none of the following parts
   
   ![image](https://user-images.githubusercontent.com/8611398/86345693-730f9380-bc8e-11ea-909c-5cf8702f75bc.png)
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] wuyunfeng commented on a change in pull request #4005: LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
wuyunfeng commented on a change in pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005#discussion_r475302348



##########
File path: be/src/common/config.h
##########
@@ -570,6 +570,16 @@ namespace config {
 
     // Soft memory limit as a fraction of hard memory limit.
     CONF_Double(soft_mem_limit_frac, "0.9");
+    
+    // Set max cache's size of query results, the unit is M byte
+    CONF_Int32(query_cache_max_size_mb, "256"); 
+
+    // Cache memory is pruned when reach query_cache_max_size_mb + query_cache_elasticity_size_mb
+    CONF_Int32(query_cache_elasticity_size_mb, "128");

Review comment:
       what `elasticity` means?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a change in pull request #4005: LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005#discussion_r449851922



##########
File path: be/src/runtime/cache/cache_utils.h
##########
@@ -0,0 +1,87 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef DORIS_BE_SRC_RUNTIME_CACHE_UTILS_H
+#define DORIS_BE_SRC_RUNTIME_CACHE_UTILS_H
+
+#include <gutil/integral_types.h>
+#include <sys/time.h>
+
+#include <algorithm>
+#include <boost/thread.hpp>
+#include <cassert>
+#include <cstdio>
+#include <cstdlib>
+#include <exception>
+#include <iostream>
+#include <list>
+#include <map>
+#include <shared_mutex>
+
+namespace doris {
+
+typedef boost::shared_lock<boost::shared_mutex> CacheReadLock;
+typedef boost::unique_lock<boost::shared_mutex> CacheWriteLock;
+
+//#ifndef PARTITION_CACHE_DEV

Review comment:
       Remove unused code

##########
File path: fe/fe-core/src/main/java/org/apache/doris/common/Config.java
##########
@@ -1,1221 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor license agreements.  See the NOTICE file
-// distributed with this work for additional information
-// regarding copyright ownership.  The ASF licenses this file
-// to you under the Apache License, Version 2.0 (the
-// "License"); you may not use this file except in compliance
-// with the License.  You may obtain a copy of the License at
-//
-//   http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing,
-// software distributed under the License is distributed on an
-// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-// KIND, either express or implied.  See the License for the
-// specific language governing permissions and limitations
-// under the License.
-
-package org.apache.doris.common;

Review comment:
       This file has been deleted?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on pull request #4005: [Cache][BE] LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
morningman commented on pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005#issuecomment-694944905


   Still has memory leak in UT


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] marising commented on a change in pull request #4005: LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
marising commented on a change in pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005#discussion_r477094463



##########
File path: be/src/runtime/cache/cache_utils.h
##########
@@ -0,0 +1,87 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef DORIS_BE_SRC_RUNTIME_CACHE_UTILS_H
+#define DORIS_BE_SRC_RUNTIME_CACHE_UTILS_H
+
+#include <gutil/integral_types.h>
+#include <sys/time.h>
+
+#include <algorithm>
+#include <boost/thread.hpp>
+#include <cassert>
+#include <cstdio>
+#include <cstdlib>
+#include <exception>
+#include <iostream>
+#include <list>
+#include <map>
+#include <shared_mutex>
+
+namespace doris {
+
+typedef boost::shared_lock<boost::shared_mutex> CacheReadLock;
+typedef boost::unique_lock<boost::shared_mutex> CacheWriteLock;
+
+//#ifndef PARTITION_CACHE_DEV

Review comment:
       This is the macro definition of debug, which I use when debugging




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] marising commented on a change in pull request #4005: LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
marising commented on a change in pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005#discussion_r465448762



##########
File path: be/src/runtime/cache/result_node.cpp
##########
@@ -0,0 +1,274 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+#include "gen_cpp/internal_service.pb.h"
+#include "runtime/cache/result_node.h"
+#include "runtime/cache/cache_utils.h"
+
+namespace doris {
+
+bool compare_partition(const PartitionRowBatch* left_node, const PartitionRowBatch* right_node) {
+    return left_node->get_partition_key() < right_node->get_partition_key();
+}
+
+//return new batch size,only include the size of PRowBatch
+void PartitionRowBatch::set_row_batch(const PCacheValue& value) {
+    if (_cache_value != NULL && !check_newer(value.param())) {
+        LOG(WARNING) << "set old version data, cache ver:" << _cache_value->param().last_version()
+                     << ",cache time:" << _cache_value->param().last_version_time()
+                     << ", setdata ver:" << value.param().last_version()
+                     << ",setdata time:" << value.param().last_version_time();
+        return;
+    }
+    SAFE_DELETE(_cache_value);
+    _cache_value = new PCacheValue(value);
+    _data_size += _cache_value->data_size();

Review comment:
       _data_size contains the data size of all partitions, in order to avoid counting once every time, so each operation increases or decreases the changed value




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] marising commented on a change in pull request #4005: LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
marising commented on a change in pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005#discussion_r465427004



##########
File path: be/src/runtime/cache/result_cache.cpp
##########
@@ -0,0 +1,257 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+#include "gen_cpp/internal_service.pb.h"
+#include "runtime/cache/result_cache.h"
+#include "util/doris_metrics.h"
+
+namespace doris {
+
+/**
+* Remove the tail node of link
+*/
+ResultNode* ResultNodeList::pop() {
+    remove(_head);
+    return _head;
+}
+
+void ResultNodeList::remove(ResultNode* node) {
+    if (!node) return;
+    if (node == _head) _head = node->get_next();
+    if (node == _tail) _tail = node->get_prev();
+    node->unlink();
+    _node_count--;
+}
+
+void ResultNodeList::push(ResultNode* node) {
+    if (!node) return;
+    if (!_head) _head = node;
+    node->append(_tail);
+    _tail = node;
+    _node_count++;
+}
+
+void ResultNodeList::move_tail(ResultNode* node) {
+    if (!node || node == _tail) return;
+    if (!_head)
+        _head = node;
+    else if (node == _head)
+        _head = node->get_next();
+    node->unlink();
+    node->append(_tail);
+    _tail = node;
+}
+
+void ResultNodeList::clear() {
+    LOG(INFO) << "clear result node list.";
+    while (_head) {
+        ResultNode* tmp_node = _head->get_next();
+        _head->clear();
+        SAFE_DELETE(_head);
+        _head = tmp_node;
+    }
+    _node_count = 0;
+}
+/**
+ * Find the node and update partition data
+ * New node, the node updated in the first partition will move to the tail of the list
+ */
+void ResultCache::update(const PUpdateCacheRequest* request, PCacheResponse* response) {
+    ResultNode* node;
+    PCacheStatus status;
+    bool update_first = false;
+    UniqueId sql_key = request->sql_key();
+    LOG(INFO) << "update cache, sql key:" << sql_key;
+    
+    CacheWriteLock write_lock(_cache_mtx);
+    auto it = _node_map.find(sql_key);
+    if (it != _node_map.end()) {
+        node = it->second;
+        _cache_size -= node->get_data_size();
+        _partition_count -= node->get_partition_count();
+        status = node->update_partition(request, update_first);
+    } else {
+        node = _node_list.new_node(sql_key);
+        status = node->update_partition(request, update_first);
+        _node_list.push(node);
+        _node_map[sql_key] = node;
+        _node_count += 1;
+    }
+    if (update_first) {
+        _node_list.move_tail(node);
+    }
+    _cache_size += node->get_data_size();
+    _partition_count += node->get_partition_count();
+    response->set_status(status);
+
+    prune();
+    update_monitor();
+}
+
+/**
+ * Fetch cache through sql key, partition key, version and time
+ */
+void ResultCache::fetch(const PFetchCacheRequest* request, PFetchCacheResult* result) {
+    bool hit_first = false;
+    ResultNodeMap::iterator node_it;
+    const UniqueId sql_key = request->sql_key();
+    LOG(INFO) << "fetch cache, sql key:" << sql_key;
+    {
+        CacheReadLock read_lock(_cache_mtx);    
+        node_it = _node_map.find(sql_key);
+        if (node_it == _node_map.end()) {
+            result->set_status(PCacheStatus::NO_SQL_KEY);
+            LOG(INFO) << "no such sql key:" << sql_key;
+            return;
+        }
+        ResultNode* node = node_it->second;
+        PartitionRowBatchList part_rowbatch_list;
+        PCacheStatus status = node->fetch_partition(request, part_rowbatch_list, hit_first);

Review comment:
       It does not need to be checked, the status is sent to the FE to record the log. In addition, under normal circumstances, when getting the list, if there is an exception, the number of the list is 0




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] marising commented on a change in pull request #4005: LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
marising commented on a change in pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005#discussion_r465446439



##########
File path: be/src/runtime/cache/result_node.h
##########
@@ -0,0 +1,197 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef DORIS_BE_SRC_RUNTIME_RESULT_NODE_H
+#define DORIS_BE_SRC_RUNTIME_RESULT_NODE_H
+
+#include <sys/time.h>
+
+#include <algorithm>
+#include <cassert>
+#include <cstdio>
+#include <cstdlib>
+#include <exception>
+#include <iostream>
+#include <list>
+#include <map>
+#include <string>
+
+#include "common/config.h"
+#include "olap/olap_define.h"
+#include "runtime/cache/cache_utils.h"
+#include "runtime/mem_pool.h"
+#include "runtime/row_batch.h"
+#include "runtime/tuple_row.h"
+#include "util/uid_util.h"
+
+namespace doris {
+
+enum PCacheStatus;
+class PCacheParam;
+class PCacheValue;
+class PCacheResponse;
+class PFetchCacheRequest;
+class PFetchCacheResult;
+class PUpdateCacheRequest;
+class PClearCacheRequest;
+
+/**
+* Cache one partition data, request param must match version and time of cache
+*/
+class PartitionRowBatch {
+public:
+    PartitionRowBatch(int64 partition_key)
+            : _partition_key(partition_key), _cache_value(NULL), _data_size(0) {}
+
+    ~PartitionRowBatch() {}
+
+    void set_row_batch(const PCacheValue& value);
+    bool is_hit_cache(const PCacheParam& param);
+    void clear();
+
+    int64 get_partition_key() const { return _partition_key; }
+
+    PCacheValue* get_value() { return _cache_value; }
+
+    size_t get_data_size() { return _data_size; }
+
+    const CacheStat* get_stat() const { return &_cache_stat; }
+
+private:
+    bool check_match(const PCacheParam& req_param) {
+        if (req_param.last_version() > _cache_value->param().last_version()) {
+            return false;
+        }
+        if (req_param.last_version_time() > _cache_value->param().last_version_time()) {
+            return false;
+        }
+        return true;
+    }
+
+    bool check_newer(const PCacheParam& up_param) {
+        //for init data of sql cache
+        if (up_param.last_version() == 0 || up_param.last_version_time() == 0) {
+            return true;
+        }
+        if (up_param.last_version_time() > _cache_value->param().last_version_time()) {
+            return true;
+        }
+        if (up_param.last_version() > _cache_value->param().last_version()) {
+            return true;
+        }
+        return false;
+    }
+
+private:
+    int64 _partition_key;
+    PCacheValue* _cache_value;
+    size_t _data_size;
+    CacheStat _cache_stat;
+};
+
+typedef std::list<PartitionRowBatch*> PartitionRowBatchList;
+typedef boost::unordered_map<int64, PartitionRowBatch*> PartitionRowBatchMap;
+
+/**
+* Cache the result of one SQL, include many partition rowsets.
+* Sql Cache: The partiton ID comes from the partition lastest updated.
+* Partition Cache: The partition ID comes from the partition scanned by query.
+* The above two modes use the same cache structure.
+*/
+class ResultNode {
+public:
+    ResultNode() : _sql_key(0, 0), _prev(NULL), _next(NULL), _data_size(0) {}
+
+    ResultNode(const UniqueId& sql_key)
+            : _sql_key(sql_key), _prev(NULL), _next(NULL), _data_size(0) {}
+
+    virtual ~ResultNode() {}
+
+    PCacheStatus update_partition(const PUpdateCacheRequest* request, bool& update_first);
+    PCacheStatus fetch_partition(const PFetchCacheRequest* request,
+                                 PartitionRowBatchList& rowBatchList, bool& hit_first);
+
+    size_t prune_first();
+    void clear();
+
+    bool operator()(const ResultNode* left_node, const ResultNode* right_node) {

Review comment:
       yes




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] wuyunfeng commented on a change in pull request #4005: LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
wuyunfeng commented on a change in pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005#discussion_r475302348



##########
File path: be/src/common/config.h
##########
@@ -570,6 +570,16 @@ namespace config {
 
     // Soft memory limit as a fraction of hard memory limit.
     CONF_Double(soft_mem_limit_frac, "0.9");
+    
+    // Set max cache's size of query results, the unit is M byte
+    CONF_Int32(query_cache_max_size_mb, "256"); 
+
+    // Cache memory is pruned when reach query_cache_max_size_mb + query_cache_elasticity_size_mb
+    CONF_Int32(query_cache_elasticity_size_mb, "128");

Review comment:
       I do not think  `elasticity` means what you want?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] marising commented on a change in pull request #4005: LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
marising commented on a change in pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005#discussion_r449336062



##########
File path: be/src/util/doris_metrics.cpp
##########
@@ -161,6 +158,10 @@ DorisMetrics::DorisMetrics() : _name("doris_be"), _hook_name("doris_metrics"), _
     REGISTER_DORIS_METRIC(tablet_cumulative_max_compaction_score);
     REGISTER_DORIS_METRIC(tablet_base_max_compaction_score);
 
+    REGISTER_DORIS_METRIC(cache_memory_total);

Review comment:
       query_cache_memory_total is  byte,REGISTER_DORIS_METRIC(query_cache_memory_total); 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] marising commented on a change in pull request #4005: LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
marising commented on a change in pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005#discussion_r465437803



##########
File path: be/src/runtime/cache/result_node.cpp
##########
@@ -0,0 +1,274 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+#include "gen_cpp/internal_service.pb.h"
+#include "runtime/cache/result_node.h"
+#include "runtime/cache/cache_utils.h"
+
+namespace doris {
+
+bool compare_partition(const PartitionRowBatch* left_node, const PartitionRowBatch* right_node) {
+    return left_node->get_partition_key() < right_node->get_partition_key();
+}
+
+//return new batch size,only include the size of PRowBatch
+void PartitionRowBatch::set_row_batch(const PCacheValue& value) {
+    if (_cache_value != NULL && !check_newer(value.param())) {
+        LOG(WARNING) << "set old version data, cache ver:" << _cache_value->param().last_version()
+                     << ",cache time:" << _cache_value->param().last_version_time()
+                     << ", setdata ver:" << value.param().last_version()
+                     << ",setdata time:" << value.param().last_version_time();
+        return;
+    }
+    SAFE_DELETE(_cache_value);
+    _cache_value = new PCacheValue(value);
+    _data_size += _cache_value->data_size();
+    _cache_stat.update();
+    LOG(INFO) << "finish set row batch, row num:" << _cache_value->row_size()
+              << ", data size:" << _data_size;
+}
+
+bool PartitionRowBatch::is_hit_cache(const PCacheParam& param) {
+    if (param.partition_key() != _partition_key) {
+        return false;
+    }
+    if (!check_match(param)) {
+        return false;
+    }
+    _cache_stat.query();
+    return true;
+}
+
+void PartitionRowBatch::clear() {
+    LOG(INFO) << "clear partition rowbatch.";
+    SAFE_DELETE(_cache_value);
+    _partition_key = 0;
+    _data_size = 0;
+    _cache_stat.init();
+}
+
+/**
+ * Update partition cache data, find RowBatch from partition map by partition key,
+ * the partition rowbatch are stored in the order of partition keys
+ */
+PCacheStatus ResultNode::update_partition(const PUpdateCacheRequest* request, bool& update_first) {
+    update_first = false;
+    if (_sql_key != request->sql_key()) {
+        LOG(INFO) << "no match sql_key " << request->sql_key().hi() << request->sql_key().lo();
+        return PCacheStatus::PARAM_ERROR;
+    }
+
+    if (request->value_size() > config::query_cache_max_partition_count) {
+        LOG(WARNING) << "too many partitions size:" << request->value_size();
+        return PCacheStatus::PARAM_ERROR;
+    }
+
+    //Only one thread per SQL key can update the cache
+    CacheWriteLock write_lock(_node_mtx);
+
+    int64 first_key = kint64max;
+    if (_partition_list.size() == 0) {
+        update_first = true;
+    } else {
+        first_key = (*(_partition_list.begin()))->get_partition_key();
+    }
+    PartitionRowBatch* partition = NULL;
+    for (int i = 0; i < request->value_size(); i++) {
+        const PCacheValue& value = request->value(i);
+        int64 partition_key = value.param().partition_key();
+        if (!update_first && partition_key <= first_key) {
+            update_first = true;
+        }
+        auto it = _partition_map.find(partition_key);
+        if (it == _partition_map.end()) {
+            partition = new PartitionRowBatch(partition_key);
+            partition->set_row_batch(value);
+            _partition_map[partition_key] = partition;
+            _partition_list.push_back(partition);
+#ifdef PARTITION_CACHE_DEV
+            LOG(INFO) << "add index:" << i << ", pkey:" << partition->get_partition_key()
+                      << ", list size:" << _partition_list.size()
+                      << ", map size:" << _partition_map.size();
+#endif
+        } else {
+            partition = it->second;
+            _data_size -= partition->get_data_size();
+            partition->set_row_batch(value);
+#ifdef PARTITION_CACHE_DEV
+            LOG(INFO) << "update index:" << i << ", pkey:" << partition->get_partition_key()
+                      << ", list size:" << _partition_list.size()
+                      << ", map size:" << _partition_map.size();
+#endif
+        }
+        _data_size += partition->get_data_size();
+    }
+    _partition_list.sort(compare_partition);
+    LOG(INFO) << "finish update batches:" << _partition_list.size();
+    while (config::query_cache_max_partition_count > 0 &&
+           _partition_list.size() > config::query_cache_max_partition_count) {
+        if (prune_first() == 0) {
+            break;
+        }
+    }
+    return PCacheStatus::CACHE_OK;
+}
+
+/**
+* Only the range query of the key of the partition is supported, and the separated partition key query is not supported.
+* Because a query can only be divided into two parts, part1 get data from cache, part2 fetch_data by scan node from BE.
+* Partion cache : 20191211-20191215
+* Hit cache parameter : [20191211 - 20191215], [20191212 - 20191214], [20191212 - 20191216],[20191210 - 20191215]
+* Miss cache parameter: [20191210 - 20191216]
+*/
+PCacheStatus ResultNode::fetch_partition(const PFetchCacheRequest* request,
+                                         PartitionRowBatchList& row_batch_list, bool& hit_first) {
+    hit_first = false;
+    if (request->param_size() == 0) {
+        return PCacheStatus::PARAM_ERROR;
+    }
+
+    CacheReadLock read_lock(_node_mtx);
+
+    if (_partition_list.size() == 0) {
+        return PCacheStatus::NO_PARTITION_KEY;
+    }
+    
+    if (request->param(0).partition_key() > (*_partition_list.rbegin())->get_partition_key() ||
+        request->param(request->param_size() - 1).partition_key() <
+                (*_partition_list.begin())->get_partition_key()) {
+        return PCacheStatus::NO_PARTITION_KEY;
+    }
+
+    bool find = false;
+    int begin_idx = -1, end_idx = -1, param_idx = 0;
+    auto begin_it = _partition_list.end();
+    auto end_it = _partition_list.end();
+    auto part_it = _partition_list.begin();
+
+    PCacheStatus status = PCacheStatus::CACHE_OK;
+    while (param_idx < request->param_size() && part_it != _partition_list.end()) {
+#ifdef PARTITION_CACHE_DEV
+        LOG(INFO) << "Param index : " << param_idx
+                  << ", param part Key : " << request->param(param_idx).partition_key()
+                  << ", batch part key : " << (*part_it)->get_partition_key();
+#endif
+        if (!find) {
+            while (part_it != _partition_list.end() &&
+                   request->param(param_idx).partition_key() > (*part_it)->get_partition_key()) {
+                part_it++;
+            }
+            while (param_idx < request->param_size() &&

Review comment:
       If there is no intersection, we have already judged it before, and we will directly regret it
   ``` 
      if (request->param(0).partition_key() > (*_partition_list.rbegin())->get_partition_key() ||
           request->param(request->param_size() - 1).partition_key() <
                   (*_partition_list.begin())->get_partition_key()) {
           return PCacheStatus::NO_PARTITION_KEY;
       }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] marising commented on a change in pull request #4005: LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
marising commented on a change in pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005#discussion_r465446382



##########
File path: be/src/runtime/exec_env_init.cpp
##########
@@ -208,6 +215,10 @@ void ExecEnv::_init_buffer_pool(int64_t min_page_size,
 }
 
 void ExecEnv::_destory() {
+    //Only destroy once after init
+    if (!_is_init) {

Review comment:
       If is_init is not judged, the initial initialization fails, and the destroy will be CoreDump later




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman merged pull request #4005: [Cache][BE] LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
morningman merged pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a change in pull request #4005: LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005#discussion_r449113739



##########
File path: be/src/common/config.h
##########
@@ -536,6 +536,15 @@ namespace config {
     // Whether to continue to start be when load tablet from header failed.
     CONF_Bool(ignore_load_tablet_failure, "false");
 
+    // Set max cache's size of query results, the unit is M byte
+    CONF_Int32(cache_max_size, "256"); 

Review comment:
       ```suggestion
       CONF_Int32(query_cache_max_size_mb, "256"); 
   ```
   
   same suggestion for other 2 configs.

##########
File path: be/src/util/doris_metrics.cpp
##########
@@ -161,6 +158,10 @@ DorisMetrics::DorisMetrics() : _name("doris_be"), _hook_name("doris_metrics"), _
     REGISTER_DORIS_METRIC(tablet_cumulative_max_compaction_score);
     REGISTER_DORIS_METRIC(tablet_base_max_compaction_score);
 
+    REGISTER_DORIS_METRIC(cache_memory_total);

Review comment:
       ```suggestion
       REGISTER_DORIS_METRIC(query_cache_memory_total_mb);
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] wutiangan commented on a change in pull request #4005: LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
wutiangan commented on a change in pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005#discussion_r465010938



##########
File path: be/src/runtime/cache/result_cache.cpp
##########
@@ -0,0 +1,257 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+#include "gen_cpp/internal_service.pb.h"
+#include "runtime/cache/result_cache.h"
+#include "util/doris_metrics.h"
+
+namespace doris {
+
+/**
+* Remove the tail node of link
+*/
+ResultNode* ResultNodeList::pop() {
+    remove(_head);
+    return _head;
+}
+
+void ResultNodeList::remove(ResultNode* node) {
+    if (!node) return;
+    if (node == _head) _head = node->get_next();
+    if (node == _tail) _tail = node->get_prev();
+    node->unlink();
+    _node_count--;
+}
+
+void ResultNodeList::push(ResultNode* node) {
+    if (!node) return;
+    if (!_head) _head = node;
+    node->append(_tail);
+    _tail = node;
+    _node_count++;
+}
+
+void ResultNodeList::move_tail(ResultNode* node) {
+    if (!node || node == _tail) return;
+    if (!_head)
+        _head = node;
+    else if (node == _head)
+        _head = node->get_next();
+    node->unlink();
+    node->append(_tail);
+    _tail = node;
+}
+
+void ResultNodeList::clear() {
+    LOG(INFO) << "clear result node list.";
+    while (_head) {
+        ResultNode* tmp_node = _head->get_next();
+        _head->clear();
+        SAFE_DELETE(_head);
+        _head = tmp_node;
+    }
+    _node_count = 0;
+}
+/**
+ * Find the node and update partition data
+ * New node, the node updated in the first partition will move to the tail of the list
+ */
+void ResultCache::update(const PUpdateCacheRequest* request, PCacheResponse* response) {
+    ResultNode* node;
+    PCacheStatus status;
+    bool update_first = false;
+    UniqueId sql_key = request->sql_key();
+    LOG(INFO) << "update cache, sql key:" << sql_key;
+    
+    CacheWriteLock write_lock(_cache_mtx);
+    auto it = _node_map.find(sql_key);
+    if (it != _node_map.end()) {
+        node = it->second;
+        _cache_size -= node->get_data_size();
+        _partition_count -= node->get_partition_count();
+        status = node->update_partition(request, update_first);
+    } else {
+        node = _node_list.new_node(sql_key);
+        status = node->update_partition(request, update_first);
+        _node_list.push(node);
+        _node_map[sql_key] = node;
+        _node_count += 1;
+    }
+    if (update_first) {
+        _node_list.move_tail(node);
+    }
+    _cache_size += node->get_data_size();
+    _partition_count += node->get_partition_count();
+    response->set_status(status);
+
+    prune();
+    update_monitor();
+}
+
+/**
+ * Fetch cache through sql key, partition key, version and time
+ */
+void ResultCache::fetch(const PFetchCacheRequest* request, PFetchCacheResult* result) {
+    bool hit_first = false;
+    ResultNodeMap::iterator node_it;
+    const UniqueId sql_key = request->sql_key();
+    LOG(INFO) << "fetch cache, sql key:" << sql_key;
+    {
+        CacheReadLock read_lock(_cache_mtx);    
+        node_it = _node_map.find(sql_key);
+        if (node_it == _node_map.end()) {
+            result->set_status(PCacheStatus::NO_SQL_KEY);
+            LOG(INFO) << "no such sql key:" << sql_key;
+            return;
+        }
+        ResultNode* node = node_it->second;
+        PartitionRowBatchList part_rowbatch_list;
+        PCacheStatus status = node->fetch_partition(request, part_rowbatch_list, hit_first);

Review comment:
       do you need check the status?

##########
File path: be/src/common/config.h
##########
@@ -546,6 +546,16 @@ namespace config {
 
     // Soft memory limit as a fraction of hard memory limit.
     CONF_Double(soft_mem_limit_frac, "0.9");
+    
+    // Set max cache's size of query results, the unit is M byte
+    CONF_Int32(query_cache_max_size_mb, "256"); 
+
+    //Cache memory is pruened when reach cache_max_size + cache_elasticity_size

Review comment:
       ```suggestion
       //Cache memory is pruned when reach cache_max_size + cache_elasticity_size
   ```

##########
File path: be/src/runtime/cache/result_node.cpp
##########
@@ -0,0 +1,274 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+#include "gen_cpp/internal_service.pb.h"
+#include "runtime/cache/result_node.h"
+#include "runtime/cache/cache_utils.h"
+
+namespace doris {
+
+bool compare_partition(const PartitionRowBatch* left_node, const PartitionRowBatch* right_node) {
+    return left_node->get_partition_key() < right_node->get_partition_key();
+}
+
+//return new batch size,only include the size of PRowBatch
+void PartitionRowBatch::set_row_batch(const PCacheValue& value) {
+    if (_cache_value != NULL && !check_newer(value.param())) {
+        LOG(WARNING) << "set old version data, cache ver:" << _cache_value->param().last_version()
+                     << ",cache time:" << _cache_value->param().last_version_time()
+                     << ", setdata ver:" << value.param().last_version()
+                     << ",setdata time:" << value.param().last_version_time();
+        return;
+    }
+    SAFE_DELETE(_cache_value);
+    _cache_value = new PCacheValue(value);
+    _data_size += _cache_value->data_size();
+    _cache_stat.update();
+    LOG(INFO) << "finish set row batch, row num:" << _cache_value->row_size()
+              << ", data size:" << _data_size;
+}
+
+bool PartitionRowBatch::is_hit_cache(const PCacheParam& param) {
+    if (param.partition_key() != _partition_key) {
+        return false;
+    }
+    if (!check_match(param)) {

Review comment:
       maybe 45-47 can also put in check_match function, because param has three params.

##########
File path: be/src/runtime/cache/result_node.cpp
##########
@@ -0,0 +1,274 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+#include "gen_cpp/internal_service.pb.h"
+#include "runtime/cache/result_node.h"
+#include "runtime/cache/cache_utils.h"
+
+namespace doris {
+
+bool compare_partition(const PartitionRowBatch* left_node, const PartitionRowBatch* right_node) {
+    return left_node->get_partition_key() < right_node->get_partition_key();
+}
+
+//return new batch size,only include the size of PRowBatch
+void PartitionRowBatch::set_row_batch(const PCacheValue& value) {
+    if (_cache_value != NULL && !check_newer(value.param())) {
+        LOG(WARNING) << "set old version data, cache ver:" << _cache_value->param().last_version()
+                     << ",cache time:" << _cache_value->param().last_version_time()
+                     << ", setdata ver:" << value.param().last_version()
+                     << ",setdata time:" << value.param().last_version_time();
+        return;
+    }
+    SAFE_DELETE(_cache_value);
+    _cache_value = new PCacheValue(value);
+    _data_size += _cache_value->data_size();

Review comment:
       why not " _data_size = _cache_value->data_size()" ?

##########
File path: be/src/runtime/cache/result_cache.cpp
##########
@@ -0,0 +1,257 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+#include "gen_cpp/internal_service.pb.h"
+#include "runtime/cache/result_cache.h"
+#include "util/doris_metrics.h"
+
+namespace doris {
+
+/**
+* Remove the tail node of link
+*/
+ResultNode* ResultNodeList::pop() {
+    remove(_head);
+    return _head;
+}
+
+void ResultNodeList::remove(ResultNode* node) {
+    if (!node) return;
+    if (node == _head) _head = node->get_next();
+    if (node == _tail) _tail = node->get_prev();
+    node->unlink();
+    _node_count--;
+}
+
+void ResultNodeList::push(ResultNode* node) {
+    if (!node) return;
+    if (!_head) _head = node;
+    node->append(_tail);
+    _tail = node;
+    _node_count++;
+}
+
+void ResultNodeList::move_tail(ResultNode* node) {
+    if (!node || node == _tail) return;
+    if (!_head)
+        _head = node;
+    else if (node == _head)
+        _head = node->get_next();
+    node->unlink();
+    node->append(_tail);
+    _tail = node;
+}
+
+void ResultNodeList::clear() {
+    LOG(INFO) << "clear result node list.";
+    while (_head) {
+        ResultNode* tmp_node = _head->get_next();
+        _head->clear();
+        SAFE_DELETE(_head);
+        _head = tmp_node;
+    }
+    _node_count = 0;
+}
+/**
+ * Find the node and update partition data
+ * New node, the node updated in the first partition will move to the tail of the list
+ */
+void ResultCache::update(const PUpdateCacheRequest* request, PCacheResponse* response) {
+    ResultNode* node;
+    PCacheStatus status;
+    bool update_first = false;
+    UniqueId sql_key = request->sql_key();
+    LOG(INFO) << "update cache, sql key:" << sql_key;
+    
+    CacheWriteLock write_lock(_cache_mtx);
+    auto it = _node_map.find(sql_key);
+    if (it != _node_map.end()) {
+        node = it->second;
+        _cache_size -= node->get_data_size();
+        _partition_count -= node->get_partition_count();
+        status = node->update_partition(request, update_first);
+    } else {
+        node = _node_list.new_node(sql_key);
+        status = node->update_partition(request, update_first);
+        _node_list.push(node);
+        _node_map[sql_key] = node;
+        _node_count += 1;
+    }
+    if (update_first) {
+        _node_list.move_tail(node);
+    }
+    _cache_size += node->get_data_size();
+    _partition_count += node->get_partition_count();
+    response->set_status(status);
+
+    prune();
+    update_monitor();
+}
+
+/**
+ * Fetch cache through sql key, partition key, version and time
+ */
+void ResultCache::fetch(const PFetchCacheRequest* request, PFetchCacheResult* result) {
+    bool hit_first = false;
+    ResultNodeMap::iterator node_it;
+    const UniqueId sql_key = request->sql_key();
+    LOG(INFO) << "fetch cache, sql key:" << sql_key;
+    {
+        CacheReadLock read_lock(_cache_mtx);    
+        node_it = _node_map.find(sql_key);
+        if (node_it == _node_map.end()) {
+            result->set_status(PCacheStatus::NO_SQL_KEY);
+            LOG(INFO) << "no such sql key:" << sql_key;
+            return;
+        }
+        ResultNode* node = node_it->second;
+        PartitionRowBatchList part_rowbatch_list;
+        PCacheStatus status = node->fetch_partition(request, part_rowbatch_list, hit_first);
+
+        for (auto part_it = part_rowbatch_list.begin(); part_it != part_rowbatch_list.end(); part_it++) {
+            PCacheValue* srcValue = (*part_it)->get_value();
+            if (srcValue != NULL) {
+                PCacheValue* value = result->add_value();
+                value->CopyFrom(*srcValue);
+                LOG(INFO) << "fetch cache partition key:" << srcValue->param().partition_key();
+            } else {
+                LOG(WARNING) << "prowbatch of cache is null";
+                status = PCacheStatus::EMPTY_DATA;
+                break;
+            }
+        }
+        result->set_status(status);
+    }
+
+    if (hit_first) {
+        {

Review comment:
       140 line,143 line maybe can remove

##########
File path: docs/zh-CN/administrator-guide/partition_cache.md
##########
@@ -0,0 +1,205 @@
+# 分区缓存
+
+## 需求场景
+大部分数据分析场景是写少读多,数据写入一次,多次频繁读取,比如一张报表涉及的维度和指标,数据在凌晨一次性计算好,但每天有数百甚至数千次的页面访问,因此非常适合把结果集缓存起来。在数据分析或BI应用中,存在下面的业务场景:
+* **高并发场景**,Doris可以较好的支持高并发,但单台服务器无法承载太高的QPS
+* **复杂图表的看板**,复杂的Dashboard或者大屏类应用,数据来自多张表,每个页面有数十个查询,虽然每个查询只有数十毫秒,但是总体查询时间会在数秒
+* **趋势分析**,给定日期范围的查询,指标按日显示,比如查询最近7天内的用户数的趋势,这类查询数据量大,查询范围广,查询时间往往需要数十秒
+* **用户重复查询**,如果产品没有防重刷机制,用户因手误或其他原因重复刷新页面,导致提交大量的重复的SQL
+
+以上四种场景,在应用层的解决方案,把查询结果放到Redis中,周期性的更新缓存或者用户手工刷新缓存,但是这个方案有如下问题:
+* **数据不一致**,无法感知数据的更新,导致用户经常看到旧的数据
+* **命中率低**,缓存整个查询结果,如果数据实时写入,缓存频繁失效,命中率低且系统负载较重
+* **额外成本**,引入外部缓存组件,会带来系统复杂度,增加额外成本
+
+## 解决方案
+本分区缓存策略可以解决上面的问题,优先保证数据一致性,在此基础上细化缓存粒度,提升命中率,因此有如下特点:
+* 用户无需担心数据一致性,通过版本来控制缓存失效,缓存的数据和从BE中查询的数据是一致的
+* 没有额外的组件和成本,缓存结果存储在BE的内存中,用户可以根据需要调整缓存内存大小
+* 实现了两种缓存策略,SQLCache和PartitionCache,后者缓存粒度更细
+* 用一致性哈希解决BE节点上下线的问题,BE中的缓存算法是改进的LRU
+
+## SQLCache
+SQLCache按SQL的签名、查询的表的分区ID、分区最新版本来存储和获取缓存。三者组合确定一个缓存数据集,任何一个变化了,如SQL有变化,如查询字段或条件不一样,或数据更新后版本变化了,会导致命中不了缓存。
+
+如果多张表Join,使用最近更新的分区ID和最新的版本号,如果其中一张表更新了,会导致分区ID或版本号不一样,也一样命中不了缓存。
+
+SQLCache,更适合T+1更新的场景,凌晨数据更新,首次查询从BE中获取结果放入到缓存中,后续相同查询从缓存中获取。实时更新数据也可以使用,但是可能存在命中率低的问题,可以参考如下PartitionCache。
+
+## PartitionCache
+
+### 设计原理
+1. SQL可以并行拆分,Q = Q1 ∪ Q2 ... ∪ Qn,R= R1 ∪ R2 ... ∪ Rn,Q为查询语句,R为结果集
+2. 拆分为只读分区和可更新分区,只读分区缓存,更新分区不缓存
+
+如上,查询最近7天的每天用户数,如按日期分区,数据只写当天分区,当天之外的其他分区的数据,都是固定不变的,在相同的查询SQL下,查询某个不更新分区的指标都是固定的。如下,在2020-03-09当天查询前7天的用户数,2020-03-03至2020-03-07的数据来自缓存,2020-03-08第一次查询来自分区,后续的查询来自缓存,2020-03-09因为当天在不停写入,所以来自分区。
+
+因此,查询N天的数据,数据更新最近的D天,每天只是日期范围不一样相似的查询,只需要查询D个分区即可,其他部分都来自缓存,可以有效降低集群负载,减少查询时间。
+
+```
+MySQL [(none)]> SELECT eventdate,count(userid) FROM testdb.appevent WHERE eventdate>="2020-03-03" AND eventdate<="2020-03-09" GROUP BY eventdate ORDER BY eventdate;
++------------+-----------------+
+| eventdate  | count(`userid`) |
++------------+-----------------+
+| 2020-03-03 |              15 |
+| 2020-03-04 |              20 |
+| 2020-03-05 |              25 |
+| 2020-03-06 |              30 |
+| 2020-03-07 |              35 |
+| 2020-03-08 |              40 | //第一次来自分区,后续来自缓存
+| 2020-03-09 |              25 | //来自分区
++------------+-----------------+
+7 rows in set (0.02 sec)
+```
+
+在PartitionCache中,缓存第一级Key是去掉了分区条件后的SQL的128位MD5签名,下面是改写后的待签名的SQL:
+```
+SELECT eventdate,count(userid) FROM testdb.appevent GROUP BY eventdate ORDER BY eventdate;
+```
+缓存的第二级Key是查询结果集的分区字段的内容,比如上面查询结果的eventdate列的内容,二级Key的附属信息是分区的版本号和版本更新时间。
+
+下面演示上面SQL在2020-03-09当天第一次执行的流程:
+1. 从缓存中获取数据
+```
++------------+-----------------+
+| 2020-03-03 |              15 |
+| 2020-03-04 |              20 |
+| 2020-03-05 |              25 |
+| 2020-03-06 |              30 |
+| 2020-03-07 |              35 |
++------------+-----------------+
+```
+2. 从BE中获取数据的SQL和数据
+```
+SELECT eventdate,count(userid) FROM testdb.appevent WHERE eventdate>="2020-03-08" AND eventdate<="2020-03-09" GROUP BY eventdate ORDER BY eventdate;
+
++------------+-----------------+
+| 2020-03-08 |              40 |
++------------+-----------------+
+| 2020-03-09 |              25 | 
++------------+-----------------+
+```
+3. 最后发送给终端的数据
+```
++------------+-----------------+
+| eventdate  | count(`userid`) |
++------------+-----------------+
+| 2020-03-03 |              15 |
+| 2020-03-04 |              20 |
+| 2020-03-05 |              25 |
+| 2020-03-06 |              30 |
+| 2020-03-07 |              35 |
+| 2020-03-08 |              40 |
+| 2020-03-09 |              25 |
++------------+-----------------+
+```
+4. 发送给缓存的数据
+```
++------------+-----------------+
+| 2020-03-08 |              40 |
++------------+-----------------+
+```
+
+Partition缓存,适合按日期分区,部分分区实时更新,查询SQL较为固定。
+
+分区字段也可以是其他字段,但是需要保证只有少量分区更新。
+
+### 一些限制
+* 只支持OlapTable,其他存储如MySQL的表没有版本信息,无法感知数据是否更新
+* 只支持按分区字段分组,不支持按其他字段分组,按其他字段分组,该分组数据都有可能被更新,会导致缓存都失效
+* 只支持结果集的前半部分、后半部分以及全部命中缓存,不支持结果集被缓存数据分割成几个部分
+
+## 使用方式
+### 开启SQLCache
+fe.conf添加enable_sql_cache=true
+```
+vim fe/conf/fe.conf
+enable_sql_cache=true
+```
+在MySQL命令行中设置变量
+```
+MySQL [(none)]> set [global] enable_sql_cache=true;
+```
+注:global是全局变量,不加指当前会话变量
+
+### 开启PartitionCache
+fe.conf添加enable_partition_cache=true
+```
+vim fe/conf/fe.conf
+enable_partition_cache=true
+```
+在MySQL命令行中设置变量
+```
+MySQL [(none)]> set [global] enable_partition_cache=true;
+```
+
+如果同时开启了两个缓存策略,下面的参数,需要注意一下:
+```
+last_version_interval_second=3600
+```
+如果分区的最新版本的时间离现在的间隔,大于last_version_interval_second,则会优先把整个查询结果缓存。如果小于这个间隔,如果符合PartitionCache的条件,则按PartitionCache数据。
+
+### 监控
+FE的监控项:
+```
+query_table            //Query中有表的数量
+query_olap_table       //Query中有Olap表的数量
+cache_mode_sql         //识别缓存模式为sql的Query数量
+cache_hit_sql          //模式为sql的Query命中Cache的数量
+query_mode_partition   //识别缓存模式为Partition的Query数量
+cache_hit_partition	    //通过Partition命中的Query数量
+partition_all          //Query中扫描的所有分区
+partition_hit          //通过Cache命中的分区数量
+
+Cache命中率     = (cache_hit_sql + cache_hit_partition) / query_olap_table
+Partition命中率 = partition_hit / partition_all
+```
+
+BE的监控项:
+```
+cache_memory_total     //Cache内存大小
+cache_sql_total        //Cache的SQL的数量

Review comment:
       the bellow name may be better.
   cache_memory_total__byte
   cache_sql_count
   cache_partition_total_count
   

##########
File path: be/src/runtime/cache/result_node.cpp
##########
@@ -0,0 +1,274 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+#include "gen_cpp/internal_service.pb.h"
+#include "runtime/cache/result_node.h"
+#include "runtime/cache/cache_utils.h"
+
+namespace doris {
+
+bool compare_partition(const PartitionRowBatch* left_node, const PartitionRowBatch* right_node) {
+    return left_node->get_partition_key() < right_node->get_partition_key();
+}
+
+//return new batch size,only include the size of PRowBatch
+void PartitionRowBatch::set_row_batch(const PCacheValue& value) {
+    if (_cache_value != NULL && !check_newer(value.param())) {
+        LOG(WARNING) << "set old version data, cache ver:" << _cache_value->param().last_version()
+                     << ",cache time:" << _cache_value->param().last_version_time()
+                     << ", setdata ver:" << value.param().last_version()
+                     << ",setdata time:" << value.param().last_version_time();
+        return;
+    }
+    SAFE_DELETE(_cache_value);
+    _cache_value = new PCacheValue(value);
+    _data_size += _cache_value->data_size();
+    _cache_stat.update();
+    LOG(INFO) << "finish set row batch, row num:" << _cache_value->row_size()
+              << ", data size:" << _data_size;
+}
+
+bool PartitionRowBatch::is_hit_cache(const PCacheParam& param) {
+    if (param.partition_key() != _partition_key) {
+        return false;
+    }
+    if (!check_match(param)) {
+        return false;
+    }
+    _cache_stat.query();
+    return true;
+}
+
+void PartitionRowBatch::clear() {
+    LOG(INFO) << "clear partition rowbatch.";
+    SAFE_DELETE(_cache_value);
+    _partition_key = 0;
+    _data_size = 0;
+    _cache_stat.init();
+}
+
+/**
+ * Update partition cache data, find RowBatch from partition map by partition key,
+ * the partition rowbatch are stored in the order of partition keys
+ */
+PCacheStatus ResultNode::update_partition(const PUpdateCacheRequest* request, bool& update_first) {
+    update_first = false;
+    if (_sql_key != request->sql_key()) {
+        LOG(INFO) << "no match sql_key " << request->sql_key().hi() << request->sql_key().lo();
+        return PCacheStatus::PARAM_ERROR;
+    }
+
+    if (request->value_size() > config::query_cache_max_partition_count) {
+        LOG(WARNING) << "too many partitions size:" << request->value_size();
+        return PCacheStatus::PARAM_ERROR;
+    }
+
+    //Only one thread per SQL key can update the cache
+    CacheWriteLock write_lock(_node_mtx);
+
+    int64 first_key = kint64max;
+    if (_partition_list.size() == 0) {
+        update_first = true;
+    } else {
+        first_key = (*(_partition_list.begin()))->get_partition_key();
+    }
+    PartitionRowBatch* partition = NULL;
+    for (int i = 0; i < request->value_size(); i++) {
+        const PCacheValue& value = request->value(i);
+        int64 partition_key = value.param().partition_key();
+        if (!update_first && partition_key <= first_key) {
+            update_first = true;
+        }
+        auto it = _partition_map.find(partition_key);
+        if (it == _partition_map.end()) {
+            partition = new PartitionRowBatch(partition_key);
+            partition->set_row_batch(value);
+            _partition_map[partition_key] = partition;
+            _partition_list.push_back(partition);
+#ifdef PARTITION_CACHE_DEV
+            LOG(INFO) << "add index:" << i << ", pkey:" << partition->get_partition_key()
+                      << ", list size:" << _partition_list.size()
+                      << ", map size:" << _partition_map.size();
+#endif
+        } else {
+            partition = it->second;
+            _data_size -= partition->get_data_size();
+            partition->set_row_batch(value);
+#ifdef PARTITION_CACHE_DEV
+            LOG(INFO) << "update index:" << i << ", pkey:" << partition->get_partition_key()
+                      << ", list size:" << _partition_list.size()
+                      << ", map size:" << _partition_map.size();
+#endif
+        }
+        _data_size += partition->get_data_size();
+    }
+    _partition_list.sort(compare_partition);
+    LOG(INFO) << "finish update batches:" << _partition_list.size();
+    while (config::query_cache_max_partition_count > 0 &&
+           _partition_list.size() > config::query_cache_max_partition_count) {
+        if (prune_first() == 0) {
+            break;
+        }
+    }
+    return PCacheStatus::CACHE_OK;
+}
+
+/**
+* Only the range query of the key of the partition is supported, and the separated partition key query is not supported.
+* Because a query can only be divided into two parts, part1 get data from cache, part2 fetch_data by scan node from BE.
+* Partion cache : 20191211-20191215
+* Hit cache parameter : [20191211 - 20191215], [20191212 - 20191214], [20191212 - 20191216],[20191210 - 20191215]
+* Miss cache parameter: [20191210 - 20191216]
+*/
+PCacheStatus ResultNode::fetch_partition(const PFetchCacheRequest* request,
+                                         PartitionRowBatchList& row_batch_list, bool& hit_first) {
+    hit_first = false;
+    if (request->param_size() == 0) {
+        return PCacheStatus::PARAM_ERROR;
+    }
+
+    CacheReadLock read_lock(_node_mtx);
+
+    if (_partition_list.size() == 0) {
+        return PCacheStatus::NO_PARTITION_KEY;
+    }
+    
+    if (request->param(0).partition_key() > (*_partition_list.rbegin())->get_partition_key() ||
+        request->param(request->param_size() - 1).partition_key() <
+                (*_partition_list.begin())->get_partition_key()) {
+        return PCacheStatus::NO_PARTITION_KEY;
+    }
+
+    bool find = false;
+    int begin_idx = -1, end_idx = -1, param_idx = 0;
+    auto begin_it = _partition_list.end();
+    auto end_it = _partition_list.end();
+    auto part_it = _partition_list.begin();
+
+    PCacheStatus status = PCacheStatus::CACHE_OK;
+    while (param_idx < request->param_size() && part_it != _partition_list.end()) {
+#ifdef PARTITION_CACHE_DEV
+        LOG(INFO) << "Param index : " << param_idx
+                  << ", param part Key : " << request->param(param_idx).partition_key()
+                  << ", batch part key : " << (*part_it)->get_partition_key();
+#endif
+        if (!find) {
+            while (part_it != _partition_list.end() &&
+                   request->param(param_idx).partition_key() > (*part_it)->get_partition_key()) {
+                part_it++;
+            }
+            while (param_idx < request->param_size() &&

Review comment:
       what happened if part_it is equal with partion_list.end()

##########
File path: be/src/runtime/cache/result_cache.cpp
##########
@@ -0,0 +1,257 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+#include "gen_cpp/internal_service.pb.h"
+#include "runtime/cache/result_cache.h"
+#include "util/doris_metrics.h"
+
+namespace doris {
+
+/**
+* Remove the tail node of link
+*/
+ResultNode* ResultNodeList::pop() {
+    remove(_head);
+    return _head;
+}
+
+void ResultNodeList::remove(ResultNode* node) {
+    if (!node) return;
+    if (node == _head) _head = node->get_next();
+    if (node == _tail) _tail = node->get_prev();
+    node->unlink();
+    _node_count--;
+}
+
+void ResultNodeList::push(ResultNode* node) {

Review comment:
       Maybe push_back is better

##########
File path: be/src/runtime/exec_env_init.cpp
##########
@@ -208,6 +215,10 @@ void ExecEnv::_init_buffer_pool(int64_t min_page_size,
 }
 
 void ExecEnv::_destory() {
+    //Only destroy once after init
+    if (!_is_init) {

Review comment:
       if ExecEnv::_init return failed status, then you will be memory leak.  for examaple 124 line in this file  return error.

##########
File path: be/src/runtime/cache/result_node.cpp
##########
@@ -0,0 +1,274 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+#include "gen_cpp/internal_service.pb.h"
+#include "runtime/cache/result_node.h"
+#include "runtime/cache/cache_utils.h"
+
+namespace doris {
+
+bool compare_partition(const PartitionRowBatch* left_node, const PartitionRowBatch* right_node) {
+    return left_node->get_partition_key() < right_node->get_partition_key();
+}
+
+//return new batch size,only include the size of PRowBatch
+void PartitionRowBatch::set_row_batch(const PCacheValue& value) {
+    if (_cache_value != NULL && !check_newer(value.param())) {
+        LOG(WARNING) << "set old version data, cache ver:" << _cache_value->param().last_version()
+                     << ",cache time:" << _cache_value->param().last_version_time()
+                     << ", setdata ver:" << value.param().last_version()
+                     << ",setdata time:" << value.param().last_version_time();
+        return;
+    }
+    SAFE_DELETE(_cache_value);
+    _cache_value = new PCacheValue(value);
+    _data_size += _cache_value->data_size();
+    _cache_stat.update();
+    LOG(INFO) << "finish set row batch, row num:" << _cache_value->row_size()
+              << ", data size:" << _data_size;
+}
+
+bool PartitionRowBatch::is_hit_cache(const PCacheParam& param) {
+    if (param.partition_key() != _partition_key) {
+        return false;
+    }
+    if (!check_match(param)) {
+        return false;
+    }
+    _cache_stat.query();
+    return true;
+}
+
+void PartitionRowBatch::clear() {
+    LOG(INFO) << "clear partition rowbatch.";
+    SAFE_DELETE(_cache_value);
+    _partition_key = 0;
+    _data_size = 0;
+    _cache_stat.init();
+}
+
+/**
+ * Update partition cache data, find RowBatch from partition map by partition key,
+ * the partition rowbatch are stored in the order of partition keys
+ */
+PCacheStatus ResultNode::update_partition(const PUpdateCacheRequest* request, bool& update_first) {
+    update_first = false;
+    if (_sql_key != request->sql_key()) {
+        LOG(INFO) << "no match sql_key " << request->sql_key().hi() << request->sql_key().lo();
+        return PCacheStatus::PARAM_ERROR;
+    }
+
+    if (request->value_size() > config::query_cache_max_partition_count) {
+        LOG(WARNING) << "too many partitions size:" << request->value_size();
+        return PCacheStatus::PARAM_ERROR;
+    }
+
+    //Only one thread per SQL key can update the cache
+    CacheWriteLock write_lock(_node_mtx);
+
+    int64 first_key = kint64max;
+    if (_partition_list.size() == 0) {
+        update_first = true;

Review comment:
       'update_first' 's meaning is not clear.  Maybe you can call it  'is_first_key_updated'

##########
File path: be/src/runtime/cache/result_node.h
##########
@@ -0,0 +1,197 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#ifndef DORIS_BE_SRC_RUNTIME_RESULT_NODE_H
+#define DORIS_BE_SRC_RUNTIME_RESULT_NODE_H
+
+#include <sys/time.h>
+
+#include <algorithm>
+#include <cassert>
+#include <cstdio>
+#include <cstdlib>
+#include <exception>
+#include <iostream>
+#include <list>
+#include <map>
+#include <string>
+
+#include "common/config.h"
+#include "olap/olap_define.h"
+#include "runtime/cache/cache_utils.h"
+#include "runtime/mem_pool.h"
+#include "runtime/row_batch.h"
+#include "runtime/tuple_row.h"
+#include "util/uid_util.h"
+
+namespace doris {
+
+enum PCacheStatus;
+class PCacheParam;
+class PCacheValue;
+class PCacheResponse;
+class PFetchCacheRequest;
+class PFetchCacheResult;
+class PUpdateCacheRequest;
+class PClearCacheRequest;
+
+/**
+* Cache one partition data, request param must match version and time of cache
+*/
+class PartitionRowBatch {
+public:
+    PartitionRowBatch(int64 partition_key)
+            : _partition_key(partition_key), _cache_value(NULL), _data_size(0) {}
+
+    ~PartitionRowBatch() {}
+
+    void set_row_batch(const PCacheValue& value);
+    bool is_hit_cache(const PCacheParam& param);
+    void clear();
+
+    int64 get_partition_key() const { return _partition_key; }
+
+    PCacheValue* get_value() { return _cache_value; }
+
+    size_t get_data_size() { return _data_size; }
+
+    const CacheStat* get_stat() const { return &_cache_stat; }
+
+private:
+    bool check_match(const PCacheParam& req_param) {
+        if (req_param.last_version() > _cache_value->param().last_version()) {
+            return false;
+        }
+        if (req_param.last_version_time() > _cache_value->param().last_version_time()) {
+            return false;
+        }
+        return true;
+    }
+
+    bool check_newer(const PCacheParam& up_param) {
+        //for init data of sql cache
+        if (up_param.last_version() == 0 || up_param.last_version_time() == 0) {
+            return true;
+        }
+        if (up_param.last_version_time() > _cache_value->param().last_version_time()) {
+            return true;
+        }
+        if (up_param.last_version() > _cache_value->param().last_version()) {
+            return true;
+        }
+        return false;
+    }
+
+private:
+    int64 _partition_key;
+    PCacheValue* _cache_value;
+    size_t _data_size;
+    CacheStat _cache_stat;
+};
+
+typedef std::list<PartitionRowBatch*> PartitionRowBatchList;
+typedef boost::unordered_map<int64, PartitionRowBatch*> PartitionRowBatchMap;
+
+/**
+* Cache the result of one SQL, include many partition rowsets.
+* Sql Cache: The partiton ID comes from the partition lastest updated.
+* Partition Cache: The partition ID comes from the partition scanned by query.
+* The above two modes use the same cache structure.
+*/
+class ResultNode {
+public:
+    ResultNode() : _sql_key(0, 0), _prev(NULL), _next(NULL), _data_size(0) {}
+
+    ResultNode(const UniqueId& sql_key)
+            : _sql_key(sql_key), _prev(NULL), _next(NULL), _data_size(0) {}
+
+    virtual ~ResultNode() {}
+
+    PCacheStatus update_partition(const PUpdateCacheRequest* request, bool& update_first);
+    PCacheStatus fetch_partition(const PFetchCacheRequest* request,
+                                 PartitionRowBatchList& rowBatchList, bool& hit_first);
+
+    size_t prune_first();
+    void clear();
+
+    bool operator()(const ResultNode* left_node, const ResultNode* right_node) {

Review comment:
       this function is useless?

##########
File path: docs/zh-CN/administrator-guide/partition_cache.md
##########
@@ -0,0 +1,205 @@
+# 分区缓存
+
+## 需求场景
+大部分数据分析场景是写少读多,数据写入一次,多次频繁读取,比如一张报表涉及的维度和指标,数据在凌晨一次性计算好,但每天有数百甚至数千次的页面访问,因此非常适合把结果集缓存起来。在数据分析或BI应用中,存在下面的业务场景:
+* **高并发场景**,Doris可以较好的支持高并发,但单台服务器无法承载太高的QPS
+* **复杂图表的看板**,复杂的Dashboard或者大屏类应用,数据来自多张表,每个页面有数十个查询,虽然每个查询只有数十毫秒,但是总体查询时间会在数秒
+* **趋势分析**,给定日期范围的查询,指标按日显示,比如查询最近7天内的用户数的趋势,这类查询数据量大,查询范围广,查询时间往往需要数十秒
+* **用户重复查询**,如果产品没有防重刷机制,用户因手误或其他原因重复刷新页面,导致提交大量的重复的SQL
+
+以上四种场景,在应用层的解决方案,把查询结果放到Redis中,周期性的更新缓存或者用户手工刷新缓存,但是这个方案有如下问题:
+* **数据不一致**,无法感知数据的更新,导致用户经常看到旧的数据
+* **命中率低**,缓存整个查询结果,如果数据实时写入,缓存频繁失效,命中率低且系统负载较重
+* **额外成本**,引入外部缓存组件,会带来系统复杂度,增加额外成本
+
+## 解决方案
+本分区缓存策略可以解决上面的问题,优先保证数据一致性,在此基础上细化缓存粒度,提升命中率,因此有如下特点:
+* 用户无需担心数据一致性,通过版本来控制缓存失效,缓存的数据和从BE中查询的数据是一致的
+* 没有额外的组件和成本,缓存结果存储在BE的内存中,用户可以根据需要调整缓存内存大小
+* 实现了两种缓存策略,SQLCache和PartitionCache,后者缓存粒度更细
+* 用一致性哈希解决BE节点上下线的问题,BE中的缓存算法是改进的LRU
+
+## SQLCache
+SQLCache按SQL的签名、查询的表的分区ID、分区最新版本来存储和获取缓存。三者组合确定一个缓存数据集,任何一个变化了,如SQL有变化,如查询字段或条件不一样,或数据更新后版本变化了,会导致命中不了缓存。
+
+如果多张表Join,使用最近更新的分区ID和最新的版本号,如果其中一张表更新了,会导致分区ID或版本号不一样,也一样命中不了缓存。
+
+SQLCache,更适合T+1更新的场景,凌晨数据更新,首次查询从BE中获取结果放入到缓存中,后续相同查询从缓存中获取。实时更新数据也可以使用,但是可能存在命中率低的问题,可以参考如下PartitionCache。
+
+## PartitionCache
+
+### 设计原理
+1. SQL可以并行拆分,Q = Q1 ∪ Q2 ... ∪ Qn,R= R1 ∪ R2 ... ∪ Rn,Q为查询语句,R为结果集
+2. 拆分为只读分区和可更新分区,只读分区缓存,更新分区不缓存
+
+如上,查询最近7天的每天用户数,如按日期分区,数据只写当天分区,当天之外的其他分区的数据,都是固定不变的,在相同的查询SQL下,查询某个不更新分区的指标都是固定的。如下,在2020-03-09当天查询前7天的用户数,2020-03-03至2020-03-07的数据来自缓存,2020-03-08第一次查询来自分区,后续的查询来自缓存,2020-03-09因为当天在不停写入,所以来自分区。
+
+因此,查询N天的数据,数据更新最近的D天,每天只是日期范围不一样相似的查询,只需要查询D个分区即可,其他部分都来自缓存,可以有效降低集群负载,减少查询时间。
+
+```
+MySQL [(none)]> SELECT eventdate,count(userid) FROM testdb.appevent WHERE eventdate>="2020-03-03" AND eventdate<="2020-03-09" GROUP BY eventdate ORDER BY eventdate;
++------------+-----------------+
+| eventdate  | count(`userid`) |
++------------+-----------------+
+| 2020-03-03 |              15 |
+| 2020-03-04 |              20 |
+| 2020-03-05 |              25 |
+| 2020-03-06 |              30 |
+| 2020-03-07 |              35 |
+| 2020-03-08 |              40 | //第一次来自分区,后续来自缓存
+| 2020-03-09 |              25 | //来自分区
++------------+-----------------+
+7 rows in set (0.02 sec)
+```
+
+在PartitionCache中,缓存第一级Key是去掉了分区条件后的SQL的128位MD5签名,下面是改写后的待签名的SQL:
+```
+SELECT eventdate,count(userid) FROM testdb.appevent GROUP BY eventdate ORDER BY eventdate;
+```
+缓存的第二级Key是查询结果集的分区字段的内容,比如上面查询结果的eventdate列的内容,二级Key的附属信息是分区的版本号和版本更新时间。
+
+下面演示上面SQL在2020-03-09当天第一次执行的流程:
+1. 从缓存中获取数据
+```
++------------+-----------------+
+| 2020-03-03 |              15 |
+| 2020-03-04 |              20 |
+| 2020-03-05 |              25 |
+| 2020-03-06 |              30 |
+| 2020-03-07 |              35 |
++------------+-----------------+
+```
+2. 从BE中获取数据的SQL和数据
+```
+SELECT eventdate,count(userid) FROM testdb.appevent WHERE eventdate>="2020-03-08" AND eventdate<="2020-03-09" GROUP BY eventdate ORDER BY eventdate;
+
++------------+-----------------+
+| 2020-03-08 |              40 |
++------------+-----------------+
+| 2020-03-09 |              25 | 
++------------+-----------------+
+```
+3. 最后发送给终端的数据
+```
++------------+-----------------+
+| eventdate  | count(`userid`) |
++------------+-----------------+
+| 2020-03-03 |              15 |
+| 2020-03-04 |              20 |
+| 2020-03-05 |              25 |
+| 2020-03-06 |              30 |
+| 2020-03-07 |              35 |
+| 2020-03-08 |              40 |
+| 2020-03-09 |              25 |
++------------+-----------------+
+```
+4. 发送给缓存的数据
+```
++------------+-----------------+
+| 2020-03-08 |              40 |
++------------+-----------------+
+```
+
+Partition缓存,适合按日期分区,部分分区实时更新,查询SQL较为固定。
+
+分区字段也可以是其他字段,但是需要保证只有少量分区更新。
+
+### 一些限制
+* 只支持OlapTable,其他存储如MySQL的表没有版本信息,无法感知数据是否更新
+* 只支持按分区字段分组,不支持按其他字段分组,按其他字段分组,该分组数据都有可能被更新,会导致缓存都失效
+* 只支持结果集的前半部分、后半部分以及全部命中缓存,不支持结果集被缓存数据分割成几个部分
+
+## 使用方式
+### 开启SQLCache
+fe.conf添加enable_sql_cache=true
+```
+vim fe/conf/fe.conf
+enable_sql_cache=true
+```
+在MySQL命令行中设置变量
+```
+MySQL [(none)]> set [global] enable_sql_cache=true;
+```
+注:global是全局变量,不加指当前会话变量
+
+### 开启PartitionCache
+fe.conf添加enable_partition_cache=true
+```
+vim fe/conf/fe.conf
+enable_partition_cache=true
+```
+在MySQL命令行中设置变量
+```
+MySQL [(none)]> set [global] enable_partition_cache=true;
+```
+
+如果同时开启了两个缓存策略,下面的参数,需要注意一下:
+```
+last_version_interval_second=3600
+```
+如果分区的最新版本的时间离现在的间隔,大于last_version_interval_second,则会优先把整个查询结果缓存。如果小于这个间隔,如果符合PartitionCache的条件,则按PartitionCache数据。
+
+### 监控
+FE的监控项:
+```
+query_table            //Query中有表的数量
+query_olap_table       //Query中有Olap表的数量
+cache_mode_sql         //识别缓存模式为sql的Query数量
+cache_hit_sql          //模式为sql的Query命中Cache的数量
+query_mode_partition   //识别缓存模式为Partition的Query数量
+cache_hit_partition	    //通过Partition命中的Query数量
+partition_all          //Query中扫描的所有分区
+partition_hit          //通过Cache命中的分区数量
+
+Cache命中率     = (cache_hit_sql + cache_hit_partition) / query_olap_table
+Partition命中率 = partition_hit / partition_all
+```
+
+BE的监控项:
+```
+cache_memory_total     //Cache内存大小
+cache_sql_total        //Cache的SQL的数量
+cache_partition_total  //Cache分区数量
+
+SQL平均数据大小       = cache_memory_total / cache_sql_total
+Partition平均数据大小 = cache_memory_total / cache_partition_total
+```
+
+其他监控:
+可以从Grafana中查看BE节点的CPU和内存指标,Query统计中的Query Percentile等指标,配合Cache参数的调整来达成业务目标。
+
+
+### 优化参数
+FE的配置项cache_result_max_row_count,查询结果集放入缓存的最大行数,可以根据实际情况调整,但建议不要设置过大,避免过多占用内存,超过这个大小的结果集不会被缓存。
+```
+vim fe/conf/fe.conf
+cache_result_max_row_count=1000
+```
+
+BE最大分区数量cache_max_partition_count,指每个SQL对应的最大分区数,如果是按日期分区,能缓存2年多的数据,假如想保留更长时间的缓存,请把这个参数设置得更大,同时修改cache_result_max_row_count的参数。
+```
+vim be/conf/be.conf
+cache_max_partition_count=1024
+```
+
+BE中缓存内存设置,有两个参数cache_max_size和cache_elasticity_size两部分组成(单位MB),内存超过cache_max_size+cache_elasticity_size会开始清理,并把内存控制到cache_max_size以下。可以根据BE节点数量,节点内存大小,和缓存命中率来设置这两个参数。

Review comment:
       change cache_max_size to query_cache_max_size_mb




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman merged pull request #4005: [Cache][BE] LRU cache for sql/partition cache #2581

Posted by GitBox <gi...@apache.org>.
morningman merged pull request #4005:
URL: https://github.com/apache/incubator-doris/pull/4005


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org