You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pegasus.apache.org by GitBox <gi...@apache.org> on 2020/09/07 09:47:59 UTC

[GitHub] [incubator-pegasus] Smityz opened a new pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Smityz opened a new pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597


   Remove the strategy mode, making the overall structure more streamlined.
   Rename the file, because the passed name is not clear.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484742902



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,98 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+void hotspot_partition_calculator::aggregate(const std::vector<row_data> &partitions)
+{
+    while (_app_data.size() > kMaxQueueSize - 1) {
+        _app_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _app_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < perf_counter_count; i++) {
+        string paritition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", paritition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", paritition_desc);
+        _points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::analysis(
+    const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+    std::vector<::dsn::perf_counter_wrapper> &perf_counters)
+{
+    dassert(hotspot_app_data.back().size() == perf_counters.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(hotspot_app_data.size() * perf_counters.size());
+    auto temp_data = hotspot_app_data;
+    double total = 0, sd = 0, avg = 0;
+    int sample_count = 0;
+    // avg: Average number
+    // sd: Standard deviation
+    // sample_count: Number of samples
+    while (!temp_data.empty()) {
+        for (auto partition_data : temp_data.front()) {
+            if (partition_data.total_qps - 1.00 > 0) {
+                data_samples.push_back(partition_data.total_qps);
+                total += partition_data.total_qps;
+                sample_count++;
+            }
+        }
+        temp_data.pop();
+    }
+    if (sample_count == 0) {
+        ddebug("hotspot_app_data size == 0");
+        return;
+    }
+    avg = total / sample_count;
+    for (auto data_sample : data_samples) {
+        sd += pow((data_sample - avg), 2);

Review comment:
       It's not the refactor of the algorithm, just modify some name of variables which are not very suitable.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Smityz commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Smityz commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484730883



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,98 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+void hotspot_partition_calculator::aggregate(const std::vector<row_data> &partitions)
+{
+    while (_app_data.size() > kMaxQueueSize - 1) {
+        _app_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _app_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < perf_counter_count; i++) {
+        string paritition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", paritition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", paritition_desc);
+        _points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::analysis(
+    const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+    std::vector<::dsn::perf_counter_wrapper> &perf_counters)
+{
+    dassert(hotspot_app_data.back().size() == perf_counters.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(hotspot_app_data.size() * perf_counters.size());
+    auto temp_data = hotspot_app_data;
+    double total = 0, sd = 0, avg = 0;
+    int sample_count = 0;
+    // avg: Average number
+    // sd: Standard deviation
+    // sample_count: Number of samples
+    while (!temp_data.empty()) {
+        for (auto partition_data : temp_data.front()) {

Review comment:
       Same to https://github.com/apache/incubator-pegasus/pull/597#discussion_r484730609

##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,98 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+void hotspot_partition_calculator::aggregate(const std::vector<row_data> &partitions)
+{
+    while (_app_data.size() > kMaxQueueSize - 1) {
+        _app_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _app_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < perf_counter_count; i++) {
+        string paritition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", paritition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", paritition_desc);
+        _points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::analysis(
+    const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+    std::vector<::dsn::perf_counter_wrapper> &perf_counters)
+{
+    dassert(hotspot_app_data.back().size() == perf_counters.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(hotspot_app_data.size() * perf_counters.size());
+    auto temp_data = hotspot_app_data;
+    double total = 0, sd = 0, avg = 0;
+    int sample_count = 0;
+    // avg: Average number
+    // sd: Standard deviation
+    // sample_count: Number of samples
+    while (!temp_data.empty()) {
+        for (auto partition_data : temp_data.front()) {
+            if (partition_data.total_qps - 1.00 > 0) {
+                data_samples.push_back(partition_data.total_qps);
+                total += partition_data.total_qps;
+                sample_count++;
+            }
+        }
+        temp_data.pop();
+    }
+    if (sample_count == 0) {
+        ddebug("hotspot_app_data size == 0");
+        return;
+    }
+    avg = total / sample_count;
+    for (auto data_sample : data_samples) {

Review comment:
       Same to https://github.com/apache/incubator-pegasus/pull/597#discussion_r484730609




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] neverchanje commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
neverchanje commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485458733



##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,57 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+#include <dsn/utility/flags.h>
+
+namespace pegasus {
+namespace server {
+
+// hotspot_partition_calculator is used to find the hotspot in Pegasus
+class hotspot_partition_calculator
+{
+public:
+    hotspot_partition_calculator(const std::string &app_name, const int partition_count)
+        : _app_name(app_name), _hot_points(partition_count)
+    {
+        init_perf_counter(partition_count);
+    }
+    // aggregate related data of hotspot detection
+    void data_aggregate(const std::vector<row_data> &partitions);
+    // analyse the saved data to find hotspot partition
+    void data_analyse();
+    void init_perf_counter(const int perf_counter_count);
+
+private:
+    void _data_analyse(const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,

Review comment:
       ```suggestion
       void data_analyse(const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
   ```
   
   We don't name a function with prefixed  `_`. Naming is important. Try make names clear and understandable. If a name starts with '_', I can definitely confirm that this function needs to be refactored.

##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.hotspot",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data stored in calculator, in order to same Mem");
+
+void hotspot_partition_calculator::data_aggregate(const std::vector<row_data> &partitions)
+{
+    while (_historical_data.size() > FLAGS_max_hotspot_store_size - 1) {
+        _historical_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _historical_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < perf_counter_count; i++) {
+        string paritition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", paritition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", paritition_desc);
+        _hot_points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::_data_analyse(
+    const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+    std::vector<::dsn::perf_counter_wrapper> &perf_counters)
+{
+    dassert(hotspot_app_data.back().size() == perf_counters.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(hotspot_app_data.size() * perf_counters.size());
+    auto temp_data = hotspot_app_data;
+    double total = 0, standard_deviation = 0, average_number = 0;
+    int sample_count = 0;
+    while (!temp_data.empty()) {
+        for (const auto &partition_data : temp_data.front()) {
+            if (partition_data.total_qps - 1.00 > 0) {
+                data_samples.push_back(partition_data.total_qps);
+                total += partition_data.total_qps;
+                sample_count++;
+            }
+        }
+        temp_data.pop();
+    }
+    if (sample_count == 0) {
+        ddebug("hotspot_app_data size == 0");
+        return;
+    }
+    average_number = total / sample_count;
+    for (const auto &data_sample : data_samples) {
+        standard_deviation += pow((data_sample - average_number), 2);
+    }
+    standard_deviation = sqrt(standard_deviation / sample_count);
+    const auto &anly_data = hotspot_app_data.back();
+    for (int i = 0; i < perf_counters.size(); i++) {
+        double hot_point = (anly_data[i].total_qps - average_number) / standard_deviation;
+        // perf_counter->set can only be unsigned __int64
+        // use ceil to guarantee conversion results
+        hot_point = ceil(std::max(hot_point, double(0)));
+        perf_counters[i]->set(hot_point);
+    }
+}
+
+void hotspot_partition_calculator::data_analyse() { _data_analyse(_historical_data, _hot_points); }

Review comment:
       ```suggestion
   void hotspot_partition_calculator::data_analyse() { }
   ```
   
   Delete `_data_analyse`. Do not pass private member to a class method. What are you supposed to do?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Smityz commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Smityz commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r486221602



##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,57 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+
+namespace pegasus {
+namespace server {
+
+typedef std::list<std::vector<hotspot_partition_data>> partition_data_list;
+typedef std::vector<std::vector<std::unique_ptr<dsn::perf_counter_wrapper>>> hot_partition_counters;
+
+// hotspot_partition_calculator is used to find the hot partition in a table.
+class hotspot_partition_calculator

Review comment:
       hotspot_partition_calculator stores the statistical data of the correspond table, `_app_name` is for that table, not itself.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484629792



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,98 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+void hotspot_partition_calculator::aggregate(const std::vector<row_data> &partitions)
+{
+    while (_app_data.size() > kMaxQueueSize - 1) {
+        _app_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _app_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < perf_counter_count; i++) {
+        string paritition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", paritition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", paritition_desc);
+        _points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::analysis(
+    const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+    std::vector<::dsn::perf_counter_wrapper> &perf_counters)
+{
+    dassert(hotspot_app_data.back().size() == perf_counters.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(hotspot_app_data.size() * perf_counters.size());
+    auto temp_data = hotspot_app_data;
+    double total = 0, sd = 0, avg = 0;
+    int sample_count = 0;
+    // avg: Average number
+    // sd: Standard deviation
+    // sample_count: Number of samples
+    while (!temp_data.empty()) {
+        for (auto partition_data : temp_data.front()) {
+            if (partition_data.total_qps - 1.00 > 0) {
+                data_samples.push_back(partition_data.total_qps);
+                total += partition_data.total_qps;
+                sample_count++;
+            }
+        }
+        temp_data.pop();
+    }
+    if (sample_count == 0) {
+        ddebug("hotspot_app_data size == 0");
+        return;
+    }
+    avg = total / sample_count;
+    for (auto data_sample : data_samples) {

Review comment:
        `for (const auto &data_sample : data_samples) {`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Smityz commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Smityz commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485494902



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.hotspot",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data stored in calculator, in order to same Mem");

Review comment:
       I want to put all the parameters related to the hotspot together




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484629380



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,98 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+void hotspot_partition_calculator::aggregate(const std::vector<row_data> &partitions)
+{
+    while (_app_data.size() > kMaxQueueSize - 1) {
+        _app_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _app_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < perf_counter_count; i++) {
+        string paritition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", paritition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", paritition_desc);
+        _points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::analysis(
+    const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+    std::vector<::dsn::perf_counter_wrapper> &perf_counters)
+{
+    dassert(hotspot_app_data.back().size() == perf_counters.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(hotspot_app_data.size() * perf_counters.size());
+    auto temp_data = hotspot_app_data;
+    double total = 0, sd = 0, avg = 0;
+    int sample_count = 0;
+    // avg: Average number
+    // sd: Standard deviation
+    // sample_count: Number of samples
+    while (!temp_data.empty()) {
+        for (auto partition_data : temp_data.front()) {
+            if (partition_data.total_qps - 1.00 > 0) {
+                data_samples.push_back(partition_data.total_qps);
+                total += partition_data.total_qps;
+                sample_count++;
+            }
+        }
+        temp_data.pop();
+    }
+    if (sample_count == 0) {
+        ddebug("hotspot_app_data size == 0");
+        return;
+    }
+    avg = total / sample_count;
+    for (auto data_sample : data_samples) {
+        sd += pow((data_sample - avg), 2);

Review comment:
       rename it to standard_deviation, so you can delete the comment above.
   Same with `avg`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485317025



##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,57 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+#include <dsn/utility/flags.h>

Review comment:
       I think `dsn/utility/flags.h` is useless in this file




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Shuo-Jia commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Shuo-Jia commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485480375



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.hotspot",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data stored in calculator, in order to same Mem");

Review comment:
       `in order to same Mem`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] neverchanje commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
neverchanje commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r486048240



##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,57 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+
+namespace pegasus {
+namespace server {
+
+typedef std::list<std::vector<hotspot_partition_data>> partition_data_list;
+typedef std::vector<std::vector<std::unique_ptr<dsn::perf_counter_wrapper>>> hot_partition_counters;
+
+// hotspot_partition_calculator is used to find the hot partition in a table.
+class hotspot_partition_calculator

Review comment:
       ```suggestion
   class hotspot_partition_calculator : public replica_base
   ```
   
   `replica_base` has a `app_name()` method. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] acelyc111 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484806342



##########
File path: src/server/info_collector.cpp
##########
@@ -302,25 +291,16 @@ void info_collector::on_storage_size_stat(int remaining_retry_count)
     _result_writer->set_result(st_stat.timestamp, "ss", st_stat.dump_to_json());
 }
 
-hotspot_calculator *info_collector::get_hotspot_calculator(const std::string &app_name,
-                                                           const int partition_num)
+std::shared_ptr<hotspot_partition_calculator>
+info_collector::get_hotspot_calculator(const std::string &app_name, const int partition_count)
 {
-    // use appname+partition_num as a key can prevent the impact of dynamic partition changes
-    std::string app_name_pcount = fmt::format("{}.{}", app_name, partition_num);
+    // use appname+partition_count as a key can prevent the impact of dynamic partition changes

Review comment:
       ```suggestion
       // use app_name+partition_count as a key can prevent the impact of dynamic partition changes
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484628910



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,98 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+void hotspot_partition_calculator::aggregate(const std::vector<row_data> &partitions)
+{
+    while (_app_data.size() > kMaxQueueSize - 1) {
+        _app_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));

Review comment:
       could you push the data into _app_data directly? so you can delete `temp`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Shuo-Jia commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Shuo-Jia commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485484247



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.hotspot",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data stored in calculator, in order to same Mem");
+
+void hotspot_partition_calculator::data_aggregate(const std::vector<row_data> &partitions)
+{
+    while (_historical_data.size() > FLAGS_max_hotspot_store_size - 1) {
+        _historical_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _historical_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < perf_counter_count; i++) {
+        string paritition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", paritition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", paritition_desc);
+        _hot_points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::_data_analyse(

Review comment:
       ->data_analyse()




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Smityz commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Smityz commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484728891



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,98 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+void hotspot_partition_calculator::aggregate(const std::vector<row_data> &partitions)
+{
+    while (_app_data.size() > kMaxQueueSize - 1) {
+        _app_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));

Review comment:
       `row_data` is too large, we should simplify its information to store.
   ```c++
   hotspot_partition_data(const row_data &row)
       : total_qps(row.get_total_qps()),
         partition_name(row.row_name){};
   ```
   The constructor of `hotspot_partition_data` can help to do this thing.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] neverchanje commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
neverchanje commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485465068



##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,57 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+#include <dsn/utility/flags.h>
+
+namespace pegasus {
+namespace server {
+
+// hotspot_partition_calculator is used to find the hotspot in Pegasus
+class hotspot_partition_calculator
+{
+public:
+    hotspot_partition_calculator(const std::string &app_name, const int partition_count)
+        : _app_name(app_name), _hot_points(partition_count)
+    {
+        init_perf_counter(partition_count);
+    }
+    // aggregate related data of hotspot detection
+    void data_aggregate(const std::vector<row_data> &partitions);
+    // analyse the saved data to find hotspot partition
+    void data_analyse();
+    void init_perf_counter(const int perf_counter_count);
+
+private:
+    void _data_analyse(const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+                       std::vector<::dsn::perf_counter_wrapper> &perf_counters);
+    const std::string _app_name;
+
+    // usually _hot_points >= 3 can be considered as a hotspot partition
+    std::vector<dsn::perf_counter_wrapper> _hot_points;
+    // save historical data can improve accuracy
+    std::queue<std::vector<hotspot_partition_data>> _historical_data;

Review comment:
       ```suggestion
       // saving historical data can improve accuracy
       std::queue<std::vector<hotspot_partition_data>> _historical_data;
   ```
   
   

##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.hotspot",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data stored in calculator, in order to same Mem");

Review comment:
       ```suggestion
   DSN_DEFINE_int64("pegasus.collector",
                    max_hotspot_store_size,
                    100,
                    "the max count of historical data stored in calculator, in order to same Mem");
   ```

##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,57 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+#include <dsn/utility/flags.h>
+
+namespace pegasus {
+namespace server {
+
+// hotspot_partition_calculator is used to find the hotspot in Pegasus

Review comment:
       ```suggestion
   // hotspot_partition_calculator is used to find the hot partition in a table.
   ```

##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.hotspot",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data stored in calculator, in order to same Mem");
+
+void hotspot_partition_calculator::data_aggregate(const std::vector<row_data> &partitions)
+{
+    while (_historical_data.size() > FLAGS_max_hotspot_store_size - 1) {
+        _historical_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _historical_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)

Review comment:
       ```suggestion
   void hotspot_partition_calculator::init_perf_counter(int partition_count)
   ```

##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.hotspot",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data stored in calculator, in order to same Mem");
+
+void hotspot_partition_calculator::data_aggregate(const std::vector<row_data> &partitions)
+{
+    while (_historical_data.size() > FLAGS_max_hotspot_store_size - 1) {
+        _historical_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _historical_data.emplace(temp);

Review comment:
       Where is `_historical_data` used?

##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.hotspot",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data stored in calculator, in order to same Mem");
+
+void hotspot_partition_calculator::data_aggregate(const std::vector<row_data> &partitions)
+{
+    while (_historical_data.size() > FLAGS_max_hotspot_store_size - 1) {
+        _historical_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _historical_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < perf_counter_count; i++) {
+        string paritition_desc = _app_name + '.' + std::to_string(i);

Review comment:
       ```suggestion
           string partition_desc = _app_name + '.' + std::to_string(i);
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] neverchanje merged pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
neverchanje merged pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484809655



##########
File path: src/server/info_collector.h
##########
@@ -177,15 +177,16 @@ class info_collector
     uint32_t _storage_size_fetch_interval_seconds;
     uint32_t _storage_size_retry_wait_seconds;
     uint32_t _storage_size_retry_max_count;
-    std::string _hotspot_detect_algorithm;
     ::dsn::task_ptr _storage_size_stat_timer_task;
     ::dsn::utils::ex_lock_nr _capacity_unit_update_info_lock;
     // mapping 'node address' --> 'last updated timestamp'
     std::map<std::string, string> _capacity_unit_update_info;
-    std::map<std::string, hotspot_calculator *> _hotspot_calculator_store;
-
-    hotspot_calculator *get_hotspot_calculator(const std::string &app_name,
-                                               const int partition_num);
+    // _hotspot_calculator_store is to save hotspot_partition_calculator for each table, a
+    // hotspot_partition_calculator saves historical hotspot data and alert perf_counters of
+    // corresponding table
+    std::map<std::string, std::shared_ptr<hotspot_partition_calculator>> _hotspot_calculator_store;

Review comment:
       Yes, I see it. But only `_hotspot_calculator_store` holds it, which means has the right to determine its life cycle.
   you can return a `hotspot_partition_calculator &` in the interface of  `_hotspot_calculator_store` for  the using of `info_collector` 

##########
File path: src/server/info_collector.h
##########
@@ -177,15 +177,16 @@ class info_collector
     uint32_t _storage_size_fetch_interval_seconds;
     uint32_t _storage_size_retry_wait_seconds;
     uint32_t _storage_size_retry_max_count;
-    std::string _hotspot_detect_algorithm;
     ::dsn::task_ptr _storage_size_stat_timer_task;
     ::dsn::utils::ex_lock_nr _capacity_unit_update_info_lock;
     // mapping 'node address' --> 'last updated timestamp'
     std::map<std::string, string> _capacity_unit_update_info;
-    std::map<std::string, hotspot_calculator *> _hotspot_calculator_store;
-
-    hotspot_calculator *get_hotspot_calculator(const std::string &app_name,
-                                               const int partition_num);
+    // _hotspot_calculator_store is to save hotspot_partition_calculator for each table, a
+    // hotspot_partition_calculator saves historical hotspot data and alert perf_counters of
+    // corresponding table
+    std::map<std::string, std::shared_ptr<hotspot_partition_calculator>> _hotspot_calculator_store;

Review comment:
       Yes, I saw it. But only `_hotspot_calculator_store` holds it, which means has the right to determine its life cycle.
   you can return a `hotspot_partition_calculator &` in the interface of  `_hotspot_calculator_store` for  the using of `info_collector` 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] neverchanje commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
neverchanje commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485463322



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.hotspot",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data stored in calculator, in order to same Mem");
+
+void hotspot_partition_calculator::data_aggregate(const std::vector<row_data> &partitions)
+{
+    while (_historical_data.size() > FLAGS_max_hotspot_store_size - 1) {
+        _historical_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _historical_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < perf_counter_count; i++) {
+        string paritition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", paritition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", paritition_desc);
+        _hot_points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::_data_analyse(
+    const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+    std::vector<::dsn::perf_counter_wrapper> &perf_counters)
+{
+    dassert(hotspot_app_data.back().size() == perf_counters.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(hotspot_app_data.size() * perf_counters.size());
+    auto temp_data = hotspot_app_data;
+    double total = 0, standard_deviation = 0, average_number = 0;
+    int sample_count = 0;
+    while (!temp_data.empty()) {
+        for (const auto &partition_data : temp_data.front()) {
+            if (partition_data.total_qps - 1.00 > 0) {
+                data_samples.push_back(partition_data.total_qps);
+                total += partition_data.total_qps;
+                sample_count++;
+            }
+        }
+        temp_data.pop();
+    }
+    if (sample_count == 0) {
+        ddebug("hotspot_app_data size == 0");
+        return;
+    }
+    average_number = total / sample_count;
+    for (const auto &data_sample : data_samples) {
+        standard_deviation += pow((data_sample - average_number), 2);
+    }
+    standard_deviation = sqrt(standard_deviation / sample_count);
+    const auto &anly_data = hotspot_app_data.back();
+    for (int i = 0; i < perf_counters.size(); i++) {
+        double hot_point = (anly_data[i].total_qps - average_number) / standard_deviation;
+        // perf_counter->set can only be unsigned __int64
+        // use ceil to guarantee conversion results
+        hot_point = ceil(std::max(hot_point, double(0)));
+        perf_counters[i]->set(hot_point);
+    }
+}
+
+void hotspot_partition_calculator::data_analyse() { _data_analyse(_historical_data, _hot_points); }

Review comment:
       ```suggestion
   void hotspot_partition_calculator::data_analyse() { }
   ```
   
   Delete `_data_analyse`. Do not pass private member to a class method. What are you intended to do?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Smityz commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Smityz commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485582877



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,101 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+#include <dsn/utility/flags.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.collector",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data "
+                 "stored in calculator, The FIFO "
+                 "queue design is used to "
+                 "eliminate outdated historical "
+                 "data");
+
+void hotspot_partition_calculator::data_aggregate(const std::vector<row_data> &partitions)
+{
+    while (_historical_data.size() > FLAGS_max_hotspot_store_size - 1) {
+        _historical_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _historical_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(int partition_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < partition_count; i++) {
+        string partition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", partition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", partition_desc);
+        _hot_points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::data_analyse()
+{
+    dassert(_historical_data.back().size() == _hot_points.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(_historical_data.size() * _hot_points.size());
+    auto temp_data = _historical_data;
+    double total = 0, standard_deviation = 0, average_number = 0;
+    int sample_count = 0;
+    while (!temp_data.empty()) {
+        for (const auto &partition_data : temp_data.front()) {
+            if (partition_data.total_qps - 1.00 > 0) {

Review comment:
       In order to prevent too many 0 elements in the queue from interfering with the accuracy of the historical data, we choose to ignore some small values




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
hycdong commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484667532



##########
File path: src/server/info_collector.h
##########
@@ -177,15 +177,14 @@ class info_collector
     uint32_t _storage_size_fetch_interval_seconds;
     uint32_t _storage_size_retry_wait_seconds;
     uint32_t _storage_size_retry_max_count;
-    std::string _hotspot_detect_algorithm;
     ::dsn::task_ptr _storage_size_stat_timer_task;
     ::dsn::utils::ex_lock_nr _capacity_unit_update_info_lock;
     // mapping 'node address' --> 'last updated timestamp'
     std::map<std::string, string> _capacity_unit_update_info;
-    std::map<std::string, hotspot_calculator *> _hotspot_calculator_store;
+    std::map<std::string, hotspot_partition_calculator *> _hotspot_calculator_store;

Review comment:
       I think raw pointer is dangerous, I suggest using smart pointer instead.
   Besides, adding comment for `_hotspot_calculator_store`.

##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,52 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+
+namespace pegasus {
+namespace server {
+
+// hotspot_partition_calculator is used to find the hotspot in Pegasus
+class hotspot_partition_calculator
+{
+public:
+    hotspot_partition_calculator(const std::string &app_name, const int partition_num)

Review comment:
       I suggest adding comments for `_points`, `_app_data`, function `start_alg` and `aggregate`.
   Besides, adding comments to explain why `kMaxQueueSize = 100`.

##########
File path: src/server/info_collector.cpp
##########
@@ -150,15 +146,12 @@ void info_collector::on_app_stat()
         // get row data statistics for all of the apps
         all_stats.merge(app_stats);
 
-        // hotspot_calculator is to detect hotspots
-        hotspot_calculator *hotspot_calculator =
+        // hotspot_partition_calculator is to detect hotspots
+        hotspot_partition_calculator *hotspot_partition_calculator =
             get_hotspot_calculator(app_rows.first, app_rows.second.size());
-        if (!hotspot_calculator) {
-            continue;
-        }
-        hotspot_calculator->aggregate(app_rows.second);
-        // new policy can be designed by strategy pattern in hotspot_partition_data.h
-        hotspot_calculator->start_alg();
+        dassert(hotspot_partition_calculator != nullptr, "hotspot_partition_calculator is NULL");

Review comment:
       I suggest printing more information such as `app_name`, `partition_count` in assert.
   Besides, is it have to assert here? If the hotspot_partition_calculator is not found, `on_app_stat` can also work.

##########
File path: src/server/info_collector.cpp
##########
@@ -302,25 +295,17 @@ void info_collector::on_storage_size_stat(int remaining_retry_count)
     _result_writer->set_result(st_stat.timestamp, "ss", st_stat.dump_to_json());
 }
 
-hotspot_calculator *info_collector::get_hotspot_calculator(const std::string &app_name,
-                                                           const int partition_num)
+hotspot_partition_calculator *info_collector::get_hotspot_calculator(const std::string &app_name,
+                                                                     const int partition_num)

Review comment:
       `partition_count` may be better




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485314326



##########
File path: src/server/test/hotspot_partition_test.cpp
##########
@@ -0,0 +1,48 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "server/hotspot_partition_calculator.h"
+
+#include <gtest/gtest.h>
+
+namespace pegasus {
+namespace server {
+
+TEST(hotspot_partition_calculator, hotspot_partition_policy)
+{
+    std::vector<row_data> test_rows(8);
+    test_rows[0].get_qps = 1000.0;

Review comment:
       use `for()`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484630892



##########
File path: src/server/info_collector.cpp
##########
@@ -150,15 +146,12 @@ void info_collector::on_app_stat()
         // get row data statistics for all of the apps
         all_stats.merge(app_stats);
 
-        // hotspot_calculator is to detect hotspots
-        hotspot_calculator *hotspot_calculator =
+        // hotspot_partition_calculator is to detect hotspots

Review comment:
       hotspot_partition_calculator is used for detecting hotspots




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Smityz commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Smityz commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485526877



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.hotspot",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data stored in calculator, in order to same Mem");
+
+void hotspot_partition_calculator::data_aggregate(const std::vector<row_data> &partitions)
+{
+    while (_historical_data.size() > FLAGS_max_hotspot_store_size - 1) {
+        _historical_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _historical_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < perf_counter_count; i++) {
+        string partition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", paritition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", paritition_desc);
+        _hot_points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::_data_analyse(
+    const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+    std::vector<::dsn::perf_counter_wrapper> &perf_counters)
+{
+    dassert(hotspot_app_data.back().size() == perf_counters.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(hotspot_app_data.size() * perf_counters.size());
+    auto temp_data = hotspot_app_data;

Review comment:
       It will be refactored in https://github.com/apache/incubator-pegasus/pull/592




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Shuo-Jia commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Shuo-Jia commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r486042744



##########
File path: src/server/hotspot_partition_data.h
##########
@@ -4,21 +4,26 @@
 
 #pragma once
 
-#include "shell/commands.h"
+#include "shell/command_helper.h"
 
 namespace pegasus {
 namespace server {
 
+enum partition_qps_type
+{
+    READ_HOTSPOT_DATA = 0,
+    WRITE_HOTSPOT_DATA

Review comment:
       why `READ_HOTSPOT_DATA` init explictly, but `WRITE_HOTSPOT_DATA` no?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Smityz commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Smityz commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484789381



##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,52 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+
+namespace pegasus {
+namespace server {
+
+// hotspot_partition_calculator is used to find the hotspot in Pegasus
+class hotspot_partition_calculator
+{
+public:
+    hotspot_partition_calculator(const std::string &app_name, const int partition_num)
+        : _app_name(app_name), _points(partition_num)
+    {
+        init_perf_counter(partition_num);
+    }
+    void aggregate(const std::vector<row_data> &partitions);
+    void start_alg();
+    void init_perf_counter(const int perf_counter_count);
+
+private:
+    void analysis(const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+                  std::vector<::dsn::perf_counter_wrapper> &perf_counters);
+    const std::string _app_name;
+    std::vector<::dsn::perf_counter_wrapper> _points;
+    std::queue<std::vector<hotspot_partition_data>> _app_data;
+    static const int kMaxQueueSize = 100;

Review comment:
       I `FLAGS_max_hotspot_store_size` instead




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] acelyc111 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484485169



##########
File path: src/server/hotspot_partition_policy.h
##########
@@ -0,0 +1,39 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+
+#include <algorithm>
+#include <gtest/gtest_prod.h>
+#include <math.h>

Review comment:
       Move to cpp file.

##########
File path: src/server/info_collector.cpp
##########
@@ -150,15 +146,15 @@ void info_collector::on_app_stat()
         // get row data statistics for all of the apps
         all_stats.merge(app_stats);
 
-        // hotspot_calculator is to detect hotspots
-        hotspot_calculator *hotspot_calculator =
+        // hotspot_partition_calculator is to detect hotspots
+        hotspot_partition_calculator *hotspot_partition_calculator =
             get_hotspot_calculator(app_rows.first, app_rows.second.size());
-        if (!hotspot_calculator) {
+        if (!hotspot_partition_calculator) {

Review comment:
       A nullptr `hotspot_partition_calculator` is an invalid case, better to use `dassert` here.

##########
File path: src/server/hotspot_partition_policy.cpp
##########
@@ -0,0 +1,67 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_policy.h"
+
+namespace pegasus {
+namespace server {
+
+void hotspot_partition_policy::analysis(

Review comment:
       Will this class or function be reused any where? If not, make it as member of `hotspot_partition_calculator` would be better.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Smityz commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Smityz commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485319676



##########
File path: src/server/test/hotspot_partition_test.cpp
##########
@@ -0,0 +1,48 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "server/hotspot_partition_calculator.h"
+
+#include <gtest/gtest.h>
+
+namespace pegasus {
+namespace server {
+
+TEST(hotspot_partition_calculator, hotspot_partition_policy)
+{
+    std::vector<row_data> test_rows(8);
+    test_rows[0].get_qps = 1000.0;

Review comment:
       Refactor in https://github.com/apache/incubator-pegasus/pull/592




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Smityz commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Smityz commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484807742



##########
File path: src/server/info_collector.h
##########
@@ -177,15 +177,16 @@ class info_collector
     uint32_t _storage_size_fetch_interval_seconds;
     uint32_t _storage_size_retry_wait_seconds;
     uint32_t _storage_size_retry_max_count;
-    std::string _hotspot_detect_algorithm;
     ::dsn::task_ptr _storage_size_stat_timer_task;
     ::dsn::utils::ex_lock_nr _capacity_unit_update_info_lock;
     // mapping 'node address' --> 'last updated timestamp'
     std::map<std::string, string> _capacity_unit_update_info;
-    std::map<std::string, hotspot_calculator *> _hotspot_calculator_store;
-
-    hotspot_calculator *get_hotspot_calculator(const std::string &app_name,
-                                               const int partition_num);
+    // _hotspot_calculator_store is to save hotspot_partition_calculator for each table, a
+    // hotspot_partition_calculator saves historical hotspot data and alert perf_counters of
+    // corresponding table
+    std::map<std::string, std::shared_ptr<hotspot_partition_calculator>> _hotspot_calculator_store;

Review comment:
       `hotspot_partition_calculator` both used in _hotspot_calculator_store and info_collector




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] neverchanje commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
neverchanje commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485457585



##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,57 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+#include <dsn/utility/flags.h>
+
+namespace pegasus {
+namespace server {
+
+// hotspot_partition_calculator is used to find the hotspot in Pegasus
+class hotspot_partition_calculator
+{
+public:
+    hotspot_partition_calculator(const std::string &app_name, const int partition_count)
+        : _app_name(app_name), _hot_points(partition_count)
+    {
+        init_perf_counter(partition_count);
+    }
+    // aggregate related data of hotspot detection
+    void data_aggregate(const std::vector<row_data> &partitions);
+    // analyse the saved data to find hotspot partition
+    void data_analyse();
+    void init_perf_counter(const int perf_counter_count);
+
+private:
+    void _data_analyse(const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+                       std::vector<::dsn::perf_counter_wrapper> &perf_counters);
+    const std::string _app_name;
+
+    // usually _hot_points >= 3 can be considered as a hotspot partition

Review comment:
       ```suggestion
       // usually a partition with "hot-point value" >= 3 can be considered as a hotspot partition.
   ```
   
   This comment is misleading. I guess you intend to say "the value of _hot_points[i]" rather than `_hot_points`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Smityz commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Smityz commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484730609



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,98 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+void hotspot_partition_calculator::aggregate(const std::vector<row_data> &partitions)
+{
+    while (_app_data.size() > kMaxQueueSize - 1) {
+        _app_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _app_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < perf_counter_count; i++) {
+        string paritition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", paritition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", paritition_desc);
+        _points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::analysis(
+    const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+    std::vector<::dsn::perf_counter_wrapper> &perf_counters)
+{
+    dassert(hotspot_app_data.back().size() == perf_counters.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(hotspot_app_data.size() * perf_counters.size());
+    auto temp_data = hotspot_app_data;
+    double total = 0, sd = 0, avg = 0;
+    int sample_count = 0;
+    // avg: Average number
+    // sd: Standard deviation
+    // sample_count: Number of samples
+    while (!temp_data.empty()) {
+        for (auto partition_data : temp_data.front()) {
+            if (partition_data.total_qps - 1.00 > 0) {
+                data_samples.push_back(partition_data.total_qps);
+                total += partition_data.total_qps;
+                sample_count++;
+            }
+        }
+        temp_data.pop();
+    }
+    if (sample_count == 0) {
+        ddebug("hotspot_app_data size == 0");
+        return;
+    }
+    avg = total / sample_count;
+    for (auto data_sample : data_samples) {
+        sd += pow((data_sample - avg), 2);

Review comment:
       This PR doesn't involve the refactor of the algorithm, which is contained in https://github.com/apache/incubator-pegasus/pull/592, thanks for staying tuned




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r486045836



##########
File path: src/server/hotspot_partition_data.h
##########
@@ -4,21 +4,26 @@
 
 #pragma once
 
-#include "shell/commands.h"
+#include "shell/command_helper.h"
 
 namespace pegasus {
 namespace server {
 
+enum partition_qps_type
+{
+    READ_HOTSPOT_DATA = 0,
+    WRITE_HOTSPOT_DATA

Review comment:
       in cpp, the enum is increased by one orderly. so only need to init the first one




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485314326



##########
File path: src/server/test/hotspot_partition_test.cpp
##########
@@ -0,0 +1,48 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "server/hotspot_partition_calculator.h"
+
+#include <gtest/gtest.h>
+
+namespace pegasus {
+namespace server {
+
+TEST(hotspot_partition_calculator, hotspot_partition_policy)
+{
+    std::vector<row_data> test_rows(8);
+    test_rows[0].get_qps = 1000.0;

Review comment:
       use for()




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] neverchanje commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
neverchanje commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485546841



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,101 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+#include <dsn/utility/flags.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.collector",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data "
+                 "stored in calculator, The FIFO "
+                 "queue design is used to "
+                 "eliminate outdated historical "
+                 "data");
+
+void hotspot_partition_calculator::data_aggregate(const std::vector<row_data> &partitions)
+{
+    while (_historical_data.size() > FLAGS_max_hotspot_store_size - 1) {
+        _historical_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _historical_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(int partition_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < partition_count; i++) {
+        string partition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", partition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", partition_desc);
+        _hot_points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::data_analyse()
+{
+    dassert(_historical_data.back().size() == _hot_points.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(_historical_data.size() * _hot_points.size());
+    auto temp_data = _historical_data;
+    double total = 0, standard_deviation = 0, average_number = 0;
+    int sample_count = 0;
+    while (!temp_data.empty()) {
+        for (const auto &partition_data : temp_data.front()) {
+            if (partition_data.total_qps - 1.00 > 0) {

Review comment:
       Why need such prerequiste? What if a partition qps is 0?

##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,101 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+#include <dsn/utility/flags.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.collector",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data "
+                 "stored in calculator, The FIFO "
+                 "queue design is used to "
+                 "eliminate outdated historical "
+                 "data");
+
+void hotspot_partition_calculator::data_aggregate(const std::vector<row_data> &partitions)
+{
+    while (_historical_data.size() > FLAGS_max_hotspot_store_size - 1) {
+        _historical_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _historical_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(int partition_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < partition_count; i++) {
+        string partition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", partition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", partition_desc);
+        _hot_points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::data_analyse()
+{
+    dassert(_historical_data.back().size() == _hot_points.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(_historical_data.size() * _hot_points.size());
+    auto temp_data = _historical_data;
+    double total = 0, standard_deviation = 0, average_number = 0;

Review comment:
       ```suggestion
       double table_qps_sum = 0, standard_deviation = 0, table_qps_avg = 0;
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484630892



##########
File path: src/server/info_collector.cpp
##########
@@ -150,15 +146,12 @@ void info_collector::on_app_stat()
         // get row data statistics for all of the apps
         all_stats.merge(app_stats);
 
-        // hotspot_calculator is to detect hotspots
-        hotspot_calculator *hotspot_calculator =
+        // hotspot_partition_calculator is to detect hotspots

Review comment:
       hotspot_partition_calculator is used for detect hotspots




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] hycdong commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
hycdong commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484805216



##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,57 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+#include <dsn/utility/flags.h>
+
+namespace pegasus {
+namespace server {
+
+// hotspot_partition_calculator is used to find the hotspot in Pegasus
+class hotspot_partition_calculator
+{
+public:
+    hotspot_partition_calculator(const std::string &app_name, const int partition_count)
+        : _app_name(app_name), _hotspot_partition_points(partition_count)
+    {
+        init_perf_counter(partition_count);
+    }
+    // aggregate related data of hotspot detection
+    void hotspot_partition_data_aggregate(const std::vector<row_data> &partitions);
+    // analyse the saved data to find hotspot partition
+    void hotspot_partition_data_analyse();
+    void init_perf_counter(const int perf_counter_count);
+
+private:
+    void analysis(const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+                  std::vector<::dsn::perf_counter_wrapper> &perf_counters);
+    const std::string _app_name;
+
+    // usually _hotspot_partition_points >= 3 can be considered as a hotspot partition
+    std::vector<dsn::perf_counter_wrapper> _hotspot_partition_points;
+    // save historical data can improve accuracy
+    std::queue<std::vector<hotspot_partition_data>> _hotspot_partition_historical_data;

Review comment:
       You can simplfy `hotspot_partition_data_aggregate` as `data_aggregate`, because the class `hotspot_partition_calculator` includes the namespace, same as other function names and local varieties.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484804533



##########
File path: src/server/info_collector.h
##########
@@ -177,15 +177,16 @@ class info_collector
     uint32_t _storage_size_fetch_interval_seconds;
     uint32_t _storage_size_retry_wait_seconds;
     uint32_t _storage_size_retry_max_count;
-    std::string _hotspot_detect_algorithm;
     ::dsn::task_ptr _storage_size_stat_timer_task;
     ::dsn::utils::ex_lock_nr _capacity_unit_update_info_lock;
     // mapping 'node address' --> 'last updated timestamp'
     std::map<std::string, string> _capacity_unit_update_info;
-    std::map<std::string, hotspot_calculator *> _hotspot_calculator_store;
-
-    hotspot_calculator *get_hotspot_calculator(const std::string &app_name,
-                                               const int partition_num);
+    // _hotspot_calculator_store is to save hotspot_partition_calculator for each table, a
+    // hotspot_partition_calculator saves historical hotspot data and alert perf_counters of
+    // corresponding table
+    std::map<std::string, std::shared_ptr<hotspot_partition_calculator>> _hotspot_calculator_store;

Review comment:
       Why didn't you use `std::unique_ptr` instead of `std::shared_ptr`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] neverchanje commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
neverchanje commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485455252



##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,57 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+#include <dsn/utility/flags.h>
+
+namespace pegasus {
+namespace server {
+
+// hotspot_partition_calculator is used to find the hotspot in Pegasus
+class hotspot_partition_calculator
+{
+public:
+    hotspot_partition_calculator(const std::string &app_name, const int partition_count)
+        : _app_name(app_name), _hot_points(partition_count)
+    {
+        init_perf_counter(partition_count);
+    }
+    // aggregate related data of hotspot detection
+    void data_aggregate(const std::vector<row_data> &partitions);
+    // analyse the saved data to find hotspot partition
+    void data_analyse();
+    void init_perf_counter(const int perf_counter_count);

Review comment:
       do not pass value-typed constant.
   ```suggestion
       void init_perf_counter(int perf_counter_count);
   ```
   BTW, is `init_perf_counter` used in somewhere else? If not, declare it private.

##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,57 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+#include <dsn/utility/flags.h>
+
+namespace pegasus {
+namespace server {
+
+// hotspot_partition_calculator is used to find the hotspot in Pegasus
+class hotspot_partition_calculator
+{
+public:
+    hotspot_partition_calculator(const std::string &app_name, const int partition_count)

Review comment:
       ```suggestion
       hotspot_partition_calculator(const std::string &app_name, int partition_count)
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r486045836



##########
File path: src/server/hotspot_partition_data.h
##########
@@ -4,21 +4,26 @@
 
 #pragma once
 
-#include "shell/commands.h"
+#include "shell/command_helper.h"
 
 namespace pegasus {
 namespace server {
 
+enum partition_qps_type
+{
+    READ_HOTSPOT_DATA = 0,
+    WRITE_HOTSPOT_DATA

Review comment:
       in cpp, the enum is increased by one orderly. so we only need to init the first one




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] neverchanje commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
neverchanje commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485488156



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.hotspot",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data stored in calculator, in order to same Mem");
+
+void hotspot_partition_calculator::data_aggregate(const std::vector<row_data> &partitions)
+{
+    while (_historical_data.size() > FLAGS_max_hotspot_store_size - 1) {
+        _historical_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _historical_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < perf_counter_count; i++) {
+        string paritition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", paritition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", paritition_desc);
+        _hot_points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::_data_analyse(
+    const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,

Review comment:
       Serious? Why call it `hotspot_app_data`? 

##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,54 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+
+namespace pegasus {
+namespace server {
+
+// hotspot_partition_calculator is used to find the hot partition in a table.
+class hotspot_partition_calculator
+{
+public:
+    hotspot_partition_calculator(const std::string &app_name, int partition_count)
+        : _app_name(app_name), _hot_points(partition_count)
+    {
+        init_perf_counter(partition_count);
+    }
+    // aggregate related data of hotspot detection
+    void data_aggregate(const std::vector<row_data> &partitions);
+    // analyse the saved data to find hotspot partition
+    void data_analyse();
+
+private:
+    const std::string _app_name;
+
+    void init_perf_counter(int perf_counter_count);
+    // usually a partition with "hot-point value" >= 3 can be considered as a hotspot partition.
+    std::vector<dsn::perf_counter_wrapper> _hot_points;
+    // saving historical data can improve accuracy
+    std::queue<std::vector<hotspot_partition_data>> _historical_data;

Review comment:
       ```suggestion
       std::queue<std::vector<hotspot_partition_data>> _partition_stat_histories;
   ```
   
   I can't understand even after a deep look at it. "historical_data" for what? I think you mean the statistics histories of every partition.

##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,101 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+#include <dsn/utility/flags.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.collector",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data "
+                 "stored in calculator, The FIFO "
+                 "queue design is used to "
+                 "eliminate outdated historical "
+                 "data");
+
+void hotspot_partition_calculator::data_aggregate(const std::vector<row_data> &partitions)
+{
+    while (_historical_data.size() > FLAGS_max_hotspot_store_size - 1) {
+        _historical_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _historical_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(int partition_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < partition_count; i++) {
+        string partition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", partition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", partition_desc);
+        _hot_points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::data_analyse()
+{
+    dassert(_historical_data.back().size() == _hot_points.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(_historical_data.size() * _hot_points.size());
+    auto temp_data = _historical_data;
+    double total = 0, standard_deviation = 0, average_number = 0;
+    int sample_count = 0;
+    while (!temp_data.empty()) {
+        for (const auto &partition_data : temp_data.front()) {
+            if (partition_data.total_qps - 1.00 > 0) {
+                data_samples.push_back(partition_data.total_qps);
+                total += partition_data.total_qps;
+                sample_count++;
+            }
+        }
+        temp_data.pop();
+    }

Review comment:
       Use std::deque instead in order to iterating over the queue. std::queue does not support for loop. Your code using std::queue requires a data copy from `_historical_data` to `temp_data`. You should at your best effort prevent such large copy.
   
   ```
   for (const auto& history : _historical_data) {
     for (const auto &partition_data : temp_data.front()) {
       if (partition_data.total_qps - 1.00 > 0) {
           ...
       }
     }
   }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484630771



##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,52 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+
+namespace pegasus {
+namespace server {
+
+// hotspot_partition_calculator is used to find the hotspot in Pegasus
+class hotspot_partition_calculator
+{
+public:
+    hotspot_partition_calculator(const std::string &app_name, const int partition_num)
+        : _app_name(app_name), _points(partition_num)
+    {
+        init_perf_counter(partition_num);
+    }
+    void aggregate(const std::vector<row_data> &partitions);
+    void start_alg();
+    void init_perf_counter(const int perf_counter_count);
+
+private:
+    void analysis(const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+                  std::vector<::dsn::perf_counter_wrapper> &perf_counters);
+    const std::string _app_name;
+    std::vector<::dsn::perf_counter_wrapper> _points;

Review comment:
       `std::vector<dsn::perf_counter_wrapper> _points;`
   don't use `::dsn`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Shuo-Jia commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Shuo-Jia commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r485504577



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+DSN_DEFINE_int64("pegasus.hotspot",
+                 max_hotspot_store_size,
+                 100,
+                 "the max count of historical data stored in calculator, in order to same Mem");
+
+void hotspot_partition_calculator::data_aggregate(const std::vector<row_data> &partitions)
+{
+    while (_historical_data.size() > FLAGS_max_hotspot_store_size - 1) {
+        _historical_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _historical_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < perf_counter_count; i++) {
+        string partition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", paritition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", paritition_desc);
+        _hot_points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::_data_analyse(
+    const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+    std::vector<::dsn::perf_counter_wrapper> &perf_counters)
+{
+    dassert(hotspot_app_data.back().size() == perf_counters.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(hotspot_app_data.size() * perf_counters.size());
+    auto temp_data = hotspot_app_data;

Review comment:
       is `temp_data` necessary?may you use `std::vector<<std::vector>>` directly and no need `pop` but use `for` to read the data?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Smityz commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Smityz commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484980343



##########
File path: src/server/info_collector.h
##########
@@ -177,15 +177,16 @@ class info_collector
     uint32_t _storage_size_fetch_interval_seconds;
     uint32_t _storage_size_retry_wait_seconds;
     uint32_t _storage_size_retry_max_count;
-    std::string _hotspot_detect_algorithm;
     ::dsn::task_ptr _storage_size_stat_timer_task;
     ::dsn::utils::ex_lock_nr _capacity_unit_update_info_lock;
     // mapping 'node address' --> 'last updated timestamp'
     std::map<std::string, string> _capacity_unit_update_info;
-    std::map<std::string, hotspot_calculator *> _hotspot_calculator_store;
-
-    hotspot_calculator *get_hotspot_calculator(const std::string &app_name,
-                                               const int partition_num);
+    // _hotspot_calculator_store is to save hotspot_partition_calculator for each table, a
+    // hotspot_partition_calculator saves historical hotspot data and alert perf_counters of
+    // corresponding table
+    std::map<std::string, std::shared_ptr<hotspot_partition_calculator>> _hotspot_calculator_store;

Review comment:
       `share_ptr` is easier to achieve I think, it's not sensitive to performance here
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484630526



##########
File path: src/server/hotspot_partition_calculator.h
##########
@@ -0,0 +1,52 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "hotspot_partition_data.h"
+#include <gtest/gtest_prod.h>
+#include <dsn/perf_counter/perf_counter.h>
+
+namespace pegasus {
+namespace server {
+
+// hotspot_partition_calculator is used to find the hotspot in Pegasus
+class hotspot_partition_calculator
+{
+public:
+    hotspot_partition_calculator(const std::string &app_name, const int partition_num)
+        : _app_name(app_name), _points(partition_num)
+    {
+        init_perf_counter(partition_num);
+    }
+    void aggregate(const std::vector<row_data> &partitions);
+    void start_alg();
+    void init_perf_counter(const int perf_counter_count);
+
+private:
+    void analysis(const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+                  std::vector<::dsn::perf_counter_wrapper> &perf_counters);
+    const std::string _app_name;
+    std::vector<::dsn::perf_counter_wrapper> _points;
+    std::queue<std::vector<hotspot_partition_data>> _app_data;
+    static const int kMaxQueueSize = 100;

Review comment:
       We don't use camel case in `rdsn`, maybe you can rename it with `MAX_QUEUE_SIZE`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] levy5307 commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
levy5307 commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r484629536



##########
File path: src/server/hotspot_partition_calculator.cpp
##########
@@ -0,0 +1,98 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "hotspot_partition_calculator.h"
+
+#include <algorithm>
+#include <math.h>
+#include <dsn/dist/fmt_logging.h>
+
+namespace pegasus {
+namespace server {
+
+void hotspot_partition_calculator::aggregate(const std::vector<row_data> &partitions)
+{
+    while (_app_data.size() > kMaxQueueSize - 1) {
+        _app_data.pop();
+    }
+    std::vector<hotspot_partition_data> temp(partitions.size());
+    for (int i = 0; i < partitions.size(); i++) {
+        temp[i] = std::move(hotspot_partition_data(partitions[i]));
+    }
+    _app_data.emplace(temp);
+}
+
+void hotspot_partition_calculator::init_perf_counter(const int perf_counter_count)
+{
+    std::string counter_name;
+    std::string counter_desc;
+    for (int i = 0; i < perf_counter_count; i++) {
+        string paritition_desc = _app_name + '.' + std::to_string(i);
+        counter_name = fmt::format("app.stat.hotspots@{}", paritition_desc);
+        counter_desc = fmt::format("statistic the hotspots of app {}", paritition_desc);
+        _points[i].init_app_counter(
+            "app.pegasus", counter_name.c_str(), COUNTER_TYPE_NUMBER, counter_desc.c_str());
+    }
+}
+
+void hotspot_partition_calculator::analysis(
+    const std::queue<std::vector<hotspot_partition_data>> &hotspot_app_data,
+    std::vector<::dsn::perf_counter_wrapper> &perf_counters)
+{
+    dassert(hotspot_app_data.back().size() == perf_counters.size(),
+            "partition counts error, please check");
+    std::vector<double> data_samples;
+    data_samples.reserve(hotspot_app_data.size() * perf_counters.size());
+    auto temp_data = hotspot_app_data;
+    double total = 0, sd = 0, avg = 0;
+    int sample_count = 0;
+    // avg: Average number
+    // sd: Standard deviation
+    // sample_count: Number of samples
+    while (!temp_data.empty()) {
+        for (auto partition_data : temp_data.front()) {

Review comment:
       `for (const auto &partition_data : temp_data.front()) {`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] Shuo-Jia commented on a change in pull request #597: refactor(collector): sort out the structure of partition hotspot detection

Posted by GitBox <gi...@apache.org>.
Shuo-Jia commented on a change in pull request #597:
URL: https://github.com/apache/incubator-pegasus/pull/597#discussion_r486047471



##########
File path: src/server/hotspot_partition_data.h
##########
@@ -4,21 +4,26 @@
 
 #pragma once
 
-#include "shell/commands.h"
+#include "shell/command_helper.h"
 
 namespace pegasus {
 namespace server {
 
+enum partition_qps_type
+{
+    READ_HOTSPOT_DATA = 0,
+    WRITE_HOTSPOT_DATA

Review comment:
       The first default 0, so also can not be inited?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org