You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@james.apache.org by GitBox <gi...@apache.org> on 2022/03/25 04:24:19 UTC

[GitHub] [james-project] Arsnael commented on a change in pull request #937: JAMES-3734 Document database benchmark methodologies and base performances

Arsnael commented on a change in pull request #937:
URL: https://github.com/apache/james-project/pull/937#discussion_r834925854



##########
File path: server/apps/distributed-app/docs/modules/ROOT/pages/operate/db-benchmark.adoc
##########
@@ -0,0 +1,1261 @@
+= Distributed James Server -- Database benchmarks
+:navtitle: Database benchmarks
+
+This document provides basic performance of Distributed James' databases, benchmark methodologies as a basis for a James administrator

Review comment:
       ```suggestion
   This document provides basic performance of Distributed James' databases, benchmark methodologies as a basis for a James administrator who
   ```

##########
File path: server/apps/distributed-app/docs/modules/ROOT/pages/operate/db-benchmark.adoc
##########
@@ -0,0 +1,850 @@
+= Distributed James Server -- Database benchmarks
+:navtitle: Database benchmarks
+
+This document provides basic performance of Distributed James' databases, benchmark methodologies as a basis for a James administrator
+can test and evaluate if his Distributed James databases are performing well.
+
+It includes:
+
+* A sample deployment topology
+* Propose benchmark methodology and base performance for each database. This aims to help operators to quickly identify
+performance issues and compliance of their databases.
+
+== Sample deployment topology
+
+We deploy a sample topology of Distributed James with these following databases:
+
+- Apache Cassandra 4 as main database
+- OpenDistro 1.13.1 as search engine
+- RabbitMQ 3.8.17 as message queue
+- OVH Swift S3 as an object storage
+
+With the above system, our email service operates stably with valuable performance.
+For a more details, it can handle a load throughput up to about 1000 JMAP requests per second with 99th percentile latency is 400ms.
+
+== Benchmark methodologies and base performances
+We are willing to share the benchmark methodologies and the result to you as a reference to evaluate your Distributed James' performance.
+Other evaluation methods are welcome, as long as your databases exhibit similar or even better performance than ours.
+It is up to your business needs. If your databases shows results that fall far from our baseline performance, there's a good chance that
+there are problems with your system, and you need to check it out thoroughly.
+
+=== Benchmark Cassandra
+
+==== Benchmark methodology
+===== Benchmark tool
+
+We use https://cassandra.apache.org/doc/latest/cassandra/tools/cassandra_stress.html[cassandra-stress tool] - an official
+tool of Cassandra for stress loading tests.
+
+The cassandra-stress tool is a Java-based stress testing utility for basic benchmarking and load testing a Cassandra cluster.
+Data modeling choices can greatly affect application performance. Significant load testing over several trials is the best method for discovering issues with a particular data model. The cassandra-stress tool is an effective tool for populating a cluster and stress testing CQL tables and queries. Use cassandra-stress to:
+
+- Quickly determine how a schema performs.
+- Understand how your database scales.
+- Optimize your data model and settings.
+- Determine production capacity.
+
+There are several operation types:
+
+- write-only, read-only, and mixed workloads of standard data
+- write-only and read-only workloads for counter columns
+- user configured workloads, running custom queries on custom schemas
+
+===== How to benchmark
+
+Here we are using a simple case to test and compare Cassandra performance between different setup environments.
+
+[source,yaml]
+----
+keyspace: stresscql
+
+keyspace_definition: |
+  CREATE KEYSPACE stresscql WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};
+
+table: mixed_workload
+
+table_definition: |
+  CREATE TABLE mixed_workload (
+    key uuid PRIMARY KEY,
+    a blob,
+    b blob
+  ) WITH COMPACT STORAGE
+
+columnspec:
+  - name: a
+    size: uniform(1..10000)
+  - name: b
+    size: uniform(1..100000)
+
+insert:
+  partitions: fixed(1)
+
+queries:
+   read:
+      cql: select * from mixed_workload where key = ?
+      fields: samerow
+----
+
+Create the yaml file as above and copy to a Cassandra node.
+
+Insert some sample data:
+
+[source,bash]
+----
+cassandra-stress user profile=mixed_workload.yml n=100000 "ops(insert=1)" cl=ONE -mode native cql3 user=<user> password=<password> -node <IP> -rate threads=8 -graph file=./graph_insert.xml title=Benchmark revision=insert_ONE
+----
+
+Read intensive scenario:
+
+[source,bash]
+----
+cassandra-stress user profile=mixed_workload.yml n=100000 "ops(insert=1,read=4)" cl=ONE -mode native cql3 user=<user> password=<password> -node <IP> -rate threads=8 -graph file=./graph_mixed.xml title=Benchmark revision=mixed_ONE
+----
+
+In there:
+
+- n=100000: The number of insert batches, not number of individual insert operations.
+- rate threads=8: The number of concurrent threads. If not specified it will start with 4 threads and increase until server reaches a limit.
+- ops(insert=1,read=4): This will execute insert and read queries in the ratio 1:4.
+- graph: Export results to graph in html format.
+
+==== Sample benchmark result
+image::cassandra_stress_test_result_1.png[]
+
+image::cassandra_stress_test_result_2.png[]
+
+==== References
+https://www.datastax.com/blog/improved-cassandra-21-stress-tool-benchmark-any-schema-part-1[Datastax - Cassandra stress tool]
+
+https://www.instaclustr.com/deep-diving-cassandra-stress-part-3-using-yaml-profiles/[Deep Diving cassandra-stress – Part 3 (Using YAML Profiles)]
+
+=== Benchmark Elasticsearch
+
+==== Benchmark methodology
+
+===== Benchmark tool
+We use https://github.com/elastic/rally[EsRally] - an official Elasticsearch benchmarking tool. EsRally provides the following features:
+
+- Automatically create Elasticsearch clusters, stress tests them, and delete them.
+- Manage stress testing data and solutions by Elasticsearch version.
+- Present stress testing data in a comprehensive way, allowing you to compare and analyze the data of different stress tests and store the data on a particular Elasticsearch instance for secondary analysis.
+- Collect Java Virtual Machine (JVM) details, such as memory and garbage collection (GC) data, to locate performance problems.
+
+You can have a look at https://elasticsearch-benchmarks.elastic.co/  where Elasticsearch also officially uses esrally to test its performance and publishes the results in real-time.
+
+===== How to benchmark
+Please follow https://esrally.readthedocs.io/en/latest/quickstart.html?spm=a2c65.11461447.0.0.e26a498c3KJZNe[Esrally quickstart documentation]
+to set up it first.
+
+Let's see which tracks (simulation profiles) that EsRally provides: ```esrally list tracks```.
+For our James use case, we are interested in ```pmc``` track: ```Full-text benchmark with academic papers from PMC```.
+
+Run the below script to benchmark against your Elasticsearch cluster:
+
+[source,bash]
+----
+esrally race --pipeline=benchmark-only --track=[track-name] --target-host=[ip_node1:port_node1],[ip_node2:port_node2],[ip_node3:port_node3] --client-options="use_ssl:false,verify_certs:false,basic_auth_user:'[user]',basic_auth_password:'[password]'"
+----
+
+In there:
+
+* --pipeline=benchmark-only: benchmark against a running cluster
+* track-name: track you want to benchmark
+* ip:port: Elasticsearch Node' socket
+* --client-options: change to your Elasticsearch authentication credentials
+
+==== Sample benchmark result
+===== PMC track
+
+[source]
+----
+------------------------------------------------------
+    _______             __   _____
+   / ____(_)___  ____ _/ /  / ___/_________  ________
+  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
+ / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
+/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
+------------------------------------------------------
+
+|                                                         Metric |                          Task |       Value |    Unit |
+|---------------------------------------------------------------:|------------------------------:|------------:|--------:|
+|                     Cumulative indexing time of primary shards |                               |     563.427 |     min |
+|             Min cumulative indexing time across primary shards |                               |           0 |     min |
+|          Median cumulative indexing time across primary shards |                               |  0.00293333 |     min |
+|             Max cumulative indexing time across primary shards |                               |      112.04 |     min |
+|            Cumulative indexing throttle time of primary shards |                               |           0 |     min |
+|    Min cumulative indexing throttle time across primary shards |                               |           0 |     min |
+| Median cumulative indexing throttle time across primary shards |                               |           0 |     min |
+|    Max cumulative indexing throttle time across primary shards |                               |           0 |     min |
+|                        Cumulative merge time of primary shards |                               |     1134.99 |     min |
+|                       Cumulative merge count of primary shards |                               |      165181 |         |
+|                Min cumulative merge time across primary shards |                               |           0 |     min |
+|             Median cumulative merge time across primary shards |                               |  0.00188333 |     min |
+|                Max cumulative merge time across primary shards |                               |     248.347 |     min |
+|               Cumulative merge throttle time of primary shards |                               |     620.683 |     min |
+|       Min cumulative merge throttle time across primary shards |                               |           0 |     min |
+|    Median cumulative merge throttle time across primary shards |                               |           0 |     min |
+|       Max cumulative merge throttle time across primary shards |                               |     138.621 |     min |
+|                      Cumulative refresh time of primary shards |                               |      644.67 |     min |
+|                     Cumulative refresh count of primary shards |                               | 1.37405e+06 |         |
+|              Min cumulative refresh time across primary shards |                               |           0 |     min |
+|           Median cumulative refresh time across primary shards |                               |   0.0101667 |     min |
+|              Max cumulative refresh time across primary shards |                               |     147.427 |     min |
+|                        Cumulative flush time of primary shards |                               |     45.1533 |     min |
+|                       Cumulative flush count of primary shards |                               |        4084 |         |
+|                Min cumulative flush time across primary shards |                               |           0 |     min |
+|             Median cumulative flush time across primary shards |                               |      0.0005 |     min |
+|                Max cumulative flush time across primary shards |                               |     7.92482 |     min |
+|                                        Total Young Gen GC time |                               |       5.593 |       s |
+|                                       Total Young Gen GC count |                               |         320 |         |
+|                                          Total Old Gen GC time |                               |           0 |       s |
+|                                         Total Old Gen GC count |                               |           0 |         |
+|                                                     Store size |                               |     359.984 |      GB |
+|                                                  Translog size |                               | 1.33691e-05 |      GB |
+|                                         Heap used for segments |                               |     8.39256 |      MB |
+|                                       Heap used for doc values |                               |    0.444857 |      MB |
+|                                            Heap used for terms |                               |     6.57648 |      MB |
+|                                            Heap used for norms |                               |    0.882629 |      MB |
+|                                           Heap used for points |                               |           0 |      MB |
+|                                    Heap used for stored fields |                               |    0.488602 |      MB |
+|                                                  Segment count |                               |         964 |         |
+|                                                 Min Throughput |                  index-append |      734.63 |  docs/s |
+|                                                Mean Throughput |                  index-append |      763.16 |  docs/s |
+|                                              Median Throughput |                  index-append |       746.5 |  docs/s |
+|                                                 Max Throughput |                  index-append |      833.51 |  docs/s |
+|                                        50th percentile latency |                  index-append |     4738.57 |      ms |
+|                                        90th percentile latency |                  index-append |      8129.1 |      ms |
+|                                        99th percentile latency |                  index-append |     11734.5 |      ms |
+|                                       100th percentile latency |                  index-append |     14662.9 |      ms |
+|                                   50th percentile service time |                  index-append |     4738.57 |      ms |
+|                                   90th percentile service time |                  index-append |      8129.1 |      ms |
+|                                   99th percentile service time |                  index-append |     11734.5 |      ms |
+|                                  100th percentile service time |                  index-append |     14662.9 |      ms |
+|                                                     error rate |                  index-append |           0 |       % |
+|                                                 Min Throughput |                       default |       19.94 |   ops/s |
+|                                                Mean Throughput |                       default |       19.95 |   ops/s |
+|                                              Median Throughput |                       default |       19.95 |   ops/s |
+|                                                 Max Throughput |                       default |       19.96 |   ops/s |
+|                                        50th percentile latency |                       default |     23.1322 |      ms |
+|                                        90th percentile latency |                       default |     25.4129 |      ms |
+|                                        99th percentile latency |                       default |     29.1382 |      ms |
+|                                       100th percentile latency |                       default |     29.4762 |      ms |
+|                                   50th percentile service time |                       default |     21.4895 |      ms |
+|                                   90th percentile service time |                       default |      23.589 |      ms |
+|                                   99th percentile service time |                       default |     26.6134 |      ms |
+|                                  100th percentile service time |                       default |     27.9068 |      ms |
+|                                                     error rate |                       default |           0 |       % |
+|                                                 Min Throughput |                          term |       19.93 |   ops/s |
+|                                                Mean Throughput |                          term |       19.94 |   ops/s |
+|                                              Median Throughput |                          term |       19.94 |   ops/s |
+|                                                 Max Throughput |                          term |       19.95 |   ops/s |
+|                                        50th percentile latency |                          term |     31.0684 |      ms |
+|                                        90th percentile latency |                          term |     34.1419 |      ms |
+|                                        99th percentile latency |                          term |     74.7904 |      ms |
+|                                       100th percentile latency |                          term |     103.663 |      ms |
+|                                   50th percentile service time |                          term |     29.6775 |      ms |
+|                                   90th percentile service time |                          term |     32.4288 |      ms |
+|                                   99th percentile service time |                          term |      36.013 |      ms |
+|                                  100th percentile service time |                          term |     102.193 |      ms |
+|                                                     error rate |                          term |           0 |       % |
+|                                                 Min Throughput |                        phrase |       19.94 |   ops/s |
+|                                                Mean Throughput |                        phrase |       19.95 |   ops/s |
+|                                              Median Throughput |                        phrase |       19.95 |   ops/s |
+|                                                 Max Throughput |                        phrase |       19.95 |   ops/s |
+|                                        50th percentile latency |                        phrase |     23.0255 |      ms |
+|                                        90th percentile latency |                        phrase |     26.1607 |      ms |
+|                                        99th percentile latency |                        phrase |     31.2094 |      ms |
+|                                       100th percentile latency |                        phrase |     45.5012 |      ms |
+|                                   50th percentile service time |                        phrase |     21.5109 |      ms |
+|                                   90th percentile service time |                        phrase |     24.4144 |      ms |
+|                                   99th percentile service time |                        phrase |     26.1865 |      ms |
+|                                  100th percentile service time |                        phrase |     43.5122 |      ms |
+|                                                     error rate |                        phrase |           0 |       % |
+|                                                 Min Throughput | articles_monthly_agg_uncached |       19.95 |   ops/s |
+|                                                Mean Throughput | articles_monthly_agg_uncached |       19.96 |   ops/s |
+|                                              Median Throughput | articles_monthly_agg_uncached |       19.96 |   ops/s |
+|                                                 Max Throughput | articles_monthly_agg_uncached |       19.96 |   ops/s |
+|                                        50th percentile latency | articles_monthly_agg_uncached |     26.7918 |      ms |
+|                                        90th percentile latency | articles_monthly_agg_uncached |     34.1708 |      ms |
+|                                        99th percentile latency | articles_monthly_agg_uncached |     42.3661 |      ms |
+|                                       100th percentile latency | articles_monthly_agg_uncached |     43.0024 |      ms |
+|                                   50th percentile service time | articles_monthly_agg_uncached |     25.3893 |      ms |
+|                                   90th percentile service time | articles_monthly_agg_uncached |     32.3418 |      ms |
+|                                   99th percentile service time | articles_monthly_agg_uncached |     41.3612 |      ms |
+|                                  100th percentile service time | articles_monthly_agg_uncached |     42.0802 |      ms |
+|                                                     error rate | articles_monthly_agg_uncached |           0 |       % |
+|                                                 Min Throughput |   articles_monthly_agg_cached |       19.94 |   ops/s |
+|                                                Mean Throughput |   articles_monthly_agg_cached |       19.95 |   ops/s |
+|                                              Median Throughput |   articles_monthly_agg_cached |       19.95 |   ops/s |
+|                                                 Max Throughput |   articles_monthly_agg_cached |       19.96 |   ops/s |
+|                                        50th percentile latency |   articles_monthly_agg_cached |     9.63666 |      ms |
+|                                        90th percentile latency |   articles_monthly_agg_cached |      10.973 |      ms |
+|                                        99th percentile latency |   articles_monthly_agg_cached |     27.1236 |      ms |
+|                                       100th percentile latency |   articles_monthly_agg_cached |     28.7119 |      ms |
+|                                   50th percentile service time |   articles_monthly_agg_cached |     7.99763 |      ms |
+|                                   90th percentile service time |   articles_monthly_agg_cached |       8.979 |      ms |
+|                                   99th percentile service time |   articles_monthly_agg_cached |     25.7034 |      ms |
+|                                  100th percentile service time |   articles_monthly_agg_cached |     27.1026 |      ms |
+|                                                     error rate |   articles_monthly_agg_cached |           0 |       % |
+|                                                 Min Throughput |                        scroll |        5.85 | pages/s |
+|                                                Mean Throughput |                        scroll |        5.86 | pages/s |
+|                                              Median Throughput |                        scroll |        5.86 | pages/s |
+|                                                 Max Throughput |                        scroll |        5.87 | pages/s |
+|                                        50th percentile latency |                        scroll |      229970 |      ms |
+|                                        90th percentile latency |                        scroll |      319870 |      ms |
+|                                        99th percentile latency |                        scroll |      340138 |      ms |
+|                                       100th percentile latency |                        scroll |      342421 |      ms |
+|                                   50th percentile service time |                        scroll |     4269.07 |      ms |
+|                                   90th percentile service time |                        scroll |     4308.67 |      ms |
+|                                   99th percentile service time |                        scroll |     4445.16 |      ms |
+|                                  100th percentile service time |                        scroll |     4605.69 |      ms |
+|                                                     error rate |                        scroll |           0 |       % |
+
+
+----------------------------------
+[INFO] SUCCESS (took 1772 seconds)
+----------------------------------
+----
+
+===== PMC custom track
+We customized the PMC track by increase search throughput target to figure out our Elasticsearch cluster limit.

Review comment:
       ```suggestion
   We customized the PMC track by increasing search throughput target to figure out our Elasticsearch cluster limit.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@james.apache.org
For additional commands, e-mail: notifications-help@james.apache.org