You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/05/11 12:12:12 UTC

[GitHub] [incubator-druid] sascha-coenen commented on a change in pull request #7629: Add basic tuning guide, getting started page, updated clustering docs

sascha-coenen commented on a change in pull request #7629: Add basic tuning guide, getting started page, updated clustering docs
URL: https://github.com/apache/incubator-druid/pull/7629#discussion_r283094105

##########
File path: docs/content/operations/basic-cluster-tuning.md
##########
@@ -0,0 +1,342 @@
+---
+layout: doc_page
+title: "Basic Cluster Tuning"
+---
+
+# Basic Cluster Tuning
+
+This document provides basic guidelines for configuration properties and cluster architecture considerations related to performance tuning of an Apache Druid (incubating) deployment.
+
+Please note that this document provides general guidelines and rules-of-thumb: these are not absolute, universal rules for cluster tuning, and this introductory guide is not an exhaustive description of all Druid tuning properties, which are described in the [configuration reference](../configuration/index.html).
+
+If you have questions on tuning Druid for specific use cases, or questions on configuration properties not covered in this guide, please ask the [Druid user mailing list or other community channels](https://druid.apache.org/community/).
+
+## Process-specific guidelines
+
+### Historical
+
+#### Heap sizing
+
+The biggest contributions to heap usage on Historicals are:
+- Partial unmerged query results from segments
+- The stored maps for [lookups](../querying/lookups.html).
+
+A general rule-of-thumb for sizing the Historical heap is `(0.5GB * number of CPU cores)`, with an upper limit of ~24GB.
+
+This rule-of-thumb scales using the number of CPU cores as a convenient proxy for hardware size and level of concurrency (note: this formula is not a hard rule for sizing Historical heaps).
+
+Having a heap that is too large can result in excessively long GC collection pauses, the ~24GB upper limit is imposed to avoid this.
+
+Running out of heap on the Historicals can indicate misconfiguration or usage patterns that are overloading the cluster.
+
+##### Lookups
+
+If you are using lookups, calculate the total size of the lookup maps being loaded.
+
+Druid performs an atomic swap when updating lookup maps (both the old map and the new map will exist in heap during the swap), so the maximum potential heap usage from lookup maps will be (2 * total size of all loaded lookups).
+
+Be sure to add `(2 * total size of all loaded lookups)` to your heap size in addition to the `(0.5GB * number of CPU cores)` guideline.
+
+#### Processing Threads and Buffers
+
+Please see the [General Guidelines for Processing Threads and Buffers](#general-guidelines-for-processing-threads-and-buffers) section for an overview of processing thread/buffer configuration.
+
+On Historicals:
+- `druid.processing.numThreads` should generally be set to `(number of cores - 1)`: a smaller value can result in CPU underutilization, while going over the number of cores can result in unnecessary CPU contention.
+- `druid.processing.buffer.sizeBytes` can be set to 500MB.
+- `druid.processing.numMergeBuffers`, a 1:4 ratio of merge buffers to processing threads is a reasonable choice for general use.
+
+#### Direct Memory Sizing
+
+The processing and merge buffers described above are direct memory buffers.
+
+When a historical processes a query, it must open a set of segments for reading. This also requires some direct memory space, described in [segment decompression buffers](#segment-decompression).
+
+A formula for estimating direct memory usage follows:
+
+(`druid.processing.numThreads` + `druid.processing.numMergeBuffers` + 1) * `druid.processing.buffer.sizeBytes`
+
+The `+ 1` factor is a fuzzy estimate meant to account for the segment decompression buffers.
+
+#### Connection Pool Sizing
+
+Please see the [General Connection Pool Guidelines](#general-connection-pool-guidelines) section for an overview of connection pool configuration.
+
+For Historicals, `druid.server.http.numThreads` should be set to a value slightly higher than the sum of `druid.broker.http.numConnections` across all the Brokers in the cluster.
+
+Tuning the cluster so that each Historical can accept 50 queries and 10 non-queries is a reasonable starting point.

Review comment:
To me this statement is not clear. What is the definition of queries vs. non-queries and how to configure a number for each?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org