You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Grant Henke (Jira)" <ji...@apache.org> on 2020/06/09 03:21:00 UTC
[jira] [Created] (KUDU-3147) Balance tablets based on range hash
buckets
Grant Henke created KUDU-3147:
---------------------------------
Summary: Balance tablets based on range hash buckets
Key: KUDU-3147
URL: https://issues.apache.org/jira/browse/KUDU-3147
Project: Kudu
Issue Type: Improvement
Components: master, perf
Affects Versions: 1.12.0
Reporter: Grant Henke
When a user defines a schema that uses range + hash partitioning its is often the case that the tablets in the latest range, based on time or any semi-sequential data, are the only tablets that receive writes. Or even if not the latest, it is common for a single range to receive a burst of writes if backloading.
This is so common, that the default Kudu balancing scheme should consider placing/rebalancing the tablets for the hash buckets within each range on as many servers as possible in order to support the maximum write throughput. In that case, `min(#buckets, #total-cluster-tservers)` tservers will be used to handle the writes if the cluster is perfectly balanced. Today, even if perfectly balanced, it is possible for all the hash buckets to be on a single tserver.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)