You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by ji...@apache.org on 2022/10/12 04:34:50 UTC

[doris-website] branch master updated: [typo](docs)fix some problem (#140)

This is an automated email from the ASF dual-hosted git repository.

jiafengzheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new f105bfb7d96 [typo](docs)fix some problem (#140)
f105bfb7d96 is described below

commit f105bfb7d96b1ba3c0ad4f4af4d06b4dd396a8a7
Author: Liqf <10...@users.noreply.github.com>
AuthorDate: Wed Oct 12 12:34:46 2022 +0800

    [typo](docs)fix some problem (#140)
---
 docs/data-table/data-partition.md                                       | 2 +-
 .../docusaurus-plugin-content-docs/current/data-table/data-partition.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/data-table/data-partition.md b/docs/data-table/data-partition.md
index 441e43cee8e..dc733214e05 100644
--- a/docs/data-table/data-partition.md
+++ b/docs/data-table/data-partition.md
@@ -314,7 +314,7 @@ It is also possible to use only one layer of partitioning. When using a layer pa
 2. Bucket
 
     * If a Partition is used, the `DISTRIBUTED ...` statement describes the division rules for the data in each partition. If you do not use Partition, it describes the rules for dividing the data of the entire table.
-    * The bucket column can be multiple columns, but it must be a Key column. The bucket column can be the same or different from the Partition column.
+    * The bucket column can be multiple columns,Aggregate and Unique models must be key columns, and Duplicate models can be key columns and value columns. The bucket column can be the same or different from the Partition column.
     * The choice of bucket column is a trade-off between **query throughput** and **query concurrency**:
 
         1. If you select multiple bucket columns, the data is more evenly distributed. However, if the query condition does not include the equivalent condition for all bucket columns, a query will scan all buckets. The throughput of such queries will increase, and the latency of a single query will decrease. This method is suitable for large throughput and low concurrent query scenarios.
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-table/data-partition.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-table/data-partition.md
index 1e3be4b31b0..f166ed543f5 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-table/data-partition.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-table/data-partition.md
@@ -321,7 +321,7 @@ Doris 支持两层的数据划分。第一层是 Partition,支持 Range 和 Li
 2. **Bucket**
 
    - 如果使用了 Partition,则 `DISTRIBUTED ...` 语句描述的是数据在**各个分区内**的划分规则。如果不使用 Partition,则描述的是对整个表的数据的划分规则。
-   - 分桶列可以是多列,但必须为 Key 列。分桶列可以和 Partition 列相同或不同。
+   - 分桶列可以是多列,Aggregate 和 Unique 模型必须为 Key 列,Duplicate 模型可以是 key 列和 value 列。分桶列可以和 Partition 列相同或不同。
    - 分桶列的选择,是在 **查询吞吐** 和 **查询并发** 之间的一种权衡:
      1. 如果选择多个分桶列,则数据分布更均匀。如果一个查询条件不包含所有分桶列的等值条件,那么该查询会触发所有分桶同时扫描,这样查询的吞吐会增加,单个查询的延迟随之降低。这个方式适合大吞吐低并发的查询场景。
      2. 如果仅选择一个或少数分桶列,则对应的点查询可以仅触发一个分桶扫描。此时,当多个点查询并发时,这些查询有较大的概率分别触发不同的分桶扫描,各个查询之间的IO影响较小(尤其当不同桶分布在不同磁盘上时),所以这种方式适合高并发的点查询场景。


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org