You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by yi...@apache.org on 2022/06/01 00:10:00 UTC

[incubator-doris] branch master updated: [tools][doc]add ssb flat tools, and update performance doc (#9827)

This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
     new 7498233601 [tools][doc]add ssb flat tools, and update performance doc (#9827)
7498233601 is described below

commit 749823360157650f066e1bef9dc6aea0fe3ac3d6
Author: Dongyang Li <he...@qq.com>
AuthorDate: Wed Jun 1 08:09:55 2022 +0800

    [tools][doc]add ssb flat tools, and update performance doc (#9827)
---
 .../public/images/ssb_v11_v015_compare.png         | Bin 0 -> 41608 bytes
 docs/en/benchmark/ssb.md                           | 456 ++++++++++++++------
 docs/zh-CN/benchmark/ssb.md                        | 465 +++++++++++++++------
 tools/ssb-tools/ddl/create-ssb-flat-table.sql      | 103 ++---
 tools/ssb-tools/ddl/create-ssb-tables.sql          | 115 +++--
 tools/ssb-tools/load-ssb-dimension-data.sh         |   2 +-
 tools/ssb-tools/load-ssb-fact-data.sh              |   4 +
 tools/ssb-tools/load-ssb-flat-data.sh              |  79 ++--
 tools/ssb-tools/run-ssb-flat-queries.sh            |  17 +-
 tools/ssb-tools/run-ssb-queries.sh                 |  15 +-
 tools/ssb-tools/ssb-queries/q1.1.sql               |   2 +-
 tools/ssb-tools/ssb-queries/q1.2.sql               |   2 +-
 tools/ssb-tools/ssb-queries/q1.3.sql               |   2 +-
 tools/ssb-tools/ssb-queries/q2.1.sql               |   2 +-
 tools/ssb-tools/ssb-queries/q2.2.sql               |   2 +-
 tools/ssb-tools/ssb-queries/q2.3.sql               |   2 +-
 tools/ssb-tools/ssb-queries/q3.1.sql               |   2 +-
 tools/ssb-tools/ssb-queries/q3.2.sql               |   2 +-
 tools/ssb-tools/ssb-queries/q3.3.sql               |   2 +-
 tools/ssb-tools/ssb-queries/q3.4.sql               |   2 +-
 tools/ssb-tools/ssb-queries/q4.1.sql               |   2 +-
 tools/ssb-tools/ssb-queries/q4.2.sql               |   2 +-
 tools/ssb-tools/ssb-queries/q4.3.sql               |   2 +-
 23 files changed, 843 insertions(+), 439 deletions(-)

diff --git a/docs/.vuepress/public/images/ssb_v11_v015_compare.png b/docs/.vuepress/public/images/ssb_v11_v015_compare.png
new file mode 100644
index 0000000000..7d127a14c4
Binary files /dev/null and b/docs/.vuepress/public/images/ssb_v11_v015_compare.png differ
diff --git a/docs/en/benchmark/ssb.md b/docs/en/benchmark/ssb.md
index 9b4b1dbc30..2464396ac9 100644
--- a/docs/en/benchmark/ssb.md
+++ b/docs/en/benchmark/ssb.md
@@ -26,173 +26,363 @@ under the License.
 
 # Star Schema Benchmark
 
-[Star Schema Benchmark(SSB)](https://www.cs.umb.edu/~poneil/StarSchemaB.PDF) is a lightweight data warehouse scenario performance test set. Based on [TPC-H](http://www.tpc.org/tpch/), SSB provides a simplified version of the star model data set, which is mainly used to test the performance of multi-table association queries under the star model.
+[Star Schema Benchmark(SSB)](https://www.cs.umb.edu/~poneil/StarSchemaB.PDF) is a performance test set in a lightweight data warehouse scenario. Based on [TPC-H](http://www.tpc.org/tpch/), SSB provides a simplified version of the star schema dataset, which is mainly used to test the performance of multi-table association queries under the star schema. . In addition, the industry usually flattens SSB as a wide table model (hereinafter referred to as: SSB flat) to test the performance of t [...]
 
-This document mainly introduces how to pass the preliminary performance test of the SSB process in Doris.
+This document mainly introduces the performance of Doris on the SSB test set.
 
-> Note 1: The standard test set including SSB is usually far from the actual business scenario, and some tests will perform parameter tuning for the test set. Therefore, the test results of the standard test set can only reflect the performance of the database in a specific scenario. It is recommended that users use actual business data for further testing.
+> Note 1: The standard test set including SSB is usually far from the actual business scenario, and some tests will perform parameter tuning for the test set. Therefore, the test results of the standard test set can only reflect the performance of the database in specific scenarios. Users are advised to conduct further testing with actual business data.
 >
-> Note 2: The operations involved in this document are all performed in the CentOS 7 environment.
+> Note 2: The operations involved in this document are all performed in the Ubuntu Server 20.04 environment, and CentOS 7 can also be tested.
 
-## Environmental preparation
+On 13 queries on the SSB standard test dataset, we tested the upcoming Doris 1.1 version and Doris 0.15.3 version peer-to-peer, and the overall performance improved by 2-3 times.
 
-Please refer to the [official document](../install/install-deploy.md) to install and deploy Doris to obtain a normal running Doris cluster ( Contain at least 1 FE, 1 BE).
+![ssb_v11_v015_compare](/images/ssb_v11_v015_compare.png)
 
-The scripts involved in the following documents are all stored under `tools/ssb-tools/` in the Doris code base.
+## 1. Hardware Environment
 
-## data preparation
+| Number of machines | 4 Tencent Cloud hosts (1 FE, 3 BE)        |
+| ------------------ | ----------------------------------------- |
+| CPU                | AMD EPYC™ Milan (2.55GHz/3.5GHz) 16 cores |
+| Memory             | 64G                                       |
+| Network Bandwidth  | 7Gbps                                     |
+| Disk               | High-performance cloud disk               |
 
-### 1. Download and install the SSB data generation tool.
+## 2. Software Environment
 
-Execute the following script to download and compile the [ssb-dbgen](https://github.com/electrum/ssb-dbgen.git) tool.
-
-```
-sh build-ssb-dbgen.sh
-```
-
-After the installation is successful, the `dbgen` binary file will be generated in the `ssb-dbgen/` directory.
-
-### 2. Generate SSB test set
-
-Execute the following script to generate the SSB data set:
-
-```
-sh gen-ssb-data.sh -s 100 -c 100
-```
+- Doris deploys 3BE 1FE;
+- Kernel version: Linux version 5.4.0-96-generic (buildd@lgw01-amd64-051)
+- OS version: Ubuntu Server 20.04 LTS 64 bit
+- Doris software version: Apache Doris 1.1, Apache Doris 0.15
+- JDK: openjdk version "11.0.14" 2022-01-18
 
-> Note 1: Run `sh gen-ssb-data.sh -h` for help.
->
-> Note 2: The data will be generated under the directory `ssb-data/` with a suffix of `.tbl`. The total file size is about 60GB. The generation time may vary from a few minutes to an hour.
->
-> Note 3: `-s 100` means that the test set size factor is 100, `-c 100` means that 100 threads concurrently generate data in the lineorder table. The `-c` parameter also determines the number of files in the final lineorder table. The larger the parameter, the more files and the smaller each file.
+## 3. Test data volume
 
-Under the `-s 100` parameter, the generated data set size is:
+| SSB table name | number of rows | remarks                          |
+| :------------- | :------------- | :------------------------------- |
+| lineorder      | 6000037902     | Commodity order list             |
+| customer       | 3000000        | Customer Information Sheet       |
+| part           | 1400000        | Parts Information Sheet          |
+| supplier       | 200000         | Supplier Information Sheet       |
+| date           | 2556           | Date table                       |
+| lineorder_flat | 6000037902     | Wide table after data flattening |
 
-|Table |Rows |Size | File Number |
-|---|---|---|---|
-|lineorder| 600 million (600037902) | 60GB | 100|
-|customer|30 million (3000000) |277M |1|
-|part|1.4 million (1400000) | 116M|1|
-|supplier|200,000 (200,000) |17M |1|
-|date| 2556|228K |1|
+## 4. Test Results
 
-### 3. Build a table
+Here we use the upcoming Doris-1.1 version and Doris-0.15.3 version for comparative testing. The test results are as follows:
 
-    0. Prepare the 'doris-cluster.conf' file.
+| Query | Doris-1.1(ms) | Doris-0.15.3(ms) |
+| ----- | ------------- | ---------------- |
+| Q1.1  | 90            | 250              |
+| Q1.2  | 10            | 30               |
+| Q1.3  | 70            | 120              |
+| Q2.1  | 360           | 900              |
+| Q2.2  | 340           | 1020             |
+| Q2.3  | 260           | 770              |
+| Q3.1  | 550           | 1710             |
+| Q3.2  | 290           | 670              |
+| Q3.3  | 240           | 550              |
+| Q3.4  | 20            | 30               |
+| Q4.1  | 480           | 1250             |
+| Q4.2  | 240           | 400              |
+| Q4.3  | 200           | 330              |
 
-         Before calling the load script, you need to write the FE's ip port and other information in the `doris-cluster.conf` file.
+**Interpretation of results**
 
-         'doris-cluster.conf' in the same directory as `load-ssb-dimension-data.sh`.
+- The data set corresponding to the test results is scale 100, about 600 million.
+- The test environment is configured to be commonly used by users, including 4 cloud servers, 16-core 64G SSD, and 1 FE and 3 BE deployment.
+- Use common user configuration tests to reduce user selection and evaluation costs, but will not consume so many hardware resources during the entire test process.
+- The test results are averaged over 3 executions. And the data has been fully compacted (if the data is tested immediately after the data is imported, the query delay may be higher than the test result, and the speed of compaction is being continuously optimized and will be significantly reduced in the future).
 
-         The contents of the file include FE's ip, HTTP port, user name, password and the DB name of the data to be loaded:
+## 5. Environment Preparation
 
-         ````
-         export FE_HOST="xxx"
-         export FE_HTTP_PORT="8030"
-         export FE_QUERY_PORT="9030"
-         export USER="root"
-         export PASSWORD='xxx'
-         export DB="ssb"
-         ````
+Please refer to the [official document](../install/install-deploy.md) to install and deploy Doris to obtain a normal running Doris cluster (at least 1 FE 1 BE, 1 FE 3 BE is recommended).
 
-    1. Create SSB tables with the following command:
-    ```
-    sh create-ssb-tables.sh
-    ```
-    Or, mannually copy the table creation statement in [create-ssb-tables.sql](https://github.com/apache/incubator-doris/tree/master/tools/ssb-tools/ddl/create-ssb-tables.sql) and execute it in Doris.
+You can modify BE's configuration file be.conf to add the following configuration items and restart BE for better query performance.
 
-### 4. Import data
+```shell
+enable_storage_vectorization=true
+enable_low_cardinality_optimize=true
+```
 
-    1. Load 4 dimension table data (customer, part, supplier and date)
+The scripts covered in the following documents are stored in `tools/ssb-tools/` in the Doris codebase.
 
-        Because the data volume of these 4 dimension tables is small, and the load is simpler, we use the following command to load the data of these 4 tables first:
+> **Notice:**
+>
+> The above two parameters do not have these two parameters in version 0.15.3 and do not need to be configured.
 
-        `sh load-ssb-dimension-data.sh`
+## 6. Data Preparation
 
-    2. Load the fact table lineorder.
+### 6.1 Download and install the SSB data generation tool.
 
-        Load the lineorder table data with the following command:
+Execute the following script to download and compile the [ssb-dbgen](https://github.com/electrum/ssb-dbgen.git) tool.
 
-        `sh load-ssb-fact-data.sh -c 5`
+```shell
+sh build-ssb-dbgen.sh
+````
 
-        `-c 5` means to start 5 concurrent threads to import (the default is 3). In the case of a single BE node, the load time of lineorder data generated by `sh gen-ssb-data.sh -s 100 -c 100` using `sh load-ssb-fact-data.sh -c 3` is about 10 minutes. The memory overhead is about 5-6GB. If you turn on more threads, you can speed up the load speed, but it will increase additional memory overhead.
+After successful installation, the `dbgen` binary will be generated in the `ssb-dbgen/` directory.
 
-        > Note: To get a faster import speed, you can add `flush_thread_num_per_store=5` in be.conf and restart BE. This configuration indicates the number of disk write threads for each data directory, and the default is 2. Larger data can increase write data throughput, but may increase IO Util. (Reference value: 1 mechanical disk, when the default is 2, the IO Util during the import process is about 12%, when it is set to 5, the IO Util is about 26%. If it is an SSD disk, it is almost 0) .
+### 6.2 Generate SSB test set
 
-### 5. Check the loaded data
+Execute the following script to generate the SSB dataset:
 
-    ```
-    select count(*) from part;
-    select count(*) from customer;
-    select count(*) from supplier;
-    select count(*) from date;
-    select count(*) from lineorder;
-    ```
+```shell
+sh gen-ssb-data.sh -s 100 -c 100
+````
 
-    The amount of data should be the same as the number of rows of generated data.
+> Note 1: See script help with `sh gen-ssb-data.sh -h`.
+>
+> Note 2: The data will be generated in the `ssb-data/` directory with the suffix `.tbl`. The total file size is about 60GB. The generation time may vary from a few minutes to an hour.
+>
+> Note 3: `-s 100` indicates that the test set size factor is 100, `-c 100` indicates that 100 concurrent threads generate data for the lineorder table. The `-c` parameter also determines the number of files in the final lineorder table. The larger the parameter, the larger the number of files and the smaller each file.
+
+With the `-s 100` parameter, the resulting dataset size is:
+
+| Table     | Rows             | Size | File Number |
+| --------- | ---------------- | ---- | ----------- |
+| lineorder | 6亿(600037902) | 60GB | 100         |
+| customer  | 300万(3000000) | 277M | 1           |
+| part      | 140万(1400000) | 116M | 1           |
+| supplier  | 20万(200000)   | 17M  | 1           |
+| date      | 2556             | 228K | 1           |
+
+### 6.3 Create table
+
+#### 6.3.1 Prepare the `doris-cluster.conf` file.
+
+Before calling the import script, you need to write the FE's ip port and other information in the `doris-cluster.conf` file.
+
+File location and `load-ssb-dimension-data.sh` level.
+
+The contents of the file include FE's ip, HTTP port, user name, password and the DB name of the data to be imported:
+
+```shell
+export FE_HOST="xxx"
+export FE_HTTP_PORT="8030"
+export FE_QUERY_PORT="9030"
+export USER="root"
+export PASSWORD='xxx'
+export DB="ssb"
+````
+
+#### 6.3.2 Execute the following script to generate and create the SSB table:
+
+```shell
+sh create-ssb-tables.sh
+````
+
+Or copy the build table in [create-ssb-tables.sql](https://github.com/apache/incubator-doris/tree/master/tools/ssb-tools/ddl/create-ssb-tables.sql) Statement, executed in Doris.
+
+#### 6.3.3 Execute the following script to generate and create an SSB flat table:
+
+```shell
+sh create-ssb-flat-table.sh
+````
+
+Or copy [create-ssb-flat-table.sql](https://github.com/apache/incubator-doris/tree/master/tools/ssb-tools/ddl/create-ssb-flat-table.sql) The table building statement in , executed in Doris.
+
+Below is the `lineorder_flat` table building statement. The "lineorder_flat" table is created in the above `create-ssb-flat-table.sh` script with the default number of buckets (48 buckets). You can delete this table and adjust the number of buckets according to your cluster size node configuration, so as to obtain a better test effect.
+
+```sql
+CREATE TABLE `lineorder_flat` (
+  `LO_ORDERDATE` date NOT NULL COMMENT "",
+  `LO_ORDERKEY` int(11) NOT NULL COMMENT "",
+  `LO_LINENUMBER` tinyint(4) NOT NULL COMMENT "",
+  `LO_CUSTKEY` int(11) NOT NULL COMMENT "",
+  `LO_PARTKEY` int(11) NOT NULL COMMENT "",
+  `LO_SUPPKEY` int(11) NOT NULL COMMENT "",
+  `LO_ORDERPRIORITY` varchar(100) NOT NULL COMMENT "",
+  `LO_SHIPPRIORITY` tinyint(4) NOT NULL COMMENT "",
+  `LO_QUANTITY` tinyint(4) NOT NULL COMMENT "",
+  `LO_EXTENDEDPRICE` int(11) NOT NULL COMMENT "",
+  `LO_ORDTOTALPRICE` int(11) NOT NULL COMMENT "",
+  `LO_DISCOUNT` tinyint(4) NOT NULL COMMENT "",
+  `LO_REVENUE` int(11) NOT NULL COMMENT "",
+  `LO_SUPPLYCOST` int(11) NOT NULL COMMENT "",
+  `LO_TAX` tinyint(4) NOT NULL COMMENT "",
+  `LO_COMMITDATE` date NOT NULL COMMENT "",
+  `LO_SHIPMODE` varchar(100) NOT NULL COMMENT "",
+  `C_NAME` varchar(100) NOT NULL COMMENT "",
+  `C_ADDRESS` varchar(100) NOT NULL COMMENT "",
+  `C_CITY` varchar(100) NOT NULL COMMENT "",
+  `C_NATION` varchar(100) NOT NULL COMMENT "",
+  `C_REGION` varchar(100) NOT NULL COMMENT "",
+  `C_PHONE` varchar(100) NOT NULL COMMENT "",
+  `C_MKTSEGMENT` varchar(100) NOT NULL COMMENT "",
+  `S_NAME` varchar(100) NOT NULL COMMENT "",
+  `S_ADDRESS` varchar(100) NOT NULL COMMENT "",
+  `S_CITY` varchar(100) NOT NULL COMMENT "",
+  `S_NATION` varchar(100) NOT NULL COMMENT "",
+  `S_REGION` varchar(100) NOT NULL COMMENT "",
+  `S_PHONE` varchar(100) NOT NULL COMMENT "",
+  `P_NAME` varchar(100) NOT NULL COMMENT "",
+  `P_MFGR` varchar(100) NOT NULL COMMENT "",
+  `P_CATEGORY` varchar(100) NOT NULL COMMENT "",
+  `P_BRAND` varchar(100) NOT NULL COMMENT "",
+  `P_COLOR` varchar(100) NOT NULL COMMENT "",
+  `P_TYPE` varchar(100) NOT NULL COMMENT "",
+  `P_SIZE` tinyint(4) NOT NULL COMMENT "",
+  `P_CONTAINER` varchar(100) NOT NULL COMMENT ""
+) ENGINE=OLAP
+DUPLICATE KEY(`LO_ORDERDATE`, `LO_ORDERKEY`)
+COMMENT "OLAP"
+PARTITION BY RANGE(`LO_ORDERDATE`)
+(PARTITION p1 VALUES [('0000-01-01'), ('1993-01-01')),
+PARTITION p2 VALUES [('1993-01-01'), ('1994-01-01')),
+PARTITION p3 VALUES [('1994-01-01'), ('1995-01-01')),
+PARTITION p4 VALUES [('1995-01-01'), ('1996-01-01')),
+PARTITION p5 VALUES [('1996-01-01'), ('1997-01-01')),
+PARTITION p6 VALUES [('1997-01-01'), ('1998-01-01')),
+PARTITION p7 VALUES [('1998-01-01'), ('1999-01-01')))
+DISTRIBUTED BY HASH(`LO_ORDERKEY`) BUCKETS 48
+PROPERTIES (
+"replication_num" = "1",
+"colocate_with" = "groupxx1",
+"in_memory" = "false",
+"storage_format" = "DEFAULT"
+);
+```
 
-### 6. Run queries
+> ### 6.4 Import data
+>
+> #### 6.4.1 Import 4 dimension table data
+>
+> Because the data volume of these four dimension tables (customer, part, supplier and date) is small, the import is relatively simple. We use the following command to import the data of these four tables first:
+>
+> ```shell
+> sh load-ssb-dimension-data.sh
+> ````
+>
+> #### 6.4.2 Import fact table lineorder.
+>
+> Import the lineorder table data by the following command
+>
+> ````shell
+> sh load-ssb-fact-data.sh -c 5
+> ````
+>
+> `-c 5` means start 5 concurrent thread imports (default is 3). In the case of a single BE node, the import time of the lineorder data generated by `sh gen-ssb-data.sh -s 100 -c 100` using `sh load-ssb-fact-data.sh -c 3` is about 10min. Memory overhead is about 5-6GB. If you start more threads, you can speed up the import, but it will add additional memory overhead.
+>
+> > Note: For faster import speed, you can restart BE after adding `flush_thread_num_per_store=5` in be.conf. This configuration indicates the number of disk write threads for each data directory, and the default is 2. Larger data can improve write data throughput, but may increase IO Util. (Reference value: 1 mechanical disk, when the default is 2, the IO Util during the import process is about 12%, and when it is set to 5, the IO Util is about 26%. If it is an SSD disk, it is almost 0) .
+>
+> #### 6.4.3 Import flat table
+>
+> Import the lineorder_flat table data with the following command:
+>
+> ```shell
+> sh load-ssb-flat-data.sh
+> ````
+>
+> > Note: Flat table data is imported in the way of 'INSERT INTO ... SELECT ... '.
 
-Execute the following script to run SSB queries:
+### 6.5 Check imported data
 
+```sql
+select count(*) from part;
+select count(*) from customer;
+select count(*) from supplier;
+select count(*) from date;
+select count(*) from lineorder;
+select count(*) from lineorder_flat;
 ```
-sh run-ssb-queries.sh
+
+The amount of data should be the same as the number of rows that generate the data.
+
+| Table          | Rows             | Origin Size | Compacted Size(1 Replica) |
+| -------------- | ---------------- | ----------- | ------------------------- |
+| lineorder_flat | 6亿(600037902) |             | 59.709 GB                 |
+| lineorder      | 6亿(600037902) | 60 GB       | 14.514 GB                 |
+| customer       | 300万(3000000) | 277 MB      | 138.247 MB                |
+| part           | 140万(1400000) | 116 MB      | 12.759 MB                 |
+| supplier       | 20万(200000)   | 17 MB       | 9.143 MB                  |
+| date           | 2556             | 228 KB      | 34.276 KB                 |
+
+### 6.6 Query test
+
+#### 6.6.1 Test SQL
+
+```sql
+--Q1.1
+SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
+FROM lineorder_flat
+WHERE  LO_ORDERDATE >= 19930101  AND LO_ORDERDATE <= 19931231 AND LO_DISCOUNT BETWEEN 1 AND 3  AND LO_QUANTITY < 25;
+--Q1.2
+SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
+FROM lineorder_flat
+WHERE LO_ORDERDATE >= 19940101 AND LO_ORDERDATE <= 19940131  AND LO_DISCOUNT BETWEEN 4 AND 6 AND LO_QUANTITY BETWEEN 26 AND 35;
+
+--Q1.3
+SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
+FROM lineorder_flat
+WHERE  weekofyear(LO_ORDERDATE) = 6 AND LO_ORDERDATE >= 19940101  AND LO_ORDERDATE <= 19941231 AND LO_DISCOUNT BETWEEN 5 AND 7  AND LO_QUANTITY BETWEEN 26 AND 35;
+
+--Q2.1
+SELECT SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR, P_BRAND
+FROM lineorder_flat WHERE P_CATEGORY = 'MFGR#12' AND S_REGION = 'AMERICA'
+GROUP BY YEAR, P_BRAND
+ORDER BY YEAR, P_BRAND;
+
+--Q2.2
+SELECT  SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR, P_BRAND
+FROM lineorder_flat
+WHERE P_BRAND >= 'MFGR#2221' AND P_BRAND <= 'MFGR#2228'  AND S_REGION = 'ASIA'
+GROUP BY YEAR, P_BRAND
+ORDER BY YEAR, P_BRAND;
+
+--Q2.3
+SELECT SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR, P_BRAND
+FROM lineorder_flat
+WHERE P_BRAND = 'MFGR#2239' AND S_REGION = 'EUROPE'
+GROUP BY YEAR, P_BRAND
+ORDER BY YEAR, P_BRAND;
+
+--Q3.1
+SELECT C_NATION, S_NATION, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS revenue
+FROM lineorder_flat
+WHERE C_REGION = 'ASIA' AND S_REGION = 'ASIA' AND LO_ORDERDATE >= 19920101  AND LO_ORDERDATE <= 19971231
+GROUP BY C_NATION, S_NATION, YEAR
+ORDER BY YEAR ASC, revenue DESC;
+
+--Q3.2
+SELECT C_CITY, S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS revenue
+FROM lineorder_flat
+WHERE C_NATION = 'UNITED STATES' AND S_NATION = 'UNITED STATES' AND LO_ORDERDATE >= 19920101 AND LO_ORDERDATE <= 19971231
+GROUP BY C_CITY, S_CITY, YEAR
+ORDER BY YEAR ASC, revenue DESC;
+
+--Q3.3
+SELECT C_CITY, S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS revenue
+FROM lineorder_flat
+WHERE C_CITY IN ('UNITED KI1', 'UNITED KI5') AND S_CITY IN ('UNITED KI1', 'UNITED KI5') AND LO_ORDERDATE >= 19920101 AND LO_ORDERDATE <= 19971231
+GROUP BY C_CITY, S_CITY, YEAR
+ORDER BY YEAR ASC, revenue DESC;
+
+--Q3.4
+SELECT C_CITY, S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS revenue
+FROM lineorder_flat
+WHERE C_CITY IN ('UNITED KI1', 'UNITED KI5') AND S_CITY IN ('UNITED KI1', 'UNITED KI5') AND LO_ORDERDATE >= 19971201  AND LO_ORDERDATE <= 19971231
+GROUP BY C_CITY, S_CITY, YEAR
+ORDER BY YEAR ASC, revenue DESC;
+
+--Q4.1
+SELECT (LO_ORDERDATE DIV 10000) AS YEAR, C_NATION, SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit
+FROM lineorder_flat
+WHERE C_REGION = 'AMERICA' ND S_REGION = 'AMERICA' AND P_MFGR IN ('MFGR#1', 'MFGR#2')
+GROUP BY YEAR, C_NATION
+ORDER BY YEAR ASC, C_NATION ASC;
+
+--Q4.2
+SELECT (LO_ORDERDATE DIV 10000) AS YEAR,S_NATION, P_CATEGORY, SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit
+FROM lineorder_flat
+WHERE C_REGION = 'AMERICA' AND S_REGION = 'AMERICA' AND LO_ORDERDATE >= 19970101 AND LO_ORDERDATE <= 19981231 AND P_MFGR IN ('MFGR#1', 'MFGR#2')
+GROUP BY YEAR, S_NATION, P_CATEGORY
+ORDER BY YEAR ASC, S_NATION ASC, P_CATEGORY ASC;
+
+--Q4.3
+SELECT (LO_ORDERDATE DIV 10000) AS YEAR, S_CITY, P_BRAND, SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit
+FROM lineorder_flat
+WHERE S_NATION = 'UNITED STATES' AND LO_ORDERDATE >= 19970101 AND LO_ORDERDATE <= 19981231 AND P_CATEGORY = 'MFGR#14'
+GROUP BY YEAR, S_CITY, P_BRAND
+ORDER BY YEAR ASC, S_CITY ASC, P_BRAND ASC;
 ```
 
-> Note 1: You can try to modify the serssion variables in the script to see the change in query latency.
->
-> Note 2: Run each query three times in a row, and take the average time as each query time.
-
-## Query test
-
-There are 4 groups of 14 SQL in the SSB test set. The query statement is in the [queries/](https://github.com/apache/incubator-doris/tree/master/tools/ssb-tools/ssb-queries) directory.
-
-## testing report
-
-The following test report is based on Doris [branch-0.15](https://github.com/apache/incubator-doris/tree/branch-0.15) branch code test, for reference only. (Update time: October 25, 2021)
-
-1. Hardware environment
-
-    * 1 FE + 1-3 BE mixed
-    * CPU: 96core, Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
-    * Memory: 384GB
-    * Hard disk: 1 HDD
-    * Network card: 10 Gigabit network card
-
-2. Data set
-
-    |Table |Rows |Origin Size | Compacted Size(1 Replica) |
-    |---|---|---|---|
-    |lineorder| 600 million (600037902) | 60 GB | 14.846 GB |
-    |customer|30 million (3000000) |277 MB | 414.741 MB |
-    |part|1.4 million (1.400000) | 116 MB | 38.277 MB |
-    |supplier|200,000 (200,000) |17 MB | 27.428 MB |
-    |date| 2556|228 KB | 275.804 KB |
-
-3. Test results
-
-    |Query |Time(ms) (1 BE) | Time(ms) (3 BE) | Parallelism | Runtime Filter Mode |
-    |---|---|---|---|---|
-    | q1.1 | 200 | 140 | 8 | IN |
-    | q1.2 | 90 | 80 | 8 | IN |
-    | q1.3 | 90 | 80 | 8 | IN |
-    | q2.1 | 1100 | 400 | 8 | BLOOM_FILTER |
-    | q2.2 | 900 | 330 | 8 | BLOOM_FILTER |
-    | q2.3 | 790 | 320 | 8 | BLOOM_FILTER |
-    | q3.1 | 3100 | 1280 | 8 | BLOOM_FILTER |
-    | q3.2 | 700 | 270 | 8 | BLOOM_FILTER |
-    | q3.3 | 540 | 270 | 8 | BLOOM_FILTER |
-    | q3.4 | 560 | 240 | 8 | BLOOM_FILTER |
-    | q4.1 | 2820 | 1150 | 8 | BLOOM_FILTER |
-    | q4.2 | 1430 | 670 | 8 | BLOOM_FILTER |
-    | q4.2 | 1750 | 1030 | 8 | BLOOM_FILTER |
-
-    > Note 1: "This test set is far from your production environment, please be skeptical!"
-    >
-    > Note 2: The test result is the average value of multiple executions (Page Cache will play a certain acceleration role). And the data has undergone sufficient compaction (if you test immediately after importing the data, the query delay may be higher than the test result)
-    >
-    > Note 3: Due to environmental constraints, the hardware specifications used in this test are relatively high, but so many hardware resources will not be consumed during the entire test. The memory consumption is within 10GB, and the CPU usage is within 10%.
-    >
-    > Note 4: Parallelism means query concurrency, which is set by `set parallel_fragment_exec_instance_num=8`.
-    >
-    > Note 5: Runtime Filter Mode is the type of Runtime Filter, set by `set runtime_filter_type="BLOOM_FILTER"`. ([Runtime Filter](../../advanced/join-optimization/runtime-filter.md) function has a significant effect on the SSB test set. Because in this test set, The data from the right table of Join can filter the left table very well. You can try to turn off this function through `set runtime_filter_mode=off` to see the change in query latency.)
diff --git a/docs/zh-CN/benchmark/ssb.md b/docs/zh-CN/benchmark/ssb.md
index d0f140ef12..88311d04e7 100644
--- a/docs/zh-CN/benchmark/ssb.md
+++ b/docs/zh-CN/benchmark/ssb.md
@@ -26,175 +26,364 @@ under the License.
 
 # Star Schema Benchmark
 
-[Star Schema Benchmark(SSB)](https://www.cs.umb.edu/~poneil/StarSchemaB.PDF) 是一个轻量级的数仓场景下的性能测试集。SSB基于 [TPC-H](http://www.tpc.org/tpch/) 提供了一个简化版的星型模型数据集,主要用于测试在星型模型下,多表关联查询的性能表现。
+[Star Schema Benchmark(SSB)](https://www.cs.umb.edu/~poneil/StarSchemaB.PDF) 是一个轻量级的数仓场景下的性能测试集。SSB基于 [TPC-H](http://www.tpc.org/tpch/) 提供了一个简化版的星型模型数据集,主要用于测试在星型模型下,多表关联查询的性能表现。另外,业界内通常也会将SSB打平为宽表模型(以下简称:SSB flat),来测试查询引擎的性能,参考[Clickhouse](https://clickhouse.com/docs/zh/getting-started/example-datasets/star-schema)。
 
-本文档主要介绍如何在 Doris 中通过 SSB 进行初步的性能测试。
+本文档主要介绍 Doris 在 SSB 测试集上的性能表现。
 
 > 注1:包括 SSB 在内的标准测试集通常和实际业务场景差距较大,并且部分测试会针对测试集进行参数调优。所以标准测试集的测试结果仅能反映数据库在特定场景下的性能表现。建议用户使用实际业务数据进行进一步的测试。
-> 
-> 注2:本文档涉及的操作都在 CentOS 7 环境进行。
+>
+> 注2:本文档涉及的操作都在 Ubuntu Server 20.04 环境进行,CentOS 7 也可测试。
+
+在 SSB 标准测试数据集上的 13 个查询上,我们对即将发布的 Doris 1.1 版本和 Doris 0.15.3 版本进行了对别测试,整体性能提升了 2-3 倍。
+
+![ssb_v11_v015_compare](/images/ssb_v11_v015_compare.png)
+
+## 1. 硬件环境
+
+| 机器数量 | 4 台腾讯云主机(1个FE,3个BE)       |
+| -------- | ------------------------------------ |
+| CPU      | AMD EPYC™ Milan(2.55GHz/3.5GHz) 16核 |
+| 内存     | 64G                                  |
+| 网络带宽  | 7Gbps                               |
+| 磁盘     | 高性能云硬盘                         |
+
+## 2. 软件环境
+
+- Doris部署 3BE 1FE;
+- 内核版本:Linux version 5.4.0-96-generic (buildd@lgw01-amd64-051)
+- 操作系统版本:Ubuntu Server 20.04 LTS 64位
+- Doris 软件版本:Apache Doris 1.1 、Apache Doris 0.15
+- JDK:openjdk version "11.0.14" 2022-01-18
+
+## 3. 测试数据量
+
+| SSB表名        | 行数       | 备注             |
+| :------------- | :--------- | :--------------- |
+| lineorder      | 6000037902 | 商品订单明细表表 |
+| customer       | 3000000    | 客户信息表       |
+| part           | 1400000    | 零件信息表       |
+| supplier       | 200000     | 供应商信息表     |
+| date           | 2556       | 日期表           |
+| lineorder_flat | 6000037902 | 数据展平后的宽表 |
+
+## 4. 测试结果
+
+这里我们使用即将发布的 Doris-1.1版本和 Doris-0.15.3 版本进行对比测试,测试结果如下:
+
+| Query | Doris-1.1(ms) | Doris-0.15.3(ms) |
+| ----- | ------------- | ---------------- |
+| Q1.1  | 90            | 250              |
+| Q1.2  | 10            | 30               |
+| Q1.3  | 70            | 120              |
+| Q2.1  | 360           | 900              |
+| Q2.2  | 340           | 1020             |
+| Q2.3  | 260           | 770              |
+| Q3.1  | 550           | 1710             |
+| Q3.2  | 290           | 670              |
+| Q3.3  | 240           | 550              |
+| Q3.4  | 20            | 30               |
+| Q4.1  | 480           | 1250             |
+| Q4.2  | 240           | 400              |
+| Q4.3  | 200           | 330              |
+
+**结果说明**
+
+- 测试结果对应的数据集为scale 100, 约6亿条。
+- 测试环境配置为用户常用配置,云服务器4台,16核 64G SSD,1 FE 3 BE 部署。
+- 选用用户常见配置测试以降低用户选型评估成本,但整个测试过程中不会消耗如此多的硬件资源。
+- 测试结果为3次执行取平均值。并且数据经过充分的 compaction(如果在刚导入数据后立刻测试,则查询延迟可能高于本测试结果,compaction的速度正在持续优化中,未来会显著降低)。
+
+## 5. 环境准备
 
-## 环境准备
+请先参照 [官方文档](../install/install-deploy.md) 进行 Doris 的安装部署,以获得一个正常运行中的 Doris 集群(至少包含 1 FE 1 BE,推荐 1 FE 3 BE)。
 
-请先参照 [官方文档](../install/install-deploy.md) 进行 Doris 的安装部署,以获得一个正常运行中的 Doris 集群(至少包含 1 FE,1 BE)。
+可修改 BE 的配置文件 be.conf 添加以下配置项,重启BE,以获得更好的查询性能。
+
+```shell
+enable_storage_vectorization=true
+enable_low_cardinality_optimize=true
+```
 
 以下文档中涉及的脚本都存放在 Doris 代码库的 `tools/ssb-tools/` 下。
 
-## 数据准备
+> **注意:**
+>
+> 上面这两个参数在 0.15.3 版本里没有这两个参数,不需要配置。
+
+## 6. 数据准备
 
-### 1. 下载安装 SSB 数据生成工具。
+### 6.1 下载安装 SSB 数据生成工具。
 
 执行以下脚本下载并编译 [ssb-dbgen](https://github.com/electrum/ssb-dbgen.git) 工具。
 
-```
+```shell
 sh build-ssb-dbgen.sh
 ```
 
 安装成功后,将在 `ssb-dbgen/` 目录下生成 `dbgen` 二进制文件。
 
-### 2. 生成 SSB 测试集
+### 6.2 生成 SSB 测试集
 
 执行以下脚本生成 SSB 数据集:
 
-```
+```shell
 sh gen-ssb-data.sh -s 100 -c 100
 ```
 
 > 注1:通过 `sh gen-ssb-data.sh -h` 查看脚本帮助。
-> 
+>
 > 注2:数据会以 `.tbl` 为后缀生成在  `ssb-data/` 目录下。文件总大小约60GB。生成时间可能在数分钟到1小时不等。
-> 
+>
 > 注3:`-s 100` 表示测试集大小系数为 100,`-c 100` 表示并发100个线程生成 lineorder 表的数据。`-c` 参数也决定了最终 lineorder 表的文件数量。参数越大,文件数越多,每个文件越小。
 
 在 `-s 100` 参数下,生成的数据集大小为:
 
-|Table |Rows |Size | File Number |
-|---|---|---|---|
-|lineorder| 6亿(600037902) | 60GB | 100|
-|customer|300万(3000000) |277M |1|
-|part|140万(1400000) | 116M|1|
-|supplier|20万(200000) |17M |1|
-|date| 2556|228K |1|
-
-### 3. 建表
-
-    0. 准备 'doris-cluster.conf' 文件。
-    
-        在调用导入脚本前,需要将 FE 的 ip 端口等信息写在 `doris-cluster.conf` 文件中。
-        
-        文件位置和 `load-ssb-dimension-data.sh` 平级。
-      
-        文件内容包括 FE 的 ip,HTTP 端口,用户名,密码以及待导入数据的 DB 名称:
-      
-        ```
-        export FE_HOST="xxx"
-        export FE_HTTP_PORT="8030"
-        export FE_QUERY_PORT="9030"
-        export USER="root"
-        export PASSWORD='xxx'
-        export DB="ssb"
-        ```
-
-    1. 执行以下脚本生成创建 SSB 表:
-
-    ```
-    sh create-ssb-tables.sh
-    ```
-    或者复制 [create-ssb-tables.sql](https://github.com/apache/incubator-doris/tree/master/tools/ssb-tools/ddl/create-ssb-tables.sql) 中的建表语句,在 Doris 中执行。
-
-### 4. 导入数据
-
-    1. 导入 4 张维度表数据(customer, part, supplier and date)
-    
-        因为这4张维表数据量较小,导入较简单,我们使用以下命令先导入这4表的数据:
-        
-        `sh load-ssb-dimension-data.sh`
-        
-    2. 导入事实表 lineorder。
-
-        通过以下命令导入 lineorder 表数据:
-        
-        `sh load-ssb-fact-data.sh -c 5`
-        
-        `-c 5` 表示启动 5 个并发线程导入(默认为3)。在单 BE 节点情况下,由 `sh gen-ssb-data.sh -s 100 -c 100` 生成的 lineorder 数据,使用 `sh load-ssb-fact-data.sh -c 3` 的导入时间约为 10min。内存开销约为 5-6GB。如果开启更多线程,可以加快导入速度,但会增加额外的内存开销。
-
-    > 注:为获得更快的导入速度,你可以在 be.conf 中添加 `flush_thread_num_per_store=5` 后重启BE。该配置表示每个数据目录的写盘线程数,默认为2。较大的数据可以提升写数据吞吐,但可能会增加 IO Util。(参考值:1块机械磁盘,在默认为2的情况下,导入过程中的 IO Util 约为12%,设置为5时,IO Util 约为26%。如果是 SSD 盘,则几乎为 0)。
-
-### 5. 检查导入数据
-
-    ```
-    select count(*) from part;
-    select count(*) from customer;
-    select count(*) from supplier;
-    select count(*) from date;
-    select count(*) from lineorder;
-    ```
-    
-    数据量应和生成数据的行数一致。
-
-### 6. 执行查询
-
-执行以下脚本跑 SSB 的查询:
+| Table     | Rows             | Size | File Number |
+| --------- | ---------------- | ---- | ----------- |
+| lineorder | 6亿(600037902) | 60GB | 100         |
+| customer  | 300万(3000000) | 277M | 1           |
+| part      | 140万(1400000) | 116M | 1           |
+| supplier  | 20万(200000)   | 17M  | 1           |
+| date      | 2556             | 228K | 1           |
+
+### 6.3 建表
+
+#### 6.3.1 准备 `doris-cluster.conf` 文件。
+
+在调用导入脚本前,需要将 FE 的 ip 端口等信息写在 `doris-cluster.conf` 文件中。
 
+文件位置和 `load-ssb-dimension-data.sh` 平级。
+
+文件内容包括 FE 的 ip,HTTP 端口,用户名,密码以及待导入数据的 DB 名称:
+
+```shell
+export FE_HOST="xxx"
+export FE_HTTP_PORT="8030"
+export FE_QUERY_PORT="9030"
+export USER="root"
+export PASSWORD='xxx'
+export DB="ssb"
 ```
-sh run-ssb-queries.sh
+
+#### 6.3.2 执行以下脚本生成创建 SSB 表:
+
+```shell
+sh create-ssb-tables.sh
 ```
+或者复制 [create-ssb-tables.sql](https://github.com/apache/incubator-doris/tree/master/tools/ssb-tools/ddl/create-ssb-tables.sql) 中的建表语句,在 Doris 中执行。
 
-> 注1:可修改脚本中的设置的session变量来查看变化。
->
-> 注2:脚本中对于每个query连续跑三次,取平均时间为每个query的耗时。
-    
-## 查询测试
-
-SSB 测试集共 4 组 14 个 SQL。查询语句在 [queries/](https://github.com/apache/incubator-doris/tree/master/tools/ssb-tools/ssb-queries) 目录下。 
-
-## 测试报告
-
-以下测试报告基于 Doris [branch-0.15](https://github.com/apache/incubator-doris/tree/branch-0.15) 分支代码测试,仅供参考。(更新时间:2021年10月25号)
-
-1. 硬件环境
-
-    * 1 FE + 1-3 BE 混部
-    * CPU:96core, Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
-    * 内存:384GB
-    * 硬盘:1块机械硬盘
-    * 网卡:万兆网卡
-
-2. 数据集
-
-    |Table |Rows |Origin Size | Compacted Size(1 Replica) |
-    |---|---|---|---|
-    |lineorder| 6亿(600037902) | 60 GB | 14.846 GB |
-    |customer|300万(3000000) |277 MB | 414.741 MB |
-    |part|140万(1400000) | 116 MB | 38.277 MB |
-    |supplier|20万(200000) |17 MB | 27.428 MB |
-    |date| 2556|228 KB | 275.804 KB |
-
-3. 测试结果
-
-    |Query |Time(ms) (1 BE) | Time(ms) (3 BE) | Parallelism | Runtime Filter Mode |
-    |---|---|---|---|---|
-    | q1.1 | 200 | 140 | 8 | IN |
-    | q1.2 | 90 | 80 | 8 | IN |
-    | q1.3 | 90 | 80 | 8 | IN |
-    | q2.1 | 1100 | 400 |  8 | BLOOM_FILTER |
-    | q2.2 | 900 | 330 | 8 | BLOOM_FILTER |
-    | q2.3 | 790 | 320 | 8 | BLOOM_FILTER |
-    | q3.1 | 3100 | 1280 | 8 | BLOOM_FILTER |
-    | q3.2 | 700 | 270 | 8 | BLOOM_FILTER |
-    | q3.3 | 540 | 270 | 8 | BLOOM_FILTER |
-    | q3.4 | 560 | 240 | 8 | BLOOM_FILTER |
-    | q4.1 | 2820 | 1150 | 8 | BLOOM_FILTER |
-    | q4.2 | 1430 | 670 | 8 | BLOOM_FILTER |
-    | q4.2 | 1750 | 1030 | 8 | BLOOM_FILTER |
-
-    > 注1:“这个测试集和你的生产环境相去甚远,请对他保持怀疑态度!”
-    > 
-    > 注2:测试结果为多次执行取平均值(Page Cache 会起到一定加速作用)。并且数据经过充分的 compaction (如果在刚导入数据后立刻测试,则查询延迟可能高于本测试结果)
-    >
-    > 注3:因环境受限,本测试使用的硬件规格较高,但整个测试过程中不会消耗如此多的硬件资源。其中内存消耗在 10GB 以内,CPU使用率在 10% 以内。
-    >
-    > 注4:Parallelism 表示查询并发度,通过 `set parallel_fragment_exec_instance_num=8` 设置。
-    >
-    > 注5:Runtime Filter Mode 是 Runtime Filter 的类型,通过 `set runtime_filter_type="BLOOM_FILTER"` 设置。([Runtime Filter](../advanced/join-optimization/runtime-filter.md) 功能对 SSB 测试集效果显著。因为该测试集中,Join 算子右表的数据可以对左表起到很好的过滤作用。你可以尝试通过 `set runtime_filter_mode=off` 关闭该功能,看看查询延迟的变化。)
+#### 6.3.3 执行以下脚本生成创建 SSB flat 表:
+
+```shell
+sh create-ssb-flat-table.sh
+```
+或者复制 [create-ssb-flat-table.sql](https://github.com/apache/incubator-doris/tree/master/tools/ssb-tools/ddl/create-ssb-flat-table.sql) 中的建表语句,在 Doris 中执行。
+
+下面是 `lineorder_flat` 表建表语句。在上面的 `create-ssb-flat-table.sh`  脚本中创建"lineorder_flat"表,并进行了默认分桶数(48个桶)。您可以删除该表,根据您的集群规模节点配置对这个分桶数进行调整,这样可以获取到更好的一个测试效果。
+
+```sql
+CREATE TABLE `lineorder_flat` (
+  `LO_ORDERDATE` date NOT NULL COMMENT "",
+  `LO_ORDERKEY` int(11) NOT NULL COMMENT "",
+  `LO_LINENUMBER` tinyint(4) NOT NULL COMMENT "",
+  `LO_CUSTKEY` int(11) NOT NULL COMMENT "",
+  `LO_PARTKEY` int(11) NOT NULL COMMENT "",
+  `LO_SUPPKEY` int(11) NOT NULL COMMENT "",
+  `LO_ORDERPRIORITY` varchar(100) NOT NULL COMMENT "",
+  `LO_SHIPPRIORITY` tinyint(4) NOT NULL COMMENT "",
+  `LO_QUANTITY` tinyint(4) NOT NULL COMMENT "",
+  `LO_EXTENDEDPRICE` int(11) NOT NULL COMMENT "",
+  `LO_ORDTOTALPRICE` int(11) NOT NULL COMMENT "",
+  `LO_DISCOUNT` tinyint(4) NOT NULL COMMENT "",
+  `LO_REVENUE` int(11) NOT NULL COMMENT "",
+  `LO_SUPPLYCOST` int(11) NOT NULL COMMENT "",
+  `LO_TAX` tinyint(4) NOT NULL COMMENT "",
+  `LO_COMMITDATE` date NOT NULL COMMENT "",
+  `LO_SHIPMODE` varchar(100) NOT NULL COMMENT "",
+  `C_NAME` varchar(100) NOT NULL COMMENT "",
+  `C_ADDRESS` varchar(100) NOT NULL COMMENT "",
+  `C_CITY` varchar(100) NOT NULL COMMENT "",
+  `C_NATION` varchar(100) NOT NULL COMMENT "",
+  `C_REGION` varchar(100) NOT NULL COMMENT "",
+  `C_PHONE` varchar(100) NOT NULL COMMENT "",
+  `C_MKTSEGMENT` varchar(100) NOT NULL COMMENT "",
+  `S_NAME` varchar(100) NOT NULL COMMENT "",
+  `S_ADDRESS` varchar(100) NOT NULL COMMENT "",
+  `S_CITY` varchar(100) NOT NULL COMMENT "",
+  `S_NATION` varchar(100) NOT NULL COMMENT "",
+  `S_REGION` varchar(100) NOT NULL COMMENT "",
+  `S_PHONE` varchar(100) NOT NULL COMMENT "",
+  `P_NAME` varchar(100) NOT NULL COMMENT "",
+  `P_MFGR` varchar(100) NOT NULL COMMENT "",
+  `P_CATEGORY` varchar(100) NOT NULL COMMENT "",
+  `P_BRAND` varchar(100) NOT NULL COMMENT "",
+  `P_COLOR` varchar(100) NOT NULL COMMENT "",
+  `P_TYPE` varchar(100) NOT NULL COMMENT "",
+  `P_SIZE` tinyint(4) NOT NULL COMMENT "",
+  `P_CONTAINER` varchar(100) NOT NULL COMMENT ""
+) ENGINE=OLAP
+DUPLICATE KEY(`LO_ORDERDATE`, `LO_ORDERKEY`)
+COMMENT "OLAP"
+PARTITION BY RANGE(`LO_ORDERDATE`)
+(PARTITION p1 VALUES [('0000-01-01'), ('1993-01-01')),
+PARTITION p2 VALUES [('1993-01-01'), ('1994-01-01')),
+PARTITION p3 VALUES [('1994-01-01'), ('1995-01-01')),
+PARTITION p4 VALUES [('1995-01-01'), ('1996-01-01')),
+PARTITION p5 VALUES [('1996-01-01'), ('1997-01-01')),
+PARTITION p6 VALUES [('1997-01-01'), ('1998-01-01')),
+PARTITION p7 VALUES [('1998-01-01'), ('1999-01-01')))
+DISTRIBUTED BY HASH(`LO_ORDERKEY`) BUCKETS 48
+PROPERTIES (
+"replication_num" = "1",
+"colocate_with" = "groupxx1",
+"in_memory" = "false",
+"storage_format" = "DEFAULT"
+);
+```
+
+
+
+
+### 6.4 导入数据
+
+#### 6.4.1 导入 4 张维度表数据
+
+因为这 4 张维表(customer, part, supplier and date)数据量较小,导入较简单,我们使用以下命令先导入这4表的数据:
+
+```shell
+sh load-ssb-dimension-data.sh
+```
+
+#### 6.4.2 导入事实表 lineorder。
+
+通过以下命令导入 lineorder 表数据
+
+````shell
+sh load-ssb-fact-data.sh -c 5
+````
+
+`-c 5` 表示启动 5 个并发线程导入(默认为3)。在单 BE 节点情况下,由 `sh gen-ssb-data.sh -s 100 -c 100` 生成的 lineorder 数据,使用 `sh load-ssb-fact-data.sh -c 3` 的导入时间约为 10min。内存开销约为 5-6GB。如果开启更多线程,可以加快导入速度,但会增加额外的内存开销。
+
+> 注:为获得更快的导入速度,你可以在 be.conf 中添加 `flush_thread_num_per_store=5` 后重启BE。该配置表示每个数据目录的写盘线程数,默认为2。较大的数据可以提升写数据吞吐,但可能会增加 IO Util。(参考值:1块机械磁盘,在默认为2的情况下,导入过程中的 IO Util 约为12%,设置为5时,IO Util 约为26%。如果是 SSD 盘,则几乎为 0)。
+
+#### 6.4.3 导入flat表
+
+通过以下命令导入 lineorder_flat 表数据:
+
+```shell
+sh load-ssb-flat-data.sh
+```
+
+> 注:flat 表数据采用 'INSERT INTO ... SELECT ... ' 的方式导入。
+
+### 6.5 检查导入数据
+
+```sql
+select count(*) from part;
+select count(*) from customer;
+select count(*) from supplier;
+select count(*) from date;
+select count(*) from lineorder;
+select count(*) from lineorder_flat;
+```
+
+数据量应和生成数据的行数一致。
+
+| Table          | Rows             | Origin Size | Compacted Size(1 Replica) |
+| -------------- | ---------------- | ----------- | ------------------------- |
+| lineorder_flat | 6亿(600037902) |             | 59.709 GB                 |
+| lineorder      | 6亿(600037902) | 60 GB       | 14.514 GB                 |
+| customer       | 300万(3000000) | 277 MB      | 138.247 MB                |
+| part           | 140万(1400000) | 116 MB      | 12.759 MB                 |
+| supplier       | 20万(200000)   | 17 MB       | 9.143 MB                  |
+| date           | 2556             | 228 KB      | 34.276 KB                 |
+
+### 6.6 查询测试
+
+#### 6.6.1 测试SQL
+
+```sql
+--Q1.1
+SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
+FROM lineorder_flat
+WHERE  LO_ORDERDATE >= 19930101  AND LO_ORDERDATE <= 19931231 AND LO_DISCOUNT BETWEEN 1 AND 3  AND LO_QUANTITY < 25;
+--Q1.2
+SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
+FROM lineorder_flat
+WHERE LO_ORDERDATE >= 19940101 AND LO_ORDERDATE <= 19940131  AND LO_DISCOUNT BETWEEN 4 AND 6 AND LO_QUANTITY BETWEEN 26 AND 35;
+
+--Q1.3
+SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
+FROM lineorder_flat
+WHERE  weekofyear(LO_ORDERDATE) = 6 AND LO_ORDERDATE >= 19940101  AND LO_ORDERDATE <= 19941231 AND LO_DISCOUNT BETWEEN 5 AND 7  AND LO_QUANTITY BETWEEN 26 AND 35;
+
+--Q2.1
+SELECT SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR, P_BRAND
+FROM lineorder_flat WHERE P_CATEGORY = 'MFGR#12' AND S_REGION = 'AMERICA'
+GROUP BY YEAR, P_BRAND
+ORDER BY YEAR, P_BRAND;
+
+--Q2.2
+SELECT  SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR, P_BRAND
+FROM lineorder_flat
+WHERE P_BRAND >= 'MFGR#2221' AND P_BRAND <= 'MFGR#2228'  AND S_REGION = 'ASIA'
+GROUP BY YEAR, P_BRAND
+ORDER BY YEAR, P_BRAND;
+
+--Q2.3
+SELECT SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR, P_BRAND
+FROM lineorder_flat
+WHERE P_BRAND = 'MFGR#2239' AND S_REGION = 'EUROPE'
+GROUP BY YEAR, P_BRAND
+ORDER BY YEAR, P_BRAND;
+
+--Q3.1
+SELECT C_NATION, S_NATION, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS revenue
+FROM lineorder_flat
+WHERE C_REGION = 'ASIA' AND S_REGION = 'ASIA' AND LO_ORDERDATE >= 19920101  AND LO_ORDERDATE <= 19971231
+GROUP BY C_NATION, S_NATION, YEAR
+ORDER BY YEAR ASC, revenue DESC;
+
+--Q3.2
+SELECT C_CITY, S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS revenue
+FROM lineorder_flat
+WHERE C_NATION = 'UNITED STATES' AND S_NATION = 'UNITED STATES' AND LO_ORDERDATE >= 19920101 AND LO_ORDERDATE <= 19971231
+GROUP BY C_CITY, S_CITY, YEAR
+ORDER BY YEAR ASC, revenue DESC;
+
+--Q3.3
+SELECT C_CITY, S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS revenue
+FROM lineorder_flat
+WHERE C_CITY IN ('UNITED KI1', 'UNITED KI5') AND S_CITY IN ('UNITED KI1', 'UNITED KI5') AND LO_ORDERDATE >= 19920101 AND LO_ORDERDATE <= 19971231
+GROUP BY C_CITY, S_CITY, YEAR
+ORDER BY YEAR ASC, revenue DESC;
+
+--Q3.4
+SELECT C_CITY, S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS revenue
+FROM lineorder_flat
+WHERE C_CITY IN ('UNITED KI1', 'UNITED KI5') AND S_CITY IN ('UNITED KI1', 'UNITED KI5') AND LO_ORDERDATE >= 19971201  AND LO_ORDERDATE <= 19971231
+GROUP BY C_CITY, S_CITY, YEAR
+ORDER BY YEAR ASC, revenue DESC;
+
+--Q4.1
+SELECT (LO_ORDERDATE DIV 10000) AS YEAR, C_NATION, SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit
+FROM lineorder_flat
+WHERE C_REGION = 'AMERICA' ND S_REGION = 'AMERICA' AND P_MFGR IN ('MFGR#1', 'MFGR#2')
+GROUP BY YEAR, C_NATION
+ORDER BY YEAR ASC, C_NATION ASC;
+
+--Q4.2
+SELECT (LO_ORDERDATE DIV 10000) AS YEAR,S_NATION, P_CATEGORY, SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit
+FROM lineorder_flat
+WHERE C_REGION = 'AMERICA' AND S_REGION = 'AMERICA' AND LO_ORDERDATE >= 19970101 AND LO_ORDERDATE <= 19981231 AND P_MFGR IN ('MFGR#1', 'MFGR#2')
+GROUP BY YEAR, S_NATION, P_CATEGORY
+ORDER BY YEAR ASC, S_NATION ASC, P_CATEGORY ASC;
+
+--Q4.3
+SELECT (LO_ORDERDATE DIV 10000) AS YEAR, S_CITY, P_BRAND, SUM(LO_REVENUE - LO_SUPPLYCOST) AS profit
+FROM lineorder_flat
+WHERE S_NATION = 'UNITED STATES' AND LO_ORDERDATE >= 19970101 AND LO_ORDERDATE <= 19981231 AND P_CATEGORY = 'MFGR#14'
+GROUP BY YEAR, S_CITY, P_BRAND
+ORDER BY YEAR ASC, S_CITY ASC, P_BRAND ASC;
+```
 
diff --git a/tools/ssb-tools/ddl/create-ssb-flat-table.sql b/tools/ssb-tools/ddl/create-ssb-flat-table.sql
index 723cf2cdcf..d80e8894cc 100644
--- a/tools/ssb-tools/ddl/create-ssb-flat-table.sql
+++ b/tools/ssb-tools/ddl/create-ssb-flat-table.sql
@@ -16,54 +16,59 @@
 -- under the License.
 
 CREATE TABLE `lineorder_flat` (
-  `lo_orderkey` bigint(20) NOT NULL COMMENT "",
-  `lo_linenumber` bigint(20) NOT NULL COMMENT "",
-  `lo_custkey` int(11) NOT NULL COMMENT "",
-  `lo_partkey` int(11) NOT NULL COMMENT "",
-  `lo_suppkey` int(11) NOT NULL COMMENT "",
-  `lo_orderdate` int(11) NOT NULL COMMENT "",
-  `lo_orderpriority` varchar(16) NOT NULL COMMENT "",
-  `lo_shippriority` int(11) NOT NULL COMMENT "",
-  `lo_quantity` bigint(20) NOT NULL COMMENT "",
-  `lo_extendedprice` bigint(20) NOT NULL COMMENT "",
-  `lo_ordtotalprice` bigint(20) NOT NULL COMMENT "",
-  `lo_discount` bigint(20) NOT NULL COMMENT "",
-  `lo_revenue` bigint(20) NOT NULL COMMENT "",
-  `lo_supplycost` bigint(20) NOT NULL COMMENT "",
-  `lo_tax` bigint(20) NOT NULL COMMENT "",
-  `lo_commitdate` bigint(20) NOT NULL COMMENT "",
-  `lo_shipmode` varchar(11) NOT NULL COMMENT "",
-  `c_name` varchar(26) NOT NULL COMMENT "",
-  `c_address` varchar(41) NOT NULL COMMENT "",
-  `c_city` varchar(11) NOT NULL COMMENT "",
-  `c_nation` varchar(16) NOT NULL COMMENT "",
-  `c_region` varchar(13) NOT NULL COMMENT "",
-  `c_phone` varchar(16) NOT NULL COMMENT "",
-  `c_mktsegment` varchar(11) NOT NULL COMMENT "",
-  `s_name` varchar(26) NOT NULL COMMENT "",
-  `s_address` varchar(26) NOT NULL COMMENT "",
-  `s_city` varchar(11) NOT NULL COMMENT "",
-  `s_nation` varchar(16) NOT NULL COMMENT "",
-  `s_region` varchar(13) NOT NULL COMMENT "",
-  `s_phone` varchar(16) NOT NULL COMMENT "",
-  `p_name` varchar(23) NOT NULL COMMENT "",
-  `p_mfgr` varchar(7) NOT NULL COMMENT "",
-  `p_category` varchar(8) NOT NULL COMMENT "",
-  `p_brand` varchar(10) NOT NULL COMMENT "",
-  `p_color` varchar(12) NOT NULL COMMENT "",
-  `p_type` varchar(26) NOT NULL COMMENT "",
-  `p_size` int(11) NOT NULL COMMENT "",
-  `p_container` varchar(11) NOT NULL COMMENT ""
-)
-PARTITION BY RANGE(`lo_orderdate`)
-(PARTITION p1992 VALUES [("-2147483648"), ("19930101")),
-PARTITION p1993 VALUES [("19930101"), ("19940101")),
-PARTITION p1994 VALUES [("19940101"), ("19950101")),
-PARTITION p1995 VALUES [("19950101"), ("19960101")),
-PARTITION p1996 VALUES [("19960101"), ("19970101")),
-PARTITION p1997 VALUES [("19970101"), ("19980101")),
-PARTITION p1998 VALUES [("19980101"), ("19990101")))
-DISTRIBUTED BY HASH(`lo_orderkey`) BUCKETS 48
+  `LO_ORDERDATE` date NOT NULL COMMENT "",
+  `LO_ORDERKEY` int(11) NOT NULL COMMENT "",
+  `LO_LINENUMBER` tinyint(4) NOT NULL COMMENT "",
+  `LO_CUSTKEY` int(11) NOT NULL COMMENT "",
+  `LO_PARTKEY` int(11) NOT NULL COMMENT "",
+  `LO_SUPPKEY` int(11) NOT NULL COMMENT "",
+  `LO_ORDERPRIORITY` varchar(100) NOT NULL COMMENT "",
+  `LO_SHIPPRIORITY` tinyint(4) NOT NULL COMMENT "",
+  `LO_QUANTITY` tinyint(4) NOT NULL COMMENT "",
+  `LO_EXTENDEDPRICE` int(11) NOT NULL COMMENT "",
+  `LO_ORDTOTALPRICE` int(11) NOT NULL COMMENT "",
+  `LO_DISCOUNT` tinyint(4) NOT NULL COMMENT "",
+  `LO_REVENUE` int(11) NOT NULL COMMENT "",
+  `LO_SUPPLYCOST` int(11) NOT NULL COMMENT "",
+  `LO_TAX` tinyint(4) NOT NULL COMMENT "",
+  `LO_COMMITDATE` date NOT NULL COMMENT "",
+  `LO_SHIPMODE` varchar(100) NOT NULL COMMENT "",
+  `C_NAME` varchar(100) NOT NULL COMMENT "",
+  `C_ADDRESS` varchar(100) NOT NULL COMMENT "",
+  `C_CITY` varchar(100) NOT NULL COMMENT "",
+  `C_NATION` varchar(100) NOT NULL COMMENT "",
+  `C_REGION` varchar(100) NOT NULL COMMENT "",
+  `C_PHONE` varchar(100) NOT NULL COMMENT "",
+  `C_MKTSEGMENT` varchar(100) NOT NULL COMMENT "",
+  `S_NAME` varchar(100) NOT NULL COMMENT "",
+  `S_ADDRESS` varchar(100) NOT NULL COMMENT "",
+  `S_CITY` varchar(100) NOT NULL COMMENT "",
+  `S_NATION` varchar(100) NOT NULL COMMENT "",
+  `S_REGION` varchar(100) NOT NULL COMMENT "",
+  `S_PHONE` varchar(100) NOT NULL COMMENT "",
+  `P_NAME` varchar(100) NOT NULL COMMENT "",
+  `P_MFGR` varchar(100) NOT NULL COMMENT "",
+  `P_CATEGORY` varchar(100) NOT NULL COMMENT "",
+  `P_BRAND` varchar(100) NOT NULL COMMENT "",
+  `P_COLOR` varchar(100) NOT NULL COMMENT "",
+  `P_TYPE` varchar(100) NOT NULL COMMENT "",
+  `P_SIZE` tinyint(4) NOT NULL COMMENT "",
+  `P_CONTAINER` varchar(100) NOT NULL COMMENT ""
+) ENGINE=OLAP
+DUPLICATE KEY(`LO_ORDERDATE`, `LO_ORDERKEY`)
+COMMENT "OLAP"
+PARTITION BY RANGE(`LO_ORDERDATE`)
+(PARTITION p1 VALUES [('0000-01-01'), ('1993-01-01')),
+PARTITION p2 VALUES [('1993-01-01'), ('1994-01-01')),
+PARTITION p3 VALUES [('1994-01-01'), ('1995-01-01')),
+PARTITION p4 VALUES [('1995-01-01'), ('1996-01-01')),
+PARTITION p5 VALUES [('1996-01-01'), ('1997-01-01')),
+PARTITION p6 VALUES [('1997-01-01'), ('1998-01-01')),
+PARTITION p7 VALUES [('1998-01-01'), ('1999-01-01')))
+DISTRIBUTED BY HASH(`LO_ORDERKEY`) BUCKETS 48
 PROPERTIES (
-"replication_num" = "1"
+"replication_num" = "1",
+"colocate_with" = "groupxx1",
+"in_memory" = "false",
+"storage_format" = "DEFAULT"
 );
\ No newline at end of file
diff --git a/tools/ssb-tools/ddl/create-ssb-tables.sql b/tools/ssb-tools/ddl/create-ssb-tables.sql
index 5e99f89146..273d2d90b9 100644
--- a/tools/ssb-tools/ddl/create-ssb-tables.sql
+++ b/tools/ssb-tools/ddl/create-ssb-tables.sql
@@ -15,39 +15,44 @@
 -- specific language governing permissions and limitations
 -- under the License.
 
-CREATE TABLE `lineorder` (
-  `lo_orderkey` bigint(20) NOT NULL COMMENT "",
-  `lo_linenumber` bigint(20) NOT NULL COMMENT "",
+CREATE TABLE IF NOT EXISTS `lineorder` (
+  `lo_orderkey` int(11) NOT NULL COMMENT "",
+  `lo_linenumber` int(11) NOT NULL COMMENT "",
   `lo_custkey` int(11) NOT NULL COMMENT "",
   `lo_partkey` int(11) NOT NULL COMMENT "",
   `lo_suppkey` int(11) NOT NULL COMMENT "",
   `lo_orderdate` int(11) NOT NULL COMMENT "",
   `lo_orderpriority` varchar(16) NOT NULL COMMENT "",
   `lo_shippriority` int(11) NOT NULL COMMENT "",
-  `lo_quantity` bigint(20) NOT NULL COMMENT "",
-  `lo_extendedprice` bigint(20) NOT NULL COMMENT "",
-  `lo_ordtotalprice` bigint(20) NOT NULL COMMENT "",
-  `lo_discount` bigint(20) NOT NULL COMMENT "",
-  `lo_revenue` bigint(20) NOT NULL COMMENT "",
-  `lo_supplycost` bigint(20) NOT NULL COMMENT "",
-  `lo_tax` bigint(20) NOT NULL COMMENT "",
-  `lo_commitdate` bigint(20) NOT NULL COMMENT "",
+  `lo_quantity` int(11) NOT NULL COMMENT "",
+  `lo_extendedprice` int(11) NOT NULL COMMENT "",
+  `lo_ordtotalprice` int(11) NOT NULL COMMENT "",
+  `lo_discount` int(11) NOT NULL COMMENT "",
+  `lo_revenue` int(11) NOT NULL COMMENT "",
+  `lo_supplycost` int(11) NOT NULL COMMENT "",
+  `lo_tax` int(11) NOT NULL COMMENT "",
+  `lo_commitdate` int(11) NOT NULL COMMENT "",
   `lo_shipmode` varchar(11) NOT NULL COMMENT ""
-)
+) ENGINE=OLAP
+DUPLICATE KEY(`lo_orderkey`)
+COMMENT "OLAP"
 PARTITION BY RANGE(`lo_orderdate`)
-(PARTITION p1992 VALUES [("-2147483648"), ("19930101")),
-PARTITION p1993 VALUES [("19930101"), ("19940101")),
-PARTITION p1994 VALUES [("19940101"), ("19950101")),
-PARTITION p1995 VALUES [("19950101"), ("19960101")),
-PARTITION p1996 VALUES [("19960101"), ("19970101")),
-PARTITION p1997 VALUES [("19970101"), ("19980101")),
-PARTITION p1998 VALUES [("19980101"), ("19990101")))
+(PARTITION p1 VALUES [("-2147483648"), ("19930101")),
+PARTITION p2 VALUES [("19930101"), ("19940101")),
+PARTITION p3 VALUES [("19940101"), ("19950101")),
+PARTITION p4 VALUES [("19950101"), ("19960101")),
+PARTITION p5 VALUES [("19960101"), ("19970101")),
+PARTITION p6 VALUES [("19970101"), ("19980101")),
+PARTITION p7 VALUES [("19980101"), ("19990101")))
 DISTRIBUTED BY HASH(`lo_orderkey`) BUCKETS 48
 PROPERTIES (
-"replication_num" = "1"
+"replication_num" = "1",
+"colocate_with" = "groupa1",
+"in_memory" = "false",
+"storage_format" = "DEFAULT"
 );
 
-CREATE TABLE `customer` (
+CREATE TABLE IF NOT EXISTS `customer` (
   `c_custkey` int(11) NOT NULL COMMENT "",
   `c_name` varchar(26) NOT NULL COMMENT "",
   `c_address` varchar(41) NOT NULL COMMENT "",
@@ -56,13 +61,18 @@ CREATE TABLE `customer` (
   `c_region` varchar(13) NOT NULL COMMENT "",
   `c_phone` varchar(16) NOT NULL COMMENT "",
   `c_mktsegment` varchar(11) NOT NULL COMMENT ""
-)
-DISTRIBUTED BY HASH(`c_custkey`) BUCKETS 10
+) ENGINE=OLAP
+DUPLICATE KEY(`c_custkey`)
+COMMENT "OLAP"
+DISTRIBUTED BY HASH(`c_custkey`) BUCKETS 12
 PROPERTIES (
-"replication_num" = "1"
+"replication_num" = "1",
+"colocate_with" = "groupa2",
+"in_memory" = "false",
+"storage_format" = "DEFAULT"
 );
 
-CREATE TABLE `date` (
+CREATE TABLE IF NOT EXISTS `dates` (
   `d_datekey` int(11) NOT NULL COMMENT "",
   `d_date` varchar(20) NOT NULL COMMENT "",
   `d_dayofweek` varchar(10) NOT NULL COMMENT "",
@@ -80,13 +90,37 @@ CREATE TABLE `date` (
   `d_lastdayinmonthfl` int(11) NOT NULL COMMENT "",
   `d_holidayfl` int(11) NOT NULL COMMENT "",
   `d_weekdayfl` int(11) NOT NULL COMMENT ""
-)
+) ENGINE=OLAP
+DUPLICATE KEY(`d_datekey`)
+COMMENT "OLAP"
 DISTRIBUTED BY HASH(`d_datekey`) BUCKETS 1
 PROPERTIES (
-"replication_num" = "1"
+"replication_num" = "1",
+"in_memory" = "false",
+"colocate_with" = "groupa3",
+"storage_format" = "DEFAULT"
 );
 
-CREATE TABLE `part` (
+ CREATE TABLE IF NOT EXISTS `supplier` (
+  `s_suppkey` int(11) NOT NULL COMMENT "",
+  `s_name` varchar(26) NOT NULL COMMENT "",
+  `s_address` varchar(26) NOT NULL COMMENT "",
+  `s_city` varchar(11) NOT NULL COMMENT "",
+  `s_nation` varchar(16) NOT NULL COMMENT "",
+  `s_region` varchar(13) NOT NULL COMMENT "",
+  `s_phone` varchar(16) NOT NULL COMMENT ""
+) ENGINE=OLAP
+DUPLICATE KEY(`s_suppkey`)
+COMMENT "OLAP"
+DISTRIBUTED BY HASH(`s_suppkey`) BUCKETS 12
+PROPERTIES (
+"replication_num" = "1",
+"colocate_with" = "groupa4",
+"in_memory" = "false",
+"storage_format" = "DEFAULT"
+);
+
+CREATE TABLE IF NOT EXISTS `part` (
   `p_partkey` int(11) NOT NULL COMMENT "",
   `p_name` varchar(23) NOT NULL COMMENT "",
   `p_mfgr` varchar(7) NOT NULL COMMENT "",
@@ -96,22 +130,13 @@ CREATE TABLE `part` (
   `p_type` varchar(26) NOT NULL COMMENT "",
   `p_size` int(11) NOT NULL COMMENT "",
   `p_container` varchar(11) NOT NULL COMMENT ""
-)
-DISTRIBUTED BY HASH(`p_partkey`) BUCKETS 10
-PROPERTIES (
-"replication_num" = "1"
-);
-
-CREATE TABLE `supplier` (
-  `s_suppkey` int(11) NOT NULL COMMENT "",
-  `s_name` varchar(26) NOT NULL COMMENT "",
-  `s_address` varchar(26) NOT NULL COMMENT "",
-  `s_city` varchar(11) NOT NULL COMMENT "",
-  `s_nation` varchar(16) NOT NULL COMMENT "",
-  `s_region` varchar(13) NOT NULL COMMENT "",
-  `s_phone` varchar(16) NOT NULL COMMENT ""
-)
-DISTRIBUTED BY HASH(`s_suppkey`) BUCKETS 10
+) ENGINE=OLAP
+DUPLICATE KEY(`p_partkey`)
+COMMENT "OLAP"
+DISTRIBUTED BY HASH(`p_partkey`) BUCKETS 12
 PROPERTIES (
-"replication_num" = "1"
+"replication_num" = "1",
+"colocate_with" = "groupa5",
+"in_memory" = "false",
+"storage_format" = "DEFAULT"
 );
diff --git a/tools/ssb-tools/load-ssb-dimension-data.sh b/tools/ssb-tools/load-ssb-dimension-data.sh
index 9e438ee213..81d3f900ed 100755
--- a/tools/ssb-tools/load-ssb-dimension-data.sh
+++ b/tools/ssb-tools/load-ssb-dimension-data.sh
@@ -71,7 +71,7 @@ echo 'Loading data for table: date'
 curl --location-trusted -u $USER:$PASSWORD \
     -H "column_separator:|" \
     -H "columns:d_datekey,d_date,d_dayofweek,d_month,d_year,d_yearmonthnum,d_yearmonth,d_daynuminweek,d_daynuminmonth,d_daynuminyear,d_monthnuminyear,d_weeknuminyear,d_sellingseason,d_lastdayinweekfl,d_lastdayinmonthfl,d_holidayfl,d_weekdayfl,d_dummy" \
-    -T $SSB_DATA_DIR/date.tbl http://$FE_HOST:$FE_HTTP_PORT/api/$DB/date/_stream_load
+    -T $SSB_DATA_DIR/date.tbl http://$FE_HOST:$FE_HTTP_PORT/api/$DB/dates/_stream_load
 
 echo 'Loading data for table: supplier'
 curl --location-trusted -u $USER:$PASSWORD \
diff --git a/tools/ssb-tools/load-ssb-fact-data.sh b/tools/ssb-tools/load-ssb-fact-data.sh
index 17aa3b6252..e6cd55e2aa 100755
--- a/tools/ssb-tools/load-ssb-fact-data.sh
+++ b/tools/ssb-tools/load-ssb-fact-data.sh
@@ -131,6 +131,7 @@ for ((i = 1; i <= $PARALLEL; i++)); do
     echo >&3
 done
 
+echo $(date)
 for file in $(ls $SSB_DATA_DIR/lineorder.tbl.*); do
     read -u3
     {
@@ -141,3 +142,6 @@ done
 
 # wait for child thread finished
 wait
+
+echo $(date)
+echo "DONE."
diff --git a/tools/ssb-tools/load-ssb-flat-data.sh b/tools/ssb-tools/load-ssb-flat-data.sh
index 268c1c48b6..42c9882884 100755
--- a/tools/ssb-tools/load-ssb-flat-data.sh
+++ b/tools/ssb-tools/load-ssb-flat-data.sh
@@ -30,7 +30,6 @@ ROOT=$(
 )
 
 CURDIR=${ROOT}
-SSB_DATA_DIR=$CURDIR/ssb-data/
 
 usage() {
     echo "
@@ -93,44 +92,44 @@ load_lineitem_flat() {
         run_sql "
 INSERT INTO lineorder_flat
 SELECT
-    lo_orderkey,
-    lo_linenumber,
-    lo_custkey,
-    lo_partkey,
-    lo_suppkey,
-    lo_orderdate,
-    lo_orderpriority,
-    lo_shippriority,
-    lo_quantity,
-    lo_extendedprice,
-    lo_ordtotalprice,
-    lo_discount,
-    lo_revenue,
-    lo_supplycost,
-    lo_tax,
-    lo_commitdate,
-    lo_shipmode,
-    c_name,
-    c_address,
-    c_city,
-    c_nation,
-    c_region,
-    c_phone,
-    c_mktsegment,
-    s_name,
-    s_address,
-    s_city,
-    s_nation,
-    s_region,
-    s_phone,
-    p_name,
-    p_mfgr,
-    p_category,
-    p_brand,
-    p_color,
-    p_type,
-    p_size,
-    p_container
+    LO_ORDERDATE,
+    LO_ORDERKEY,
+    LO_LINENUMBER,
+    LO_CUSTKEY,
+    LO_PARTKEY,
+    LO_SUPPKEY,
+    LO_ORDERPRIORITY,
+    LO_SHIPPRIORITY,
+    LO_QUANTITY,
+    LO_EXTENDEDPRICE,
+    LO_ORDTOTALPRICE,
+    LO_DISCOUNT,
+    LO_REVENUE,
+    LO_SUPPLYCOST,
+    LO_TAX,
+    LO_COMMITDATE,
+    LO_SHIPMODE,
+    C_NAME,
+    C_ADDRESS,
+    C_CITY,
+    C_NATION,
+    C_REGION,
+    C_PHONE,
+    C_MKTSEGMENT,
+    S_NAME,
+    S_ADDRESS,
+    S_CITY,
+    S_NATION,
+    S_REGION,
+    S_PHONE,
+    P_NAME,
+    P_MFGR,
+    P_CATEGORY,
+    P_BRAND,
+    P_COLOR,
+    P_TYPE,
+    P_SIZE,
+    P_CONTAINER
 FROM (
     SELECT
         lo_orderkey,
@@ -185,6 +184,7 @@ run_sql "set global query_timeout=7200;"
 run_sql "set global parallel_fragment_exec_instance_num=1;"
 
 echo '============================================'
+echo $(date)
 load_lineitem_flat
 
 echo '============================================'
@@ -193,4 +193,5 @@ run_sql "set global query_timeout=${origin_query_timeout};"
 run_sql "set global parallel_fragment_exec_instance_num=${origin_parallel};"
 
 echo '============================================'
+echo $(date)
 echo "DONE."
diff --git a/tools/ssb-tools/run-ssb-flat-queries.sh b/tools/ssb-tools/run-ssb-flat-queries.sh
index bf88072d8b..8a423dedcd 100755
--- a/tools/ssb-tools/run-ssb-flat-queries.sh
+++ b/tools/ssb-tools/run-ssb-flat-queries.sh
@@ -34,7 +34,7 @@ QUERIES_DIR=$CURDIR/ssb-flat-queries
 usage() {
     echo "
 This script is used to run SSB flat 13queries, 
-will use mysql client to connect Doris server which parameter is specified in doris-cluster.conf file.
+will use mysql client to connect Doris server which parameter is specified in 'doris-cluster.conf' file.
 Usage: $0 
   "
     exit 1
@@ -84,7 +84,7 @@ check_prerequest() {
     fi
 }
 
-check_prerequest "mysql --version" "mysql"
+check_prerequest "mysqlslap --version" "mysqlslap"
 
 source $CURDIR/doris-cluster.conf
 export MYSQL_PWD=$PASSWORD
@@ -109,13 +109,8 @@ pre_set "show variables"
 echo '============================================'
 
 for i in '1.1' '1.2' '1.3' '2.1' '2.2' '2.3' '3.1' '3.2' '3.3' '3.4' '4.1' '4.2' '4.3'; do
-    total=0
-    # Each query is executed three times and takes the average time
-    for j in $(seq 1 3); do
-        start=$(date +%s%3N)
-        mysql -h$FE_HOST -u $USER -P$FE_QUERY_PORT -D$DB <$QUERIES_DIR/q${i}.sql >/dev/null
-        end=$(date +%s%3N)
-        total=$((total + end - start))
-    done
-    echo "q$i: $((total / 3))ms"
+    # Each query is executed 3 times and takes the average time
+    res=$(mysqlslap -h$FE_HOST -u$USER -P$FE_QUERY_PORT --create-schema=$DB --query=$QUERIES_DIR/q${i}.sql -F '\r' -i 3 | sed -n '2p' | cut -d ' ' -f 9,10)
+    echo "q$i: $res"
+    sleep 1
 done
diff --git a/tools/ssb-tools/run-ssb-queries.sh b/tools/ssb-tools/run-ssb-queries.sh
index 75cbbb089d..6729186b26 100755
--- a/tools/ssb-tools/run-ssb-queries.sh
+++ b/tools/ssb-tools/run-ssb-queries.sh
@@ -84,7 +84,7 @@ check_prerequest() {
     fi
 }
 
-check_prerequest "mysql --version" "mysql"
+check_prerequest "mysqlslap --version" "mysql slap"
 
 source $CURDIR/doris-cluster.conf
 export MYSQL_PWD=$PASSWORD
@@ -106,18 +106,13 @@ pre_set "set global exec_mem_limit=48G;"
 pre_set "set global batch_size=4096;"
 pre_set "set global enable_projection=true;"
 pre_set "set global runtime_filter_mode=global;"
+# pre_set "set global enable_cost_based_join_reorder=1"
 echo '============================================'
 pre_set "show variables"
 echo '============================================'
 
 for i in '1.1' '1.2' '1.3' '2.1' '2.2' '2.3' '3.1' '3.2' '3.3' '3.4' '4.1' '4.2' '4.3'; do
-    total=0
-    # Each query is executed three times and takes the average time
-    for j in $(seq 1 3); do
-        start=$(date +%s%3N)
-        mysql -h$FE_HOST -u $USER -P$FE_QUERY_PORT -D$DB <$QUERIES_DIR/q${i}.sql >/dev/null
-        end=$(date +%s%3N)
-        total=$((total + end - start))
-    done
-    echo "q$i: $((total / 3))ms"
+    # Each query is executed 3 times and takes the average time
+    res=$(mysqlslap -h$FE_HOST -u$USER -P$FE_QUERY_PORT --create-schema=$DB --query=$QUERIES_DIR/q${i}.sql -F '\r' -i 3 | sed -n '2p' | cut -d ' ' -f 9,10)
+    echo "q$i: $res"
 done
diff --git a/tools/ssb-tools/ssb-queries/q1.1.sql b/tools/ssb-tools/ssb-queries/q1.1.sql
index d9496e0d28..d8a2840ca7 100644
--- a/tools/ssb-tools/ssb-queries/q1.1.sql
+++ b/tools/ssb-tools/ssb-queries/q1.1.sql
@@ -15,7 +15,7 @@
 -- specific language governing permissions and limitations
 -- under the License.
 SELECT SUM(lo_extendedprice * lo_discount) AS REVENUE
-FROM lineorder, date
+FROM lineorder, dates
 WHERE
     lo_orderdate = d_datekey
     AND d_year = 1993
diff --git a/tools/ssb-tools/ssb-queries/q1.2.sql b/tools/ssb-tools/ssb-queries/q1.2.sql
index f86cb7e71c..db6eb0c613 100644
--- a/tools/ssb-tools/ssb-queries/q1.2.sql
+++ b/tools/ssb-tools/ssb-queries/q1.2.sql
@@ -15,7 +15,7 @@
 -- specific language governing permissions and limitations
 -- under the License.
 SELECT SUM(lo_extendedprice * lo_discount) AS REVENUE
-FROM lineorder, date
+FROM lineorder, dates
 WHERE
     lo_orderdate = d_datekey
     AND d_yearmonth = 'Jan1994'
diff --git a/tools/ssb-tools/ssb-queries/q1.3.sql b/tools/ssb-tools/ssb-queries/q1.3.sql
index 94c191a01b..dbb91b0c46 100644
--- a/tools/ssb-tools/ssb-queries/q1.3.sql
+++ b/tools/ssb-tools/ssb-queries/q1.3.sql
@@ -16,7 +16,7 @@
 -- under the License.
 SELECT
     SUM(lo_extendedprice * lo_discount) AS REVENUE
-FROM lineorder, date
+FROM lineorder, dates
 WHERE
     lo_orderdate = d_datekey
     AND d_weeknuminyear = 6
diff --git a/tools/ssb-tools/ssb-queries/q2.1.sql b/tools/ssb-tools/ssb-queries/q2.1.sql
index 0089ef61bb..9707016e03 100644
--- a/tools/ssb-tools/ssb-queries/q2.1.sql
+++ b/tools/ssb-tools/ssb-queries/q2.1.sql
@@ -15,7 +15,7 @@
 -- specific language governing permissions and limitations
 -- under the License.
 SELECT SUM(lo_revenue), d_year, p_brand
-FROM lineorder, date, part, supplier
+FROM lineorder, dates, part, supplier
 WHERE
     lo_orderdate = d_datekey
     AND lo_partkey = p_partkey
diff --git a/tools/ssb-tools/ssb-queries/q2.2.sql b/tools/ssb-tools/ssb-queries/q2.2.sql
index 95c5c67f48..e283dbdb05 100644
--- a/tools/ssb-tools/ssb-queries/q2.2.sql
+++ b/tools/ssb-tools/ssb-queries/q2.2.sql
@@ -15,7 +15,7 @@
 -- specific language governing permissions and limitations
 -- under the License.
 SELECT SUM(lo_revenue), d_year, p_brand
-FROM lineorder, date, part, supplier
+FROM lineorder, dates, part, supplier
 WHERE
     lo_orderdate = d_datekey
     AND lo_partkey = p_partkey
diff --git a/tools/ssb-tools/ssb-queries/q2.3.sql b/tools/ssb-tools/ssb-queries/q2.3.sql
index b747ee12e5..22d2419621 100644
--- a/tools/ssb-tools/ssb-queries/q2.3.sql
+++ b/tools/ssb-tools/ssb-queries/q2.3.sql
@@ -15,7 +15,7 @@
 -- specific language governing permissions and limitations
 -- under the License.
 SELECT SUM(lo_revenue), d_year, p_brand
-FROM lineorder, date, part, supplier
+FROM lineorder, dates, part, supplier
 WHERE
     lo_orderdate = d_datekey
     AND lo_partkey = p_partkey
diff --git a/tools/ssb-tools/ssb-queries/q3.1.sql b/tools/ssb-tools/ssb-queries/q3.1.sql
index ae8b6ea6b3..d674337995 100644
--- a/tools/ssb-tools/ssb-queries/q3.1.sql
+++ b/tools/ssb-tools/ssb-queries/q3.1.sql
@@ -19,7 +19,7 @@ SELECT
     s_nation,
     d_year,
     SUM(lo_revenue) AS REVENUE
-FROM customer, lineorder, supplier, date
+FROM customer, lineorder, supplier, dates
 WHERE
     lo_custkey = c_custkey
     AND lo_suppkey = s_suppkey
diff --git a/tools/ssb-tools/ssb-queries/q3.2.sql b/tools/ssb-tools/ssb-queries/q3.2.sql
index 2a8ed7ab0a..2969efb1a2 100644
--- a/tools/ssb-tools/ssb-queries/q3.2.sql
+++ b/tools/ssb-tools/ssb-queries/q3.2.sql
@@ -19,7 +19,7 @@ SELECT
     s_city,
     d_year,
     SUM(lo_revenue) AS REVENUE
-FROM customer, lineorder, supplier, date
+FROM customer, lineorder, supplier, dates
 WHERE
     lo_custkey = c_custkey
     AND lo_suppkey = s_suppkey
diff --git a/tools/ssb-tools/ssb-queries/q3.3.sql b/tools/ssb-tools/ssb-queries/q3.3.sql
index 5a526e9eaa..ac1cb324d0 100644
--- a/tools/ssb-tools/ssb-queries/q3.3.sql
+++ b/tools/ssb-tools/ssb-queries/q3.3.sql
@@ -19,7 +19,7 @@ SELECT
     s_city,
     d_year,
     SUM(lo_revenue) AS REVENUE
-FROM customer, lineorder, supplier, date
+FROM customer, lineorder, supplier, dates
 WHERE
     lo_custkey = c_custkey
     AND lo_suppkey = s_suppkey
diff --git a/tools/ssb-tools/ssb-queries/q3.4.sql b/tools/ssb-tools/ssb-queries/q3.4.sql
index fe205af5c2..2be6a5cd70 100644
--- a/tools/ssb-tools/ssb-queries/q3.4.sql
+++ b/tools/ssb-tools/ssb-queries/q3.4.sql
@@ -19,7 +19,7 @@ SELECT
     s_city,
     d_year,
     SUM(lo_revenue) AS REVENUE
-FROM customer, lineorder, supplier, date
+FROM customer, lineorder, supplier, dates
 WHERE
     lo_custkey = c_custkey
     AND lo_suppkey = s_suppkey
diff --git a/tools/ssb-tools/ssb-queries/q4.1.sql b/tools/ssb-tools/ssb-queries/q4.1.sql
index beffb53092..f0cfcdd403 100644
--- a/tools/ssb-tools/ssb-queries/q4.1.sql
+++ b/tools/ssb-tools/ssb-queries/q4.1.sql
@@ -18,7 +18,7 @@ SELECT
     d_year,
     c_nation,
     SUM(lo_revenue - lo_supplycost) AS PROFIT
-FROM date, customer, supplier, part, lineorder
+FROM dates, customer, supplier, part, lineorder
 WHERE
     lo_custkey = c_custkey
     AND lo_suppkey = s_suppkey
diff --git a/tools/ssb-tools/ssb-queries/q4.2.sql b/tools/ssb-tools/ssb-queries/q4.2.sql
index c763b7d71c..fbbaef00e8 100644
--- a/tools/ssb-tools/ssb-queries/q4.2.sql
+++ b/tools/ssb-tools/ssb-queries/q4.2.sql
@@ -19,7 +19,7 @@ SELECT
     s_nation,
     p_category,
     SUM(lo_revenue - lo_supplycost) AS PROFIT
-FROM date, customer, supplier, part, lineorder
+FROM dates, customer, supplier, part, lineorder
 WHERE
     lo_custkey = c_custkey
     AND lo_suppkey = s_suppkey
diff --git a/tools/ssb-tools/ssb-queries/q4.3.sql b/tools/ssb-tools/ssb-queries/q4.3.sql
index 92859cdb92..64582cc6ac 100644
--- a/tools/ssb-tools/ssb-queries/q4.3.sql
+++ b/tools/ssb-tools/ssb-queries/q4.3.sql
@@ -19,7 +19,7 @@ SELECT
     s_city,
     p_brand,
     SUM(lo_revenue - lo_supplycost) AS PROFIT
-FROM date, customer, supplier, part, lineorder
+FROM dates, customer, supplier, part, lineorder
 WHERE
     lo_custkey = c_custkey
     AND lo_suppkey = s_suppkey


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org