You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by mo...@apache.org on 2023/06/26 03:36:51 UTC

[doris] branch master updated: [doc](catalog) update and improve doc of multi catalog (#21105)

This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
     new 1138ed1d70 [doc](catalog) update and improve doc of multi catalog (#21105)
1138ed1d70 is described below

commit 1138ed1d70cf8c7dc331c0d4bbc608c196ca6aea
Author: Mingyu Chen <mo...@163.com>
AuthorDate: Mon Jun 26 11:36:44 2023 +0800

    [doc](catalog) update and improve doc of multi catalog (#21105)
    
    Update the document of multi catalog feature.
---
 docs/en/docs/lakehouse/file.md                     | 63 ++++++--------
 docs/en/docs/lakehouse/filecache.md                |  9 +-
 docs/en/docs/lakehouse/multi-catalog/hive.md       | 20 ++---
 docs/en/docs/lakehouse/multi-catalog/hudi.md       | 34 +++++++-
 docs/en/docs/lakehouse/multi-catalog/iceberg.md    | 96 +++++++++++-----------
 .../docs/lakehouse/multi-catalog/multi-catalog.md  | 58 ++++---------
 docs/en/docs/lakehouse/multi-catalog/paimon.md     |  1 +
 docs/sidebars.json                                 |  2 +-
 docs/zh-CN/docs/lakehouse/file.md                  | 61 ++++++--------
 docs/zh-CN/docs/lakehouse/filecache.md             | 13 +--
 docs/zh-CN/docs/lakehouse/multi-catalog/hive.md    | 20 ++---
 docs/zh-CN/docs/lakehouse/multi-catalog/hudi.md    | 35 +++++++-
 docs/zh-CN/docs/lakehouse/multi-catalog/iceberg.md | 96 +++++++++++-----------
 .../docs/lakehouse/multi-catalog/multi-catalog.md  | 51 ++++--------
 docs/zh-CN/docs/lakehouse/multi-catalog/paimon.md  |  1 +
 15 files changed, 275 insertions(+), 285 deletions(-)

diff --git a/docs/en/docs/lakehouse/file.md b/docs/en/docs/lakehouse/file.md
index 9aeed3963d..d7f4620496 100644
--- a/docs/en/docs/lakehouse/file.md
+++ b/docs/en/docs/lakehouse/file.md
@@ -27,12 +27,8 @@ under the License.
 
 # File Analysis
 
-<version since="1.2.0">
-
 With the Table Value Function feature, Doris is able to query files in object storage or HDFS as simply as querying Tables. In addition, it supports automatic column type inference.
 
-</version>
-
 ## Usage
 
 For more usage details, please see the documentation:
@@ -45,12 +41,13 @@ The followings illustrate how file analysis is conducted with the example of S3
 ### Automatic Column Type Inference
 
 ```
-MySQL [(none)]> DESC FUNCTION s3(
+> DESC FUNCTION s3 (
     "URI" = "http://127.0.0.1:9312/test2/test.snappy.parquet",
-    "ACCESS_KEY"= "minioadmin",
-    "SECRET_KEY" = "minioadmin",
-    "Format" = "parquet",
-    "use_path_style"="true");
+    "s3.access_key"= "ak",
+    "s3.secret_key" = "sk",
+    "format" = "parquet",
+    "use_path_style"="true"
+);
 +---------------+--------------+------+-------+---------+-------+
 | Field         | Type         | Null | Key   | Default | Extra |
 +---------------+--------------+------+-------+---------+-------+
@@ -71,8 +68,8 @@ An S3 Table Value Function is defined as follows:
 ```
 s3(
     "URI" = "http://127.0.0.1:9312/test2/test.snappy.parquet",
-    "ACCESS_KEY"= "minioadmin",
-    "SECRET_KEY" = "minioadmin",
+    "s3.access_key"= "ak",
+    "s3.secret_key" = "sk",
     "Format" = "parquet",
     "use_path_style"="true")
 ```
@@ -87,8 +84,6 @@ Besides Parquet, Doris supports analysis and auto column type inference of ORC,
 
 **CSV Schema**
 
-<version since="dev"></version>
-
 By default, for CSV format files, all columns are of type String. Column names and column types can be specified individually via the `csv_schema` attribute. Doris will use the specified column type for file reading. The format is as follows:
 
 `name1:type1;name2:type2;...`
@@ -118,13 +113,13 @@ Example:
 
 ```
 s3 (
-    'URI' = 'https://bucket1/inventory.dat',
-    'ACCESS_KEY'= 'ak',
-    'SECRET_KEY' = 'sk',
-    'FORMAT' = 'csv',
-    'column_separator' = '|',
-    'csv_schema' = 'k1:int;k2:int;k3:int;k4:decimal(38,10)',
-    'use_path_style'='true'
+    "uri" = "https://bucket1/inventory.dat",
+    "s3.access_key"= "ak",
+    "s3.secret_key" = "sk",
+    "format" = "csv",
+    "column_separator" = "|",
+    "csv_schema" = "k1:int;k2:int;k3:int;k4:decimal(38,10)",
+    "use_path_style"="true"
 )
 ```
 
@@ -134,10 +129,10 @@ You can conduct queries and analysis on this Parquet file using any SQL statemen
 
 ```
 SELECT * FROM s3(
-    "URI" = "http://127.0.0.1:9312/test2/test.snappy.parquet",
-    "ACCESS_KEY"= "minioadmin",
-    "SECRET_KEY" = "minioadmin",
-    "Format" = "parquet",
+    "uri" = "http://127.0.0.1:9312/test2/test.snappy.parquet",
+    "s3.access_key"= "ak",
+    "s3.secret_key" = "sk",
+    "format" = "parquet",
     "use_path_style"="true")
 LIMIT 5;
 +-----------+------------------------------------------+----------------+----------+-------------------------+--------+-------------+---------------+---------------------+
@@ -153,18 +148,16 @@ LIMIT 5;
 
 You can put the Table Value Function anywhere that you used to put Table in the SQL, such as in the WITH or FROM clause in CTE. In this way, you can treat the file as a normal table and conduct analysis conveniently.
 
-<version since="dev"></version>
-
 你也可以用过 `CREATE VIEW` 语句为 Table Value Function 创建一个逻辑视图。这样,你可以想其他视图一样,对这个 Table Value Function 进行访问、权限管理等操作,也可以让其他用户访问这个 Table Value Function。
 You can also create a logic view by using `CREATE VIEW` statement for a Table Value Function. So that you can query this view, grant priv on this view or allow other user to access this Table Value Function.
 
 ```
 CREATE VIEW v1 AS 
 SELECT * FROM s3(
-    "URI" = "http://127.0.0.1:9312/test2/test.snappy.parquet",
-    "ACCESS_KEY"= "minioadmin",
-    "SECRET_KEY" = "minioadmin",
-    "Format" = "parquet",
+    "uri" = "http://127.0.0.1:9312/test2/test.snappy.parquet",
+    "s3.access_key"= "ak",
+    "s3.secret_key" = "sk",
+    "format" = "parquet",
     "use_path_style"="true");
 
 DESC v1;
@@ -193,12 +186,10 @@ PROPERTIES("replication_num" = "1");
 INSERT INTO test_table (id,name,age)
 SELECT cast(id as INT) as id, name, cast (age as INT) as age
 FROM s3(
-    "uri" = "${uri}",
-    "ACCESS_KEY"= "${ak}",
-    "SECRET_KEY" = "${sk}",
-    "format" = "${format}",
-    "strip_outer_array" = "true",
-    "read_json_by_line" = "true",
+    "uri" = "http://127.0.0.1:9312/test2/test.snappy.parquet",
+    "s3.access_key"= "ak",
+    "s3.secret_key" = "sk",
+    "format" = "parquet",
     "use_path_style" = "true");
 ```
 
diff --git a/docs/en/docs/lakehouse/filecache.md b/docs/en/docs/lakehouse/filecache.md
index 18312e1a24..fa9272e708 100644
--- a/docs/en/docs/lakehouse/filecache.md
+++ b/docs/en/docs/lakehouse/filecache.md
@@ -26,12 +26,8 @@ under the License.
 
 # File Cache
 
-<version since="dev">
-
 File Cache accelerates queries that read the same data by caching the data files of recently accessed from remote storage system (HDFS or Object Storage). In Ad Hoc scenarios where the same data is frequently accessed, File Cache can avoid repeated remote data access costs and improve the query analysis performance and stability of hot data.
 
-</version>
-
 ## How it works
 
 File Cache caches the accessed remote data in the local BE node. The original data file will be divided into blocks according to the read IO size, and the block will be stored in file `cache_path/hash(filepath).substr(0, 3)/hash(filepath)/offset`, and save the block meta information in the BE node. When accessing the same remote file, doris will check whether the cached data of the file exists in the local cache, and according to the offset and size of the block, confirm which data is re [...]
@@ -54,14 +50,17 @@ Enable File Cache globally:
 SET GLOBAL enable_file_cache = true;
 ```
 
+> The File Cache is only applicable to external queries for files (such as Hive, Hudi). It has no effect on internal table queries, or non-file external queries (such as JDBC, Elasticsearch), etc.
+
 ### Configurations for BE
 Add settings to the BE node's configuration file `conf/be.conf`, and restart the BE node for the configuration to take effect.
 
 |  Parameter   | Description  |
 |  ---  | ---  |
 | `enable_file_cache`  | Whether to enable File Cache, default false |
-| `file_cache_max_file_segment_size` | Max size of a single cached block, default 4MB, should greater than 4096 |
 | `file_cache_path` | Parameters about cache path, json format, for exmaple: `[{"path": "/path/to/file_cache1", "total_size":53687091200,"query_limit": "10737418240"},{"path": "/path/to/file_cache2", "total_size":53687091200,"query_limit": "10737418240"},{"path": "/path/to/file_cache3", "total_size":53687091200,"query_limit": "10737418240"}]`. `path` is the path to save cached data; `total_size` is the max size of cached data; `query_limit` is the max size of cached data for a single query. |
+| `file_cache_min_file_segment_size` | Min size of a single cached block, default 1MB, should greater than 4096 |
+| `file_cache_max_file_segment_size` | Max size of a single cached block, default 4MB, should greater than 4096 |
 | `enable_file_cache_query_limit` | Whether to limit the cache size used by a single query, default false |
 | `clear_file_cache` | Whether to delete the previous cache data when the BE restarts, default false |
 
diff --git a/docs/en/docs/lakehouse/multi-catalog/hive.md b/docs/en/docs/lakehouse/multi-catalog/hive.md
index 3193e2bacd..04ef0de50e 100644
--- a/docs/en/docs/lakehouse/multi-catalog/hive.md
+++ b/docs/en/docs/lakehouse/multi-catalog/hive.md
@@ -30,15 +30,17 @@ By connecting to Hive Metastore, or a metadata service compatible with Hive Meta
 
 In addition to Hive, many other systems also use the Hive Metastore to store metadata. So through Hive Catalog, we can not only access Hive, but also access systems that use Hive Metastore as metadata storage. Such as Iceberg, Hudi, etc.
 
-## Limitations
+## Terms and Conditions
 
-1. Need to put core-site.xml, hdfs-site.xml in the conf directory of FE and BE.
+1. Need to put core-site.xml, hdfs-site.xml and hive-site.xml in the conf directory of FE and BE.
 2. hive supports version 1/2/3.
-3. Support Managed Table and External Table.
+3. Support Managed Table and External Table and part of Hive View.
 4. Can identify hive, iceberg, hudi metadata stored in Hive Metastore.
 
 ## Create Catalog
 
+### Hive On HDFS
+
 ```sql
 CREATE CATALOG hive PROPERTIES (
     'type'='hms',
@@ -93,16 +95,6 @@ Please place the `krb5.conf` file and `keytab` authentication file under all `BE
 
 The value of `hive.metastore.kerberos.principal` needs to be consistent with the property of the same name of the connected hive metastore, which can be obtained from `hive-site.xml`.
 
-Provide Hadoop KMS encrypted transmission information, examples are as follows:
-
-```sql
-CREATE CATALOG hive PROPERTIES (
-    'type'='hms',
-    'hive.metastore.uris' = 'thrift://172.0.0.1:9083',
-    'dfs.encryption.key.provider.uri' = 'kms://http@kms_host:kms_port/kms'
-);
-```
-
 ### Hive On JuiceFS
 
 Data is stored in JuiceFS, examples are as follows:
@@ -189,7 +181,7 @@ CREATE CATALOG hive PROPERTIES (
 
 ## Metadata cache settings
 
-When creating a Catalog, you can use the parameter `file.meta.cache.ttl-second` to set the metadata File Cache automatic expiration time, or set this value to 0 to disable File Cache. The time unit is: second. Examples are as follows:
+When creating a Catalog, you can use the parameter `file.meta.cache.ttl-second` to set the automatic expiration time of the Hive partition file cache, or set this value to 0 to disable the partition file cache. The time unit is: second. Examples are as follows:
 
 ```sql
 CREATE CATALOG hive PROPERTIES (
diff --git a/docs/en/docs/lakehouse/multi-catalog/hudi.md b/docs/en/docs/lakehouse/multi-catalog/hudi.md
index c87ac48d9c..890e7fd92c 100644
--- a/docs/en/docs/lakehouse/multi-catalog/hudi.md
+++ b/docs/en/docs/lakehouse/multi-catalog/hudi.md
@@ -29,8 +29,14 @@ under the License.
 
 ## Usage
 
-1. Currently, Doris supports Snapshot Query on Copy-on-Write Hudi tables and Read Optimized Query on Merge-on-Read tables. In the future, it will support Snapshot Query on Merge-on-Read tables and Incremental Query.
-2. Doris only supports Hive Metastore Catalogs currently. The usage is basically the same as that of Hive Catalogs. More types of Catalogs will be supported in future versions.
+1. Doris supports Snapshot Query on Copy-on-Write Hudi tables and Read Optimized Query / Snapshot on Merge-on-Read tables. In the future, it will support Incremental Query and Time Travel.
+
+|  Table Type   | Supported Query types  |
+|  ----  | ----  |
+| Copy On Write  | Snapshot Query |
+| Merge On Read  | Snapshot Queries + Read Optimized Queries |
+
+2. Doris supports Hive Metastore(Including catalogs compatible with Hive MetaStore, like [AWS Glue](./hive.md)/[Alibaba DLF](./dlf.md)) Catalogs.
 
 ## Create Catalog
 
@@ -52,3 +58,27 @@ CREATE CATALOG hudi PROPERTIES (
 ## Column Type Mapping
 
 Same as that in Hive Catalogs. See the relevant section in [Hive](./hive.md).
+
+## Query Optimization
+Doris uses the parquet native reader to read the data files of the COW table, and uses the Java SDK (By calling hudi-bundle through JNI) to read the data files of the MOR table. In `upsert` scenario, there may still remains base files that have not been updated in the MOR table, which can be read through the parquet native reader. Users can view the execution plan of hudi scan through the [explain](../../advanced/best-practice/query-analysis.md) command, where `hudiNativeReadSplits` indi [...]
+```
+|0:VHUDI_SCAN_NODE                                                             |
+|      table: minbatch_mor_rt                                                  |
+|      predicates: `o_orderkey` = 100030752                                    |
+|      inputSplitNum=810, totalFileSize=5645053056, scanRanges=810             |
+|      partition=80/80                                                         |
+|      numNodes=6                                                              |
+|      hudiNativeReadSplits=717/810                                            |
+```
+Users can view the perfomace of Java SDK through [profile](../../admin-manual/http-actions/fe/profile-action.md), for exmpale:
+```
+-  HudiJniScanner:  0ns
+  -  FillBlockTime:  31.29ms
+  -  GetRecordReaderTime:  1m5s
+  -  JavaScanTime:  35s991ms
+  -  OpenScannerTime:  1m6s
+```
+1. `OpenScannerTime`: Time to create and initialize JNI reader
+2. `JavaScanTime`: Time to read data by Java SDK
+3. `FillBlockTime`: Time co convert Java column data into C++ column data
+4. `GetRecordReaderTime`: Time to create and initialize Hudi Record Reader
diff --git a/docs/en/docs/lakehouse/multi-catalog/iceberg.md b/docs/en/docs/lakehouse/multi-catalog/iceberg.md
index f316ee6bfe..6f063ecf0b 100644
--- a/docs/en/docs/lakehouse/multi-catalog/iceberg.md
+++ b/docs/en/docs/lakehouse/multi-catalog/iceberg.md
@@ -55,58 +55,49 @@ CREATE CATALOG iceberg PROPERTIES (
 
 Use the Iceberg API to access metadata, and support services such as Hive, REST, and Glue as Iceberg's Catalog.
 
-- Hive Metastore
-
-    ```sql
-    CREATE CATALOG iceberg PROPERTIES (
-        'type'='iceberg',
-        'iceberg.catalog.type'='hms',
-        'hive.metastore.uris' = 'thrift://172.21.0.1:7004',
-        'hadoop.username' = 'hive',
-        'dfs.nameservices'='your-nameservice',
-        'dfs.ha.namenodes.your-nameservice'='nn1,nn2',
-        'dfs.namenode.rpc-address.your-nameservice.nn1'='172.21.0.2:4007',
-        'dfs.namenode.rpc-address.your-nameservice.nn2'='172.21.0.3:4007',
-        'dfs.client.failover.proxy.provider.your-nameservice'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider'
-    );
-    ```
-
-- Glue Catalog
-
-    ```sql
-    CREATE CATALOG glue PROPERTIES (
-        "type"="iceberg",
-        "iceberg.catalog.type" = "glue",
-        "glue.endpoint" = "https://glue.us-east-1.amazonaws.com",
-        "glue.access_key" = "ak",
-        "glue.secret_key" = "sk"
-    );
-    ```
-
-    For Iceberg properties, see [Iceberg Glue Catalog](https://iceberg.apache.org/docs/latest/aws/#glue-catalog)
-
-- REST Catalog
-
-    This method needs to provide REST services in advance, and users need to implement the REST interface for obtaining Iceberg metadata.
-    
-    ```sql
-    CREATE CATALOG iceberg PROPERTIES (
-        'type'='iceberg',
-        'iceberg.catalog.type'='rest',
-        'uri' = 'http://172.21.0.1:8181',
-    );
-    ```
+#### Hive Metastore
 
-If the data is stored on S3, the following parameters can be used in properties:
+```sql
+CREATE CATALOG iceberg PROPERTIES (
+    'type'='iceberg',
+    'iceberg.catalog.type'='hms',
+    'hive.metastore.uris' = 'thrift://172.21.0.1:7004',
+    'hadoop.username' = 'hive',
+    'dfs.nameservices'='your-nameservice',
+    'dfs.ha.namenodes.your-nameservice'='nn1,nn2',
+    'dfs.namenode.rpc-address.your-nameservice.nn1'='172.21.0.2:4007',
+    'dfs.namenode.rpc-address.your-nameservice.nn2'='172.21.0.3:4007',
+    'dfs.client.failover.proxy.provider.your-nameservice'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider'
+);
+```
+
+#### AWS Glue
 
+```sql
+CREATE CATALOG glue PROPERTIES (
+    "type"="iceberg",
+    "iceberg.catalog.type" = "glue",
+    "glue.endpoint" = "https://glue.us-east-1.amazonaws.com",
+    "glue.access_key" = "ak",
+    "glue.secret_key" = "sk"
+);
 ```
-"s3.access_key" = "ak"
-"s3.secret_key" = "sk"
-"s3.endpoint" = "http://endpoint-uri"
-"s3.credentials.provider" = "provider-class-name" // 可选,默认凭证类基于BasicAWSCredentials实现。
+
+For Iceberg properties, see [Iceberg Glue Catalog](https://iceberg.apache.org/docs/latest/aws/#glue-catalog)
+
+#### REST Catalog
+
+This method needs to provide REST services in advance, and users need to implement the REST interface for obtaining Iceberg metadata.
+
+```sql
+CREATE CATALOG iceberg PROPERTIES (
+    'type'='iceberg',
+    'iceberg.catalog.type'='rest',
+    'uri' = 'http://172.21.0.1:8181',
+);
 ```
 
-#### Google Dataproc Metastore 作为元数据服务
+#### Google Dataproc Metastore
 
 ```sql
 CREATE CATALOG iceberg PROPERTIES (
@@ -123,6 +114,17 @@ CREATE CATALOG iceberg PROPERTIES (
 
 `hive.metastore.uris`: Dataproc Metastore URI,See in Metastore Services :[Dataproc Metastore Services](https://console.cloud.google.com/dataproc/metastore).
 
+### Iceberg On S3
+
+If the data is stored on S3, the following parameters can be used in properties:
+
+```
+"s3.access_key" = "ak"
+"s3.secret_key" = "sk"
+"s3.endpoint" = "http://endpoint-uri"
+"s3.credentials.provider" = "provider-class-name" // 可选,默认凭证类基于BasicAWSCredentials实现。
+```
+
 ## Column type mapping
 
 Consistent with Hive Catalog, please refer to the **column type mapping** section in [Hive Catalog](./hive.md).
diff --git a/docs/en/docs/lakehouse/multi-catalog/multi-catalog.md b/docs/en/docs/lakehouse/multi-catalog/multi-catalog.md
index 9f266f9fed..d9fc1ba480 100644
--- a/docs/en/docs/lakehouse/multi-catalog/multi-catalog.md
+++ b/docs/en/docs/lakehouse/multi-catalog/multi-catalog.md
@@ -27,24 +27,21 @@ under the License.
 
 # Multi Catalog
 
-<version since="1.2.0">
-
-Multi-Catalog is a newly added feature in Doris 1.2.0. It allows Doris to interface with external catalogs more conveniently and thus increases the data lake analysis and federated query capabilities of Doris.
+Multi-Catalog is designed to make it easier to connect to external data catalogs to enhance Doris's data lake analysis and federated data query capabilities.
 
 In older versions of Doris, user data is in a two-tiered structure: database and table. Thus, connections to external catalogs could only be done at the database or table level. For example, users could create a mapping to a table in an external catalog via `create external table`, or to a database via `create external database` . If there were large amounts of databases or tables in the external catalog, users would need to create mappings to them one by one, which could be a heavy workload.
 
 With the advent of Multi-Catalog, Doris now has a new three-tiered metadata hierarchy (catalog -> database -> table), which means users can connect to external data at the catalog level. The currently supported external catalogs include:
 
-1. Hive
-2. Iceberg
-3. Hudi
+1. Apache Hive
+2. Apache Iceberg
+3. Apache Hudi
 4. Elasticsearch
 5. JDBC
+6. Apache Paimon(Incubating)
 
 Multi-Catalog works as an additional and enhanced external table connection method. It helps users conduct multi-catalog federated queries quickly. 
 
-</version>
-
 ## Basic Concepts
 
 1. Internal Catalog
@@ -247,28 +244,10 @@ For more information about Hive, please see [Hive](./hive.md).
 	{'label':'insert_212f67420c6444d5_9bfc184bf2e7edb8', 'status':'VISIBLE', 'txnId':'4'}
 	```
 
-### Connect to Iceberg
-
-See [Iceberg](./iceberg.md)
-
-### Connect to Hudi
-
-See [Hudi](./hudi.md)
-
-### Connect to Elasticsearch
-
-See [Elasticsearch](./es.md)
-
-### Connect to JDBC
-
-See [JDBC](./jdbc.md)
-
 ## Column Type Mapping
 
 After you create a Catalog, Doris will automatically synchronize the databases and tables from the corresponding external catalog to it. The following shows how Doris maps different types of catalogs and tables.
 
-<version since="1.2.2">
-
 As for types that cannot be mapped to a Doris column type, such as `UNION` and `INTERVAL` , Doris will map them to an UNSUPPORTED type. Here are examples of queries in a table containing UNSUPPORTED types:
 
 Suppose the table is of the following schema:
@@ -287,39 +266,41 @@ select k1, k3 from table;           // Error: Unsupported type 'UNSUPPORTED_TYPE
 select k1, k4 from table;           // Query OK.
 ```
 
-</version>
-
 You can find more details of the mapping of various data sources (Hive, Iceberg, Hudi, Elasticsearch, and JDBC) in the corresponding pages.
 
 ## Privilege Management
 
-Access from Doris to databases and tables in an External Catalog is not under the privilege control of the external catalog itself, but is authorized by Doris.
+When using Doris to access the data in the External Catalog, by default, it relies on Doris's own permission access management function.
 
 Along with the new Multi-Catalog feature, we also added privilege management at the Catalog level (See [Privilege Management](https://doris.apache.org/docs/dev/admin-manual/privilege-ldap/user-privilege/) for details).
 
+Users can also specify a custom authentication class through the `access_controller.class` attribute. As specified by:
+
+`"access_controller.class" = "org.apache.doris.catalog.authorizer.RangerHiveAccessControllerFactory"`
+
+Then you can use Apache Range to perform authentication management on Hive Catalog. For more information see: [Hive Catalog](./hive.md)
+
 ## Database synchronizing management
 
 Setting `include_database_list` and `exclude_database_list` in Catalog properties to specify databases to synchronize.
 
-`include_database_list`: Only synchronize the specified databases. split with ',', default value is '', means no filter takes effect, synchronizes all databases. db name is case sensitive.
+`include_database_list`: Only synchronize the specified databases. split with `,`, default is to synchronize all databases. db name is case sensitive.
 
-`exclude_database_list`: Specify databases that do not need to synchronize. split with ',', default value is '', means no filter takes effect, synchronizes all databases. db name is case sensitive.
+`exclude_database_list`: Specify databases that do not need to synchronize. split with `,`, default is to synchronize all databases. db name is case sensitive.
 
 > When `include_database_list` and `exclude_database_list` specify overlapping databases, `exclude_database_list` would take effect with higher privilege over `include_database_list`.
 >
 > To connect JDBC, these two properties should work with `only_specified_database`, see [JDBC](./jdbc.md) for more detail.
 
-## Metadata Update
+## Metadata Refresh
 
-### Manual Update
+### Manual Refresh
 
 By default, changes in metadata of external data sources, including addition or deletion of tables and columns, will not be synchronized into Doris.
 
 Users need to manually update the metadata using the  [REFRESH CATALOG](https://doris.apache.org/docs/dev/sql-manual/sql-reference/Utility-Statements/REFRESH/) command.
 
-### Automatic Update
-
-<version since="1.2.2"></version>
+### Automatic Refresh
 
 #### Hive Metastore
 
@@ -394,9 +375,7 @@ We need to change to
 
 > Note: To enable automatic update, whether for existing Catalogs or newly created Catalogs, all you need is to set `enable_hms_events_incremental_sync` to `true`, and then restart the FE node. You don't need to manually update the metadata before or after the restart.
 
-<version since="dev">
-
-#### Timing Refresh
+#### Timed Refresh
 
 When creating a catalog, specify the refresh time parameter `metadata_refresh_interval_sec` in the properties, in seconds. If this parameter is set when creating a catalog, the master node of FE will refresh the catalog regularly according to the parameter value. Three types are currently supported
 
@@ -415,4 +394,3 @@ CREATE CATALOG es PROPERTIES (
 );
 ```
 
-</version>
diff --git a/docs/en/docs/lakehouse/multi-catalog/paimon.md b/docs/en/docs/lakehouse/multi-catalog/paimon.md
index 0dcb0f2011..cd9253288f 100644
--- a/docs/en/docs/lakehouse/multi-catalog/paimon.md
+++ b/docs/en/docs/lakehouse/multi-catalog/paimon.md
@@ -26,6 +26,7 @@ under the License.
 
 
 # Paimon
+
 <version since="dev">
 </version>
 
diff --git a/docs/sidebars.json b/docs/sidebars.json
index 8f955967da..f055ab4b20 100644
--- a/docs/sidebars.json
+++ b/docs/sidebars.json
@@ -205,8 +205,8 @@
                         "lakehouse/multi-catalog/multi-catalog",
                         "lakehouse/multi-catalog/hive",
                         "lakehouse/multi-catalog/iceberg",
-                        "lakehouse/multi-catalog/paimon",
                         "lakehouse/multi-catalog/hudi",
+                        "lakehouse/multi-catalog/paimon",
                         "lakehouse/multi-catalog/es",
                         "lakehouse/multi-catalog/jdbc",
                         "lakehouse/multi-catalog/dlf",
diff --git a/docs/zh-CN/docs/lakehouse/file.md b/docs/zh-CN/docs/lakehouse/file.md
index 80f968a896..4a8996a10f 100644
--- a/docs/zh-CN/docs/lakehouse/file.md
+++ b/docs/zh-CN/docs/lakehouse/file.md
@@ -27,12 +27,8 @@ under the License.
 
 # 文件分析
 
-<version since="1.2.0">
-
 通过 Table Value Function 功能,Doris 可以直接将对象存储或 HDFS 上的文件作为 Table 进行查询分析。并且支持自动的列类型推断。
 
-</version>
-
 ## 使用方式
 
 更多使用方式可参阅 Table Value Function 文档:
@@ -45,12 +41,13 @@ under the License.
 ### 自动推断文件列类型
 
 ```
-MySQL [(none)]> DESC FUNCTION s3(
+> DESC FUNCTION s3 (
     "URI" = "http://127.0.0.1:9312/test2/test.snappy.parquet",
-    "ACCESS_KEY"= "minioadmin",
-    "SECRET_KEY" = "minioadmin",
-    "Format" = "parquet",
-    "use_path_style"="true");
+    "s3.access_key"= "ak",
+    "s3.secret_key" = "sk",
+    "format" = "parquet",
+    "use_path_style"="true"
+);
 +---------------+--------------+------+-------+---------+-------+
 | Field         | Type         | Null | Key   | Default | Extra |
 +---------------+--------------+------+-------+---------+-------+
@@ -71,9 +68,9 @@ MySQL [(none)]> DESC FUNCTION s3(
 ```
 s3(
     "URI" = "http://127.0.0.1:9312/test2/test.snappy.parquet",
-    "ACCESS_KEY"= "minioadmin",
-    "SECRET_KEY" = "minioadmin",
-    "Format" = "parquet",
+    "s3.access_key"= "ak",
+    "s3.secret_key" = "sk",
+    "format" = "parquet",
     "use_path_style"="true")
 ```
 
@@ -87,8 +84,6 @@ s3(
 
 **CSV Schema**
 
-<version since="dev"></version>
-
 在默认情况下,对 CSV 格式文件,所有列类型均为 String。可以通过 `csv_schema` 属性单独指定列名和列类型。Doris 会使用指定的列类型进行文件读取。格式如下:
 
 `name1:type1;name2:type2;...`
@@ -118,13 +113,13 @@ s3(
 
 ```
 s3 (
-    'URI' = 'https://bucket1/inventory.dat',
-    'ACCESS_KEY'= 'ak',
-    'SECRET_KEY' = 'sk',
-    'FORMAT' = 'csv',
-    'column_separator' = '|',
-    'csv_schema' = 'k1:int;k2:int;k3:int;k4:decimal(38,10)',
-    'use_path_style'='true'
+    "URI" = "https://bucket1/inventory.dat",
+    "s3.access_key"= "ak",
+    "s3.secret_key" = "sk",
+    "format" = "csv",
+    "column_separator" = "|",
+    "csv_schema" = "k1:int;k2:int;k3:int;k4:decimal(38,10)",
+    "use_path_style"="true"
 )
 ```
 
@@ -135,9 +130,9 @@ s3 (
 ```
 SELECT * FROM s3(
     "URI" = "http://127.0.0.1:9312/test2/test.snappy.parquet",
-    "ACCESS_KEY"= "minioadmin",
-    "SECRET_KEY" = "minioadmin",
-    "Format" = "parquet",
+    "s3.access_key"= "ak",
+    "s3.secret_key" = "sk",
+    "format" = "parquet",
     "use_path_style"="true")
 LIMIT 5;
 +-----------+------------------------------------------+----------------+----------+-------------------------+--------+-------------+---------------+---------------------+
@@ -154,17 +149,15 @@ LIMIT 5;
 Table Value Function 可以出现在 SQL 中,Table 能出现的任意位置。如 CTE 的 WITH 子句中,FROM 子句中。
 这样,你可以把文件当做一张普通的表进行任意分析。
 
-<version since="dev"></version>
-
 你也可以用过 `CREATE VIEW` 语句为 Table Value Function 创建一个逻辑视图。这样,你可以想其他视图一样,对这个 Table Value Function 进行访问、权限管理等操作,也可以让其他用户访问这个 Table Value Function。
 
 ```
 CREATE VIEW v1 AS 
 SELECT * FROM s3(
     "URI" = "http://127.0.0.1:9312/test2/test.snappy.parquet",
-    "ACCESS_KEY"= "minioadmin",
-    "SECRET_KEY" = "minioadmin",
-    "Format" = "parquet",
+    "s3.access_key"= "ak",
+    "s3.secret_key" = "sk",
+    "format" = "parquet",
     "use_path_style"="true");
 
 DESC v1;
@@ -193,12 +186,10 @@ PROPERTIES("replication_num" = "1");
 INSERT INTO test_table (id,name,age)
 SELECT cast(id as INT) as id, name, cast (age as INT) as age
 FROM s3(
-    "uri" = "${uri}",
-    "ACCESS_KEY"= "${ak}",
-    "SECRET_KEY" = "${sk}",
-    "format" = "${format}",
-    "strip_outer_array" = "true",
-    "read_json_by_line" = "true",
+    "uri" = "http://127.0.0.1:9312/test2/test.snappy.parquet",
+    "s3.access_key"= "ak",
+    "s3.secret_key" = "sk",
+    "format" = "parquet",
     "use_path_style" = "true");
 ```    
 
diff --git a/docs/zh-CN/docs/lakehouse/filecache.md b/docs/zh-CN/docs/lakehouse/filecache.md
index f8ad2b3fde..79035141e2 100644
--- a/docs/zh-CN/docs/lakehouse/filecache.md
+++ b/docs/zh-CN/docs/lakehouse/filecache.md
@@ -27,12 +27,8 @@ under the License.
 
 # 文件缓存
 
-<version since="dev">
-
 文件缓存(File Cache)通过缓存最近访问的远端存储系统(HDFS 或对象存储)的数据文件,加速后续访问相同数据的查询。在频繁访问相同数据的查询场景中,File Cache 可以避免重复的远端数据访问开销,提升热点数据的查询分析性能和稳定性。
 
-</version>
-
 ## 原理
 
 File Cache 将访问的远程数据缓存到本地的 BE 节点。原始的数据文件会根据访问的 IO 大小切分为 Block,Block 被存储到本地文件 `cache_path/hash(filepath).substr(0, 3)/hash(filepath)/offset` 中,并在 BE 节点中保存 Block 的元信息。当访问相同的远程文件时,doris 会检查本地缓存中是否存在该文件的缓存数据,并根据 Block 的 offset 和 size,确认哪些数据从本地 Block 读取,哪些数据从远程拉起,并缓存远程拉取的新数据。BE 节点重启的时候,扫描 `cache_path` 目录,恢复 Block 的元信息。当缓存大小达到阈值上限的时候,按照 LRU 原则清理长久未访问的 Block。
@@ -48,26 +44,32 @@ File Cache 默认关闭,需要在 FE 和 BE 中设置相关参数进行开启
 ```
 SET enable_file_cache = true;
 ```
+
 全局开启 File Cache:
 
 ```
 SET GLOBAL enable_file_cache = true;
 ```
 
+> File Cache 功能仅作用于针对文件的外表查询(如 Hive、Hudi )。对内表查询,或非文件的外表查询(如 JDBC、Elasticsearch)等无影响。
+
 ### BE 配置
+
 添加参数到 BE 节点的配置文件 conf/be.conf 中,并重启 BE 节点让配置生效。
 
 |  参数   | 说明  |
 |  ---  | ---  |
 | `enable_file_cache`  | 是否启用 File Cache,默认 false |
-| `file_cache_max_file_segment_size` | 单个 Block 的大小上限,默认 4MB,需要大于 4096 |
 | `file_cache_path` | 缓存目录的相关配置,json格式,例子: `[{"path": "/path/to/file_cache1", "total_size":53687091200,"query_limit": "10737418240"},{"path": "/path/to/file_cache2", "total_size":53687091200,"query_limit": "10737418240"},{"path": "/path/to/file_cache3", "total_size":53687091200,"query_limit": "10737418240"}]`。`path` 是缓存的保存路径,`total_size` 是缓存的大小上限,`query_limit` 是单个查询能够使用的最大缓存大小。 |
+| `file_cache_min_file_segment_size` | 单个 Block 的大小下限,默认 1MB,需要大于 4096 |
+| `file_cache_max_file_segment_size` | 单个 Block 的大小上限,默认 4MB,需要大于 4096 |
 | `enable_file_cache_query_limit` | 是否限制单个 query 使用的缓存大小,默认 false |
 | `clear_file_cache` | BE 重启时是否删除之前的缓存数据,默认 false |
 
 ### 查看 File Cache 命中情况
 
 执行 `set enable_profile=true` 打开会话变量,可以在 FE 的 web 页面的 Queris 标签中查看到作业的 Profile。File Cache 相关的指标如下:
+
 ```
 -  FileCache:
   -  IOHitCacheNum:  552
@@ -90,3 +92,4 @@ SET GLOBAL enable_file_cache = true;
 `IOHitCacheNum` / `IOTotalNum` 等于1,表示缓存完全命中
 
 `ReadFromFileCacheBytes` / `ReadTotalBytes` 等于1,表示缓存完全命中
+
diff --git a/docs/zh-CN/docs/lakehouse/multi-catalog/hive.md b/docs/zh-CN/docs/lakehouse/multi-catalog/hive.md
index bd20f16846..0b8c4ac62a 100644
--- a/docs/zh-CN/docs/lakehouse/multi-catalog/hive.md
+++ b/docs/zh-CN/docs/lakehouse/multi-catalog/hive.md
@@ -30,15 +30,17 @@ under the License.
 
 除了 Hive 外,很多其他系统也会使用 Hive Metastore 存储元数据。所以通过 Hive Catalog,我们不仅能访问 Hive,也能访问使用 Hive Metastore 作为元数据存储的系统。如 Iceberg、Hudi 等。
 
-## 使用限制
+## 使用须知
 
-1. 需将 core-site.xml,hdfs-site.xml 放到 FE 和 BE 的 conf 目录下。
+1. 需将 core-site.xml,hdfs-site.xml 和 hive-site.xml  放到 FE 和 BE 的 conf 目录下。
 2. hive 支持 1/2/3 版本。
-3. 支持 Managed Table 和 External Table。
+3. 支持 Managed Table 和 External Table,支持部分 Hive View。
 4. 可以识别 Hive Metastore 中存储的 hive、iceberg、hudi 元数据。
 
 ## 创建 Catalog
 
+### Hive On HDFS
+
 ```sql
 CREATE CATALOG hive PROPERTIES (
     'type'='hms',
@@ -92,16 +94,6 @@ CREATE CATALOG hive PROPERTIES (
 请在所有的 `BE`、`FE` 节点下放置 `krb5.conf` 文件和 `keytab` 认证文件,`keytab` 认证文件路径和配置保持一致,`krb5.conf` 文件默认放置在 `/etc/krb5.conf` 路径。
 `hive.metastore.kerberos.principal` 的值需要和所连接的 hive metastore 的同名属性保持一致,可从 `hive-site.xml` 中获取。
 
-提供 Hadoop KMS 加密传输信息,示例如下:
-
-```sql
-CREATE CATALOG hive PROPERTIES (
-    'type'='hms',
-    'hive.metastore.uris' = 'thrift://172.0.0.1:9083',
-    'dfs.encryption.key.provider.uri' = 'kms://http@kms_host:kms_port/kms'
-);
-```
-
 ### Hive On JuiceFS
 
 数据存储在JuiceFS,示例如下:
@@ -188,7 +180,7 @@ CREATE CATALOG hive PROPERTIES (
 
 ## 元数据缓存设置
 
-创建 Catalog 时可以采用参数 `file.meta.cache.ttl-second` 来设置元数据 File Cache 自动失效时间,也可以将该值设置为 0 来禁用 File Cache。时间单位为:秒。示例如下:
+创建 Catalog 时可以采用参数 `file.meta.cache.ttl-second` 来设置 Hive 分区文件缓存自动失效时间,也可以将该值设置为 0 来禁用分区文件缓存。时间单位为:秒。示例如下:
 
 ```sql
 CREATE CATALOG hive PROPERTIES (
diff --git a/docs/zh-CN/docs/lakehouse/multi-catalog/hudi.md b/docs/zh-CN/docs/lakehouse/multi-catalog/hudi.md
index a651bcfde6..d93c4268d8 100644
--- a/docs/zh-CN/docs/lakehouse/multi-catalog/hudi.md
+++ b/docs/zh-CN/docs/lakehouse/multi-catalog/hudi.md
@@ -29,8 +29,14 @@ under the License.
 
 ## 使用限制
 
-1. Hudi 目前仅支持 Copy On Write 表的 Snapshot Query,以及 Merge On Read 表的 Read Optimized Query。后续将支持 Incremental Query 和 Merge On Read 表的 Snapshot Query。
-2. 目前仅支持 Hive Metastore 类型的 Catalog。所以使用方式和 Hive Catalog 基本一致。后续版本将支持其他类型的 Catalog。
+1. 目前支持 Copy On Write 表的 Snapshot Query,以及 Merge On Read 表的 Snapshot Queries 和 Read Optimized Query。后续将支持 Incremental Query 和 Time Travel。
+
+|  表类型   | 支持的查询类型  |
+|  ----  | ----  |
+| Copy On Write  | Snapshot Query |
+| Merge On Read  | Snapshot Queries + Read Optimized Queries |
+
+2. 目前支持 Hive Metastore 和兼容 Hive Metastore 类型(例如[AWS Glue](./hive.md)/[Alibaba DLF](./dlf.md))的 Catalog。
 
 ## 创建 Catalog
 
@@ -52,3 +58,28 @@ CREATE CATALOG hudi PROPERTIES (
 ## 列类型映射
 
 和 Hive Catalog 一致,可参阅 [Hive Catalog](./hive.md) 中 **列类型映射** 一节。
+
+## 查询优化
+
+Doris 使用 parquet native reader 读取 COW 表的数据文件,使用 Java SDK(通过JNI调用hudi-bundle) 读取 MOR 表的数据文件。在 upsert 场景下,MOR 依然会有数据文件没有被更新,这部分文件可以通过 parquet native reader读取,用户可以通过 [explain](../../advanced/best-practice/query-analysis.md) 命令查看 hudi scan 的执行计划,`hudiNativeReadSplits` 表示有多少 split 文件通过 parquet native reader 读取。
+```
+|0:VHUDI_SCAN_NODE                                                             |
+|      table: minbatch_mor_rt                                                  |
+|      predicates: `o_orderkey` = 100030752                                    |
+|      inputSplitNum=810, totalFileSize=5645053056, scanRanges=810             |
+|      partition=80/80                                                         |
+|      numNodes=6                                                              |
+|      hudiNativeReadSplits=717/810                                            |
+```
+用户可以通过 [profile](../../admin-manual/http-actions/fe/profile-action.md) 查看 Java SDK 的性能,例如:
+```
+-  HudiJniScanner:  0ns
+  -  FillBlockTime:  31.29ms
+  -  GetRecordReaderTime:  1m5s
+  -  JavaScanTime:  35s991ms
+  -  OpenScannerTime:  1m6s
+```
+1. `OpenScannerTime`: 创建并初始化 JNI Reader 的时间
+2. `JavaScanTime`: Java SDK 读取数据的时间
+3. `FillBlockTime`: Java 数据拷贝为 C++ 数据的时间
+4. `GetRecordReaderTime`: 调用 Java SDK 并创建 Hudi Record Reader 的时间
diff --git a/docs/zh-CN/docs/lakehouse/multi-catalog/iceberg.md b/docs/zh-CN/docs/lakehouse/multi-catalog/iceberg.md
index 7fbab6ba13..f72553b6d0 100644
--- a/docs/zh-CN/docs/lakehouse/multi-catalog/iceberg.md
+++ b/docs/zh-CN/docs/lakehouse/multi-catalog/iceberg.md
@@ -55,58 +55,49 @@ CREATE CATALOG iceberg PROPERTIES (
 
 使用Iceberg API访问元数据的方式,支持Hive、REST、Glue等服务作为Iceberg的Catalog。
 
-- Hive Metastore 作为元数据服务
-
-    ```sql
-    CREATE CATALOG iceberg PROPERTIES (
-        'type'='iceberg',
-        'iceberg.catalog.type'='hms',
-        'hive.metastore.uris' = 'thrift://172.21.0.1:7004',
-        'hadoop.username' = 'hive',
-        'dfs.nameservices'='your-nameservice',
-        'dfs.ha.namenodes.your-nameservice'='nn1,nn2',
-        'dfs.namenode.rpc-address.your-nameservice.nn1'='172.21.0.2:4007',
-        'dfs.namenode.rpc-address.your-nameservice.nn2'='172.21.0.3:4007',
-        'dfs.client.failover.proxy.provider.your-nameservice'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider'
-    );
-    ```
-
-- Glue Catalog 作为元数据服务
-
-    ```sql
-    CREATE CATALOG glue PROPERTIES (
-        "type"="iceberg",
-        "iceberg.catalog.type" = "glue",
-        "glue.endpoint" = "https://glue.us-east-1.amazonaws.com",
-        "glue.access_key" = "ak",
-        "glue.secret_key" = "sk"
-    );
-    ```
-
-    Iceberg 属性详情参见 [Iceberg Glue Catalog](https://iceberg.apache.org/docs/latest/aws/#glue-catalog)
-
-- REST Catalog 作为元数据服务
-
-    该方式需要预先提供REST服务,用户需实现获取Iceberg元数据的REST接口。
-    
-    ```sql
-    CREATE CATALOG iceberg PROPERTIES (
-        'type'='iceberg',
-        'iceberg.catalog.type'='rest',
-        'uri' = 'http://172.21.0.1:8181',
-    );
-    ```
+#### Hive Metastore
 
-若数据存放在S3上,properties中可以使用以下参数
+```sql
+CREATE CATALOG iceberg PROPERTIES (
+    'type'='iceberg',
+    'iceberg.catalog.type'='hms',
+    'hive.metastore.uris' = 'thrift://172.21.0.1:7004',
+    'hadoop.username' = 'hive',
+    'dfs.nameservices'='your-nameservice',
+    'dfs.ha.namenodes.your-nameservice'='nn1,nn2',
+    'dfs.namenode.rpc-address.your-nameservice.nn1'='172.21.0.2:4007',
+    'dfs.namenode.rpc-address.your-nameservice.nn2'='172.21.0.3:4007',
+    'dfs.client.failover.proxy.provider.your-nameservice'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider'
+);
+```
+
+#### AWS Glue
 
+```sql
+CREATE CATALOG glue PROPERTIES (
+    "type"="iceberg",
+    "iceberg.catalog.type" = "glue",
+    "glue.endpoint" = "https://glue.us-east-1.amazonaws.com",
+    "glue.access_key" = "ak",
+    "glue.secret_key" = "sk"
+);
 ```
-"s3.access_key" = "ak"
-"s3.secret_key" = "sk"
-"s3.endpoint" = "http://endpoint-uri"
-"s3.credentials.provider" = "provider-class-name" // 可选,默认凭证类基于BasicAWSCredentials实现。
+
+Iceberg 属性详情参见 [Iceberg Glue Catalog](https://iceberg.apache.org/docs/latest/aws/#glue-catalog)
+
+#### REST Catalog
+
+该方式需要预先提供REST服务,用户需实现获取Iceberg元数据的REST接口。
+
+```sql
+CREATE CATALOG iceberg PROPERTIES (
+    'type'='iceberg',
+    'iceberg.catalog.type'='rest',
+    'uri' = 'http://172.21.0.1:8181',
+);
 ```
 
-#### Google Dataproc Metastore 作为元数据服务
+#### Google Dataproc Metastore
 
 ```sql
 CREATE CATALOG iceberg PROPERTIES (
@@ -123,6 +114,17 @@ CREATE CATALOG iceberg PROPERTIES (
 
 `hive.metastore.uris`: Dataproc Metastore 服务开放的接口,在 Metastore 管理页面获取 :[Dataproc Metastore Services](https://console.cloud.google.com/dataproc/metastore).
 
+### Iceberg On S3
+
+若数据存放在S3上,properties中可以使用以下参数
+
+```
+"s3.access_key" = "ak"
+"s3.secret_key" = "sk"
+"s3.endpoint" = "http://endpoint-uri"
+"s3.credentials.provider" = "provider-class-name" // 可选,默认凭证类基于BasicAWSCredentials实现。
+```
+
 ## 列类型映射
 
 和 Hive Catalog 一致,可参阅 [Hive Catalog](./hive.md) 中 **列类型映射** 一节。
diff --git a/docs/zh-CN/docs/lakehouse/multi-catalog/multi-catalog.md b/docs/zh-CN/docs/lakehouse/multi-catalog/multi-catalog.md
index 6a528d2302..f77634bcc7 100644
--- a/docs/zh-CN/docs/lakehouse/multi-catalog/multi-catalog.md
+++ b/docs/zh-CN/docs/lakehouse/multi-catalog/multi-catalog.md
@@ -27,24 +27,21 @@ under the License.
 
 # 多源数据目录
 
-<version since="1.2.0">
-
-多源数据目录(Multi-Catalog)是 Doris 1.2.0 版本中推出的功能,旨在能够更方便对接外部数据目录,以增强Doris的数据湖分析和联邦数据查询能力。
+多源数据目录(Multi-Catalog)功能,旨在能够更方便对接外部数据目录,以增强Doris的数据湖分析和联邦数据查询能力。
 
 在之前的 Doris 版本中,用户数据只有两个层级:Database 和 Table。当我们需要连接一个外部数据目录时,我们只能在Database 或 Table 层级进行对接。比如通过 `create external table` 的方式创建一个外部数据目录中的表的映射,或通过 `create external database` 的方式映射一个外部数据目录中的 Database。 如果外部数据目录中的 Database 或 Table 非常多,则需要用户手动进行一一映射,使用体验不佳。
 
 而新的 Multi-Catalog 功能在原有的元数据层级上,新增一层Catalog,构成 Catalog -> Database -> Table 的三层元数据层级。其中,Catalog 可以直接对应到外部数据目录。目前支持的外部数据目录包括:
 
-1. Hive
-2. Iceberg
-3. Hudi
+1. Apache Hive
+2. Apache Iceberg
+3. Apache Hudi
 4. Elasticsearch
 5. JDBC: 对接数据库访问的标准接口(JDBC)来访问各式数据库的数据。
+6. Apache Paimon(Incubating)
 
 该功能将作为之前外表连接方式(External Table)的补充和增强,帮助用户进行快速的多数据目录联邦查询。
 
-</version>
-
 ## 基础概念
 
 1. Internal Catalog
@@ -247,28 +244,10 @@ under the License.
 	{'label':'insert_212f67420c6444d5_9bfc184bf2e7edb8', 'status':'VISIBLE', 'txnId':'4'}
 	```
 
-### 连接 Iceberg
-
-详见 [Iceberg Catalog](./iceberg.md)
-
-### 连接 Hudi
-
-详见 [Hudi Catalog](./hudi.md)
-
-### 连接 Elasticsearch
-
-详见 [Elasticsearch Catalog](./es.md)
-
-### 连接 JDBC
-
-详见 [JDBC Catalog](./jdbc.md)
-
 ## 列类型映射
 
 用户创建 Catalog 后,Doris 会自动同步数据目录的数据库和表,针对不同的数据目录和数据表格式,Doris 会进行以下列映射关系。
 
-<version since="1.2.2">
-
 对于当前无法映射到 Doris 列类型的外表类型,如 `UNION`, `INTERVAL` 等。Doris 会将列类型映射为 UNSUPPORTED 类型。对于 UNSUPPORTED 类型的查询,示例如下:
 
 假设同步后的表 schema 为:
@@ -287,23 +266,27 @@ select k1, k3 from table;           // Error: Unsupported type 'UNSUPPORTED_TYPE
 select k1, k4 from table;           // Query OK.
 ```
 
-</version>
-
 不同的数据源的列映射规则,请参阅不同数据源的文档。
 
 ## 权限管理
 
-使用 Doris 对 External Catalog 中库表进行访问,并不受外部数据目录自身的权限控制,而是依赖 Doris 自身的权限访问管理功能。
+使用 Doris 对 External Catalog 中库表进行访问时,默认情况下,依赖 Doris 自身的权限访问管理功能。
 
 Doris 的权限管理功能提供了对 Catalog 层级的扩展,具体可参阅 [权限管理](../../admin-manual/privilege-ldap/user-privilege.md) 文档。
 
+用户也可以通过 `access_controller.class` 属性指定自定义的鉴权类。如通过指定:
+
+`"access_controller.class" = "org.apache.doris.catalog.authorizer.RangerHiveAccessControllerFactory"`
+
+则可以使用 Apache Range 对 Hive Catalog 进行鉴权管理。详细信息请参阅:[Hive Catalog](./hive.md)
+
 ## 指定需要同步的数据库
 
 通过在 Catalog 配置中设置 `include_database_list` 和 `exclude_database_list` 可以指定需要同步的数据库。
 
-`include_database_list`: 支持只同步指定的多个database,以','分隔。默认为'',同步所有database。db名称是大小写敏感的。
+`include_database_list`: 支持只同步指定的多个database,以 `,` 分隔。默认同步所有database。db名称是大小写敏感的。
 
-`exclude_database_list`: 支持指定不需要同步的多个database,以','分割。默认为'',即不做任何过滤,同步所有database。db名称是大小写敏感的。
+`exclude_database_list`: 支持指定不需要同步的多个database,以 `,` 分割。默认不做任何过滤,同步所有database。db名称是大小写敏感的。
 
 > 当 `include_database_list` 和 `exclude_database_list` 有重合的database配置时,`exclude_database_list`会优先生效。
 >
@@ -319,8 +302,6 @@ Doris 的权限管理功能提供了对 Catalog 层级的扩展,具体可参
 
 ### 自动刷新
 
-<version since="1.2.2"></version>
-
 #### Hive Metastore
 
 自动刷新目前仅支持 Hive Metastore 元数据服务。通过让 FE 节点定时读取 HMS 的 notification event 来感知 Hive 表元数据的变更情况,目前支持处理如下event:
@@ -394,8 +375,6 @@ Doris 的权限管理功能提供了对 Catalog 层级的扩展,具体可参
 
 > 使用建议: 无论是之前已经创建好的catalog现在想改为自动刷新,还是新创建的 catalog,都只需要把 `enable_hms_events_incremental_sync` 设置为true,重启fe节点,无需重启之前或之后再手动刷新元数据。
 
-<version since="dev">
-
 #### 定时刷新
 
 在创建catalog时,在properties 中指定刷新时间参数`metadata_refresh_interval_sec` ,以秒为单位,若在创建catalog时设置了该参数,FE 的master节点会根据参数值定时刷新该catalog。目前支持三种类型
@@ -415,5 +394,3 @@ CREATE CATALOG es PROPERTIES (
 );
 ```
 
-</version>
-
diff --git a/docs/zh-CN/docs/lakehouse/multi-catalog/paimon.md b/docs/zh-CN/docs/lakehouse/multi-catalog/paimon.md
index 69f6ef16f9..0ed5a12caa 100644
--- a/docs/zh-CN/docs/lakehouse/multi-catalog/paimon.md
+++ b/docs/zh-CN/docs/lakehouse/multi-catalog/paimon.md
@@ -26,6 +26,7 @@ under the License.
 
 
 # Paimon
+
 <version since="dev">
 </version>
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org