You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kylin.apache.org by xx...@apache.org on 2022/12/23 06:19:00 UTC

[kylin] branch doc5.0 updated (ac1bdf07c3 -> 1878e6cb00)

This is an automated email from the ASF dual-hosted git repository.

xxyu pushed a change to branch doc5.0
in repository https://gitbox.apache.org/repos/asf/kylin.git


    from ac1bdf07c3 Update community activity page
     new 5c25707fe9 add mac troubleshooting content of how_to_write_doc
     new ef9338c278 fix some spelling mistakes of kylin doc
     new 1878e6cb00 add a document: introduction of metadata

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .gitignore                                         |   4 +-
 .../2022-12-18-Introduction_of_Metadata/index.md   | 309 +++++++++++++++++++++
 website/blog/authors.yml                           |   5 +
 website/docs/development/coding_convention.md      |  16 +-
 .../docs/development/how_to_debug_kylin_in_ide.md  |  30 +-
 .../development/how_to_debug_kylin_in_local.md     |  24 +-
 website/docs/development/how_to_package.md         |  38 +--
 .../development/how_to_subscribe_mailing_list.md   |  14 +-
 .../development/how_to_understand_kylin_design.md  |   6 +-
 website/docs/development/how_to_write_doc.md       |  88 +++---
 website/docs/development/intro.md                  |   6 +-
 website/docs/development/roadmap.md                |  12 +-
 12 files changed, 437 insertions(+), 115 deletions(-)
 create mode 100644 website/blog/2022-12-18-Introduction_of_Metadata/index.md


[kylin] 02/03: fix some spelling mistakes of kylin doc

Posted by xx...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

xxyu pushed a commit to branch doc5.0
in repository https://gitbox.apache.org/repos/asf/kylin.git

commit ef9338c27808c4b8fd0af7cda8995780bccd90d3
Author: pengfei.zhan <de...@gmail.com>
AuthorDate: Sun Dec 18 20:04:27 2022 +0800

    fix some spelling mistakes of kylin doc
---
 website/docs/development/coding_convention.md      | 16 ++++-----
 .../docs/development/how_to_debug_kylin_in_ide.md  | 30 ++++++++---------
 .../development/how_to_debug_kylin_in_local.md     | 24 +++++++-------
 website/docs/development/how_to_package.md         | 38 +++++++++++-----------
 .../development/how_to_subscribe_mailing_list.md   | 14 ++++----
 .../development/how_to_understand_kylin_design.md  |  6 ++--
 website/docs/development/intro.md                  |  6 ++--
 website/docs/development/roadmap.md                | 12 +++----
 8 files changed, 73 insertions(+), 73 deletions(-)

diff --git a/website/docs/development/coding_convention.md b/website/docs/development/coding_convention.md
index 68c64710d1..2be9bea478 100644
--- a/website/docs/development/coding_convention.md
+++ b/website/docs/development/coding_convention.md
@@ -16,15 +16,15 @@ last_update:
     author: Tengting Xu
 ---
 
-Coding convention is very important for teamwork. Not only it keeps code neat and tidy, it saves a lot of work too. Different coding convention (and auto formatter) will cause unnecessary code changes that requires more effort at code review and code merge.
+Coding convention is very important for teamwork. Not only it keeps code neat and tidy, but it also saves a lot of work too. Different coding convention (and auto formatter) will cause unnecessary code changes that require more effort at code review and code merge.
 
 ## Setup IDE code formatter
 
-For Java code, we use Eclipse default formatter setting, with one change that to allow long lines.
+For Java code, we use Eclipse's default formatter setting, with one change that allows long lines.
 
-- For Eclipse developers, no manual setting is required. Code formatter configurations `src/core-common/.settings/org.eclipse.jdt.core.prefs` is on git repo. Your IDE should be auto configured when the projects are imported.
+- For Eclipse developers, no manual setting is required. Code formatter configurations `src/core-common/.settings/org.eclipse.jdt.core.prefs` are on the git repo. Your IDE should be auto-configured when the projects are imported.
 
-- For intellij IDEA developers, you need to install `Eclipse Code Formatter` and load the Eclipse formatter settings into your IDE manually.
+- For IntelliJ IDEA developers, you need to install `Eclipse Code Formatter` and load the Eclipse formatter settings into your IDE manually.
 
   you have to do a few more steps:
 
@@ -36,7 +36,7 @@ For Java code, we use Eclipse default formatter setting, with one change that to
 
   ![](images/coding_convention/coding_convention_2.png)
 
-  3. Disable intellij IDEA’s `Optimize imports on the fly`
+  3. Disable IntelliJ IDEA’s `Optimize imports on the fly`
   
   ![](images/coding_convention/coding_convention_3.png)
 
@@ -64,7 +64,7 @@ See the License for the specific language governing permissions and
 limitations under the License.
 ```
 
-The checkstyle plugin will check the header rule when packaging also. The license file locates under `dev-support/checkstyle-apache-header.txt`. To make it easy for developers, please add the header as Copyright Profile and set it as default for Kylin project.
+The checkstyle plugin will check the header rule when packaging also. The license file locates under `dev-support/checkstyle-apache-header.txt`. To make it easy for developers, please add the header as Copyright Profile and set it as default for the Kylin project.
 
 ![](images/coding_convention/coding_convention_4.png)
 
@@ -80,8 +80,8 @@ The checkstyle plugin will check the header rule when packaging also. The licens
 
 5. Add a new test or modified a test.
 
-    1) Please using the `junit5` instead of `junit4`. Example, Using the annotation of `org.junit.jupiter.api.Test` instead of `org.junit.Test`.
+    1) Please use the `junit5` instead of `junit4`. For example, Using the annotation of `org.junit.jupiter.api.Test` instead of `org.junit.Test`.
 
-    2) A test case which extends from `NLocalFileMetadataTestCase` need to change with annotation `@MetadataInfo` and remove the `extend`. 
+    2) A test case that extends from `NLocalFileMetadataTestCase` needs to change with annotation `@MetadataInfo` and remove the `extend`. 
     
     > Example: org.apache.kylin.junit.MetadataExtension, org.apache.kylin.metadata.epoch.EpochManagerTest
diff --git a/website/docs/development/how_to_debug_kylin_in_ide.md b/website/docs/development/how_to_debug_kylin_in_ide.md
index d13a0efb85..2849292178 100644
--- a/website/docs/development/how_to_debug_kylin_in_ide.md
+++ b/website/docs/development/how_to_debug_kylin_in_ide.md
@@ -18,55 +18,55 @@ last_update:
 
 ### Background
 #### Why debug Kylin in IDEA using docker
-This article aims to introduce a simple and useful way to develop and debug Kylin for developer, and provided similar deployment to user's real scenario.
+This article aims to introduce a simple and useful way to develop and debug Kylin for developers and provided a similar deployment to the user's real scenario.
 
 #### Deployment architecture
-Following is architecture of current deployment.
+Following is the architecture of the current deployment.
 
 ![debug_in_laptop](images/debug_kylin_by_docker_compose.png)
 
 This guide **assumes** you have prepared the following things:
 
-- [ ] A **laptop** with MacOS installed to do development work (Windows is not verified at the moment)
-- [ ] A **remote linux server** for testing and deployment purpose(if you do not prepare remote linux server, you will deploy Hadoop on your laptop)
-- [ ] kylin's source code is cloned into some directory in your laptop
+- [ ] A **laptop** with macOS installed to do development work (Windows is not verified at the moment)
+- [ ] A **remote Linux server** for testing and deployment purposes (if you do not prepare a remote Linux server, you will deploy Hadoop on your laptop)
+- [ ] Kaylin's source code is cloned into some directory on your laptop
 
 :::info For Windows Dev Machine
-For Windows dev machine, setup the Kylin dev env in [Windows Subsystem for Linux](https://learn.microsoft.com/en-us/windows/wsl/about) is the best option. Follow this guide on [how to install WSL with GUI](https://learn.microsoft.com/en-us/windows/wsl/tutorials/gui-apps), and install both Kylin code and your favorite IDE (but not the docker) in WSL for best performance.
+For Windows dev machine, setup the Kylin dev env in [Windows Subsystem for Linux](https://learn.microsoft.com/en-us/windows/wsl/about) is the best option. Follow this guide on [how to install WSL with GUI](https://learn.microsoft.com/en-us/windows/wsl/tutorials/gui-apps), and install both the Kylin code and your favorite IDE (but not the docker) in WSL for best performance.
 :::
 
 ### Prepare IDEA and build source code
 
 #### Step 1: Check Software Requirement
 
-Please visit [software_requirement](how_to_package#software_reqiurement), make sure your laptop has meet the requirement.
+Please visit [software_requirement](how_to_package#software_reqiurement), and make sure your laptop has met the requirement.
 
 #### Step 2: Build source code
-- Build back-end source code before your start debug.
+- Build backend source code before your start debugging.
 ```shell
 cd <path-to-kylin-source>
 mvn clean install -DskipTests
 ```
 
 - Build front-end source code. 
-(Please use node.js **v12.14.0**, for how to use specific version of node.js, please check [how to switch to specific node js](how_to_package#install_older_node) )
+(Please use node.js **v12.14.0**, for how to use a specific version of node.js, please check [how to switch to a specific node js](how_to_package#install_older_node) )
 ```shell
 cd kystudio
 npm install
 ```
 
 #### Step 3: Install IntelliJ IDEA and build the source
-1. Install IDEA Community edition (Ultimate edition is ok too).
-2. Import the source code into IDEA. Click the **Open**, and choose the directory of **kylin source code**.
-  ![](images/OPEN_KYLIN_PROJECT.png)
+1. Install the IDEA Community edition (the Ultimate edition is ok too).
+2. Import the source code into IDEA. Click the **Open**, and choose the directory of **Kylin source code**.
+    ![](images/OPEN_KYLIN_PROJECT.png)
 
-3. Install scala plugin and restart
+3. Install the scala plugin and restart
 ![](images/IDEA_Install_Scala_plugin.png)
 
-4. Configure SDK(JDK and Scala), make sure you use **JDK 1.8.X** and **Scala 2.12.X**.
+4. Configure SDK(JDK and Scala), and make sure you use **JDK 1.8.X** and **Scala 2.12.X**.
 ![](images/IDEA_Notify_Install_SDK.png)
 
-5. Reload maven projects, and directory `scala` will be marked as source root(in blue color).
+5. Reload maven projects, and the `scala` directory will be marked as source root(in blue color).
 ![](images/IDEA_RELOAD_ALL_MAVEN_PROJECT.png)
 
 6. Build the projects.(make sure you have executed `mvn clean package -DskipTests`, otherwise some source code is not generated by maven javacc plugin)
diff --git a/website/docs/development/how_to_debug_kylin_in_local.md b/website/docs/development/how_to_debug_kylin_in_local.md
index dcbfc7b823..57a1d694e9 100644
--- a/website/docs/development/how_to_debug_kylin_in_local.md
+++ b/website/docs/development/how_to_debug_kylin_in_local.md
@@ -19,32 +19,32 @@ last_update:
 ### Background
 
 #### Why debug Kylin in IDEA without Hadoop
-This article aims to introduce a simple and useful way to develop and debug Kylin for developer.
+This article aims to introduce a simple and useful way to develop and debug Kylin for developers.
 
 #### Deployment architecture
 
-Following is architecture of current deployment.
+Following is the architecture of the current deployment.
 
 ![debug_in_laptop](./images/how_to_debug_kylin_in_local/laptop.png)
 
 This guide **assumes** you have prepared the following things:
 
-- [X] A **laptop** with MacOS installed to do development work (Windows is not verified at the moment)
-- [X] kylin's source code is cloned into some directory in your laptop
+- [X] A **laptop** with macOS installed to do development work (Windows is not verified at the moment)
+- [X] Kylin's source code is cloned into some directory on your laptop
 
 :::info For Windows Dev Machine
-For Windows dev machine, setup the Kylin dev env in [Windows Subsystem for Linux](https://learn.microsoft.com/en-us/windows/wsl/about) is the best option. Follow this guide on [how to install WSL with GUI](https://learn.microsoft.com/en-us/windows/wsl/tutorials/gui-apps), and install both Kylin code and your favorite IDE (but not the docker) in WSL for best performance.
+For Windows dev machine, setup the Kylin dev env in [Windows Subsystem for Linux](https://learn.microsoft.com/en-us/windows/wsl/about) is the best option. Follow this guide on [how to install WSL with GUI](https://learn.microsoft.com/en-us/windows/wsl/tutorials/gui-apps), and install both the Kylin code and your favorite IDE (but not the docker) in WSL for best performance.
 :::
 
 ### Prepare IDEA and build source code
 
 #### Step 1: Check Software Requirement
 
-Please visit [Software Requirement](how_to_package#software_reqiurement), make sure your laptop has meet the requirement.
+Please visit [Software Requirement](how_to_package#software_reqiurement), and make sure your laptop has met the requirement.
 
 #### Step 2: Build source code
 
-- Build back-end source code before your start debug.
+- Build backend source code before your start debugging.
   
     ```shell
     cd <path-to-kylin-source>
@@ -53,7 +53,7 @@ Please visit [Software Requirement](how_to_package#software_reqiurement), make s
 
 - Build front-end source code.
   
-    (Please use node.js **v12.14.0**, for how to use specific version of node.js, please check [how to switch to specific node js](how_to_package#install_older_node) )
+    (Please use node.js **v12.14.0**, for how to use a specific version of node.js, please check [how to switch to a specific node js](how_to_package#install_older_node) )
     
     ```shell
     cd kystudio
@@ -61,18 +61,18 @@ Please visit [Software Requirement](how_to_package#software_reqiurement), make s
     ```
 #### Step 3: Install IntelliJ IDEA and build the source
 
-1. Install IDEA Community edition (Ultimate edition is ok too).
+1. Install the IDEA Community edition (the Ultimate edition is ok too).
 
-2. Import the source code into IDEA. Click the **Open**, and choose the directory of **kylin source code**.
+2. Import the source code into IDEA. Click the **Open**, and choose the directory of **Kylin source code**.
    ![](images/how_to_debug_kylin_in_local/OPEN_KYLIN_PROJECT.png)
 
-3. Install scala plugin and restart
+3. Install the scala plugin and restart
    ![](images/how_to_debug_kylin_in_local/IDEA_Install_Scala_plugin.png)
 
 4. Configure SDK(JDK and Scala), make sure you use **JDK 1.8.X** and **Scala 2.12.X**.
    ![](images/how_to_debug_kylin_in_local/IDEA_Notify_Install_SDK.png)
 
-5. Reload maven projects, and directory `scala` will be marked as source root(in blue color).
+5. Reload maven projects, and the `scala` directory will be marked as source root(in blue color).
    ![](images/how_to_debug_kylin_in_local/IDEA_RELOAD_ALL_MAVEN_PROJECT.png)
 
 6. Build the projects.(make sure you have executed `mvn clean package -DskipTests`, otherwise some source code is not generated by maven javacc plugin)
diff --git a/website/docs/development/how_to_package.md b/website/docs/development/how_to_package.md
index b8d29eb65e..8fdbc34373 100644
--- a/website/docs/development/how_to_package.md
+++ b/website/docs/development/how_to_package.md
@@ -22,11 +22,11 @@ last_update:
 | Software      | Comment                                      |    Version     |   Download Link    |
 |---------------| ---------------------------------------------|----------------|--------------------|
 | Git           |  Fetch branch name and hash of latest commit | latest         | https://git-scm.com/book/en/v2/Getting-Started-Installing-Git |
-| Apache Maven  |  Build Java and Scala source code            | 3.8.2 or latest| https://maven.apache.org/download.cgi |  
+| Apache Maven  |  Build Java and Scala source code            | 3.8.2 or latest| https://maven.apache.org/download.cgi |
 | Node.js       |  Build front end                             | 12.14.0 is recommended ( or 12.x ~ 14.x) | [How to switch to older node.js](development/how_to_package.md#install_older_node)|
 | JDK           |  Java Compiler and Development Tools         | JDK 1.8.x      | https://www.oracle.com/java/technologies/javase/javase8u211-later-archive-downloads.html |
 
-After installed above software, please do verify **software requirement** by following commands:
+After installing the above software, please verify **software requirements** by following commands:
 
 ```shell
 $ java -version
@@ -49,14 +49,14 @@ git version 2.30.1 (Apple Git-130)
 ```
 ### Options for Packaging Script
 
-|         Option       |     Comment                                        | 
-|--------------------  | ---------------------------------------------------|
-| -official            | If add this option, package name won't contain timestamp| 
-| -noThirdParty        | If add this option, third party binary won't be packaging into binary, current they are influxdb,grafana and postgresql |
-| -noSpark             | If add this option, spark won't packaging into Kylin binary |
-| -noHive1             | By default kylin 5.0 will support Hive 1.2, if add this option, this binary will support Hive 2.3+ |
-| -skipFront           | If add this option, front-end won't be build and packaging |
-| -skipCompile         | Add this option will assume java source code no need be compiled again |
+| Option        | Comment                                                      |
+| ------------- | ------------------------------------------------------------ |
+| -official     | If adding this option, the package name won't contain the timestamp |
+| -noThirdParty | If adding this option, third-party binary won't be packaged into binary, current they are influxdb,grafana and PostgreSQL |
+| -noSpark      | If adding this option, spark won't be packaged into the Kylin binary |
+| -noHive1      | By default Kylin 5.0 will support Hive 1.2, if add this option, this binary will support Hive 2.3+ |
+| -skipFront    | If add this option, the front-end won't be built and packaged |
+| -skipCompile  | Add this option will assume java source code no need to be compiled again |
 
 ### Other Options for Packaging Script
 |         Option       |     Comment                                        | 
@@ -79,11 +79,11 @@ For example, an unofficial package could be `apache-kylin-5.0.0-SNAPSHOT.2022081
 
 ```shell
 
-## Case 1: For developer who want to package for testing purpose
+## Case 1: For the developer who wants to package for testing purposes
 ./build/release/release.sh 
 
-## Case 2: Official apache release,  kylin binary for deploy on Hadoop3+ and Hive2.3+, 
-# and third party cannot be distributed because of apache distribution policy(size and license)
+## Case 2: Official apache release,  Kylin binary for deployment on Hadoop3+ and Hive2.3+, 
+# and the third party cannot be distributed because of apache distribution policy(size and license)
 ./build/release/release.sh -noSpark -official 
 
 ## Case 3: A package for Apache Hadoop 3 platform
@@ -92,31 +92,31 @@ For example, an unofficial package could be `apache-kylin-5.0.0-SNAPSHOT.2022081
 
 ### <span id="install_older_node">How to install older node.js</span>
 
-1. Please visit https://nodejs.org/en/download/ to download and install the latest node.js . After installed, you may use follow command to verify if the latest node.js is in use:
+1. Please visit https://nodejs.org/en/download/ to download and install the latest node.js. After installed, you may use the following command to verify if the latest node.js is in use:
 ```shell
 $ node -v
 v16.17.0
 ```
 
-2. Use some tools like https://github.com/nvm-sh/nvm to install specific version of node.js 
+2. Use some tools like https://github.com/nvm-sh/nvm to install a specific version of node.js 
 
 ```shell
 ## Switch to specific version using nvm
 curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.1/install.sh | bash
 nvm install 12.14.0
 
-## Before packaging, please switch to specific version
+## Before packaging, please switch to a specific version
 nvm use 12.14.0
 ```
 
-You may use follow command to verify if older node.js is in use:
+You may use the following command to verify if older node.js is in use:
 ```shell
 $ node -v
 v12.14.0
 ```
 
-3. Switch to latest node.js
+3. Switch to the latest node.js
 ```shell
-## switch to original version
+## switch to the original version
 nvm use system
 ```
diff --git a/website/docs/development/how_to_subscribe_mailing_list.md b/website/docs/development/how_to_subscribe_mailing_list.md
index 6e37297afa..b8c8b8d79e 100644
--- a/website/docs/development/how_to_subscribe_mailing_list.md
+++ b/website/docs/development/how_to_subscribe_mailing_list.md
@@ -14,7 +14,7 @@ last_update:
 ---
 
 ### Mailing List Table
-These are the mailing lists that have been established for kylin project. For each list, there is a subscribe, unsubscribe, and an archive link.
+These are the mailing lists that have been established for the Kylin project. For each list, there is a subscribe, unsubscribe, and archive link.
 
 | Mailing List |                    Subscribe Link                    |                     Unsubscribe Link                     |                               Archive Link                                |
 |:------------:|:----------------------------------------------------:|:--------------------------------------------------------:|:-------------------------------------------------------------------------:|
@@ -25,23 +25,23 @@ These are the mailing lists that have been established for kylin project. For ea
 
 ### <span id="mailing_list">Subscribe mailing list</span>
 
-#### Step 1: Send subscribe request
-You click the _Subscribe Link_ in above table, and send the simple request(it is ok to leave the title and body with a very short sentence) to the appropriate address .
+#### Step 1: Send subscription request
+You click the _Subscribe Link_ in the above table and send the simple request(it is ok to leave the title and body with a very short sentence) to the appropriate address.
 ![](images/subscribe_mailing_list_1.jpg)
 
 #### Step 2: Receive confirmation reply from ezmlm
-The mailing list management program, [ezmlm](http://untroubled.org/ezmlm/), will send you a reply in 2-10 minutes, ask you to **confirm your subscription**.
+The mailing list management program, [ezmlm](http://untroubled.org/ezmlm/), will send you a reply in 2-10 minutes, asking you to **confirm your subscription**.
 
 Following is a successful case.
 
 ![](images/subscribe_mailing_list_2.jpg)
 
-#### Step 3: Send confirmation request by replying to previous email
+#### Step 3: Send a confirmation request by replying to the previous email
 Reply to the previous email(it is ok to leave the title and body with a short sentence)
 ![](images/subscribe_mailing_list_3.jpg)
 
-#### Step 4: ezmlm acknowledge your confirmation request
-You will receive "Welcome to user/dev@kylin.apache.org" in 2-10 minutes. From now, you have right to send and receive mails from all subscribers of current mailing list.
+#### Step 4: ezmlm acknowledges your confirmation request
+You will receive "Welcome to user/dev@kylin.apache.org" in 2-10 minutes. From now, you have the right to send and receive mail from all subscribers of the current mailing list.
 
 Following is a successful case.
 
diff --git a/website/docs/development/how_to_understand_kylin_design.md b/website/docs/development/how_to_understand_kylin_design.md
index 735c550ee5..e660d1ac8d 100644
--- a/website/docs/development/how_to_understand_kylin_design.md
+++ b/website/docs/development/how_to_understand_kylin_design.md
@@ -21,14 +21,14 @@ last_update:
 Unless more comments, all source code analysis are based on [this code snapshot](https://github.com/apache/kylin/tree/edab8698b6a9770ddc4cd00d9788d718d032b5e8) .
 :::
 
-### About Design of Kylin 5.0
+### About the Design of Kylin 5.0
 1. Metadata Store
    - [x] Metadata Store
    - [ ] Metadata Cache
    - [x] Transaction(CRUD of Metadata)
-   - [ ] Epoch, AuditLog etc.
+   - [ ] Epoch, AuditLog, etc.
 2. Metadata Format/Schema
-   - [ ] DataModel, IndexPlan and Dataflow
+   - [ ] DataModel, IndexPlan, and Dataflow
    - [ ] Index and Layout
    - [ ] Computed Column
 3. Query Engine
diff --git a/website/docs/development/intro.md b/website/docs/development/intro.md
index c33cd133d4..94e34d4109 100644
--- a/website/docs/development/intro.md
+++ b/website/docs/development/intro.md
@@ -19,7 +19,7 @@ last_update:
 Check out the [How to Contribute](how_to_contribute.md) document.
 
 ### Source Repository
-Apache Kylin™ source code is version controlled using Git version control:
+Apache Kylin™ source code is version-controlled using Git version control:
 
 | Repository        |                      Link                       | 
 |:------------------|:------------------------------------------------|
@@ -32,8 +32,8 @@ Apache Kylin™ source code is version controlled using Git version control:
 Track issues on the **Kylin Project** on the [Apache JIRA](http://issues.apache.org/jira/browse/KYLIN)
 
 ### Wiki
-Please check [How to contribute wiki](https://cwiki.apache.org/confluence/display/KYLIN/How+to+contribute+wiki) .
+Please check [How to contribute wiki](https://cwiki.apache.org/confluence/display/KYLIN/How+to+contribute+wiki).
 
 ### Roadmap
 
-Please check [Roadmap of Kylin 5.0](./roadmap.md)
\ No newline at end of file
+Please check the [Roadmap of Kylin 5.0](./roadmap.md)
\ No newline at end of file
diff --git a/website/docs/development/roadmap.md b/website/docs/development/roadmap.md
index 027ac80936..33fd279077 100644
--- a/website/docs/development/roadmap.md
+++ b/website/docs/development/roadmap.md
@@ -20,8 +20,8 @@ last_update:
 ### Kylin 5.0.0
 
 - More flexible and enhanced data model
-  - Allow adding new dimensions and measures to exiting data model
-  - Model adapts to table schema changes while retaining existing index at best effort
+  - Allow adding new dimensions and measures to the existing data model
+  - The model adapts to table schema changes while retaining the existing index at the best effort
   - Support last-mile data transformation using Computed Column
   - Support raw query (non-aggregation query) using Table Index
   - Support changing dimension table (SCD2)
@@ -32,9 +32,9 @@ last_update:
 - More flexible index management (was cuboid)
   - Add IndexPlan to support flexible index management
   - Add IndexEntity to support different index type
-  - Add LayoutEntity to support different storage layout of same Index
-- Towards a native and vectorized query engine
-  - Experiment: Integrate with native execution engine, leveraging Gluten
+  - Add LayoutEntity to support different storage layouts of the same Index
+- Toward a native and vectorized query engine
+  - Experiment: Integrate with a native execution engine, leveraging Gluten
   - Support async query
   - Enhance cost-based index optimizer
 - More
@@ -44,4 +44,4 @@ last_update:
 
 ### Kylin 5.1.0
 
-- Support deploy Kylin on K8S with micro-service architecture
+- Support deploying Kylin on K8S with micro-service architecture


[kylin] 03/03: add a document: introduction of metadata

Posted by xx...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

xxyu pushed a commit to branch doc5.0
in repository https://gitbox.apache.org/repos/asf/kylin.git

commit 1878e6cb00bb925aa70ad13635d5267fae31e715
Author: pengfei.zhan <de...@gmail.com>
AuthorDate: Sun Dec 18 18:13:20 2022 +0800

    add a document: introduction of metadata
---
 .gitignore                                         |   4 +-
 .../2022-12-18-Introduction_of_Metadata/index.md   | 309 +++++++++++++++++++++
 website/blog/authors.yml                           |   5 +
 3 files changed, 317 insertions(+), 1 deletion(-)

diff --git a/.gitignore b/.gitignore
index 85e7c1dfcb..ed2304eeac 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1 +1,3 @@
-/.idea/
+.idea
+.DS_Store
+
diff --git a/website/blog/2022-12-18-Introduction_of_Metadata/index.md b/website/blog/2022-12-18-Introduction_of_Metadata/index.md
new file mode 100644
index 0000000000..4a5ef1d012
--- /dev/null
+++ b/website/blog/2022-12-18-Introduction_of_Metadata/index.md
@@ -0,0 +1,309 @@
+---
+title: Introduction of Metadata(CN)
+slug: introduction_of_metadata_cn
+authors: pfzhan
+tags: [metadata, kylin5]
+hide_table_of_contents: false
+date: 2022-12-18T17:00
+---
+
+# Introduction of Metadata
+
+Kylin5 对元数据的组织结构做了比较大的调整,本文将针对这些调整做比较详细的介绍。相比于 Kylin4,Kylin5 的元数据的一个显著特点是项目级隔离,也即每个项目彼此独立、互不干扰。本文会从项目开始,分别展开Table、Model、IndexPlan、Segment、Job 各个部分的内容。更宽泛地说,Kylin5 的元数据还包括元数据更新审计日志(AuditLog)、事务表(Epoch)、权限(ACL)、查询历史(Query History) 等内容。所有的这些内容都很重要,但作为入门级的介绍,本文不涉及更宽泛的内容,而是尽量把篇幅控制在元数据中最为基础的那部分,以帮助更多的开发者快速地了解和参与到 Kylin5 的研发。当然,作为一篇入门级介绍性文章,非研发人员阅读也能有所收获。接下来让我们切入主题。
+
+## **Overview**
+
+当启动一个 Kylin5 的实例在这个实例上进行了一些常规操作,创建项目、加载表、建模、编辑聚合组、构建索引等操作后,通过执行脚本 `{KYLIN_HOME}/metadata.sh backup` 可以得到类似于如下结构的元数据树状图。
+
+```
+.
+├── UUID
+├── _global
+│   ├── project
+│   │   ├── def.json
+│   │   └── ssb.json
+│   ├── resource_group
+│   │   └── relation.json
+│   ├── sys_acl
+│   │   └── user
+│   │       └── ADMIN
+│   ├── user
+│   │   └── ADMIN
+│   └── user_group
+│       ├── ALL_USERS
+│       ├── ROLE_ADMIN
+│       ├── ROLE_ANALYST
+│       └── ROLE_MODELER
+├── def
+│   ├── dataflow
+│   │   └── de52affe-f280-dcd1-be78-7865ff149669.json
+│   ├── dataflow_details
+│   │   └── de52affe-f280-dcd1-be78-7865ff149669
+│   │       ├── 0b36355d-03df-8b02-aaf9-c2ab00e30456.json
+│   │       ├── 743c5345-a3cd-9acf-2c90-c5bcec61c600.json
+│   │       └── 7cbe4459-cbbd-8b8f-bb88-6a54c24d76ce.json
+│   ├── execute
+│   │   ├── 1c7dd6c1-6da5-72e6-a2fe-e87e5c764502-de52affe-f280-dcd1-be78-7865ff149669
+│   │   ├── 383edce9-fad4-eba9-a498-0f289f1f1a79-de52affe-f280-dcd1-be78-7865ff149669
+│   │   ├── 88400e7b-104a-9117-343d-cc3047b49983
+│   │   ├── 95300d93-da1e-6505-b710-2771c724fa63-de52affe-f280-dcd1-be78-7865ff149669
+│   │   ├── c11e4632-578b-1007-d3e6-65afca6f003c-de52affe-f280-dcd1-be78-7865ff149669
+│   │   └── e1731c9e-d70b-e2eb-0fca-6be24751fb72
+│   ├── index_plan
+│   │   └── de52affe-f280-dcd1-be78-7865ff149669.json
+│   ├── job_stats
+│   │   └── 1671292800000.json
+│   ├── model_desc
+│   │   └── de52affe-f280-dcd1-be78-7865ff149669.json
+│   ├── table
+│   │   ├── SSB.CUSTOMER.json
+│   │   └── SSB.LINEORDER.json
+│   └── table_exd
+│       ├── SSB.CUSTOMER.json
+│       └── SSB.LINEORDER.json
+├── ssb
+│   ├── dataflow
+│   │   ├── 407f1b4e-e5d7-6c1d-3697-a3deaffd0f6b.json
+│   │   └── 91b2007b-112f-f98e-b967-c2fe26c6761c.json
+│   ├── dataflow_details
+│   │   └── 91b2007b-112f-f98e-b967-c2fe26c6761c
+│   │       └── 19211ed7-9c05-cc8e-9d05-7b64d0b90cf8.json
+│   ├── execute
+│   │   ├── 5dde1d4b-f60b-7601-bdda-4d20493b324d-91b2007b-112f-f98e-b967-c2fe26c6761c
+│   │   ├── add2d7c6-4f51-1144-b05f-a28da0d42e1d
+│   │   └── c144cb8f-28c8-833e-3990-59e51bd3f7f8
+│   ├── index_plan
+│   │   ├── 407f1b4e-e5d7-6c1d-3697-a3deaffd0f6b.json
+│   │   └── 91b2007b-112f-f98e-b967-c2fe26c6761c.json
+│   ├── job_stats
+│   │   └── 1671292800000.json
+│   ├── model_desc
+│   │   ├── 407f1b4e-e5d7-6c1d-3697-a3deaffd0f6b.json
+│   │   └── 91b2007b-112f-f98e-b967-c2fe26c6761c.json
+│   ├── table
+│   │   ├── SSB.DATES.json
+│   │   └── SSB.LINEORDER.json
+│   └── table_exd
+│       ├── SSB.DATES.json
+│       └── SSB.LINEORDER.json
+```
+
+从上面的这个树形结构很容易看出 Kylin5 的元数据是项目级隔离的,_global 这个项目相对特殊,它用来存储系统级别的信息如当前实例的项目元数据、ACL 权限相关的元数据、用户以及用户组等信息。权限相关的部分略去不谈,将重点放在单个项目的元数据组织结构上。这份元数据只包括两个项目 ssb 和 def,它们的组织结构完全相同,接下来将一一展开介绍,这里先简要地说明每个目录的作用。
+
+- table: 记录该项目加载的所有表的元数据信息
+- table_exd: 记录 table 目录下表对应的扩展描述性信息
+- model_desc: 记录模型的元数据信息
+- index_plan: 记录索引相关的元数据信息
+- dataflow: 记录 segment 相关的元数据信息
+- dataflow_details: 这是个目录,里面每个文件记录的是已构建索引的描述性信息
+- execute: 记录构建任务相关的元数据信息
+- job_status: 记录构建任务执行状态的元数据信息
+
+将 execute 和 job_status 对应的信息排除在外,那么其它的部分是组成一个项目最为核心的元数据信息。上面这个树状图是备份元数据之后的结果,实际上它们都存在同一张元数据表中,所有的元数据都是一条条的表记录,记录的绝对路径就是它在原数据表中的 meta_key,比如,/ssb/table/SSB.DATES.json 这个绝对路径就代表了ssb 项目加载的表 SSB.DATES。
+
+## **Table Description**
+
+表的描述性文件信息包括 table 和 table_exd 两个文件夹下的内容。table/SSB.DATES.json 是从数据源加载到 Kylin5 系统中生成的表基础描述性信息,而 table_exd/SSB.DATES.json 则是表的扩展信息。随着 Kylin5 功能的扩展,它可能越来越丰富。就目前来说,扩展信息包括表采样信息和查询命中次数信息。
+
+在表的描述性信息中,大部分属性很清晰,这里主要对几个解释一下含义。
+
+- `source_type` 表的源信息,ISourceAware 这个类中定义了一些常规的数据源的类型。
+- `table_type` 表的类型,来源于表或者视图。
+- `transactional` 事务表标志。
+- `increment_loading、top、rangePartition` 已经属于废弃字段。
+- `query_hit_count` 已移到表扩展信息中。
+- 关于表的描述性信息中还有保留了一些快照相关的信息,这是不太合理的,因为快照可能在每次构建时候都自动更新。
+
+值得注意的是,这里的表的描述性信息是项目级别的,而实际模型在引用到这些元数据的时候会利用模型上添加的可计算列信息加以扩展,从而使得表上的列会增加,但这部分内容是在内存中存在的,不会被保存到元数据库中。关于这部分内容,接下来的模型部分会给予进一步说明。
+
+## **Model Description**
+
+模型是 Kylin5 的一个核心概念,可以认为它是一系列相关业务的抽象,在这个抽象中包含了一个或者多个业务模式 (对应到索引)。模型包含的概念比较多,如维度、度量、可计算列、普通列、表的关联关系、事实表、维表、星型模型、雪花模型、星座模型(暂不支持)等,大部分概念在维度建模理论中都有论述,本文不再做说明。这里仅介绍 Kylin5 中特有的概念:**普通列** 和**可计算列**。
+
+先说**可计算列**(ComputedColumn),借用 Kylin5 手册中的一段话来说明可计算列。
+
+> 可计算列是为了充分利用 Kylin 的预计算能力而在模型中预先定义的数据列。通过将相对复杂的在线计算转换成基于可计算列的预计算,查询的性能将会得到大幅提升。此外,可计算列支持将数据的转换、重定义等操作预先定义在模型中,增强数据语义层。通过定义可计算列,用户可以重用已有的业务逻辑代码。
+> 
+
+在模型定义可计算列时,会同时往普通列中添加一个同名的列,目前可计算列只能定义在事实表上。可计算列之后就可以在维度和度量定义时使用。前文讲 Table 这部分内容的时候提到,模型使用的表扩展了可计算列,它是通过模型的 getExtendedTables 方法在使用 table 之前扩展进来的,所以模型使用的是增强了语义信息的表。 
+
+**普通列**(NamedColumn) 的来源有两个,一是前文已经说明可计算列的定义会同时增加一个普通列,二是来源于建模时添加的事实表和所有被关联上的维表。普通列有个属性 status 用于标记这个列是维度列(DIMENSION)、被删除的列(TOMB)、还是仅仅是个普通列(EXIST)。当和可计算列同名的列的 status 属性是 DIMENSION 则表明可计算列被定义成了维度。
+
+### **Significant Change**
+
+Kylin5 中定义了**模型的重大变更**,以确定是否需要执行一系列的后续操作。重大变更会删除无价值的维度和度量,可能会触发 Segment 重新构建,而且模型的 semantic_version 属性在每次发生重大变更后都会自增。以下任意一种情况发生,都会被认定为模型已经发生重大变更。
+
+- 分区列、多级分区列的变化
+- 事实表变更
+- 模型中表的关联关系变更
+- 模型的过滤条件发生变化
+- 模型关联的维表是否预计算属性被更改
+
+### **Broken Model**
+
+由于 model_desc、index_plan、dataflow 之间存在一一对应的关系,因此这三者中的任意一个遭到破坏,模型都会以 broken 状态展示出来。损坏的模型需要用户干预去修复。模型什么时候会损坏?一般来说,模型损坏来源于重载表操作,当数据源中表的列被删除、或者表的列数据类型产生无法兼容的变更、表被删除等情况发生时,重载表有可能导致模型损坏。**因此,在重载表或者做一些相对比较危险性的操作时,记得要先备份元数据。**
+
+模型损坏时会触发哪些操作?优化建议全部清空、dataflow 上会记录失败的原因。失败原因分为三类:由 EVENT 触发的、由 SCHEMA 触发的、其他未知问题触发的 (NULL)。
+
+- EVENT 导致的失败
+- SCHEMA 导致的失败会删除所有已经构建好的 segment。
+- NULL 导致的失败
+
+## **IndexPlan**
+
+IndexPlan 是 Kylin5 中用来组织 Index/Layout 的元数据,它里面包含两个最重要的信息 RuleBasedIndex 和 Indexes,前者用于管理聚合组索引,后者用于管理被物化的索引。在介绍这两个概念之前,我们先介绍一下 Index 和 Layout 这组概念。
+
+### **Index & Layout**
+
+Index 是一个集合概念(后面翻译为索引),它将一类 Layout 管理在一起,这些 Layout 需要满足维度和度量的元素集合相同,但它们的排列不同,或者它们的 shardByColumn 不一样。Layout 时这个集合中的一个具体的排列。
+
+- Index 的 id 是 10000 的整数倍
+- Layout 与 Index 之间的关系: 参考下面的例子会比较清楚
+- 只要不引起歧义,可使用 Index 来代指一个具体的 Layout
+- 同样的 Layout 被删除后如果再生成出来,那么它的 ID 是全新的
+
+```
+假设模型有3个维度分别是{1, 2, 3} ,2个度量分别是{100000, 100001},
+当使用全部维度和度量时,Index 的 Bitset 包含{1, 2, 3, 100000, 100001};
+但是生成的 Layout 可以有很多个,比如:
+ {col_order = [1, 2, 3, 100000, 100001] , shard_by_column = []},
+ {col_order = [1, 2, 3, 100000, 100001] , shard_by_column = [1]},
+ {col_order = [2, 1, 3, 100000, 100001] , shard_by_column = []},
+ {col_order = [3, 1, 2, 100000, 100001] , shard_by_column = [2]}
+注: col_order 是维度的一个排列再加上度量。shard_by_column 影响查询效率。
+    Kylin5在大部分情况下使用的是id值,维度、度量、Layout都是这样。
+```
+
+当前 Kylin5 设计中,Layout 定义了两个属性 manual 和 auto,manual 用来标记索引是用户自定义的,auto 用来标记索引是自动化程序或者脚本生成的,开发者可自行扩展。定义两个属性是为了更好的可扩展性,比如索引先由程序生成出来,之后用户又编辑了聚合组也生成了一个相同的索引,但又不希望这个信息被覆盖掉。按照这两个属性可以将索引划分为用户自定义索引和自动化索引。一般来说程序自动生成的索引比较灵活,也能够清晰的看到索引的组成部分(元数据被物化下来了,或许在不久的将来 Kylin5 将所有的索引都物化下来也未可知),开发者可自行扩展。
+
+按照索引是否预计算聚合将索引分为聚合索引和明细索引。一个不太优雅且通俗易懂的说法,聚合索引就是带度量的索引,明细索引相反。此外,Layout 还定义了一个属性 base 用于标注索引是否是基础索引,基础索引能够尽最大能力避免查询下压到其他计算引擎,比如 Kylin5 自带的 Spark 下压引擎,用户可自行扩展其他计算引擎。
+
+### **RuleBasedIndex**
+
+RuleBasedIndex 里面包含多个聚合组(NAggregationGroup),每个聚合组可以定义自己的生成规则,具体包括必须维度、联合维度、层级维度以及最大维度组合数。RuleBasedIndex 也可以定义一个模型级别的最大维度组合数。当删除 RuleBasedIndex 中的索引时,索引会被加入到 `layout_black_list` 中来保证编辑聚合组不至于导致索引出现 ID 错乱的情况。如果出现模型重大变更 `layout_black_list` 会被清空,整个聚合组会重新生成索引,需要重刷 Segment 数据。
+
+```json
+"rule_based_index" : {
+  "dimensions" : [ 0, 10, 16, 26, 17, 19 ],
+  "measures" : [ 100000, 100001, 100002, 100003, 100004 ],
+  "global_dim_cap" : null,
+  "aggregation_groups" : [ {
+    "includes" : [ 0, 10, 16, 26, 17, 19 ],
+    "measures" : [ 100000, 100001, 100002, 100003, 100004 ],
+    "select_rule" : {
+      "hierarchy_dims" : [ ],
+      "mandatory_dims" : [ 10, 0, 16 ],
+      "joint_dims" : [ [ 17, 19 ] ]
+    },
+    "index_range" : "EMPTY"
+  } ],
+  "layout_id_mapping" : [ 10001, 20001, 30001, 40001 ],
+  "parent_forward" : 3,
+  "index_start_id" : 10000,
+  "last_modify_time" : 1671335454291,
+  "layout_black_list" : [ ],
+  "scheduler_version" : 2,
+  "index_update_enabled" : true, /* streaming 相关,可以忽略 */
+  "base_layout_enabled" : true  /* 是否生成包含所有聚合组维度度量的大索引 */
+}
+```
+
+### **Indexes**
+
+Indexes 属性用于管理用户自定义明细索引、基础明细索引、基础聚合索引以及用户通过扩展 Kylin 自动生成的索引。这类索引区别于聚合组生成的索引的显著特点是所有信息一目了然。对于开发 Kylin 功能以及排查有些查询问题特别方便。需要注意的是,基础聚合索引和基础明细索引会随着编辑模型添加维度、度量重新生成,之后原来的基础索引就变成了普通索引并且会变为锁定状态,在新的基础索引没有构建好之前仍然可以供用户查询。构建好之后,锁定状态的索引不会自动删除,需要用户主动触发删除。
+
+```json
+"indexes" : [ {
+  "id" : 0,
+  "dimensions" : [ 0, 10, 16, 17, 19, 22, 26 ],
+  "measures" : [ 100000, 100001, 100002, 100003, 100004 ],
+  "layouts" : [ {
+    "id" : 1,
+    "name" : null,
+    "owner" : null,
+    "col_order" : [ 0, 10, 16, 17, 19, 22, 26, 100000, 100001, 100002, 100003, 100004 ],
+    "shard_by_columns" : [ ],
+    "partition_by_columns" : [ ],
+    "sort_by_columns" : [ ],
+    "storage_type" : 20,
+    "update_time" : 1671335046320,
+    "manual" : false, /* 聚合组定义的索引 */
+    "auto" : false,  /* 自动化程序生成的索引 */
+    "base" : true,  /* 基础索引 */
+    "draft_version" : null, /* 废弃属性 */
+    "index_range" : null    /* streaming 相关暂时可忽略 */
+  } ],
+  "next_layout_offset" : 2
+}]
+```
+
+最后介绍 IndexPlan 的其他重要属性来结束这部分内容。
+
+- `next_aggregation_index_id` 记录新的聚合索引可以分配的 ID;
+- `next_table_index_id` 记录新的明细索引可以分配的 ID;
+- `approved_additional_recs、approved_removal_recs` 非开源功能,无需关注;
+- `retention_range、engine_type`、 废弃属性
+
+## **Segment**
+
+这一部分是对 Segment 数据存储的描述性性信息,包括 dataflow 和 dataflow_details 两部分。其中,dataflow 用于存储 segment自身的描述性信息,而 dataflow_details 则存储的是每个 segment 已经构建的索引的一些描述性信息。
+
+### **Dataflow**
+
+这部分元数据主要描述了Segment的总体信息,分类说明一些重要的信息。
+
+status 状态信息包括:ONLINE、OFFLINE、WARNING。ONLINE 就是模型在线可供查询,OFFLINE 反之。模型 OFFLINE 的场景包括:
+
+```
+刚新建的无分区列的模型(构建好会自动 ONLINE)
+模型没有任何 segment,模型自动 OFFLINE
+克隆出来的新模型默认 OFFLINE
+用户主动下线模型
+```
+
+值得关注的是 WARNING 状态的模型,它表明模型存在异构,这是因为 Kylin5 允许对一个模型并发构建以及在线的 Schema 变更。与前面模型部分已经提到重大变更不同,这里的变更对已有的索引和 Segment 影响不大,比如:删掉了一些索引、增加了一些维度度量以及索引、添加了一些可计算列等。Segment 中的索引异构分几种情况:
+
+- 对部分 segment 删除索引导致的;
+- 新增索引导致的;
+- segment 中构建的索引依赖了不同的平表
+
+segments 这个属性刻画了模型中存在多少 segment,每个 segment 是增量构建还是全量构建、构建的时间范围多大、segment 的 min/max 统计信息。除此之外,dataflow 上还记录了一些查询统计信息,设计上的缺陷,不应该把查询统计信息和描述性信息耦合在一起。
+
+### **DataflowDetails**
+
+这一部分主要用于记录每个 segment 构建了哪些索引以及每个索引构建好的数据的一些统计信息。当查询到具体的 segment 中的索引时,查询结果会统计并展示这些信息的汇总结果。dataflow_details 这个目录中的数据在没有构建索引的情况下是不存在的,但是 dataflow 里面的信息则不一样,即使没有添加任何segment,也会有一个与模型对应的文件,只是里面的属性 segments 为空数组。
+
+```json
+"layout_instances" : [{
+  "layout_id" : 1,
+  "build_job_id" : "xxxxx",
+  "rows" : 4942,
+  "byte_size" : 138432,
+  "file_count" : 2,
+  "source_rows" : 18416,
+  "source_byte_size" : 0,
+  "partition_num" : 1,
+  "partition_values" : [ ],
+  "is_ready" : false,
+  "create_time" : 1671335217776,
+  "multi_partition" : [ ]
+}]
+```
+
+## **Task & Job**
+
+Task 这一部分与构建任务相关,包括 execute 和 job_status 两部分。相比于以上部分,这部分内容更多的被用在构建任务报错后定位问题。这边不做详细讨论,有兴趣可以直接参考代码理解这些内容。重点提一下,execute 下的 job 的 uuid 可能很长,那是因为它后缀了模型的 uuid,起作用是在构建时避免项目锁的抢占,它能够提升整个系统的稳定性和并发构建的能力。
+
+## **Brief Summary**
+
+最后介绍一下单个项目的描述性信息,它里面的属性和项目级设置相关,这里挑一些重要的作介绍。default_database 项目默认数据库,通过设置默认数据用户可以在查询的时候对默认数据库中的表不用加上数据库前缀。override_kylin_properties 记录项目覆盖的配置信息,目前 Kylin5 提供的查询相关的配置,基本上都可以定义到项目级别。segment_config 中的配置在 Segment 合并时使用。
+
+总而言之,Kylin5 的新一代元数据设计旨在提高系统的易用性、可扩展性,比如:IndexPlan 的抽象、Index & Layout 的设计、索引的属性 manual & auto 的设计等。这些都开发者进一步探索留下了广阔的空间。除此之外,Kylin5 也提供了诸多新颖的特性,如:
+
+- 可计算列丰富了预计算能力
+- AuditLog & Epoch 提供了全新的元数据同步机制
+- 灵活的语义变更,在大多数场景下不需要重建模型也能够更灵活的应对业务变化
+- 明细索引,赋予了从 Valuable 的数据中下钻以及上卷的能力
+- 更灵活的 Runtime Join 从而更好的发挥预计算与实时计算各自的长处
+
+本文主要是一篇介绍性的文章,希望能为您切入 Kylin5 提供些许帮助,接下来我们将提供更多关于 Kylin5 新特性设计原理的文章,期待大家一起加入到 Kylin5 的研发和试用。
diff --git a/website/blog/authors.yml b/website/blog/authors.yml
index 26b4075478..02552d7a82 100644
--- a/website/blog/authors.yml
+++ b/website/blog/authors.yml
@@ -3,3 +3,8 @@ xxyu:
   title: PMC member and Release Manager of Apache Kylin
   url: https://github.com/hit-lacus
   image_url: https://github.com/hit-lacus.png
+
+pfzhan:
+  name: pfzhan
+  url: https://github.com/dethrive
+  image_url: https://github.com/dethrive.png


[kylin] 01/03: add mac troubleshooting content of how_to_write_doc

Posted by xx...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

xxyu pushed a commit to branch doc5.0
in repository https://gitbox.apache.org/repos/asf/kylin.git

commit 5c25707fe9d2f846726ebf63821e0a6b594696f6
Author: pengfei.zhan <de...@gmail.com>
AuthorDate: Sun Dec 18 19:39:35 2022 +0800

    add mac troubleshooting content of how_to_write_doc
---
 website/docs/development/how_to_write_doc.md | 88 +++++++++++++++-------------
 1 file changed, 47 insertions(+), 41 deletions(-)

diff --git a/website/docs/development/how_to_write_doc.md b/website/docs/development/how_to_write_doc.md
index 3d12743dc0..87cfbf7a8c 100644
--- a/website/docs/development/how_to_write_doc.md
+++ b/website/docs/development/how_to_write_doc.md
@@ -1,8 +1,8 @@
 ---
-title: How to write document
+title: How to write a document
 language: en
-sidebar_label: How to write document
-pagination_label: How to write document
+sidebar_label: How to write a document
+pagination_label: How to write a document
 toc_min_heading_level: 2
 toc_max_heading_level: 6
 pagination_prev: development/how_to_contribute
@@ -16,19 +16,19 @@ last_update:
     author: Tengting Xu, Xiaoxiang Yu
 ---
 
-From Kylin 5.0, Kylin documents are written using [Docusaurus](https://docusaurus.io/). Please note multi-version and i18n (multi-language) is not supported right now, but is in the plan. Contributions are very much appreciated.
+From Kylin 5.0, Kylin documents are written using [Docusaurus](https://docusaurus.io/). Please note that multi-version and i18n (multi-language) are not supported right now but are in the plan. Contributions are very much appreciated.
 
 ### Shortcut: Edit a single existing page
 
 :::info Shortcut editing a single page
-1. This shortcut is extreme useful if you found some minor typos or mistakes on a single page, you can edit the document in browser right away in a few minutes without preparation.
-2. But if the change is more complex, like add/edit several pages, upload images, or change global config files, please jump to next paragraph: [**Before your work**](#Before_your_work).
+1. This shortcut is extremely useful if you found some minor typos or mistakes on a single page, you can edit the document in the browser right away in a few minutes without preparation.
+2. But if the change is more complex, like adding/editing several pages, uploading images, or changing global config files, please jump to the next paragraph: [**Before your work**](#Before_your_work).
 :::
 
 1. Just scroll down the page to the bottom and click the `Edit this page`.
 ![](images/how-to-write-doc-01.png)
 
-2. Edit this file in browser.
+2. Edit this file in the browser.
 ![](images/how-to-write-doc-03.png)
 
 3. Propose your changes by raising a pull request.
@@ -38,21 +38,21 @@ From Kylin 5.0, Kylin documents are written using [Docusaurus](https://docusauru
 
 ### <span id="Before_your_work">Before your work</span>
 
-Before adding new documentation, it is best to setup the preview environment first.
+Before adding new documentation, it is best to set up the preview environment first.
 
 1. Install Node.js
 
-   Make sure [Node.js](https://nodejs.org/en/download/) version is 16.14 or above by checking `node -v`. You can use [nvm](https://github.com/nvm-sh/nvm) for managing multiple Node versions on a single machine installed.
+   Make sure the [Node.js](https://nodejs.org/en/download/) version is 16.14 or above by checking `node -v`. You can use [nvm](https://github.com/nvm-sh/nvm) for managing multiple Node versions on a single machine installed.
 
    ```shell
    node -v
    ```
 
    :::note Tips
-   When installing Node.js via *Windows/macOS Installer*, recommend to check all checkboxes related to dependencies.
+   When installing Node.js via *Windows/macOS Installer*, recommend checking all checkboxes related to dependencies.
    :::
 
-2. Clone the kylin doc branch
+2. Clone the Kylin doc branch
 
    ```shell
    cd /path/you/prefer/
@@ -68,7 +68,8 @@ Before adding new documentation, it is best to setup the preview environment fir
    ```
 
    :::note Slow NPM in China?
-   Add following lines to `~/.npmrc` and npm shall become much faster in China.
+   Add the following lines to `~/.npmrc` and npm shall become much faster in China.
+
    ```
    sass_binary_site=https://npm.taobao.org/mirrors/node-sass/
    phantomjs_cdnurl=https://npm.taobao.org/mirrors/phantomjs/
@@ -78,13 +79,18 @@ Before adding new documentation, it is best to setup the preview environment fir
    :::
 
    :::note Troubleshooting
-   Depending on your OS environment, `npm install` can hit various issues at this stage and most of them are due to missing a certain library. Below are a few examples from a Ubuntu user.
-   - If hit error `../src/common.cc:24:10: fatal error: vips/vips8: No such file or directory`
-     - Try install glib2.0-dev, like `sudo apt-get install glib2.0-dev`
-   - If hit error `Error: Command failed: /bin/sh -c autoreconf -ivf`
-     - Try install autoconf, like `sudo apt-get install autoconf`
+   Depending on your OS environment, `npm install` may hit various issues at this stage, most of which are due to missing a certain library. Below are a few examples.
+
+   If an error is like the one below, for an Ubuntu user, it can be solved by installing the lib `glib2.0-dev` with **sudo apt-get install glib2.0-dev**.
+
+   > ../src/common.cc:24:10: fatal error: vips/vips8: No such file or directory
+
+   If an error is like the one below, for an Ubuntu user, it can be solved by installing the lib `autoconf` with **sudo apt-get install autoconf**, while for a macOS user, please try with **brew install autoconf automake libtool**.
+
+   > Error: Command failed: /bin/sh -c autoreconf -ivf
+
    :::
-   
+
    For more information about [Docusaurus](https://docusaurus.io/), please refer to [Docusaurus Installation](https://docusaurus.io/docs/installation).
 
 4. Launch the doc website and preview it locally
@@ -93,13 +99,13 @@ Before adding new documentation, it is best to setup the preview environment fir
    npm run start
    ```
 
-   The homepage of this doc site `http://localhost:3000` shall automatically open in your default browser if no error occurs. Modify any MD or resource file in your local repository, the changes shall reflect immediately in the browser. Very convenient for doc development.
+   The homepage of this doc site `http://localhost:3000` shall automatically open in your default browser if no error occurs. Modify any MD or resource file in your local repository, and the changes shall reflect immediately in the browser. Very convenient for doc development.
 
-### How to create new document
+### How to create a new document
 
 #### Step 1: Create a new markdown file with metadata
 
-Create a new markdown file with any text editor, copy and paste following **Head Metadata Template** to the top your file. After that, replace the variables like `${TITLE OF NEW DOC}` with actual values.
+Create a new markdown file with any text editor, and copy and paste the following **Head Metadata Template** to the top of your file. After that, replace the variables like `${TITLE OF NEW DOC}` with actual values.
 
 ```
 ---
@@ -130,9 +136,9 @@ Add text in the [markdown format](https://docusaurus.io/docs/markdown-features).
 
 Pictures usually go into a subfolder called `images`.
 
-#### Step 3: Add new page to the sidebar
+#### Step 3: Add a new page to the sidebar
 
-Sidebar contains the menu and the navigation tree of the doc site structure. It is maintained in a JS file located at `website/sidebars.js`.
+The sidebar contains the menu and the navigation tree of the doc site structure. It is maintained in a JS file located at `website/sidebars.js`.
 
 For example, if you want to add a new doc `how_to_write_doc.md` to be the child of the `development` menu. Open the `sidebars.js` and modify the `DevelopmentSideBar` block. Add a new block at the tail of `items` of `DevelopmentSideBar`.
 
@@ -163,19 +169,19 @@ npm run start
 ```
 
 :::info Check your doc
-- [ ] Whether the **look and feel** meet expectation?
-- [ ] Whether the **link/pictures** work fine?
+- [ ] Whether the **look and feel** meet expectations?
+- [ ] Whether the **links/pictures** work fine?
 - [ ] Whether the important info is properly **highlighted**? [How to highlight?](#highlight_paragraph)
 - [ ] Whether the **title levels** follow the [heading guide](#heading_level)?
 :::
 
 #### Step 5: Create a pull request
 
-When everything looks fine, create a pull request to the [kylin doc5.0 branch](https://github.com/apache/kylin/tree/doc5.0).
+When everything looks fine, create a pull request to the [Kylin doc5.0 branch](https://github.com/apache/kylin/tree/doc5.0).
 
 :::note What, Pull Request?
 For those who are new to pull requests, here it is explained.
-- How to geek -- [What are git pull requests](https://www.howtogeek.com/devops/what-are-git-pull-requests-and-how-do-you-use-them/)
+- How to geek -- [What is git pull requests](https://www.howtogeek.com/devops/what-are-git-pull-requests-and-how-do-you-use-them/)
 - Github -- [About pull requests](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests)
 :::
 
@@ -187,17 +193,17 @@ For those who are new to pull requests, here it is explained.
 
 [Docusaurus](https://docusaurus.io/) is a static-site generator. It builds a single-page application with fast client-side navigation, leveraging the full power of React to make your site interactive. It provides out-of-the-box documentation features but can be used to create any kind of site (personal website, product, blog, marketing landing pages, etc).
 
-Apache Kylin's website and documentation is using [Docusaurus](https://docusaurus.io/) to manage and generate final content which avaliable at [http://kylin.apache.org](http://kylin.apache.org).
+Apache Kylin's website and documentation is using [Docusaurus](https://docusaurus.io/) to manage and generate final content which is available at [http://kylin.apache.org](http://kylin.apache.org).
 
-#### Kylin document structure and navigation menu
+#### Kylin's document structure and the navigation menu
 
 The Kylin [website material](https://github.com/apache/kylin/tree/doc5.0) is maintained under the `doc5.0` branch.
 
 1. __Home Page__: Home page of Docs
 2. __Document__: General docs about Apache Kylin, including _Installation_, _Tutorial_, etc.
-3. __Development__: _"development"_ For developer to contribute, to develop, integration with other application and extend Apache Kylin
+3. __Development__: _"development"_ For the developer to contribute, develop, integrate with other applications and extend Apache Kylin
 4. __Download__: _"Download"_ Apache Kylin packages
-5. __Community__: Apache kylin Community information
+5. __Community__: Apache Kylin Community information
 6. __Blog__: Engineering blogs about Apache Kylin
 
 #### Full doc structure
@@ -257,16 +263,16 @@ doc5.0
 │     ├── ...
 ```
 
-More details about structure which managed by Docusaurus, please refer to [Project structure rundown](https://docusaurus.io/docs/installation#project-structure-rundown).
+For more details about the structure which is managed by Docusaurus, please refer to the [Project structure rundown](https://docusaurus.io/docs/installation#project-structure-rundown).
 
 #### <span id="heading_level">Title/Heading Level</span>
 
-Here is [official guide about heading](https://docusaurus.io/docs/markdown-features/toc#markdown-headings).  Please use level 3 title("###") and level 4 title("####") in most of the article.
+Here is the [official guide about heading](https://docusaurus.io/docs/markdown-features/toc#markdown-headings).  Please use the level 3 title("###") and level 4 title("####") in most of the article.
 
 Following is a general guide:
-- Use level 2 heading(aka "##") as **top level** title. The number of top level title should not more than two. 
-- Use level 3 heading(aka "###") as **middle level** title. 
-- Use level 4 heading(aka "####") as **the lowest level** title.
+- Use the level 2 heading(aka "##") as the **top-level** title. The number of top-level titles should not be more than two. 
+- Use the level 3 heading(aka "###") as a **middle-level** title. 
+- Use the level 4 heading(aka "####") as **the lowest level** title.
 
 
 We recommend you to check for [this article](how_to_contribute) for example. Following is toc of it.
@@ -288,22 +294,22 @@ We recommend you to check for [this article](how_to_contribute) for example. Fol
 #### Sidebar
 The Sidebar is managed by __sidebars.js__ , please refer to [Sidebar](https://docusaurus.io/docs/sidebar).
 
-#### How to add image in doc
-All image should be put under _images_ folder, in your document, please using below sample to include image:
+#### How to add an image in a doc
+All images should be put under the _images_ folder, in your document, please use the below sample to include the image:
 
 ```
 ![](images/how-to-write-doc-01.png)
 ```
 
 #### How to link to another page
-Using relative path for site links, check this [Markdown links](https://docusaurus.io/docs/markdown-features/links)
+Using relative path for site links, check this [Markdown link](https://docusaurus.io/docs/markdown-features/links)
 
 
 #### How to add source code in doc
-We are using [Code Block](https://docusaurus.io/docs/markdown-features/code-blocks) to highlight code syntax, check this doc for more detail sample.
+We are using [Code Block](https://docusaurus.io/docs/markdown-features/code-blocks) to highlight code syntax, check this doc for a more detailed sample.
 
 #### <span id="highlight_paragraph">How to highlight a sentence/paragraph</span>
-We recommend you to use [admonitions feature](https://docusaurus.io/docs/markdown-features/admonitions) to highlight a sentence/paragraph, following is a example:
+We recommend you use the [admonitions feature](https://docusaurus.io/docs/markdown-features/admonitions) to highlight a sentence/paragraph, following is an example:
 
 ```
 :::caution