You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@inlong.apache.org by he...@apache.org on 2022/01/25 11:18:57 UTC

[incubator-inlong-website] branch master updated: [INLONG-2241][Audit] Add Audit Introduction (#261)

This is an automated email from the ASF dual-hosted git repository.

healchow pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-inlong-website.git


The following commit(s) were added to refs/heads/master by this push:
     new f5f538e  [INLONG-2241][Audit] Add Audit Introduction (#261)
f5f538e is described below

commit f5f538e3ee2006a8c56b9b08c427d6632a345f87
Author: doleyzi <43...@users.noreply.github.com>
AuthorDate: Tue Jan 25 19:18:26 2022 +0800

    [INLONG-2241][Audit] Add Audit Introduction (#261)
---
 docs/deployment/bare_metal.md                      |   2 +-
 docs/modules/audit/img/audit_api.png               | Bin 0 -> 31395 bytes
 docs/modules/audit/img/audit_architecture.png      | Bin 0 -> 47278 bytes
 docs/modules/audit/img/audit_mysql.png             | Bin 0 -> 19356 bytes
 docs/modules/audit/img/audit_proxy.png             | Bin 0 -> 29358 bytes
 docs/modules/audit/img/audit_sdk.png               | Bin 0 -> 81979 bytes
 .../audit/img/audit_sdk_disaster_recovery.png      | Bin 0 -> 35275 bytes
 docs/modules/audit/img/audit_ui.png                | Bin 0 -> 76860 bytes
 docs/modules/audit/img/elasticsearch_index.png     | Bin 0 -> 22117 bytes
 docs/modules/audit/img/elasticsearch_overview.png  | Bin 0 -> 47297 bytes
 docs/modules/audit/img/elasticsearch_write.png     | Bin 0 -> 26657 bytes
 docs/modules/audit/overview.md                     | 208 ++++++++++++++++++++
 docs/modules/audit/quick_start.md                  |  51 +++++
 .../current/deployment/bare_metal.md               |   2 +-
 .../current/modules/audit/img/audit_api.png        | Bin 0 -> 33174 bytes
 .../modules/audit/img/audit_architecture.png       | Bin 0 -> 33154 bytes
 .../current/modules/audit/img/audit_mysql.png      | Bin 0 -> 19165 bytes
 .../current/modules/audit/img/audit_proxy.png      | Bin 0 -> 31329 bytes
 .../current/modules/audit/img/audit_sdk.png        | Bin 0 -> 77771 bytes
 .../audit/img/audit_sdk_disaster_recovery.png      | Bin 0 -> 29207 bytes
 .../current/modules/audit/img/audit_ui.png         | Bin 0 -> 79285 bytes
 .../modules/audit/img/elasticsearch_index.png      | Bin 0 -> 21801 bytes
 .../modules/audit/img/elasticsearch_overview.png   | Bin 0 -> 46579 bytes
 .../modules/audit/img/elasticsearch_write.png      | Bin 0 -> 27577 bytes
 .../current/modules/audit/overview.md              | 211 +++++++++++++++++++++
 .../current/modules/audit/quick_start.md           |  50 +++++
 26 files changed, 522 insertions(+), 2 deletions(-)

diff --git a/docs/deployment/bare_metal.md b/docs/deployment/bare_metal.md
index 2d81001..f6106a9 100644
--- a/docs/deployment/bare_metal.md
+++ b/docs/deployment/bare_metal.md
@@ -19,6 +19,6 @@ sidebar_position: 4
 | 5 | inlong-dataproxy |  | [InLong DataProxy](modules/dataproxy/quick_start.md)                  |  |
 | 6 | inlong-sort | ZooKeeper, Flink | [InLong Sort](modules/sort/quick_start.md)                            |  |
 | 7 | inlong-agent |  | [InLong Agent](modules/agent/quick_start.md)                          |  |
-
+| 8 | inlong-audit | MySQL or Elasticsearch | [InLong Audit](modules/audit/quick_start.md)                          |  |
 ## Create Data Stream
 After the InLong cluster deployed successfully, you can create a data stream refer to the [user manual](user_guide/user_manual.md) to start using.
\ No newline at end of file
diff --git a/docs/modules/audit/img/audit_api.png b/docs/modules/audit/img/audit_api.png
new file mode 100644
index 0000000..4936797
Binary files /dev/null and b/docs/modules/audit/img/audit_api.png differ
diff --git a/docs/modules/audit/img/audit_architecture.png b/docs/modules/audit/img/audit_architecture.png
new file mode 100644
index 0000000..cdaf196
Binary files /dev/null and b/docs/modules/audit/img/audit_architecture.png differ
diff --git a/docs/modules/audit/img/audit_mysql.png b/docs/modules/audit/img/audit_mysql.png
new file mode 100644
index 0000000..cb21c52
Binary files /dev/null and b/docs/modules/audit/img/audit_mysql.png differ
diff --git a/docs/modules/audit/img/audit_proxy.png b/docs/modules/audit/img/audit_proxy.png
new file mode 100644
index 0000000..25342da
Binary files /dev/null and b/docs/modules/audit/img/audit_proxy.png differ
diff --git a/docs/modules/audit/img/audit_sdk.png b/docs/modules/audit/img/audit_sdk.png
new file mode 100644
index 0000000..a63e89c
Binary files /dev/null and b/docs/modules/audit/img/audit_sdk.png differ
diff --git a/docs/modules/audit/img/audit_sdk_disaster_recovery.png b/docs/modules/audit/img/audit_sdk_disaster_recovery.png
new file mode 100644
index 0000000..0f43f2c
Binary files /dev/null and b/docs/modules/audit/img/audit_sdk_disaster_recovery.png differ
diff --git a/docs/modules/audit/img/audit_ui.png b/docs/modules/audit/img/audit_ui.png
new file mode 100644
index 0000000..9b63bcf
Binary files /dev/null and b/docs/modules/audit/img/audit_ui.png differ
diff --git a/docs/modules/audit/img/elasticsearch_index.png b/docs/modules/audit/img/elasticsearch_index.png
new file mode 100644
index 0000000..41b654a
Binary files /dev/null and b/docs/modules/audit/img/elasticsearch_index.png differ
diff --git a/docs/modules/audit/img/elasticsearch_overview.png b/docs/modules/audit/img/elasticsearch_overview.png
new file mode 100644
index 0000000..82f1f6e
Binary files /dev/null and b/docs/modules/audit/img/elasticsearch_overview.png differ
diff --git a/docs/modules/audit/img/elasticsearch_write.png b/docs/modules/audit/img/elasticsearch_write.png
new file mode 100644
index 0000000..2f5a6ad
Binary files /dev/null and b/docs/modules/audit/img/elasticsearch_write.png differ
diff --git a/docs/modules/audit/overview.md b/docs/modules/audit/overview.md
new file mode 100644
index 0000000..d779577
--- /dev/null
+++ b/docs/modules/audit/overview.md
@@ -0,0 +1,208 @@
+---
+title: Audit Design
+sidebar_position: 1
+---
+
+## Overview
+
+InLong audit is a subsystem independent of InLong, which performs real-time audit and reconciliation on the incoming and outgoing traffic of the Agent, DataProxy, and Sort modules of the InLong system.
+There are three granularities for reconciliation: minutes, hours, and days.
+
+The audit reconciliation is based on the log reporting time, and each service participating in the audit will conduct real-time reconciliation according to the same log time. Through audit reconciliation, we can clearly understand InLong
+The transmission status of each module, and whether the data stream is lost or repeated
+
+![](img/audit_architecture.png)
+1. The audit SDK is nested in the service that needs to be audited, audits the service, and sends the audit result to the audit access layer
+2. The audit access layer writes audit data to MQ (kafak or pulsar)
+3. The distribution service consumes the audit data of MQ, and writes the audit data to MySQL and Elasticsearch
+4. The interface layer encapsulates the data of MySQL and Elasticsearch
+5. Application scenarios mainly include report display, audit reconciliation, etc.
+
+## Audit Dimension
+| | | || | | | | | |
+| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
+| Machine ip | Container ID | Thread ID | Log time (minutes) | Audit ID | inlong_group_id | inlong_stream_id | Number of records | Size | Transmission delay (ms) |
+## Audit ID
+The receiving and sending of each module are respectively an independent audit item ID
+
+|Inlong Service Module |Audit ID |
+|----|----|
+|Inlong API Received Successfully	|1 |
+|Inlong API Send Successfully	|2|
+|Inlong Agent Received Successfully	|3|
+|Inlong Agent Send Successfully	|4|
+|Inlong DataProxy Received Successfully	|5|
+|Inlong DataProxy Send Successfully	|6|
+|Inlong Sort Received Successfully	|7|
+|Inlong Sort Send Successfully	|8|
+
+## Data Transfer Protocol
+The transmission protocol between sdk, access layer, and distribution layer is Protocol Buffers
+```markdown
+syntax = "proto3";
+
+package org.apache.inlong.audit.protocol;
+
+message BaseCommand {
+    enum Type {
+        PING          = 0;
+        PONG          = 1;
+        AUDITREQUEST  = 2;
+        AUDITREPLY    = 3;
+    }
+    Type type                            = 1;
+    optional AuditRequest audit_request  = 2;
+    optional AuditReply audit_reply      = 3;
+    optional Ping ping                   = 4;
+    optional Pong pong                   = 5;
+}
+
+message Ping {
+}
+
+message Pong {
+}
+
+message AuditRequest {
+  AuditMessageHeader msg_header = 1;   
+  repeated AuditMessageBody msg_body = 2;   
+}
+
+message AuditMessageHeader {
+  string ip = 1;            
+  string docker_id = 2;     
+  string thread_id = 3;     
+  uint64 sdk_ts = 4;        
+  uint64 packet_id = 5;     
+}
+
+message AuditMessageBody {
+  uint64 log_ts = 1;   
+  string inlong_group_id= 2;   
+  string inlong_stream_id= 3; 
+  string audit_id = 4;   
+  uint64 count = 5;     
+  uint64 size = 6;      
+  int64  delay = 7;      
+}
+
+message AuditReply {
+  enum RSP_CODE {
+    SUCCESS  = 0;  
+    FAILED   = 1;   
+    DISASTER = 2; 
+  }
+  RSP_CODE rsp_code = 1;   
+  optional string message = 2;
+}
+```
+## Audit SDK Implementation Details
+### Target
+***1. Support local disaster recovery***
+***2. Data Uniqueness***
+***3. Reduce data loss caused by abnormal restart***
+
+### Main Logic Diagram
+![](img/audit_sdk.png)
+
+1. The sdk provides the add interface externally. The parameters are: audit_id, inlong_group_id, inlong_stream_id, number, size.
+2. The sdk uses log time+audit_id+inlong_group_id+inlong_stream_id as the key to perform real-time statistics.
+3. When the sending cycle is satisfied or the business program is actively triggered, the SDK will package the statistical results with the PB protocol and send the audit access layer.
+4. If (4) fails to send, put it into the failure queue, and continue to send in the next cycle.
+5. When the failure queue is greater than the threshold, perform disaster recovery through local files.
+
+### Service Discovery
+Audit name discovery between sdk and access layer, support plug-in, including domain name, vip, etc.
+
+### Disaster Recovery
+![](img/audit_sdk_disaster_recovery.png)
+1. When the SDK fails to send the access layer, it will be placed in the failure queue.
+2. When the failure queue reaches the threshold, it will be written to the local disaster recovery file.
+3. When the local disaster recovery file reaches the threshold, the old data will be eliminated (eliminated by time).
+
+## Access layer Implementation
+### Target
+***1.High reliability***  
+***2.at least once***  
+
+### Main Logic Diagram
+![](img/audit_proxy.png)
+1. After the access layer receives the packet sent by the sdk, it writes the message queue.
+2. After writing the message queue successfully, return success to the sdk.
+3. The data protocol of the message queue is the PB protocol.
+4. Set the ack of the write message queue to -1 or all.
+
+## Elasticsearch Distribution Implementation
+### Target
+***1. High real-time performance (minute level)***
+***2. Can operate tens of billions of audit data per day***
+***3. Can be deduplicated***
+
+### Main Logic Diagram
+![](img/elasticsearch_overview.png)
+1. Distribution service AuditDds consumes messages in real time.
+2. According to the audit ID in the audit data, route the data to the corresponding Elasticsearch cluster.
+3. Each audit ID corresponds to an Elasticsearch index.
+
+### Elasticsearch Index Design
+#### Index Name
+The index name consists of date + audit item ID, such as 20211019_1, 20211019_2.
+#### Index Field Schema
+
+|field               |type        |instruction |
+|----               |----       |----|
+|audit_id	        |keyword    |Audit ID |
+|inlong_group_id	|keyword    |inlong_group_id |
+|inlong_stream_id	|keyword    |inlong_stream_id |
+|docker_id	        |keyword    |ID of the container where the dk is located|
+|thread_id	        |keyword    |thread ID |
+|packet_id	        |keyword    |Package ID reported by sdk |
+|ip	                |keyword    |Machine IP |
+|log_ts	            |keyword    |log time |
+|sdk_ts	            |long       |Audit SDK reporting time |
+|count	            |long       |Number of logs |
+|size	            |long       |size of log  |
+|delay	            |long       |The log transfer time, equal to the current machine time minus the log time |
+
+#### Elasticsearch Index Storage Period
+Storage by day, storage period is dynamically configurable
+
+## Elasticsearch Write Design
+### The relationship between inlong_group_id, inlong_stream_id, audit ID and Elasticsearch index
+![](img/elasticsearch_index.png)
+The relationship between inlong_group_id, inlong_stream_id, audit ID and Elasticsearch index is 1:N in system design and service implementation
+
+### Write Routing Policy
+![](img/elasticsearch_write.png)
+Use inlong_group_id and inlong_stream_id to route to Elasticsearch shards to ensure that the same inlong_group_id and inlong_stream_id are stored in the same shard
+When writing the same inlong_group_id and inlong_stream_id to the same shard, when querying and aggregating, only one shard needs to be processed, which can greatly improve performance
+
+### Optional DeduplicationBy doc_id
+Elasticsearch is resource-intensive for real-time deduplication. This function is optional through configuration.
+
+### Use bulk batch method
+Use bulk to write, each batch of 5000, improve the write performance of the Elasticsearch cluster
+
+## MySQL Distribution Implementation
+### Target
+***1. High real-time performance (minute level)***  
+***2. Simple to deploy***  
+***3. Can be deduplicated***  
+
+### Main Logic Diagram
+![](img/audit_mysql.png)
+MySQL distribution supports distribution to different MySQL instances according to the audit ID, and supports horizontal expansion.
+
+### Usage introduction
+  1. When the audit scale of the business is relatively small, less than ten million per day, you can consider using MySQL as the audit storage. Because the deployment of MySQL is much simpler than that of Elasticsearch, the resource cost will be much less.
+  2. If the scale of audit data is large and MySQL cannot support it, you can consider using Elasticsearch as storage. After all, a single Elasticsearch cluster can support tens of billions of audit data and horizontal expansion.
+  
+## Audit Usage Interface Design
+### Main Logic Diagram
+![](img/audit_api.png)
+The audit interface layer uses SQL to check MySQL or restful to check Elasticsearch. How to check which type of storage the interface uses depends on which type of storage is used.
+
+### UI Interface Display
+### Main Logic Diagram
+![](img/audit_ui.png)
+The front-end page pulls the audit data of each module through the interface layer and displays it.
\ No newline at end of file
diff --git a/docs/modules/audit/quick_start.md b/docs/modules/audit/quick_start.md
new file mode 100644
index 0000000..450bf44
--- /dev/null
+++ b/docs/modules/audit/quick_start.md
@@ -0,0 +1,51 @@
+---
+title: Deployment
+---
+
+## audit-source Deployment
+
+### Configure Message Queue
+The configuration file file is `inlong-audit/audit-source/conf/audit.conf`. 
+```html
+agent1.sinks.pulsar-sink-msg1.pulsar_server_url= pulsar://PULSAR_BROKER_LIST
+agent1.sinks.pulsar-sink-msg1.topic = persistent://PULSAR_TOPIC
+```
+
+## run
+The startup script file file is `inlong-audit/audit-source/bin/start.sh`
+```shell script
+sh bin/start.sh
+```
+
+## stop
+The stop script file file is `inlong-audit/audit-source/bin/stop.sh`
+```shell script
+sh bin/stop.sh
+```
+
+## audit-store安装与部署
+### Configure
+The configuration file file is `inlong-audit/audit-store/conf/aapplication.properties`. 
+#### Configure Message Queue
+```html
+audit.pulsar.server.url=pulsar://127.0.0.1:6650
+audit.pulsar.topic=persistent://public/default/audit
+```
+#### Configure MySQL
+```html
+spring.datasource.druid.url= jdbc:mysql://127.0.0.1:3306/apache_inlong_audit?characterEncoding=utf8&useSSL=false&serverTimezone=GMT%2b8&rewriteBatchedStatements=true&allowMultiQueries=true&zeroDateTimeBehavior=CONVERT_TO_NULL
+spring.datasource.druid.username=root
+spring.datasource.druid.password=inlong
+```
+
+## run
+The startup script file file is `inlong-audit/audit-store/bin/start.sh`
+```shell script
+sh bin/start.sh
+```
+
+## stop
+The stop script file file is `inlong-audit/audit-store/bin/stop.sh`
+```shell script
+sh bin/stop.sh
+```
\ No newline at end of file
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/deployment/bare_metal.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/deployment/bare_metal.md
index b12b035..013a5a9 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/deployment/bare_metal.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/deployment/bare_metal.md
@@ -19,7 +19,7 @@ sidebar_position: 4
 | 5 | inlong-dataproxy | 无 | [InLong DataProxy](modules/dataproxy/quick_start.md)                  |  |
 | 6 | inlong-sort | ZooKeeper, Flink | [InLong Sort](modules/sort/quick_start.md)                            |  |
 | 7 | inlong-agent | 无 | [InLong Agent](modules/agent/quick_start.md)                          |  |
-
+| 8 | inlong-audit | MySQL 或者 Elasticsearch | [InLong Audit](modules/audit/quick_start.md)                          |  |
 ## 创建数据流
 InLong 集群部署成功后,你可以参考[用户手册](user_guide/user_manual.md)创建一个数据流开始使用。
 
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_api.png b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_api.png
new file mode 100644
index 0000000..240f1d3
Binary files /dev/null and b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_api.png differ
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_architecture.png b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_architecture.png
new file mode 100644
index 0000000..3aa6743
Binary files /dev/null and b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_architecture.png differ
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_mysql.png b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_mysql.png
new file mode 100644
index 0000000..5ccd355
Binary files /dev/null and b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_mysql.png differ
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_proxy.png b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_proxy.png
new file mode 100644
index 0000000..faf160c
Binary files /dev/null and b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_proxy.png differ
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_sdk.png b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_sdk.png
new file mode 100644
index 0000000..243403d
Binary files /dev/null and b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_sdk.png differ
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_sdk_disaster_recovery.png b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_sdk_disaster_recovery.png
new file mode 100644
index 0000000..e07f7ca
Binary files /dev/null and b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_sdk_disaster_recovery.png differ
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_ui.png b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_ui.png
new file mode 100644
index 0000000..4de5f2e
Binary files /dev/null and b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/audit_ui.png differ
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/elasticsearch_index.png b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/elasticsearch_index.png
new file mode 100644
index 0000000..a93f2a1
Binary files /dev/null and b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/elasticsearch_index.png differ
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/elasticsearch_overview.png b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/elasticsearch_overview.png
new file mode 100644
index 0000000..e472566
Binary files /dev/null and b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/elasticsearch_overview.png differ
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/elasticsearch_write.png b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/elasticsearch_write.png
new file mode 100644
index 0000000..0fefe47
Binary files /dev/null and b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/img/elasticsearch_write.png differ
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/overview.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/overview.md
new file mode 100644
index 0000000..581a2f2
--- /dev/null
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/overview.md
@@ -0,0 +1,211 @@
+---
+title: 审计设计
+sidebar_position: 1
+---
+
+## 总览
+
+InLong审计是独立于InLong的一个子系统,对InLong系统的Agent、DataProxy、Sort模块的入流量、出流量进行实时审计对账。
+对账的粒度有分钟、小时、天三种粒度。
+
+审计对账以日志上报时间为统一的口径,参与审计的各个服务将按照相同的日志时间进行实时对账。通过审计对账,我们可以清晰的了解InLong
+各个模块的传输情况,以及数据流是否有丢失或者重复
+
+![](img/audit_architecture.png)
+1. 审计SDK嵌套在需要审计的服务,对服务进行审计,将审计结果发送到审计接入层。
+2. 审计接入层将审计数据写到MQ(kafak或者pulsar)。
+3. 分发服务消费MQ的审计数据,将审计数据写到MySQL、Elasticsearch。
+4. 接口层将MySQL、Elasticsearch的数据进行封装。
+5. 应用场景主要包括报表展示、审计对账等等。
+
+## 审计维度
+| | | || | | | | | |
+| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
+| 机器ip |  容器ID | 线程ID | 日志时间(分钟) | 审计ID | inlong_group_id | inlong_stream_id | 条数 | 大小 | 传输时延(ms) |
+
+## 审计项ID
+每个模块的接收与发送分别为一个独立的审计项ID
+
+|Inlong服务模块 |审计ID |
+|----|----|
+|Inlong api接收成功	|1 |
+|Inlong api发送成功	|2|
+|Inlong agent接收成功	|3|
+|Inlong agent发送成功	|4|
+|Inlong DataProxy接收成功	|5|
+|Inlong DataProxy发送成功	|6|
+|Inlong分发服务1接收成功	|7|
+|Inlong分发服务1发送成功	|8|
+|Inlong分发服务2接收成功	|9|
+|Inlong分发服务2发送成功	|10|
+
+## 数据传输协议
+sdk、接入层、分发层之间的传输协议为Protocol Buffers
+```markdown
+syntax = "proto3";
+
+package org.apache.inlong.audit.protocol;
+
+message BaseCommand {
+    enum Type {
+        PING          = 0;
+        PONG          = 1;
+        AUDITREQUEST  = 2;
+        AUDITREPLY    = 3;
+    }
+    Type type                            = 1;
+    optional AuditRequest audit_request  = 2;
+    optional AuditReply audit_reply      = 3;
+    optional Ping ping                   = 4;
+    optional Pong pong                   = 5;
+}
+
+message Ping {
+}
+
+message Pong {
+}
+
+message AuditRequest {
+  AuditMessageHeader msg_header = 1;   //包头
+  repeated AuditMessageBody msg_body = 2;   //包体
+}
+
+message AuditMessageHeader {
+  string ip = 1;            //sdk客户端ip
+  string docker_id = 2;     //sdk所在容器ID
+  string thread_id = 3;     //sdk所在的线程ID
+  uint64 sdk_ts = 4;        //sdk上报时间
+  uint64 packet_id = 5;     //sdk上报的包ID
+}
+
+message AuditMessageBody {
+  uint64 log_ts = 1;    //日志时间
+  string inlong_group_id= 2;   //inlong_group_id
+  string inlong_stream_id= 3; //inlong_stream_id
+  string audit_id = 4;   //审计ID
+  uint64 count = 5;     //条数
+  uint64 size = 6;      //大小
+  int64  delay = 7;      //总传输延时
+}
+
+message AuditReply {
+  enum RSP_CODE {
+    SUCCESS  = 0;  //成功
+    FAILED   = 1;   //失败
+    DISASTER = 2; //容灾
+  }
+  RSP_CODE rsp_code = 1;   //服务端返回码
+  optional string message = 2;
+}
+```
+## 审计SDK实现细节
+### 目标
+***1.支持本地容灾***  
+***2.数据唯一性***  
+***3.减少异常重启导致的数据丢失***  
+
+### 主要逻辑图
+![](img/audit_sdk.png)  
+1.sdk对外提供add接口,参数为:audit_id, inlong_group_id,inlong_stream_id,条数,大小 
+2.sdk以日志时间+audit_id+inlong_group_id+inlong_stream_id为key,进行实时统计  
+3.满足发送周期或者业务程序主动触发,SDK将统计结果进行PB协议组包,发送审计接入层  
+4.如果(4)发送失败,则放入失败队列,下个周期继续发送  
+5.当失败队列大于阈值时,通过本地文件进行容灾  
+
+### 服务发现
+审计sdk与接入层之间的名字发现,支持插件化,包括域名、vip等
+
+### 容灾逻辑
+![](img/audit_sdk_disaster_recovery.png)   
+1.sdk发送接入层失败时,会放入失败队列  
+2.失败队列达到阈值时,将写到本地容灾文件  
+3.本地容灾文件达到阈值时,将淘汰旧数据(按时间淘汰)  
+
+## 接入层实现细节
+### 目标
+***1.高可靠***
+
+***2.at least once***
+
+### 主要逻辑
+![](img/audit_proxy.png)
+1.接入层收到sdk发送的包之后,写消息队列  
+2.写消息队列成功之后,则对sdk返回成功  
+3.消息队列的数据协议为PB协议  
+4.写消息队列的ack设置成-1或者all  
+
+## Elasticsearch分发实现
+### 目标
+***1.高实时性(分钟级)***  
+***2.可运营每天百亿级别的审计数据***  
+***3.可去重***  
+
+### 主要逻辑图
+![](img/elasticsearch_overview.png)
+1.分发服务AuditDds实时消费消息  
+2.根据审计数据中的审计ID,将数据路由到对应的Elasticsearch集群  
+3.每个审计ID对应一个Elasticsearch索引  
+
+### 索引设计
+#### 索引名  
+索引名由日期+审计项ID组成,如20211019_1,20211019_2  
+#### 索引字段格式
+
+|字段               |类型        |说明 |
+|----               |----       |----|
+|audit_id	        |keyword    |审计ID |
+|inlong_group_id	|keyword    |inlong_group_id |
+|inlong_stream_id	|keyword    |inlong_stream_id |
+|docker_id	        |keyword    |sdk所在容器ID |
+|thread_id	        |keyword    |线程ID |
+|packet_id	        |keyword    |sdk上报的包ID |
+|ip	                |keyword    |机器IP |
+|log_ts	            |keyword    |日志时间 |
+|sdk_ts	            |long       |审计SDK上报时间 |
+|count	            |long       |日志条数 |
+|size	            |long       |日志大小 |
+|delay	            |long       |日志传输时间,等于当前机器时间减去日志时间 |
+
+#### 索引的存储周期
+按天存储,存储周期动态可配置
+
+## Elasticsearch写入设计
+### inlong_group_id、inlong_stream_id、审计ID与Elasticsearch索引的关系
+![](img/elasticsearch_index.png)
+系统设计与服务实现上inlong_group_id、inlong_stream_id、审计ID与Elasticsearch索引为1:N的关系  
+
+### 写入路由策略
+![](img/elasticsearch_write.png)
+使用inlong_group_id、inlong_stream_id路由到Elasticsearch分片,保证相同的inlong_group_id、inlong_stream_id存储在相同的分片
+将相同的inlong_group_id、inlong_stream_id写到同一个分片,查询以及聚合的时候,只需要处理一个分片,能够大大提高性能  
+
+### 可选按doc_id去重
+Elasticsearch实时去重比较耗资源,此功能通过配置可选。
+
+### 使用bulk批量方式
+使用bulk写入,每批5000条,提高Elasticsearch集群的写入性能
+
+## MySQL分发实现
+### 目标
+***1.高实时性(分钟级)***   
+***2.部署简单***  
+***3.可去重***
+
+### 主要逻辑图
+![](img/audit_mysql.png)
+MySQL分发支持根据审计ID分发到不同的MySQL实例,支持水平扩展。
+
+### 使用介绍
+  1.当业务的审计规模比较小,小于千万级/天时,就可以考虑采用MySQL作为审计的存储。因为MySQL的部署相对Elasticsearch要简单的多, 资源成本也会少很多。   
+  2.如果审计数据规模很大,MySQL支撑不了时,就可以考虑采用Elasticsearch作为存储,毕竟单个Elasticsearch集群能够支持百亿级别的审计数据,也支持水平扩容。
+  
+## 审计使用接口设计
+### 主要逻辑图
+![](img/audit_api.png)
+审计接口层通过SQL查MySQL或者restful查Elasticsearch。接口具体怎么查哪一种存储,取决使用了哪一种存储
+
+### UI 界面展示
+### 主要逻辑图
+![](img/audit_ui.png)
+前端页面通过接口层,拉取各个模块的审计数据,进行展示
\ No newline at end of file
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/quick_start.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/quick_start.md
new file mode 100644
index 0000000..3ead9be
--- /dev/null
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/audit/quick_start.md
@@ -0,0 +1,50 @@
+---
+title: 安装部署
+---
+
+## audit-source安装与部署
+### 配置消息队列
+配置文件`inlong-audit/audit-source/conf/audit.conf`. 
+```html
+agent1.sinks.pulsar-sink-msg1.pulsar_server_url= pulsar://PULSAR_BROKER_LIST
+agent1.sinks.pulsar-sink-msg1.topic = persistent://PULSAR_TOPIC
+```
+
+## 启动
+启动脚本 `inlong-audit/audit-source/bin/start.sh`
+```shell script
+sh bin/start.sh
+```
+
+## 停止
+停止脚本 `inlong-audit/audit-source/bin/stop.sh`
+```shell script
+sh bin/stop.sh
+```
+
+## audit-store安装与部署
+### 配置
+配置文件 `inlong-audit/audit-store/conf/aapplication.properties`. 
+#### 配置消息队列
+```html
+audit.pulsar.server.url=pulsar://127.0.0.1:6650
+audit.pulsar.topic=persistent://public/default/audit
+```
+#### 配置MySQL
+```html
+spring.datasource.druid.url= jdbc:mysql://127.0.0.1:3306/apache_inlong_audit?characterEncoding=utf8&useSSL=false&serverTimezone=GMT%2b8&rewriteBatchedStatements=true&allowMultiQueries=true&zeroDateTimeBehavior=CONVERT_TO_NULL
+spring.datasource.druid.username=root
+spring.datasource.druid.password=inlong
+```
+
+## 启动
+启动脚本 `inlong-audit/audit-store/bin/start.sh`
+```shell script
+sh bin/start.sh
+```
+
+## 停止
+停止脚本 `inlong-audit/audit-store/bin/stop.sh`
+```shell script
+sh bin/stop.sh
+```
\ No newline at end of file