You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by zh...@apache.org on 2022/03/10 06:04:14 UTC

[dolphinscheduler-website] branch master updated: Add news (#728)

This is an automated email from the ASF dual-hosted git repository.

zhongjiajie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/dolphinscheduler-website.git


The following commit(s) were added to refs/heads/master by this push:
     new a86994d  Add news  (#728)
a86994d is described below

commit a86994d926eb3e49a073b8e5669f2d6d1d7bc9be
Author: lifeng <53...@users.noreply.github.com>
AuthorDate: Thu Mar 10 14:04:07 2022 +0800

    Add news  (#728)
---
 ...nd_practice_of_Tujia_Big_Data_Platform_Based.md | 170 +++++++++++++++++++++
 ...nd_practice_of_Tujia_Big_Data_Platform_Based.md | 145 ++++++++++++++++++
 img/2022-3-9/1.jpeg                                | Bin 0 -> 46154 bytes
 img/2022-3-9/10.png                                | Bin 0 -> 68953 bytes
 img/2022-3-9/2.png                                 | Bin 0 -> 1552323 bytes
 img/2022-3-9/3.png                                 | Bin 0 -> 154400 bytes
 img/2022-3-9/4.png                                 | Bin 0 -> 97894 bytes
 img/2022-3-9/5.png                                 | Bin 0 -> 69376 bytes
 img/2022-3-9/6.png                                 | Bin 0 -> 58068 bytes
 img/2022-3-9/7.png                                 | Bin 0 -> 116677 bytes
 img/2022-3-9/8.png                                 | Bin 0 -> 116933 bytes
 img/2022-3-9/9.png                                 | Bin 0 -> 72502 bytes
 img/2022-3-9/Eng/1.jpeg                            | Bin 0 -> 46154 bytes
 img/2022-3-9/Eng/2.png                             | Bin 0 -> 1552323 bytes
 img/2022-3-9/Eng/3.png                             | Bin 0 -> 51355 bytes
 img/2022-3-9/Eng/4.png                             | Bin 0 -> 177842 bytes
 img/2022-3-9/Eng/5.png                             | Bin 0 -> 23944 bytes
 img/2022-3-9/Eng/6.png                             | Bin 0 -> 123556 bytes
 img/2022-3-9/Eng/7.png                             | Bin 0 -> 76331 bytes
 img/2022-3-9/Eng/8.png                             | Bin 0 -> 21516 bytes
 img/2022-3-9/Eng/9.png                             | Bin 0 -> 68953 bytes
 site_config/blog.js                                |  14 ++
 site_config/home.jsx                               |  28 ++--
 23 files changed, 343 insertions(+), 14 deletions(-)

diff --git a/blog/en-us/Exploration_and_practice_of_Tujia_Big_Data_Platform_Based.md b/blog/en-us/Exploration_and_practice_of_Tujia_Big_Data_Platform_Based.md
new file mode 100644
index 0000000..5cf70c1
--- /dev/null
+++ b/blog/en-us/Exploration_and_practice_of_Tujia_Big_Data_Platform_Based.md
@@ -0,0 +1,170 @@
+---
+title: Exploration and practice of Tujia Big Data Platform Based on Apache DolphinScheduler
+keywords: Apache,DolphinScheduler,scheduler,big data,ETL,airflow,hadoop,orchestration,dataops,Meetup,Tujia
+description: Tujia introduced Apache DolphinScheduler in 2019. At the recent Apache DolphinScheduler Meetup in February
+---
+# Exploration and practice of Tujia Big Data Platform Based on Apache DolphinScheduler
+
+<div align=center>
+
+<img src="/img/2022-3-9/Eng/1.jpeg"/>
+
+</div>
+
+Tujia introduced Apache DolphinScheduler in 2019. At the recent Apache DolphinScheduler Meetup in February, Tujia Big Data Engineer Xuchao Zan introduced the process of Tujia's access to Apache DolphinScheduler and the functional improvements in detail.
+
+<div align=center>
+
+<img style="width: 60%;" src="/img/2022-3-9/Eng/2.png"/>
+
+</div>
+
+Xuchao Zan, Big Data Engineer, Data Development engineer from Tujia, is mainly responsible for the development, maintenance, and tuning of the big data platform.
+
+**Watch the record here:**[https://www.bilibili.com/video/BV1Ki4y117WV?spm_id_from=333.999.0.0](https://www.bilibili.com/video/BV1Ki4y117WV?spm_id_from=333.999.0.0)
+
+This speech mainly consists of 4 parts. The first part is the current status of Tujia's platform, introducing the process of Tujia's data flow, how Tujia provides data services, and the role of Apache DolphinScheduler in the platform. The second part will introduce the scheduler selection process of Tujia, which mainly refers to some features of the scheduler and the process of access. The third part is mainly about some improvements and functional expansions of the system, including sup [...]
+
+## Status of Tujia Big Data Platform
+
+### 01 Big Data Platform schema
+
+First, let's introduce the schema of Tujia Big Data Platform and the role of Apache DolphinScheduler in the platform.
+
+<div align=center>
+
+<img src="/img/2022-3-9/Eng/3.png"/>
+
+</div>
+
+The architecture of Tujia Big Data Platform
+
+The picture above shows the architecture of our data platform, which mainly includes **data source, data collection, data storage, data management, and finally service provision.**
+
+**The main source of the data** includes three parts: data synchronization of the business from library MySQL API, involving the Dubbo interface, the http interface, and the embedded point data of the web page.
+
+**Data collection** adopts real-time and offline synchronization, business data is incremental synchronized based on Canal, logs are collected by Flume, and Kafka is collected in real-time and falls on HDFS.
+
+**The data storage process** mainly involves some data synchronization services. After the data falls into HDFS, it is cleaned, processed, and then pushed online to provide services.
+
+**At the data management level**, the data dictionary records the metadata information of the business, the definition of the model, and the mapping relationship between various levels, which is convenient for users to find the data they care about; the log records the operation log of the task, and the alarm configures the fault information, etc. The dispatching system, as a command center of big data, rationally allocates dispatching resources and can better serve the business. Metrics [...]
+
+The last part is the**data service**, which mainly includes ad hoc data query, report preview, data download and upload analysis, online business data support release, etc.
+
+### 02 The role of Apache DolphinScheduler in the platform
+
+The following focuses on the role of the scheduler in the platform. The data tunnel is synchronized, and incremental data is pulled regularly every morning. After the data is cleaned and processed, it is pushed online to provide services. It also charges the processing of the data model and the interface configuration greatly improves Productivity. The service of the timed report, push email, support the display of attachments, text table, and line chart. The report push function allows  [...]
+
+## Introduces Apache DolphinScheduler
+
+The second part is about the work we have done by introducing Apache DolphinScheduler.
+
+
+Apache DolphinScheduler is advanced in many aspects. As a command center of big data, it's undoubtedly reliable. The decentralized design of Apache DolphinScheduler avoids the single point of failure problem, and once a problem occurs with one node, the task will be automatically restarted on other nodes, which greatly improves the system's reliability.
+
+In addition, Apache DolphinScheduler is simple and practical, which reduces learning costs and improves work efficiency. Now many staff in the company are using it, including analysts, product operational staff, and developers.
+
+The scalability of scheduling is also very important because as the amount of tasks increases, the cluster can add resources in time to provide services. A wide range of applications is also a key reason for us to choose Apache DolphinScheduler. It supports a variety of task types: Shell, MR, Spark, SQL (MySQL, PostgreSQL, Hive, SparkSQL), Python, Sub_Process, Procedure, etc. and enables workflow timing scheduling and dependency scheduling, manual scheduling, manual pause/stop/resume, as [...]
+
+Next is the upgrade of our time scheduling.
+
+Before adopting Apache DolphinScheduler, our scheduling was quite confusing. Some people deployed their own local Crontab, some used Oozie for scheduling, and some used the system for time schedules. The management is chaotic, the timeliness& accuracy cannot be guaranteed, and tasks cannot be found from time to time due to lacking one unified scheduling platform. In addition, the self-built scheduler is not stable enough due to lacking configuration dependency and data output guarantee,  [...]
+
+<div align=center>
+
+<img src="/img/2022-3-9/Eng/4.png"/>
+
+</div>
+
+In 2019, we introduced Apache DolphinScheduler, and it has been running stably for nearly 3  years.
+
+Below is some data of our system transfer.
+
+
+We have built an Apache DolphinScheduler cluster with a total of 4 physical machines. Currently, a single machine supports 100 task scheduling concurrently.
+
+Algorithms are also configured with special machines and be isolated.
+
+
+Most of our tasks are processed by Oozie, which are mainly Spark and Hive tasks. There are also some scripts on Crontab, some mailing tasks, and scheduled tasks of the reporting system.
+
+## The Scheduling System Re-built Based on Apache DolphinScheduler
+
+Before reconstruction, we have optimized the system, such as supporting table-level dependencies, extending the mail function, etc. After introducing Apache DolphinScheduler, we re-built a scheduling system based on it to provide better services.
+
+**Firstly, the synchronization of table dependencies was supported.**At that time, considering the task migration may be parallel, the tasks could not be synchronized all at once and needed to be marked as the table tasks run successfully. Therefore, we developed the function to solve the dependency problem in task migration. However, the different naming style of users makes it difficult to locate the task of the table when configuring dependencies, and we cannot identify which tables a [...]
+
+<div align=center>
+
+<img src="/img/2022-3-9/Eng/5.png"/>
+
+</div>
+
+**Secondly, Mail tasks support multiple tables.**Scheduling has its self-installed mail push function, but only supports a single table. With more and more business requirements, we need to configure multiple tables and multiple sheets, and the number of text and attachments required to be displayed differently, which needs to be configured. In addition, it is necessary to support the function of line charts to enrich the text pages. Furthermore, users also want to be able to add notes t [...]
+
+<div align=center>
+
+<img src="/img/2022-3-9/Eng/6.png"/>
+
+</div>
+
+<div align=center>
+
+<img src="/img/2022-3-9/Eng/7.png"/>
+
+</div>
+
+**Thirdly, it supports rich data source synchronization.** Due to some data transmission issues, we needed to modify configuration codes greatly, compile, package and upload data in the previous migration process, which is complex and error-prone, and data and online data cannot be separated due to data source disunity; in terms of development efficiency, there is a lot of repetition in the code, and the lack of a unified configuration tool and the unreasonable parameter configuration le [...]
+
+<div align=center>
+
+<img src="/img/2022-3-9/Eng/8.png"/>
+
+</div>
+
+<div align=center>
+
+<img src="/img/2022-3-9/Eng/9.png"/>
+
+</div>
+
+We simplified the data development process, made MySQL support the high availability of pxc/mha, and improved the efficiency of data synchronization.
+
+
+The input data sources support relational databases and FTP synchronization. As the computing engine, the output data sources of Spark support various relational databases, as well as message middleware Kafka, MQ, and Redis.
+
+Next, let's get through the process of our implementation.
+
+
+We have extended the data source of Apache DolphinScheduler to support the expansion of Kafka mq and namespace. Before MySQL synchronization, we first calculate an increment locally and synchronize the incremental data to MySQL. Spark also supports the HA of MySQL pxc/qmha. In addition, there is a qps limit when pushing MQ and Redis, we control the number of partitions and concurrency of Spark by the amount of data.
+
+## Improvements
+
+The fourth part is mainly about the added functions to the system, including:
+
+1. Spark supports publishing system
+2. Data quality open up
+3. Display of Data lineage
+### 01 Spark task supports publishing system
+
+More than 80% of our usual schedules are Spark jar package tasks, but the task release process and the code modification are not normalized. This leads to code inconsistencies from time to time, and even causes online problems in severe cases.
+
+This requires us to refine the release process for tasks. We mainly use the publishing system, the Jenkens packaging function, compile and package to generate btag, and then publish and generate rtag after the test is completed, and the code is merged into master. This avoids the problem of code inconsistency and reduces the steps for jar package upload. After compiling and generating the jar package, the system will automatically push the jar package to the resource center of Apache Dol [...]
+
+### 02 Data quality connection
+
+Data quality is the basis for ensuring the validity and accuracy of analytical conclusions. We need a complete data monitoring output process to make the data more convincing. The quality platform ensures data accuracy, integrity, consistency, and timeliness from four aspects, and supports multiple alarm methods such as telephone, WeCom, and email to inform users.
+
+
+Next, we will introduce how to connect the data quality and scheduling system. After the scheduling task is completed, the message record is sent, the data quality platform receives the message, and triggers the rule of the data quality monitoring. Abide by the monitoring rule, the downstream operation is blocked or an alarm message is sent.
+
+### 03 Data linage display
+
+Data lineage is key to metadata management, data governance, and data quality. It can track the source, processing, and provenance of data, provide a basis for data value assessment, and describe the flow of source data among processes, tables, reports, and ad hoc queries, the dependencies between tables, tables & offline ETL tasks, as well as scheduling platforms & computing engines. The data warehouse is built on Hive, and the raw data of Hive often comes from the production DB, and th [...]
+
+
+* **Data trace:** When data is abnormal, it helps to trace the cause of the abnormality; impact analysis, tracing the source of the data and data processing process.
+* **Data e****valuat****ion**: Provide a basis for evaluating data value in terms of data target, update importance, and update frequency.
+* **Life cycle:**  Intuitively obtain the entire life cycle of data, providing a basis for data governance.
+The collection process of data linage is mainly about: Spark monitors the SQL and inserted tables by monitoring the Spark API, obtains and parses the Spark execution plan.
+
diff --git a/blog/zh-cn/Exploration_and_practice_of_Tujia_Big_Data_Platform_Based.md b/blog/zh-cn/Exploration_and_practice_of_Tujia_Big_Data_Platform_Based.md
new file mode 100644
index 0000000..2489ae1
--- /dev/null
+++ b/blog/zh-cn/Exploration_and_practice_of_Tujia_Big_Data_Platform_Based.md
@@ -0,0 +1,145 @@
+# 途家大数据平台基于 Apache DolphinScheduler 的探索与实践
+
+<div align=center>
+<img src="/img/2022-3-9/1.jpeg"/>
+</div>
+
+>途家在 2019 年引入 Apache DolphinScheduler,在不久前的 Apache DolphinScheduler 2 月份的 Meetup上,**途家大数据工程师 昝绪超** 详细介绍了途家接入 Apache DolphinScheduler 的历程,以及进行的功能改进。
+<div align=center>
+<img style="width: 60%;" src="/img/2022-3-9/2.png"/>
+</div>
+
+途家大数据工程师数据开发工程师,主要负责大数据平台的开发,维护和调优。
+
+本次演讲主要包括4个部分,第一部分是途家的平台的现状,介绍途家的数据的流转过程,如何提供数据服务,以及Apache DolphinScheduler在平台中扮演的角色。第二部分,调度选型,主要介绍调度的一些特性,以及接入的过程。第三部分主要介绍我们对系统的的一些改进和功能扩展,包括功能表依赖的支持,邮件任务扩展,以及数据同步的功能,第四部分是根据业务需求新增的一些功能,如Spark jar包支持发布系统,调度与数据质量打通,以及表血缘展示。
+
+## 途家数据平台现状
+### 01 数据架构
+
+
+首先介绍一下途家数据平台的架构以及Apache DolphinScheduler在数据平台扮演的角色。
+
+<div align=center>
+<img src="/img/2022-3-9/3.png"/>
+</div>
+
+途家数据平台架构
+
+
+上图为我司数据平台的架构,主要包括**数据源,数据采集,数据存储,数据管理,最后提供服务**。
+
+**数据源**主要来源包括三个部分:业务库MySQL API数据同步,涉及到Dubbo接口、http 接口,以及web页面的埋点数据。
+
+**数据采集**采用实时和离线同步,业务数据是基于Canal的增量同步,日志是Flume,Kafka实时收集,落到HDFS上。
+
+**数据存储**过程,主要涉及到一些数据同步服务,数据落到HDFS 后经过清洗加工,推送到线上提供服务。
+
+**数据管理**层面,数据字典记录业务的元数据信息,模型的定义,各层级之间的映射关系,方便用户找到自己关心的数据;日志记录任务的运行日志,告警配置故障信息等。调度系统,作为大数据的一个指挥中枢,合理分配调度资源,可以更好地服务于业务。指标库记录了维度和属性,业务过程指标的规范定义,用于更好的管理和使用数据。Abtest记录不同指标和策略对产品功能的影响;数据质量是数据分析有效性和准确性的基础。
+
+最后是**数据服务**部分,主要包括数据的即席查询,报表预览,数据下载上传分析,线上业务数据支持发布等。
+
+### 02 Apache DolphinScheduler在平台的作用
+
+
+下面着重介绍调度系统在平台扮演的角色。数据隧道同步,每天凌晨定时拉去增量数据。数据清洗加工后推送到线上提供服务。数据的模型的加工,界面化的配置大大提高了开发的效率。定时报表的服务,推送邮件,支持附件,正文table 以及折线图的展示。报表推送功能,数据加工后,分析师会配置一些数据看板,每天DataX把计算好的数据推送到MySQL,做报表展示。
+
+## 接入DS
+
+第二部分介绍我们接入Apache DolphinScheduler做的一些工作。
+
+
+Apache DolphinScheduler具有很多优势,作为大数据的一个指挥中枢,系统的可靠性毋庸置疑,Apache DolphinScheduler 去中心化的设计避免了单点故障问题,以及节点出现问题,任务会自动在其他节点重启,大大提高了系统的可靠性。
+
+
+此外,调度系统简单实用,减少了学习成本,提高工作效率,现在公司很多人都在用我们的调度系统,包括分析师、产品运营,开发。
+
+调度的扩展性也很重要,随着任务量的增加,集群能及时增添资源,提供服务。应用广泛也是我们选择它的一个重要原因,它支持丰富的任务类型:Shell、MR、Spark、SQL(MySQL、PostgreSQL、Hive、SparkSQL),Python,Sub_Process,Procedure等,支持工作流定时调度、依赖调度、手动调度、手动暂停/停止/恢复,同时支持失败重试/告警、从指定节点恢复失败、Kill任务等操作等。它的优势很多,一时说不完,大家都用起来才知道。
+
+
+接下就是我们定时调度的升级。
+
+在采用 Apache DolphinScheduler之前,我们的调度比较混乱,有自己部署本地的Crontab,也有人用Oozie做调度,还有部分是在系统做定时调度。管理起来比较混乱,没有统一的管理调度平台,时效性,和准确性得不到保障,管理任务比较麻烦,找不到任务的情况时有发生。此外,自建调度稳定性不足,没有配置依赖,数据产出没有保障,而且产品功能单一,支持的任务调度有限。
+
+<div align=center>
+<img src="/img/2022-3-9/4.png"/>
+</div>
+2019年,我们引入Apache DolphinScheduler ,到现在已经接近三年时间,使用起来非常顺手。
+
+下面是我们迁移系统的一些数据。
+
+
+我们搭建了DS集群 ,共4台实体机 ,目前单机并发支持100个任务调度。
+
+
+算法也有专门的机器,并做了资源隔离。
+
+Oozie 任务居多 ,主要是一些Spark和Hive 任务 。还有Crontab 上的一些脚本,一些邮件任务以及报表系统的定时任务。
+
+## 基于DS的调度系统构建
+
+在此之前,我们也对系统做过优化,如支持表级别的依赖,邮件功能的扩展等。接入Apache DolphinScheduler后,我们在其基础之上进行了调度系统构建,以能更好地提供服务。 
+
+
+**第一. 支持表依赖的同步,**当时考虑到任务迁移,会存在并行的情况,任务一时没法全部同步,需要表任务运行成功的标记,于是我们开发了一版功能,解决了任务迁移中的依赖问题。然而,每个人的命名风格不太一样,导致配置依赖的时候很难定位到表的任务,我们也无法识别任务里面包含哪些表,无法判断表所在的任务,这给我们使用造成不小的麻烦。
+
+<div align=center>
+<img src="/img/2022-3-9/5.png"/>
+</div>
+<div align=center>
+<img src="/img/2022-3-9/6.png"/>
+</div>
+
+**第二.邮件任务支持多table。** 调度里面自带邮件推送功能,但仅支持单个table ,随着业务要求越来越多,我们需要配置多个 table和多个sheet,要求正文和附件的展示的的条数不一样,需要配置,另外还需要支持折线图的功能,丰富正文页面。此外,用户还希望能在正文或者每个表格下面加注释,进行指标的说明等。我们使用Spark jar包实现了邮件推送功能,支持异常预警、表依赖缺失等。
+
+<div align=center>
+< img src="/img/2022-3-9/7.png"/>
+</div>
+<div align=center>
+< img src="/img/2022-3-9/8.png"/>
+</div>
+
+**第三.支持丰富的数据源同步。**由于在数据传输方面存在一些问题,在以前迁移的过程中,我们需要修改大量的配置代码,编译打包上传,过程繁琐,经常出现漏改,错该,导致线上故障,数据源不统一,测试数据和线上数据无法分开;在开发效率方面,代码有大量重复的地方,缺少统一的配置工具,参数配置不合理,导致MySQL压力大,存在宕机的风险;数据传输后,没有重复值校验,数据量较大的时候,全量更新,导致MySQL压力比较大。MySQL 传输存在单点故障问题,任务延迟影响线上服务。
+
+
+<div align=center>
+<img src="/img/2022-3-9/9.png"/>
+</div>
+<div align=center>
+<img src="/img/2022-3-9/10.png"/>
+</div>
+我们在此过程中简化了数据开发的流程,使得MySQL支持 pxc/mha的高可用,提升了数据同步的效率。
+
+我们支持输入的数据源支持关系型数据库,支持FTP同步,Spark作为计算引擎,输出的数据源支持各种关系型数据库,以及消息中间件Kafka、MQ和Redis。
+
+接下来讲一下我们实现的过程。
+
+我们对Apache DolphinScheduler 的数据源做了扩展,支持kafka mq和namespace 的扩展,MySQL 同步之前首先在本地计算一个增量,把增量数据同步到MySQL,Spark 也支持了MySQL pxc/qmha 的高可用。另外,在推送MQ和Redis时会有qps 的限制,我们根据数据量控制Spark的分区数和并发量。
+
+## 改进
+
+第四部分主要是对系统新增的一些功能,来完善系统。主要包含以下三点:
+
+1. Spark支持发布系统
+2. 数据质量打通
+3. 数据血缘的展示
+### **01**Spark 任务支持发布系统
+
+由于我们平时的调度80%以上都是Spark jar 包任务,但任务的发布流程缺少规范,代码修改随意,没有完整的流程规范,各自维护一套代码。这就导致代码不一致的情况时有发生,严重时还会造成线上问题。
+
+这要求我们完善任务的发布流程。我们主要使用发布系统,Jenkens 打包功能,编译打包后生成btag,在测试完成后再发布生成rtag ,代码合并到master 。这就避免了代码不一致的问题,也减少了jar包上传的步骤。在编译生成jar 包后,系统会自动把jar包推送到Apache DolphinScheduler的资源中心,用户只需配置参数,选择jar包做测试发布即可。在运行Spark任务时,不再需要把文件拉到本地,而是直接读取HDFS上的jar包。
+
+### 02 数据质量打通
+
+数据质量是保障分析结论的有效性和准确性的基础。我们需要要完整的数据监控产出流程才能让数据更有说服力。质量平台从四个方面来保证数据准确性,完整性一致性和及时性,并支持电话、企业微信和邮件等多种报警方式来告知用户。
+
+接下来将介绍如何将数据质量和调度系统打通。调度任务运行完成后,发送消息记录,数据质量平台消费消息,触发数据质量的规则监控 根据监控规则来阻断下游运行或者是发送告警消息等。
+
+### 03 数据血缘关系展示
+
+数据血缘是元数据管理、数据治理、数据质量的重要一环,其可以追踪数据的来源、处理、出处,为数据价值评估提供依据,描述源数据流程、表、报表、即席查询之间的流向关系,表与表的依赖关系、表与离线ETL任务,调度平台、计算引擎之间的依赖关系。数据仓库是构建在Hive之上,而Hive的原始数据往往来自生产DB,也会把计算结果导出到外部存储,异构数据源的表之间是有血缘关系的。
+
+
+* **追踪数据溯源**:当数据发生异常,帮助追踪到异常发生的原因;影响面分析,追踪数据的来源,追踪数据处理过程。
+* **评估数据价值**:从数据受众、更新量级、更新频次等几个方面给数据价值的评估提供依据。
+* **生命周期**:直观地得到数据整个生命周期,为数据治理提供依据。
+血缘的收集过程主要是 :Spark 通过监控Spark API来监听SQL和插入的表,获取Spark的执行计划 ,并解析Spark 执行计划。 
diff --git a/img/2022-3-9/1.jpeg b/img/2022-3-9/1.jpeg
new file mode 100644
index 0000000..b5cdfa8
Binary files /dev/null and b/img/2022-3-9/1.jpeg differ
diff --git a/img/2022-3-9/10.png b/img/2022-3-9/10.png
new file mode 100644
index 0000000..d3ff46a
Binary files /dev/null and b/img/2022-3-9/10.png differ
diff --git a/img/2022-3-9/2.png b/img/2022-3-9/2.png
new file mode 100644
index 0000000..8c97829
Binary files /dev/null and b/img/2022-3-9/2.png differ
diff --git a/img/2022-3-9/3.png b/img/2022-3-9/3.png
new file mode 100644
index 0000000..86f35a9
Binary files /dev/null and b/img/2022-3-9/3.png differ
diff --git a/img/2022-3-9/4.png b/img/2022-3-9/4.png
new file mode 100644
index 0000000..fbf4919
Binary files /dev/null and b/img/2022-3-9/4.png differ
diff --git a/img/2022-3-9/5.png b/img/2022-3-9/5.png
new file mode 100644
index 0000000..5dce9dc
Binary files /dev/null and b/img/2022-3-9/5.png differ
diff --git a/img/2022-3-9/6.png b/img/2022-3-9/6.png
new file mode 100644
index 0000000..1187949
Binary files /dev/null and b/img/2022-3-9/6.png differ
diff --git a/img/2022-3-9/7.png b/img/2022-3-9/7.png
new file mode 100644
index 0000000..a768fab
Binary files /dev/null and b/img/2022-3-9/7.png differ
diff --git a/img/2022-3-9/8.png b/img/2022-3-9/8.png
new file mode 100644
index 0000000..60b04db
Binary files /dev/null and b/img/2022-3-9/8.png differ
diff --git a/img/2022-3-9/9.png b/img/2022-3-9/9.png
new file mode 100644
index 0000000..09e3933
Binary files /dev/null and b/img/2022-3-9/9.png differ
diff --git a/img/2022-3-9/Eng/1.jpeg b/img/2022-3-9/Eng/1.jpeg
new file mode 100644
index 0000000..b5cdfa8
Binary files /dev/null and b/img/2022-3-9/Eng/1.jpeg differ
diff --git a/img/2022-3-9/Eng/2.png b/img/2022-3-9/Eng/2.png
new file mode 100644
index 0000000..8c97829
Binary files /dev/null and b/img/2022-3-9/Eng/2.png differ
diff --git a/img/2022-3-9/Eng/3.png b/img/2022-3-9/Eng/3.png
new file mode 100644
index 0000000..07e0d09
Binary files /dev/null and b/img/2022-3-9/Eng/3.png differ
diff --git a/img/2022-3-9/Eng/4.png b/img/2022-3-9/Eng/4.png
new file mode 100644
index 0000000..1c8c465
Binary files /dev/null and b/img/2022-3-9/Eng/4.png differ
diff --git a/img/2022-3-9/Eng/5.png b/img/2022-3-9/Eng/5.png
new file mode 100644
index 0000000..923c99a
Binary files /dev/null and b/img/2022-3-9/Eng/5.png differ
diff --git a/img/2022-3-9/Eng/6.png b/img/2022-3-9/Eng/6.png
new file mode 100644
index 0000000..740de4c
Binary files /dev/null and b/img/2022-3-9/Eng/6.png differ
diff --git a/img/2022-3-9/Eng/7.png b/img/2022-3-9/Eng/7.png
new file mode 100644
index 0000000..f229a5a
Binary files /dev/null and b/img/2022-3-9/Eng/7.png differ
diff --git a/img/2022-3-9/Eng/8.png b/img/2022-3-9/Eng/8.png
new file mode 100644
index 0000000..52468ff
Binary files /dev/null and b/img/2022-3-9/Eng/8.png differ
diff --git a/img/2022-3-9/Eng/9.png b/img/2022-3-9/Eng/9.png
new file mode 100644
index 0000000..d3ff46a
Binary files /dev/null and b/img/2022-3-9/Eng/9.png differ
diff --git a/site_config/blog.js b/site_config/blog.js
index 7bbba0a..2ff9464 100644
--- a/site_config/blog.js
+++ b/site_config/blog.js
@@ -4,6 +4,13 @@ export default {
         postsTitle: 'All posts',
         list: [
             {
+                title: 'Exploration and practice of Tujia Big Data Platform Based on Apache DolphinScheduler',
+                author: 'Debra Chen',
+                dateStr: '2022-3-10',
+                desc: 'Tujia introduced Apache DolphinScheduler in 2019.. ',
+                link: '/en-us/blog/Exploration_and_practice_of_Tujia_Big_Data_Platform_Based.html',
+            },
+            {
                 title: 'Release News! Apache DolphinScheduler 2_0_5 optimizes The Fault Tolerance Process of Worker',
                 author: 'Debra Chen',
                 dateStr: '2022-3-7',
@@ -146,6 +153,13 @@ export default {
         postsTitle: '所有文章',
         list: [
             {
+                title: '途家大数据平台基于 Apache DolphinScheduler 的探索与实践',
+                author: 'Debra Chen',
+                dateStr: '2022-3-10',
+                desc: '途家在 2019 年引入 Apache DolphinScheduler。',
+                link: '/zh-cn/blog/Exploration_and_practice_of_Tujia_Big_Data_Platform_Based.html',
+            },
+            {
                 title: 'Apache DolphinScheduler 2_0_5 发布,Worker 容错流程优化',
                 author: 'Debra Chen',
                 dateStr: '2022-3-7',
diff --git a/site_config/home.jsx b/site_config/home.jsx
index ed66ac2..21fd639 100644
--- a/site_config/home.jsx
+++ b/site_config/home.jsx
@@ -55,6 +55,13 @@ export default {
       title: '事件 & 新闻',
       list: [
         {
+          img: '/img/2022-3-9/1.jpeg',
+          title: '途家大数据平台基于 Apache DolphinScheduler 的探索与实践',
+          content: '途家在 2019 年引入 Apache DolphinScheduler,在不久前的 Apache DolphinScheduler 2...',
+          dateStr: '2022-3-10',
+          link: '/zh-cn/blog/Exploration_and_practice_of_Tujia_Big_Data_Platform_Based.html',
+        },
+        {
           img: '/img/2022-3-7/1.png',
           title: 'Apache DolphinScheduler 2.0.5 发布,Worker 容错流程优化',
           content: '今天,Apache DolphinScheduler 宣布 2.0.5 版本正式发布。..',
@@ -68,13 +75,6 @@ export default {
           dateStr: '2022-2-22',
           link: '/zh-cn/blog/DolphinScheduler_Kubernetes_Technology_in_action.html',
         },
-        {
-          img: '/img/2022-02-26/07.png',
-          title: '# 直播报名火热启动 | 2022 年 Apache DolphinScheduler Meetup 首秀!',
-          content: '各位关注 Apache DolphinScheduler 的小伙伴们大家好呀!相信大家都已经从热闹的春节里...',
-          dateStr: '2022-2-18',
-          link: '/zh-cn/blog/Meetup_2022_02_26.html',
-        },
       ],
     },
     ourusers: {
@@ -547,6 +547,13 @@ export default {
       title: 'Events & News',
       list: [
         {
+          img: '/img/2022-3-9/Eng/1.jpeg',
+          title: 'Exploration and practice of Tujia Big Data Platform Based on Apache DolphinScheduler',
+          content: 'Tujia introduced Apache DolphinScheduler in 2019...',
+          dateStr: '2022-3-10',
+          link: '/en-us/blog/Exploration_and_practice_of_Tujia_Big_Data_Platform_Based.html',
+        },
+        {
           img: '/img/2022-3-7/1.png',
           title: 'Release News! Apache DolphinScheduler 2_0_5 optimizes The Fault Tolerance Process of Worker',
           content: 'Today, Apache DolphinScheduler announced the official release of version 2.0.5....',
@@ -560,13 +567,6 @@ export default {
           dateStr: '2022-2-24',
           link: '/en-us/blog/DolphinScheduler_Kubernetes_Technology_in_action.html',
         },
-        {
-          img: '/img/2022-02-26/08.png',
-          title: ' Sign Up to Apache DolphinScheduler Meetup Online | We Are Waiting For You to Join the Grand Gathering on 2.26 2022!\n',
-          content: 'Hello the community! After having adjusted ourselves from the pleasure of...',
-          dateStr: '2022-2-18',
-          link: '/en-us/blog/Meetup_2022_02_26.html',
-        },
       ],
     },
     userreview: {