You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by gi...@apache.org on 2021/12/01 11:19:17 UTC
[dolphinscheduler-website] branch asf-site updated: Automated deployment: d0d5befab4a6485e9e62794ebd29915f0e7796aa

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/dolphinscheduler-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 2d0f226  Automated deployment: d0d5befab4a6485e9e62794ebd29915f0e7796aa
2d0f226 is described below

commit 2d0f2269fa0b9e1c22b00ab7d73b5f91915aa810
Author: github-actions[bot] <gi...@users.noreply.github.com>
AuthorDate: Wed Dec 1 11:19:10 2021 +0000

    Automated deployment: d0d5befab4a6485e9e62794ebd29915f0e7796aa
---
 .../blog/Lizhi case study(en) blog correction.html |  36 ++++++++++++++-------
 .../blog/Lizhi case study(en) blog correction.json |   2 +-
 img/present1.jpg                                   | Bin 0 -> 24709 bytes
 img/present2.jpg                                   | Bin 0 -> 24026 bytes
 img/present3.jpg                                   | Bin 0 -> 24242 bytes
 img/streamline.png                                 | Bin 0 -> 24101 bytes
 6 files changed, 25 insertions(+), 13 deletions(-)

diff --git a/en-us/blog/Lizhi case study(en) blog correction.html b/en-us/blog/Lizhi case study(en) blog correction.html
index f495200..6cfbeef 100644
--- a/en-us/blog/Lizhi case study(en) blog correction.html	
+++ b/en-us/blog/Lizhi case study(en) blog correction.html	
@@ -63,27 +63,39 @@
 <p>After selecting the DolphinScheduler, the Lizhi machine learning platform carries out re-development based on it and applies the achievements to actual business scenarios, which are mainly about recommendation and risk control. Recommendation scenarios cover recommendation of voice, anchor, live broadcast, podcast, friend, etc., and risk control scenarios cover risk control in payment, advertising, and comment, etc.
 At the technical level of the platform, Lizhi optimizes the extended modules for the five paradigms of machine learning, i.e. obtaining training samples, data preprocessing, model training, model evaluation, and model release.</p>
 <p>A simple xgboost case:</p>
-<div align=center>
-<img src="https://imgpp.com/images/2021/11/30/32db43420c7c44e28ff2fb7be27ec79c.md.png"/>
-</div>
+<p align="center">
+  <img src="/img/streamline.png" alt="streamline"  width="60%" />
+  <p align="center">
+        <em>streamline</em>
+  </p>
+</p>
 <h3>1. Obtaining training samples</h3>
 <p>At present, Lizhi does not directly select data from Hive, and joins the union, splitting the sample afterward, but directly processes the sample by shell nodes.</p>
 <h3>2. Data preprocessing</h3>
 <p>Transformer&amp; custom preprocessing configuration file, use the same configuration for online training, and feature preprocessing is performed after the feature is obtained. It contains the itemType and its feature set to be predicted, the user’s userType and its feature set, as well as the associated and crossed itemType and its feature set. Define the transformer function for each feature preprocessing, supports custom transformer and hot update, xgboost, and tf model feature prep [...]
-<div align=center>
-<img src="https://imgpp.com/images/2021/11/30/1afaee9a4142648f0.md.jpg"/>
-</div>
+<p align="center">
+  <img src="/img/present1.jpg" alt="training data preprocess"  width="60%" />
+  <p align="center">
+        <em>Training data preprocess</em>
+  </p>
+</p>
 <h3>3. Xgboost training</h3>
 <p>It supports w2v, xgboost, tf model training modules. The training modules are first packaged with TensorFlow or PyTorch and then packaged into DolphinScheduler modules.
 For example, in the xgboost training process, use Python to package the xgboost training script into the xgboost training node of DolphinScheduler, and show the parameters required for training on the interface. The file exported by “training set data preprocessing” is input to the training node through HDFS.</p>
-<div align=center>
-<img src="https://imgpp.com/images/2021/11/23/3.md.png"/>
-</div>
+<p align="center">
+  <img src="/img/present3.jpg" alt="Xgboost training"  width="60%" />
+  <p align="center">
+        <em>Xgboost training</em>
+  </p>
+</p>
 <h3>4. Model release</h3>
 <p>The release model will send the model and preprocessing configuration files to HDFS and insert records into the model release table. The model service will automatically identify the new model, update the model, and provide online prediction services to the external.</p>
-<div align=center>
-<img src="https://imgpp.com/images/2021/11/30/2c4b9ff8072e348ee.md.jpg"/>
-</div>
+<p align="center">
+  <img src="/img/present2.jpg" alt="Model release"  width="60%" />
+  <p align="center">
+        <em>model release</em>
+  </p>
+</p>
 <p>Haibin Yu said that due to historical and technical limitations, Lizhi has not yet built a machine learning platform like Ali PAI, but the practice has proved that similar platform functions can be achieved based on DolphinScheduler.</p>
 <p>In addition, Lizhi has also carried out many re-developments based on DolphinScheduler to make the scheduling system more in line with actual business needs, such as:</p>
 <ol>
diff --git a/en-us/blog/Lizhi case study(en) blog correction.json b/en-us/blog/Lizhi case study(en) blog correction.json
index f37b05e..e785ee0 100644
--- a/en-us/blog/Lizhi case study(en) blog correction.json	
+++ b/en-us/blog/Lizhi case study(en) blog correction.json	
@@ -1,6 +1,6 @@
 {
   "filename": "Lizhi case study(en) blog correction.md",
-  "__html": "<h1>A Formidable Combination of Lizhi Machine Learning Platform&amp; DolphinScheduler Creates New Paradigm for Data Process in the Future</h1>\n<blockquote>\n<p>Editor's word: The online audio industry is a blue ocean market in China nowadays. According to CIC data, the market size of China’s online audio industry has grown from 1.6 billion yuan in 2016 to 13.1 billion yuan in 2020, with a compound annual growth rate of 69.4%. With the popularity of the Internet of Things, a [...]
+  "__html": "<h1>A Formidable Combination of Lizhi Machine Learning Platform&amp; DolphinScheduler Creates New Paradigm for Data Process in the Future</h1>\n<blockquote>\n<p>Editor's word: The online audio industry is a blue ocean market in China nowadays. According to CIC data, the market size of China’s online audio industry has grown from 1.6 billion yuan in 2016 to 13.1 billion yuan in 2020, with a compound annual growth rate of 69.4%. With the popularity of the Internet of Things, a [...]
   "link": "/dist/en-us/blog/Lizhi case study(en) blog correction.html",
   "meta": {}
 }
\ No newline at end of file
diff --git a/img/present1.jpg b/img/present1.jpg
new file mode 100644
index 0000000..5503fc0
Binary files /dev/null and b/img/present1.jpg differ
diff --git a/img/present2.jpg b/img/present2.jpg
new file mode 100644
index 0000000..c17e645
Binary files /dev/null and b/img/present2.jpg differ
diff --git a/img/present3.jpg b/img/present3.jpg
new file mode 100644
index 0000000..eda8184
Binary files /dev/null and b/img/present3.jpg differ
diff --git a/img/streamline.png b/img/streamline.png
new file mode 100644
index 0000000..3ed685d
Binary files /dev/null and b/img/streamline.png differ