You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@linkis.apache.org by rita <ri...@163.com> on 2022/07/07 02:31:38 UTC
[DISCUSS] Data Quality Solution Design
Dear:
The chat records of WeChat group“Apache linkis Community Development Group”are as follows:
微信群"Apache Linkis 社区开发群"的聊天记录如下:
————— 2022-7-5 —————
Lentils扁豆 09:31
@Andy@WDS I send data quality design ideas to your https://www.yuque.com/docs/share/b8597a87-e2c7-4a40-b558-182d318f0afa?# 《Data quality solution design》
@Andy@WDS 我把数据质量设计思路发给你们吧 https://www.yuque.com/docs/share/b8597a87-e2c7-4a40-b558-182d318f0afa?# 《数据质量方案设计》
Sargent Ti 09:41
Nice! How to define and manage streaming data sources based on Streamis and Linkis datasource
[强][强]可以看看如何基于Streamis和Linkis datasource做流式数据源的定义和管理
Enjoyyin_尹强@WDS 10:01
Maybe very suitable based on Exchangis - >; Streamis -> Linkis -> FlinkX to do the implementation @lentils
感觉非常适合基于 Exchangis -> Streamis -> Linkis -> FlinkX 来做实现 @扁豆
Lentils扁豆 10:03
Yeah, very suitable.
嗯 非常适合
Jack 10:08
@ Lentils How is the rule engine implemented in the diagram?
@扁豆 图里的规则引擎怎么实现的
Lentils扁豆 10:09
Hi Jack, this is self-developed and cant be provided.
这个是自研的 哥哥 没办法提供出来
Lentils扁豆 10:09
I can tell you how it works.
只能告诉你思路
Jack 10:10
Main about what means to do quality inspection?
大概会用哪些手段 做质量检查
Jack 10:11
Lol, I designed one before but it was low.
[呲牙] 我之前 也设计过一个 不过有点low
Lentils 扁豆 10:12
1、Common logical judgment (including: yes or no, greater than or less than) 2. Common SQL 3. Common scripts
1、常见的逻辑判断(包括:与或非、大于小于等)2、常见类SQL 3、常见脚本
Lentils 扁豆 10:12
These can meet the various dimensions of data quality inspection
这些就可以满足数据质量各种维度质检
Lentils 扁豆 10:12
For example: fluctuation detection, edge detection and so on
比如:波动检测、边缘检测等
Lentils 扁豆 10:13
What else is the maximum, average, etc
还有什么最值、平均值等
Lentils 扁豆 10:17
In a word: the core of data quality is the rule engine, streaming processing only provides the means, the calculation process only contains the logic, not the data .
就是一句话概括:数据质量的核心就是规则引擎,流式处理只是提供手段,计算过程只存计算逻辑,不存数据
Jack 10:22
Got it.
明白了 [呲牙]
Jack 10:26
Stream quality check, I also want to rely on SQL and re solution, if not dynamic load some class quality check method.
流式质量检查 [强], 我之前还想的是靠 sql和正则解决、实在不行动态加载一些class的质量检测方法[呲牙]、