You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@linkis.apache.org by rita <ri...@163.com> on 2022/07/07 02:31:38 UTC

[DISCUSS] Data Quality Solution Design

Dear:

The chat records of WeChat group“Apache linkis Community Development Group”are as follows:

微信群"Apache Linkis 社区开发群"的聊天记录如下:

—————  2022-7-5  —————

Lentils扁豆  09:31

@Andy@WDS  I send data quality design ideas to your https://www.yuque.com/docs/share/b8597a87-e2c7-4a40-b558-182d318f0afa?# 《Data quality solution design》

@Andy@WDS  我把数据质量设计思路发给你们吧 https://www.yuque.com/docs/share/b8597a87-e2c7-4a40-b558-182d318f0afa?# 《数据质量方案设计》

Sargent Ti  09:41

Nice! How to define and manage streaming data sources based on Streamis and Linkis datasource

[强][强]可以看看如何基于Streamis和Linkis datasource做流式数据源的定义和管理

 

Enjoyyin_尹强@WDS  10:01

Maybe very suitable based on Exchangis - >;  Streamis -&gt;   Linkis -&gt;   FlinkX to do the implementation @lentils

感觉非常适合基于 Exchangis -&gt; Streamis -&gt; Linkis -&gt; FlinkX 来做实现 @扁豆 

 

Lentils扁豆  10:03

Yeah, very suitable.

嗯 非常适合

 

Jack  10:08

@ Lentils How is the rule engine implemented in the diagram?

@扁豆 图里的规则引擎怎么实现的

 

Lentils扁豆  10:09

Hi Jack, this is self-developed and cant be provided.

这个是自研的 哥哥 没办法提供出来  

 

Lentils扁豆  10:09

I can tell you how  it works.

只能告诉你思路

 

Jack  10:10

Main about what means to do quality inspection?

大概会用哪些手段 做质量检查 

 

Jack  10:11

Lol, I designed one before but it was low.

[呲牙]  我之前 也设计过一个 不过有点low

 

Lentils 扁豆  10:12

1、Common logical judgment (including: yes or no, greater than or less than) 2. Common SQL 3. Common scripts

1、常见的逻辑判断(包括:与或非、大于小于等)2、常见类SQL 3、常见脚本

 

Lentils 扁豆  10:12

These can meet the various dimensions of data quality inspection

这些就可以满足数据质量各种维度质检

 

Lentils 扁豆  10:12

For example: fluctuation detection, edge detection and so on  

比如:波动检测、边缘检测等

 

Lentils 扁豆  10:13

What else is the maximum, average, etc  

还有什么最值、平均值等

 

Lentils 扁豆  10:17

In a word: the core of data quality is the rule engine, streaming processing only provides the means, the calculation process only contains the logic, not the data  .

就是一句话概括:数据质量的核心就是规则引擎,流式处理只是提供手段,计算过程只存计算逻辑,不存数据

 

Jack  10:22

Got it.

明白了 [呲牙]

 

Jack  10:26

Stream  quality check, I also want to rely on SQL and re solution, if not dynamic load some class quality check method.

流式质量检查 [强], 我之前还想的是靠 sql和正则解决、实在不行动态加载一些class的质量检测方法[呲牙]、