You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@linkis.apache.org by jie xu <ti...@gmail.com> on 2022/05/26 07:09:19 UTC

2022-05-25 Minutes of the bi-weekly meeting of Apache Linkis(incubating)

2022/05/25 Minutes of the bi-weekly meeting of Apache Linkis(incubating)

1. [Fixed topic] Apache Linkis incubation & version progress
synchronization. —— Xu Jie
https://docs.qq.com/sheet/DSFJyTld3Y0JGeU54?tab=uf5xax
2. [Fixed topic] Apache Linkis1.1.1 progress synchronization. —— Xia Chen
https://docs.qq.com/sheet/DVWZLYlFVTWVrdmlr
The voting is over, an email will be issued on 2022-05-25, and a soft
article will be issued on the public account this week, which is basically
completed.
3. [Fixed topic] Apache Linkis1.1.2 progress synchronization. - Tok Road
4. [Fixed topic] Apache Linkis1.2.0 progress synchronization. —— Wang Zhen
Code has been merged, documentation and test cases are being prepared
In addition, the 1.1.3 new feature code has been submitted, the unit test
has been supplemented, and the documentation is being improved, hosted by
Sun Shun
5. [Fixed topic] Apache Linkis community operation progress is
synchronized. —— Li Wen
The operating indicators of the open source community are growing normally,
and the growth of core indicators such as committer and pr is in line with
expectations. Judging from the tweets of the public account issued in the
past two weeks, developers are more concerned about the technical level, as
well as the components related to Linux or WDS. Some new editions are
released, and some articles about other activities and news articles are of
general interest.
There are 28 contributors and 80 registrations in total, and further
contact is required.
The video on how to become a developer has been launched on websites such
as Station B. It is recommended to start with a simpler way of
documentation. Already 7 volunteers have started contributing to the
community through documentation.
Xingce community meetup, the number of viewers reached 4000+, but it did
not bring traffic growth to the community.
On the evening of Thursday, June 9th, a community meetup will be held,
which will be postponed for a week until June 15th.
6. [Temporary Issue] Test Environment Whitelist Access Mechanism —— Wang
Heping
For security reasons, the existing test environment cannot be connected
normally, and external resources such as Alibaba Cloud need to be purchased
for redeployment, and it is necessary to further communicate resource
application issues within Wezhong.
7. [Temporary topic] Apache Linkis 1.1.3 feature introduction (Prometheus
monitoring) - Sun Shun
It mainly introduces the monitoring architecture and deployment plan of
Linkis Prometheus. The main monitoring data is in JVM and NETWORK. If
necessary, Tianyi Cloud can contribute to the monitoring of indicators that
are closer to the business, such as the number of successful and failed
tasks.
8. [Temporary Issue] Linkis containerization starts - Tao Kelu
Linis containerization starts to start development, and Tianyi Cloud can
provide some codes and ideas to cooperate.
9. [Temporary Issue] Based on the linkis&dss 1.X version, the adaptation of
peripheral components - Di Shuai
10. [Fixed topic] The host of the next regular meeting, welcome to claim it.
Postponed for a week, held on June 15, hosted by Hui Ge
11. [Temporary Issue] Answers to developer questions
https://docs.qq.com/sheet/DUlJIREJKaHlVVUVU?u=444994d9ce1841d3af5bb0d65efbe4a9&tab=ggfj6i
For resource usage issues, it is recommended that users upgrade from 1.0.2
to 1.0.3 or later, and 1.1.2 will be the long-term support version.
k8s deployment linkis exists. Due to the overlap network, the service ip
and port are random. When restarting, linkis saves the eurka address in the
local database, resulting in the address before the connection after the
restart. It is recommended not to keep unnecessary information in the
database. ,
The data source runs through the entire data governance. If it is
configured in Linkis, it is not conducive to expansion and use. The main
Linkis data source is for tool development and use, not the data source
used by the company.

Configuration parameter change Whether it is necessary to add a md file of
parameter change record in the source code library
Recent developments
merged
1. https://github.com/apache/incubator-linkis/pull/2168/files FileSource
supports variable configuration for file types Jie Longping has been merged
2. https://github.com/apache/incubator-linkis/issues/2124 Optimize the
result set path to be separated by date, to solve the problem of too many
subdirectories in a single folder. The result set path of different dates
is in the same folder, such as "/tmp
/linkis/hadoop/linkis/20220516_210525/IDE/40099", which may result in too
many files in a folder. The number of merged hdfs directories is limited.
3. https://github.com/apache/incubator-linkis/pull/2109 Add support for
sqoop engine plugin Merged
4. https://github.com/apache/incubator-linkis/issues/2103 Fixed a bug where
kerberos was not used, and the kinit thread was started when executing JDBC
engine tasks. Merged
5. https://github.com/apache/incubator-linkis/issues/2110 Removed the
binary file .mvn/wrapper/maven-wrapper.jar in the source code, and adjusted
the LICENSE instructions related to .mvn/*
6. https://github.com/apache/incubator-linkis/pull/2113 Upgrade
py4j-0.10.7-src.zip to py4j-0.10.9.5-src.zip
7. https://github.com/apache/incubator-linkis/pull/2116 linkis-storage
module replaces cglib with spring built-in cglib
8. https://github.com/apache/incubator-linkis/pull/2131 Remove the
introduction of pandas to solve the problem that the python engine fails to
start due to lack of dependencies
9. https://github.com/apache/incubator-linkis/pull/2133 The temporary
storage path of data source kafka and hive is added to check the function
of automatically creating a directory
10. https://github.com/apache/incubator-linkis/pull/2142 Fix the problem
that the JDBC Engine console configuration cannot take effect immediately
after modification (the cache time is adjusted as a configuration item)
11. https://github.com/apache/incubator-linkis/pull/2160 The consumption
queue for task submission supports the configuration of specific
high-volume users
12. https://github.com/apache/incubator-linkis/pull/2161 Added support for
automatic formatting parameters when exporting the result set to an excel
file

to be merged
1. https://github.com/apache/incubator-linkis/pull/2173 Add support for
presto engine plugin
2. https://github.com/apache/incubator-linkis/pull/2164 entrance The
parameter that supports the number of retries of the task - RetryCountLabel
Asf header is missing
3. https://github.com/apache/incubator-linkis/pull/2163 Add task and
execution EC records, and EC information is recorded in the task's Metrics
field
4. https://github.com/apache/incubator-linkis/pull/2159 EC's log log
supports scrolling by size and time
5. https://github.com/apache/incubator-linkis/pull/2150 The common and
entry modules both have the logic of custom variable substitution, and the
optimization is aggregated into the common module for processing
6. https://github.com/apache/incubator-linkis/pull/2147 The gson of
dependabot is upgraded from 2.8.5 to 2.8.9
7. https://github.com/apache/incubator-linkis-website/pull/265
Supplementary engine implementation details document to be merged Build
failed

issue
1. https://github.com/apache/incubator-linkis/issues/2144 The problem of
shell script authorization +x permission failure occurs when compiling
different systems. Tao Kelu follow up
2. https://github.com/apache/incubator-linkis/issues/2141 Change dbcp in
JDBC engine to dbcp2

---------------------

2022/05/25 Apache Linkis(incubating) 双周例会会议纪要

1. 【固定议题】Apache Linkis 孵化&版本 进展同步。 —— 徐杰
https://docs.qq.com/sheet/DSFJyTld3Y0JGeU54?tab=uf5xax
2. 【固定议题】Apache Linkis1.1.1 进展同步。 —— 夏晨
https://docs.qq.com/sheet/DVWZLYlFVTWVrdmlr
投票结束,2022-05-25发版邮件,这周发公众号软文,基本完成。
3. 【固定议题】Apache Linkis1.1.2 进展同步。 —— 陶克路
测试完成,文档完成,等待1.1.1发版完成
4. 【固定议题】Apache Linkis1.2.0 进展同步。 —— 王震
代码已完成合并,文档与测试用例正在准备准备
另外1.1.3 新特性代码已提交,补充单元测试,文档在完善,孙顺主持
5. 【固定议题】Apache Linkis 社区运营 进展同步。 —— 李文
开源社区运行指标增长正常,committer与pr等核心指标增长符合预期,从最近两周发出的公众号的推文来看,开发者们是比较关注技术层面的,还有Linux或者WDS相关的组件的一些新版本的发布,而对于其他的活动类的些软文类的文章关注度一般。
贡献者证书和摆台登记人数28人,总数为80人,需要进一步联系。
已经在B站等网站推出如何成为开发者视频,建议是从文档较为简单的方式开始。已经有7个志愿者通过文档开始给社区做贡献。
星策社区meetup,观看人数达到4000+,但是没有带给社区带来流量增长。
6月9日周四晚上,开展社区meetup活动,双周会延期一周至6月15日举行。
6. 【临时议题】测试环境白名单访问机制 —— 王和平
安全原因,现有的测试环境不能正常接入,需要采购如阿里云等外部资源进行重新部署,需要微众内部进一步进行沟通资源申请问题。
7. 【临时议题】Apache Linkis1.1.3特性介绍(Prometheus监控) —— 孙顺
主要介绍了Linkis
Prometheus监控架构涉及及部署方案,主要监控的数据在JVM、NETWORK,如有需要天翼云这边可贡献成功与失败任务数等更贴近业务的指标监控。
8. 【临时议题】Linkis容器化开始启动 —— 陶克路
Linis容器化开始启动开发,天翼云可提供一些代码与思路配合。
9. 【临时议题】基于linkis&dss 1.X版本的,周边组件适配情况—— 邸帅
强哥汇报了各个系统的版本规划情况,及各个版本之间的依赖关系。
10. 【固定议题】下一场例会的主持人,欢迎认领。
延期一周,6月15日举行,辉哥主持
11. 【临时议题】开发者问题答疑
https://docs.qq.com/sheet/DUlJIREJKaHlVVUVU?u=444994d9ce1841d3af5bb0d65efbe4a9&tab=ggfj6i
针对资源使用问题,建议用户从1.0.2升级至1.0.3以上版本,1.1.2将作为为长期支持版本。
k8s部署linkis存在,由于采用overlap网络,服务ip和端口是随机的,再重启的时候,linkis把eurka地址存在本地数据库中,导致重启后还是连接之前地址,建议不保留不必要的信息在数据库中,
数据源贯穿整个数据治理,如果在linkis中配置,不利于扩展和使用,主要Linkis数据源是针对工具开发使用,而非公司统一使用的数据源。


配置参数变化 是否需要在源码库中增加一个参数变化记录的md文档
近期动态
已合并
1. https://github.com/apache/incubator-linkis/pull/2168/files
FileSource中文件类型支持变量配置 介龙平 已合并
2. https://github.com/apache/incubator-linkis/issues/2124
 优化结果集路径以日期分隔,解决单个文件夹子目录过多问题
不同日期的resustset路径在同一个文件夹,如“/tmp/linkis/hadoop/linkis/20220516_210525/IDE/40099”,可能会导致一个文件夹下文件太多
已合并  hdfs目录个数有限制
3. https://github.com/apache/incubator-linkis/pull/2109   添加sqoop引擎插件的支持
已合并
4. https://github.com/apache/incubator-linkis/issues/2103 修复了未使用
kerberos,在执行JDBC引擎任务时 kinit 线程启动的错误  已合并
5. https://github.com/apache/incubator-linkis/issues/2110
移除了源码中的二进制文件.mvn/wrapper/maven-wrapper.jar,调整.mvn/*相关的LICENSE说明
6. https://github.com/apache/incubator-linkis/pull/2113   升级
py4j-0.10.7-src.zip 至 py4j-0.10.9.5-src.zip
7. https://github.com/apache/incubator-linkis/pull/2116   linkis-storage
模块将 cglib 替换为spring 内置的cglib
8. https://github.com/apache/incubator-linkis/pull/2131
移除对pandas的引入,解决python引擎因为缺失依赖导致启动失败的问题
9. https://github.com/apache/incubator-linkis/pull/2133
数据源kafka与hive的临时存储路径增加检查自动创建目录功能
10. https://github.com/apache/incubator-linkis/pull/2142 修复JDBC Engine
控制台配置修改后无法立即生效的问题(cache时间调整为配置项)
11. https://github.com/apache/incubator-linkis/pull/2160
任务提交的消费队列支持配置特定大容量用户
12. https://github.com/apache/incubator-linkis/pull/2161  新增对结果集导出到
excel文件时,自动格式化参数的支持

待合并
1. https://github.com/apache/incubator-linkis/pull/2173 添加presto引擎插件的支持
2. https://github.com/apache/incubator-linkis/pull/2164 entrance
支持任务重试次数的参数- RetryCountLabel  Asf头部缺失
3. https://github.com/apache/incubator-linkis/pull/2163
增加任务与执行EC的记录,EC信息记录到任务的 Metrics字段中
4. https://github.com/apache/incubator-linkis/pull/2159
EC的log日志支持按大小和时间切割滚动
5. https://github.com/apache/incubator-linkis/pull/2150
common和entrance模块都存在自定义变量替换的逻辑,优化聚集到common模块中处理
6. https://github.com/apache/incubator-linkis/pull/2147
dependabot的gson从2.8.5升级至2.8.9
7. https://github.com/apache/incubator-linkis-website/pull/265  引擎实现细节文档补充
待合并 构建失败

issue
1. https://github.com/apache/incubator-linkis/issues/2144
不同系统编译出现shell脚本授权+x权限失败问题  陶克路跟进
2. https://github.com/apache/incubator-linkis/issues/2141 将 JDBC 引擎中的 dbcp
更改为 dbcp2