You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@linkis.apache.org by GitBox <gi...@apache.org> on 2022/02/17 10:10:24 UTC

[GitHub] [incubator-linkis] Andywli opened a new issue #1502: [Bug] Linkis1.X提示资源不足的通用排查方法

Andywli opened a new issue #1502:
URL: https://github.com/apache/incubator-linkis/issues/1502


   ### Search before asking
   
   - [X] I searched the [issues](https://github.com/apache/incubator-linkis/issues) and found no similar issues.
   
   
   ### Linkis Component
   
   linkis-commons
   
   ### What happened + What you expected to happen
   
   Linkis1.X提示资源不足的通用排查方法
   
   ### Relevent platform
   
   ·
   
   ### Reproduction script
   
   ·
   
   ### Anything else
   
   This question comes from the QA documentation of the Linkis community
   QA Link:https://docs.qq.com/doc/DSGZhdnpMV3lTUUxq
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@linkis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@linkis.apache.org
For additional commands, e-mail: commits-help@linkis.apache.org


[GitHub] [incubator-linkis] Andywli closed issue #1502: [Bug] Linkis1.X提示资源不足的通用排查方法

Posted by GitBox <gi...@apache.org>.
Andywli closed issue #1502:
URL: https://github.com/apache/incubator-linkis/issues/1502


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@linkis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@linkis.apache.org
For additional commands, e-mail: commits-help@linkis.apache.org


[GitHub] [incubator-linkis] Andywli commented on issue #1502: [Bug] Linkis1.X提示资源不足的通用排查方法

Posted by GitBox <gi...@apache.org>.
Andywli commented on issue #1502:
URL: https://github.com/apache/incubator-linkis/issues/1502#issuecomment-1042783119


   资源不足分为两种情况:
   1. 服务器本身的资源不足 
   2. 用户自身的资源不足(linkis会对用户资源进行管控)。
   3. 
   这两种资源在linkis1.X中都记录在linkis_cg_manager_label_resource和linkis_cg_manager_linkis_resources中,前者为label和resource的关联表,后者为resource表
   
   通常情况下,linkis1.0对资源的高并发管控是安全的,不建议通过修改表记录的方式去强行重置用户资源记录。但是由于安装调试过程中,linkis的执行环境有所不同,所以会出现引擎启动失败,或在引擎启动过程中对微服务的反复重启导致资源没有安全释放,或者监控器没来得及自动清理(有小时级的延迟),就可能会出现资源不足的问题,严重时会导致用户的大部分资源处于上锁状态。因此对于排查资源不足可以参考以下步骤:
   a. 在管理台确认ECM的剩余资源是否大于引擎的请求资源,如果ECM剩余的资源非常少,那么就会导致请求新的引擎失败,需要手动在ECM中关掉部分闲置的引擎,linkis对引擎也有闲时自动释放的机制,但这个时间默认设置的相对较长。
   b. 如果ECM资源充足,则必定是用户剩余资源不足以请求新的引擎,首先确定用户的执行任务时产生的label标签,例如用户hadoop在Scriptis上执行spark2.4.3脚本,则在linkis_cg_manager_label表中对应下条记录
   ![image](https://user-images.githubusercontent.com/55732213/154454237-73354218-c5b2-40aa-831c-f38f7cdc29f2.png)
   
   我们拿到这条label的id值,在关联表linkis_cg_manager_label_resource中找到对应的resourceId,通过resourceId在linkis_cg_manager_linkis_resource中就能找到对应的label的resource记录,可以检查下这条记录中的剩余资源
   
   如果这条资源排查判定是异常情况,即不符合实际引擎启动产生的资源。可以进行以下操作恢复:
   在确认该label下所有引擎已经关停的情况下,可以将这条资源和关联表linkis_cg_manager_label_resource对应的关联记录直接删除,再次请求时则会自动重置这条资源。
   
   注意:该label所有引擎已经关停在上个例子中是指的hadoop用户在Scriptis上启动的spark2.4.3的引擎已经全部关停,可以在管理台的资源管理中看到该用户启动的所有引擎实例。否则可能还会出现该label的资源记录异常。 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@linkis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@linkis.apache.org
For additional commands, e-mail: commits-help@linkis.apache.org