You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@linkis.apache.org by rita <ri...@163.com> on 2022/08/09 03:40:42 UTC

[DISCUSS]Gateway module is faulty

Dear:

The chat records of WeChat group“Apache linkis Community Development Group”are as follows微信群"Apache Linkis 社区开发群"的聊天记录如下

—————  2022-8-6  —————

Xi  10:50

In our case, the Gateway module occasionally fails directly. Have you ever encountered this  @peacewong@WDS 和平兄,我们这里gateway模块偶尔会直接挂掉,你们有遇到过吗

 

Mr flash闪电先生  10:56

I do find that Eureka dies a lot, solo我倒是发现eureka经常死,单机的话

 

Mr flash闪电先生  10:57

This could be improved by deploying two gateways如果部署二个gateway 可能会有所改善

 

peacewong@WDS  11:11

Did you see any reason to die?有看到什么原因挂掉吗?是不是内存设置太小。我们这边没有挂掉过

 

Xi  11:14

Yeah, it's OOM恩,看了下是oom了

 

peacewong@WDS  11:15

How much do I set it to设置的多少呢

 

Xi  11:15

How much do I set it to设置的是2G

 

peacewong@WDS  11:15

Eureka wouldn't have been normal eitherEureka正常应该也不会的

 

Xi  11:15

I went up to 6 grams我加到6G了

 

peacewong@WDS  11:16

I went up to 6 gramsdump Our 2G never failed看看,哪个比较多?我们这边2g没挂过

 

Xi  11:16

I am looking at the background log here for other services accessing Eureka frequently times out我这里看后台日志其他服务访问eureka经常超时

 

peacewong@WDS  11:16

Not normally. Eureka doesn't have a lot of requests正常不会,Eureka的请求数不多

 

Xi  11:18

WDS. Linkis.gateway. url= Is this also comma separated ?gateway 部署多个的时候 wds.linkis.gateway.url= 这个也是逗号分隔吗

 

Xi  11:24

WDS. Linkis.gateway. url= Is this also comma separated没共享这个问题有解决办法吗

 

Xi  11:25

Is it still possible to only point to one of the gateways when logging in with scripts用scripts登陆的时候是不是还是只能指到其中一个gateway上

 

Xi  11:26

Can I only point to one of the gateways when logging in the Linkis background登陆linkis后台也是只能指向其中一台gateway吗

 

Xi  11:26

@夏晨 @casion

 

Heisenbao 海森堡  11:26

If you want to share, you probably need to add a public cache想要共享的话 估计需要加公共缓存啦

 

Heisenbao 海森堡  11:28

The public cache is used to store the login status of the user. The login status is saved by each gateway and is not shared to other gateways.公共缓存来保存用户登录状态,现在的登录状态是由每一个gateway 保存的,没有共享到别的gateway,

 

peacewong@WDS  11:28

This way you don't need a public cache这种方式就不需要公共缓存

 

Heisenbao 海森堡  11:29

As I recall, Gateway records the login status of the user我记得gateway 会记录用户的登录状态

 

Xi  11:30

I try 我试下

 

casion  11:32

Well, using Nginx's IP hash as the load ensures that requests are always sent to a fixed gateway without changing the IP address of the same user.嗯嗯,用nginx的ip hash来做负载,能保证同一个用户ip不变前提下,始终请求到固定某台gateway上。

 

casion  11:34

This is 6G, but there is something wrong with the OOM. Please check it. The Gateway processing logic is relatively light, so it should not need too much memory这个6G了,还oom应该是哪里有点问题,可以排查下,gateway处理逻辑体量比较轻,不应该需要太多内存

 

Xi  11:38

This is when the kill task, report an error, not clear if there is a relationship, feel like a code logic problem这个是kill任务的时候,报个错,不清楚是不是有关系,感觉像是代码逻辑问题

 

Xi  11:39

 



 

Xi  11:39

/api/rest_j/v1/entrance/exec_id018017linkis-cg-entrance172.21.32.17:9104IDE_eps_dw_spark_2159/kill

 

Xi  11:39

There is a consistent error in the kill task. It looks like there is a problem with execidkill 任务的时候这里一致报错,看着execid好像在这里有问题

 

Xi  11:40

This is 2G error, I added to 6G restart, again observe这个是2G报错的,我加到6G重启了,再观察下

 

peacewong@WDS  11:43

This is not OOM, looks like the parameter parsing error?这个不是oom,看起来是传的参数解析错误了?

 

Xi  12:12

 



 

Xi  12:12

OOM in this OOM的在这

 

Xi  12:13

OOM in this刚那个是大量warn,看着像是kill任务的时候报的

 

peacewong@WDS  12:18

Take a look at GC logs and dump看看gc日志和dump看看