You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "yeshuangshuang (JIRA)" <ji...@apache.org> on 2018/12/12 08:31:00 UTC
[jira] [Updated] (ZOOKEEPER-3211) zookeeper standalone mode,found a high level bug in kernel of centos7.0 ,zookeeper Server's tcp/ip socket connections(default 60 ) are CLOSE_WAIT ,this lead to zk can't work for client any more
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
yeshuangshuang updated ZOOKEEPER-3211:
--------------------------------------
Environment:
1.zoo.cfg
server.1=127.0.0.1:2902:2903
2.kernel
kernel:Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Tue Feb 12 19:44:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
JDK:
java version "1.7.0_181"
OpenJDK Runtime Environment (rhel-2.6.14.5.el7-x86_64 u181-b00)
OpenJDK 64-Bit Server VM (build 24.181-b00, mixed mode)
zk: 3.4.5
was:
1.部署配置
server.1=127.0.0.1:2902:2903
2.部署版本
内核:Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Tue Feb 12 19:44:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
JDK:
java version "1.7.0_181"
OpenJDK Runtime Environment (rhel-2.6.14.5.el7-x86_64 u181-b00)
OpenJDK 64-Bit Server VM (build 24.181-b00, mixed mode)
zk: 3.4.5
Remaining Estimate: 168h (was: 72h)
Original Estimate: 168h (was: 72h)
Description:
1.config--zoo.cfg
server.1=127.0.0.1:2902:2903
2.kernel version
version:Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Tue Feb 12 19:44:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
JDK:
java version "1.7.0_181"
OpenJDK Runtime Environment (rhel-2.6.14.5.el7-x86_64 u181-b00)
OpenJDK 64-Bit Server VM (build 24.181-b00, mixed mode)
zk: 3.4.5
3.bug details:
Occasionally,But the recurrence probability is extremely high. At first, the read-write timeout takes about 6s, and after a few minutes, all connections (including long ones) will be CLOSE_WAIT state.
4.:Circumvention scheme: it is found that all connections become close_wait to restart the zookeeper server side actively
was:
1.部署配置
server.1=127.0.0.1:2902:2903
2.部署版本
内核:Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Tue Feb 12 19:44:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
JDK:
java version "1.7.0_181"
OpenJDK Runtime Environment (rhel-2.6.14.5.el7-x86_64 u181-b00)
OpenJDK 64-Bit Server VM (build 24.181-b00, mixed mode)
zk: 3.4.5
3.问题现象:不是必现问题,但是复现概率极高,起初是读写超时,大概耗时6s左右,过来几分钟后所有的连接(包括长连接)都成了CLOSE_WAIT状态。
4.目前手段:发现连接全部变为close_wait 主动重启zookeeper 服务端
Summary: zookeeper standalone mode,found a high level bug in kernel of centos7.0 ,zookeeper Server's tcp/ip socket connections(default 60 ) are CLOSE_WAIT ,this lead to zk can't work for client any more (was: zookeeper单机版本部署,在centos7.0内核中偶现一个较严重问题,Server段默认的60个连接全部变为CLOSE_WATI状态且长时间不消除,导致zk无法正常提供服务)
> zookeeper standalone mode,found a high level bug in kernel of centos7.0 ,zookeeper Server's tcp/ip socket connections(default 60 ) are CLOSE_WAIT ,this lead to zk can't work for client any more
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-3211
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3211
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.4.5
> Environment: 1.zoo.cfg
> server.1=127.0.0.1:2902:2903
> 2.kernel
> kernel:Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Tue Feb 12 19:44:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
> JDK:
> java version "1.7.0_181"
> OpenJDK Runtime Environment (rhel-2.6.14.5.el7-x86_64 u181-b00)
> OpenJDK 64-Bit Server VM (build 24.181-b00, mixed mode)
> zk: 3.4.5
> Reporter: yeshuangshuang
> Priority: Blocker
> Fix For: 3.4.5
>
> Attachments: 1.log, zklog.rar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> 1.config--zoo.cfg
> server.1=127.0.0.1:2902:2903
> 2.kernel version
> version:Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Tue Feb 12 19:44:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
> JDK:
> java version "1.7.0_181"
> OpenJDK Runtime Environment (rhel-2.6.14.5.el7-x86_64 u181-b00)
> OpenJDK 64-Bit Server VM (build 24.181-b00, mixed mode)
> zk: 3.4.5
> 3.bug details:
> Occasionally,But the recurrence probability is extremely high. At first, the read-write timeout takes about 6s, and after a few minutes, all connections (including long ones) will be CLOSE_WAIT state.
> 4.:Circumvention scheme: it is found that all connections become close_wait to restart the zookeeper server side actively
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)