You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@zookeeper.apache.org by 田毅群 <ti...@qiyi.com> on 2018/10/21 06:11:53 UTC

ZooKeeper two issues review

Hi, all
I proposed a Jira issue to commit ZooKeeper codes. I was asked to follow the new issue. So firstly I need to send an email to describe my two issues.

First one:
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-3167.
Purpose: add an API to get total count of recursive sub nodes of one node
Description:

1. In production environment, there will be always a situation that there are a lot of recursive sub nodes of one node. We need to count total number of the node. Like this.（We want to get all the subnodes of nodeA.）

[cid:image002.jpg@01D46948.03F27490]

2. Now, we can only use API getChildren which returns the List<String> of first level of sub nodes.(We can only get the nodeB list directly). We need to iterate every sub node to get recursive sub nodes. It will cost a lot of time.

3. In zookeeper server side, it uses Hasp<String, DataNode> to store node. The key of the map represents the path of the node. We can iterate the map get total number of all levels of sub nodes of one node.

Second One:
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-3168
Purpose: Reduce session revalidation time after zxid roll over
Description:

1. Sometimes Zookeeper cluster will receive a lot of connections from clients, sometimes connection number even exceeds 1W. When zxid rolls over, the clients will reconnect and revalidate the session.

2. In Zookeeper design structure, when follower server receives the session revalidation requests, it will send requests to leader server, which is designed to be responsible for session revalidation.

[cid:image004.png@01D46948.03F27490] When LearnerZooKeeperServer receives reconnection, it will send revalidation requests to LeaderZooKeeperServer. LeaderZooKeeperServer will face a lot of pressure.

3. In a short time, Leader will handle lots of requests. I use a tool to get the statistics, some clients need to wait over 20s. It is too long for some special clients, like ResourceManager.

4. I design a thought: when zxid rollover happens. Leader will record the accurate time. When reelection finishs, all servers will get the rollover time. When clients reconnect and revalidate session. All servers can judge it. So it can reduce a lots of pressure of cluster, all clients can will wait for less time.

These are my two issues. Help to review the solution is right or not. Thank you a lot.
田毅群
技术产品中心 云平台
爱奇艺公司
QIYI.com, Inc.
地址：上海市长宁区临虹路365号爱奇艺创新大厦6层
邮编：201103
手机：＋86 157 2140 1256
邮箱：tianyiqun@qiyi.com<ma...@qiyi.com>