You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Jung Young Seok <ju...@gmail.com> on 2014/04/29 10:30:33 UTC

Regarding large number of watch count

Dear Zookeeper-user,

We have 3 zookeeper node clustered.
The zookeeper is used as application lock coordinator.
Client creates node(key) when it needs lock then release and delete the
node(key) when the usage is done.

Strangely, zk_watch_count has been increased.
In case of leader, zk_watch_count reached 6804.

==================================================================
Detailed information is below. (mntr)

1. Zoo-1 (Follower)
Status Information: OK Zookeeper State : follower zk_avg_latency 2
zk_max_latency 215 zk_min_latency 0 zk_packets_received 9001127
zk_packets_sent 9027569 zk_num_alive_connections 3 zk_outstanding_requests
0 zk_server_state follower zk_znode_count 12 zk_watch_count 1786
zk_ephemerals_count 2 zk_approximate_data_size 525
zk_open_file_descriptor_count 29 zk_max_file_descriptor_count 4096
Performance Data: zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT
2. Zoo-2 (Follower)
Status Information: OK Zookeeper State : follower zk_avg_latency 0
zk_max_latency 6 zk_min_latency 0 zk_packets_received 3539 zk_packets_sent
3538 zk_num_alive_connections 2 zk_outstanding_requests 0 zk_server_state
follower zk_znode_count 12 zk_watch_count 0 zk_ephemerals_count 2
zk_approximate_data_size 525 zk_open_file_descriptor_count 28
zk_max_file_descriptor_count 4096 Performance Data: zk_version
3.4.6-1569965, built on 02/20/2014 09:09 GMT
3. Zoo-3 (Leader) Status Information: OK Zookeeper State : leader
zk_avg_latency 1 zk_max_latency 214 zk_min_latency 0 zk_packets_received
21575604 zk_packets_sent 21638420 zk_num_alive_connections 4
zk_outstanding_requests 0 zk_server_state leader zk_znode_count 18
zk_watch_count 6804 zk_ephemerals_count 5 zk_approximate_data_size 954
zk_open_file_descriptor_count 32 zk_max_file_descriptor_count 4096
zk_followers 2 zk_synced_followers 2 zk_pending_syncs 0 Performance Data:
zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT

==================================================================

My questions are,
1. Is it normal case that zk_watch_count reached 6804?

2. Why has zk_watch_count been increased?
- We use tomcat + Apache Curator + Zookeeper 3.4.6

3. Would it make trouble if zk_watch_count reach too large number?

4. Is there any way that I can reduce the zk_watch_count?

I'd be glad on any opinion.
Thank you in advance

Regards,
Youngseok Jung

Re: Regarding large number of watch count

Posted by Jung Young Seok <ju...@gmail.com>.
Thank you for your suggestion.

I've checked watch node with wchp.
As you can see zk_watch_count  is 3587 (pasted below)
However we delete every node after usage.

# zkCli.sh
[zk: localhost:2181(CONNECTED) 0] ls /aws/user
[]


# echo wchp | nc localhost 2181 | wc -l
7175  (zk_watch_count  is 3587)


# echo wchp | nc localhost 2181
/aws/user/9cb3ecea-fe4a-4b47/_c_f6b8c5ca-f4e3-4755-9bab-df27c4ff239f-lock-0000000000
        0x145983186f10006
/aws/user/6534f8b7-8707-4641/_c_5dbf1dea-7392-436c-97fe-ba97d45068fe-lock-0000000001
        0x145983186f10007
/aws/user/00e2422a-8afb-4ea9/_c_4bcc9937-ddac-4b67-ad79-c9d36f8da0b0-lock-0000000000
        0x145983186f10007
/aws/user/08ebc05d-acb1-4243/_c_362a838a-deda-48cb-b6ed-d37417191dc9-lock-0000000003
        0x145983186f10007
/aws/user/05da9afd-4e01-4843/_c_e18fd405-e509-4b2c-9419-6a4b5d85477a-lock-0000000000
        0x145983186f10007
/aws/user/4da11842-a033-4016/_c_6edbba60-ad3c-4dad-acfa-2c2fb49a9c89-lock-0000000000
        0x145983186f10007
/aws/user/a8b4055b-7248-45ef/_c_df409fd1-9b0e-4c72-a4a4-5ee0f5a09dc8-lock-0000000000
        0x145983186f10006
/aws/user/4bf4c860-e3ad-442c/_c_5cdca3aa-490f-475a-8430-3f1e56dd6bae-lock-0000000000
        0x145983186f10006
/aws/user/4952533f-8d34-4041/_c_72f4bd75-ad6f-454c-a734-9bc85948f3e4-lock-0000000000
        0x145983186f10007
/aws/user/a9d04409-9c69-4fb6/_c_2d4f1df0-0a0f-4608-ab28-94460f8d0773-lock-0000000000
        0x145983186f10007

.... omit ....


We use zookeeper with curator framework. Here's how we use.

...
//acquire part

if( vo == null ) {
vo = new LockVO();
vo.fullPath = fullPath;
vo.lock = new InterProcessMutex(client, vo.fullPath);
lockFullPathMap.put(fullPath, vo);
}

try {
if( !vo.lock.acquire(lockTimeout, TimeUnit.MILLISECONDS) ) {
throw new IllegalStateException("Fail to acquire lock");
}
vo.count++;
} catch (final RuntimeException e) {
throw e;
}
...


...
//release part
vo.lock.release();
if( vo.count == 0 ) {
try {
client.delete().forPath(fullPath);
} catch( final NoNodeException | NotEmptyException e) {
/* ignore exception */
}
}
...

I don't understand why the watch count is getting bigger even though we
delete every single node after lock usage.
Watch count is increased slowly not every single lock/release request. (it
took 7 days to reach 3587)

When we restart all WAS connecting to the zookeeper, the watch count get
clear to zero.

Do you guys have any idea why it's increasing?
Is there any way to clear the watch count in running environment?
I'm worried that it stops suddenly due to memory leak or any segmentation
fault.

I'd appreciate on any idea.
Thanks,

Sincerely,
Youngseok Jung


2014-04-30 2:00 GMT+09:00 Raúl Gutiérrez Segalés <rg...@itevenworks.net>:

> Hi,
>
>
> On 29 April 2014 01:30, Jung Young Seok <ju...@gmail.com> wrote:
>
> > Dear Zookeeper-user,
> >
> > We have 3 zookeeper node clustered.
> > The zookeeper is used as application lock coordinator.
> > Client creates node(key) when it needs lock then release and delete the
> > node(key) when the usage is done.
> >
> > Strangely, zk_watch_count has been increased.
> > In case of leader, zk_watch_count reached 6804.
> >
> > ==================================================================
> > Detailed information is below. (mntr)
> >
> > 1. Zoo-1 (Follower)
> > Status Information: OK Zookeeper State : follower zk_avg_latency 2
> > zk_max_latency 215 zk_min_latency 0 zk_packets_received 9001127
> > zk_packets_sent 9027569 zk_num_alive_connections 3
> zk_outstanding_requests
> > 0 zk_server_state follower zk_znode_count 12 zk_watch_count 1786
> > zk_ephemerals_count 2 zk_approximate_data_size 525
> > zk_open_file_descriptor_count 29 zk_max_file_descriptor_count 4096
> > Performance Data: zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT
> > 2. Zoo-2 (Follower)
> > Status Information: OK Zookeeper State : follower zk_avg_latency 0
> > zk_max_latency 6 zk_min_latency 0 zk_packets_received 3539
> zk_packets_sent
> > 3538 zk_num_alive_connections 2 zk_outstanding_requests 0 zk_server_state
> > follower zk_znode_count 12 zk_watch_count 0 zk_ephemerals_count 2
> > zk_approximate_data_size 525 zk_open_file_descriptor_count 28
> > zk_max_file_descriptor_count 4096 Performance Data: zk_version
> > 3.4.6-1569965, built on 02/20/2014 09:09 GMT
> > 3. Zoo-3 (Leader) Status Information: OK Zookeeper State : leader
> > zk_avg_latency 1 zk_max_latency 214 zk_min_latency 0 zk_packets_received
> > 21575604 zk_packets_sent 21638420 zk_num_alive_connections 4
> > zk_outstanding_requests 0 zk_server_state leader zk_znode_count 18
> > zk_watch_count 6804 zk_ephemerals_count 5 zk_approximate_data_size 954
> > zk_open_file_descriptor_count 32 zk_max_file_descriptor_count 4096
> > zk_followers 2 zk_synced_followers 2 zk_pending_syncs 0 Performance Data:
> > zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT
> >
> > ==================================================================
> >
> > My questions are,
> > 1. Is it normal case that zk_watch_count reached 6804?
> >
> > 2. Why has zk_watch_count been increased?
> > - We use tomcat + Apache Curator + Zookeeper 3.4.6
> >
> > 3. Would it make trouble if zk_watch_count reach too large number?
> >
> > 4. Is there any way that I can reduce the zk_watch_count?
> >
> >
> You can introspect the watches via wchs (summary), wchc (watches by
> session) and wchp (watches by path). That'll give you an idea of what's
> going on. For example, on one of my servers:
>
> $ echo wchp | nc localhost 2181
> /messaging/00/0019/L2383
>         0x45aa3508a3ab77
>         0x45aa35089f8dce
>         0x45aa3508a2837d
> /search/member_0000283539
>         0x145aa2cc2b345d7
> /messaging/00/0019/L2384
>         0x45aa35089f8de4
> ...
>
>
> -rgs
>

Re: Regarding large number of watch count

Posted by Raúl Gutiérrez Segalés <rg...@itevenworks.net>.
Hi,


On 29 April 2014 01:30, Jung Young Seok <ju...@gmail.com> wrote:

> Dear Zookeeper-user,
>
> We have 3 zookeeper node clustered.
> The zookeeper is used as application lock coordinator.
> Client creates node(key) when it needs lock then release and delete the
> node(key) when the usage is done.
>
> Strangely, zk_watch_count has been increased.
> In case of leader, zk_watch_count reached 6804.
>
> ==================================================================
> Detailed information is below. (mntr)
>
> 1. Zoo-1 (Follower)
> Status Information: OK Zookeeper State : follower zk_avg_latency 2
> zk_max_latency 215 zk_min_latency 0 zk_packets_received 9001127
> zk_packets_sent 9027569 zk_num_alive_connections 3 zk_outstanding_requests
> 0 zk_server_state follower zk_znode_count 12 zk_watch_count 1786
> zk_ephemerals_count 2 zk_approximate_data_size 525
> zk_open_file_descriptor_count 29 zk_max_file_descriptor_count 4096
> Performance Data: zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT
> 2. Zoo-2 (Follower)
> Status Information: OK Zookeeper State : follower zk_avg_latency 0
> zk_max_latency 6 zk_min_latency 0 zk_packets_received 3539 zk_packets_sent
> 3538 zk_num_alive_connections 2 zk_outstanding_requests 0 zk_server_state
> follower zk_znode_count 12 zk_watch_count 0 zk_ephemerals_count 2
> zk_approximate_data_size 525 zk_open_file_descriptor_count 28
> zk_max_file_descriptor_count 4096 Performance Data: zk_version
> 3.4.6-1569965, built on 02/20/2014 09:09 GMT
> 3. Zoo-3 (Leader) Status Information: OK Zookeeper State : leader
> zk_avg_latency 1 zk_max_latency 214 zk_min_latency 0 zk_packets_received
> 21575604 zk_packets_sent 21638420 zk_num_alive_connections 4
> zk_outstanding_requests 0 zk_server_state leader zk_znode_count 18
> zk_watch_count 6804 zk_ephemerals_count 5 zk_approximate_data_size 954
> zk_open_file_descriptor_count 32 zk_max_file_descriptor_count 4096
> zk_followers 2 zk_synced_followers 2 zk_pending_syncs 0 Performance Data:
> zk_version 3.4.6-1569965, built on 02/20/2014 09:09 GMT
>
> ==================================================================
>
> My questions are,
> 1. Is it normal case that zk_watch_count reached 6804?
>
> 2. Why has zk_watch_count been increased?
> - We use tomcat + Apache Curator + Zookeeper 3.4.6
>
> 3. Would it make trouble if zk_watch_count reach too large number?
>
> 4. Is there any way that I can reduce the zk_watch_count?
>
>
You can introspect the watches via wchs (summary), wchc (watches by
session) and wchp (watches by path). That'll give you an idea of what's
going on. For example, on one of my servers:

$ echo wchp | nc localhost 2181
/messaging/00/0019/L2383
        0x45aa3508a3ab77
        0x45aa35089f8dce
        0x45aa3508a2837d
/search/member_0000283539
        0x145aa2cc2b345d7
/messaging/00/0019/L2384
        0x45aa35089f8de4
...


-rgs