You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by wangmiao <wa...@163.com> on 2023/07/08 04:34:04 UTC

Session Issue

Hello Team,
 We are using a zookeeper cluster to serve HBase services, with three nodes deployed in the cluster,We found that other clients and regionserver services share the same session ID, which caused the regionserver to crash
Zookeeper version is: 3.4.5


Could you please help us to debug this issue?


Follower's log
 2023-07-05 02:25:08,990 INFO org.apache.zookeeper.server.ZooKeeperServer: Client attempting to establish new session at /10.11.1.10:51432
 2023-07-05 02:25:08,991 INFO org.apache.zookeeper.server.ZooKeeperServer: Established session 0xff88d1721b6c53bc with negotiated timeout 180000 for client   /10.11.1.10:51432
 2023-07-05 02:25:35,880 INFO org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection for client /10.11.1.10:51432 which had sessionid 0xff88d1721b6c53bc

0xff88d1721b6c53bc comes from the client 10.11.1.10:51432, 2023-07-05 02:25:35880. The server closed it when it thought it timed out


master's log


 2023-07-05 02:25:35,880 INFO org.apache.zookeeper.server.PrepRequestProcessor: Processed session termination for sessionid: 0xff88d1721b6c53bc
 2023-07-05 02:25:35,880 INFO org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection for client /10.11.8.18:41402 which had sessionid 0xff88d1721b6c53bc

The client displayed in 0xff88d1721b6c53bc is 10.11.8.18:41402, 
10.11.8.18 is my regionserver service, which was restarted due to a session disconnection
resionserver's log
 2023-07-05 02:25:35,213 INFO org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Slow sync cost: 272 ms, current pipeline:   [DatanodeInfoWithStorage[10.11.8.18:9866,DS-a43ef0f9-d9e8-4fbd-a69b-1f2aec83cb8d,DISK], DatanodeInfoWithStorage[10.11.1.11:9866,DS-3a9a9b9d-a405-4a1b-af87-   9c0d38eb7fc6,DISK], DatanodeInfoWithStorage[10.11.8.19:9866,DS-8466a61b-419c-44e7-85e2-30d28ff16c0f,DISK]]
 2023-07-05 02:25:35,881 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0xff88d1721b6c53bc, likely server has closed socket,   closing socket connection and attempting reconnect
 2023-07-05 02:25:36,573 INFO org.apache.hadoop.hbase.regionserver.throttle.PressureAwareThroughputController:   a40720868195bb1f851f94e01162801b#cf#compaction#13639 average throughput is 29.28   MB/second, slept 0 time(s) and total slept time is 0 ms. 1 active operations   remaining, total limit is 69.23 MB/second
 2023-07-05 02:25:36,591 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server dx-hbaseservice01.dx/10.11.39.10:2181. Will not attempt to   authenticate using SASL (unknown error)
 2023-07-05 02:25:36,592 INFO org.apache.zookeeper.ClientCnxn: Socket connection established, initiating session, client: /10.11.8.18:29508, server: dx-   hbaseservice01.dx/10.11.39.10:2181
 2023-07-05 02:25:36,597 WARN org.apache.zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session 0xff88d1721b6c53bc has expired
 2023-07-05 02:25:36,597 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down for session: 0xff88d1721b6c53bc
Thanks,
Miao Wang

Re:Re: Session Issue

Posted by Miao Wang <wa...@163.com>.


Thank you for your reply

1.The Zookeeper service is a distributed deployment, not an independent mode. Upon checking the logs, no brain split was found

2.we plan to upgrade to the new version. I would like to ask which version is the most stable for you to choose











At 2023-07-21 06:30:10, "Enrico Olivelli" <eo...@gmail.com> wrote:
>Miao,
>Sorry for late reply.
>
>Are you sure that the servers are not running in standalone mode? (Split
>brain).
>
>Second thing: your version is very old (3
>4.5), it would be better to move to some more recent version.
>
>Enrico
>
>Il Sab 8 Lug 2023, 06:34 wangmiao <wa...@163.com> ha scritto:
>
>> Hello Team,
>>  We are using a zookeeper cluster to serve HBase services, with three
>> nodes deployed in the cluster,We found that other clients and regionserver
>> services share the same session ID, which caused the regionserver to crash
>> Zookeeper version is: 3.4.5
>>
>>
>> Could you please help us to debug this issue?
>>
>>
>> Follower's log
>>  2023-07-05 02:25:08,990 INFO org.apache.zookeeper.server.ZooKeeperServer:
>> Client attempting to establish new session at /10.11.1.10:51432
>>  2023-07-05 02:25:08,991 INFO org.apache.zookeeper.server.ZooKeeperServer:
>> Established session 0xff88d1721b6c53bc with negotiated timeout 180000 for
>> client   /10.11.1.10:51432
>>  2023-07-05 02:25:35,880 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> Closed socket connection for client /10.11.1.10:51432 which had sessionid
>> 0xff88d1721b6c53bc
>>
>> 0xff88d1721b6c53bc comes from the client 10.11.1.10:51432, 2023-07-05
>> 02:25:35880. The server closed it when it thought it timed out
>>
>>
>> master's log
>>
>>
>>  2023-07-05 02:25:35,880 INFO
>> org.apache.zookeeper.server.PrepRequestProcessor: Processed session
>> termination for sessionid: 0xff88d1721b6c53bc
>>  2023-07-05 02:25:35,880 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> Closed socket connection for client /10.11.8.18:41402 which had sessionid
>> 0xff88d1721b6c53bc
>>
>> The client displayed in 0xff88d1721b6c53bc is 10.11.8.18:41402,
>> 10.11.8.18 is my regionserver service, which was restarted due to a
>> session disconnection
>> resionserver's log
>>  2023-07-05 02:25:35,213 INFO
>> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Slow sync cost: 272
>> ms, current pipeline:   [DatanodeInfoWithStorage[10.11.8.18:9866,DS-a43ef0f9-d9e8-4fbd-a69b-1f2aec83cb8d,DISK],
>> DatanodeInfoWithStorage[10.11.1.11:9866,DS-3a9a9b9d-a405-4a1b-af87-
>>  9c0d38eb7fc6,DISK], DatanodeInfoWithStorage[10.11.8.19:9866
>> ,DS-8466a61b-419c-44e7-85e2-30d28ff16c0f,DISK]]
>>  2023-07-05 02:25:35,881 INFO org.apache.zookeeper.ClientCnxn: Unable to
>> read additional data from server sessionid 0xff88d1721b6c53bc, likely
>> server has closed socket,   closing socket connection and attempting
>> reconnect
>>  2023-07-05 02:25:36,573 INFO
>> org.apache.hadoop.hbase.regionserver.throttle.PressureAwareThroughputController:
>>  a40720868195bb1f851f94e01162801b#cf#compaction#13639 average throughput is
>> 29.28   MB/second, slept 0 time(s) and total slept time is 0 ms. 1 active
>> operations   remaining, total limit is 69.23 MB/second
>>  2023-07-05 02:25:36,591 INFO org.apache.zookeeper.ClientCnxn: Opening
>> socket connection to server dx-hbaseservice01.dx/10.11.39.10:2181. Will
>> not attempt to   authenticate using SASL (unknown error)
>>  2023-07-05 02:25:36,592 INFO org.apache.zookeeper.ClientCnxn: Socket
>> connection established, initiating session, client: /10.11.8.18:29508,
>> server: dx-   hbaseservice01.dx/10.11.39.10:2181
>>  2023-07-05 02:25:36,597 WARN org.apache.zookeeper.ClientCnxn: Unable to
>> reconnect to ZooKeeper service, session 0xff88d1721b6c53bc has expired
>>  2023-07-05 02:25:36,597 INFO org.apache.zookeeper.ClientCnxn: EventThread
>> shut down for session: 0xff88d1721b6c53bc
>> Thanks,
>> Miao Wang

Re: Session Issue

Posted by Enrico Olivelli <eo...@gmail.com>.
Miao,
Sorry for late reply.

Are you sure that the servers are not running in standalone mode? (Split
brain).

Second thing: your version is very old (3
4.5), it would be better to move to some more recent version.

Enrico

Il Sab 8 Lug 2023, 06:34 wangmiao <wa...@163.com> ha scritto:

> Hello Team,
>  We are using a zookeeper cluster to serve HBase services, with three
> nodes deployed in the cluster,We found that other clients and regionserver
> services share the same session ID, which caused the regionserver to crash
> Zookeeper version is: 3.4.5
>
>
> Could you please help us to debug this issue?
>
>
> Follower's log
>  2023-07-05 02:25:08,990 INFO org.apache.zookeeper.server.ZooKeeperServer:
> Client attempting to establish new session at /10.11.1.10:51432
>  2023-07-05 02:25:08,991 INFO org.apache.zookeeper.server.ZooKeeperServer:
> Established session 0xff88d1721b6c53bc with negotiated timeout 180000 for
> client   /10.11.1.10:51432
>  2023-07-05 02:25:35,880 INFO org.apache.zookeeper.server.NIOServerCnxn:
> Closed socket connection for client /10.11.1.10:51432 which had sessionid
> 0xff88d1721b6c53bc
>
> 0xff88d1721b6c53bc comes from the client 10.11.1.10:51432, 2023-07-05
> 02:25:35880. The server closed it when it thought it timed out
>
>
> master's log
>
>
>  2023-07-05 02:25:35,880 INFO
> org.apache.zookeeper.server.PrepRequestProcessor: Processed session
> termination for sessionid: 0xff88d1721b6c53bc
>  2023-07-05 02:25:35,880 INFO org.apache.zookeeper.server.NIOServerCnxn:
> Closed socket connection for client /10.11.8.18:41402 which had sessionid
> 0xff88d1721b6c53bc
>
> The client displayed in 0xff88d1721b6c53bc is 10.11.8.18:41402,
> 10.11.8.18 is my regionserver service, which was restarted due to a
> session disconnection
> resionserver's log
>  2023-07-05 02:25:35,213 INFO
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Slow sync cost: 272
> ms, current pipeline:   [DatanodeInfoWithStorage[10.11.8.18:9866,DS-a43ef0f9-d9e8-4fbd-a69b-1f2aec83cb8d,DISK],
> DatanodeInfoWithStorage[10.11.1.11:9866,DS-3a9a9b9d-a405-4a1b-af87-
>  9c0d38eb7fc6,DISK], DatanodeInfoWithStorage[10.11.8.19:9866
> ,DS-8466a61b-419c-44e7-85e2-30d28ff16c0f,DISK]]
>  2023-07-05 02:25:35,881 INFO org.apache.zookeeper.ClientCnxn: Unable to
> read additional data from server sessionid 0xff88d1721b6c53bc, likely
> server has closed socket,   closing socket connection and attempting
> reconnect
>  2023-07-05 02:25:36,573 INFO
> org.apache.hadoop.hbase.regionserver.throttle.PressureAwareThroughputController:
>  a40720868195bb1f851f94e01162801b#cf#compaction#13639 average throughput is
> 29.28   MB/second, slept 0 time(s) and total slept time is 0 ms. 1 active
> operations   remaining, total limit is 69.23 MB/second
>  2023-07-05 02:25:36,591 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server dx-hbaseservice01.dx/10.11.39.10:2181. Will
> not attempt to   authenticate using SASL (unknown error)
>  2023-07-05 02:25:36,592 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established, initiating session, client: /10.11.8.18:29508,
> server: dx-   hbaseservice01.dx/10.11.39.10:2181
>  2023-07-05 02:25:36,597 WARN org.apache.zookeeper.ClientCnxn: Unable to
> reconnect to ZooKeeper service, session 0xff88d1721b6c53bc has expired
>  2023-07-05 02:25:36,597 INFO org.apache.zookeeper.ClientCnxn: EventThread
> shut down for session: 0xff88d1721b6c53bc
> Thanks,
> Miao Wang