You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@zookeeper.apache.org by "Huizhi Lu (Jira)" <ji...@apache.org> on 2021/01/08 23:40:00 UTC

[jira] [Created] (ZOOKEEPER-4053) ConnectionLossException is vague for failing to read/write large znode

Huizhi Lu created ZOOKEEPER-4053:
------------------------------------

             Summary: ConnectionLossException is vague for failing to read/write large znode
                 Key: ZOOKEEPER-4053
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4053
             Project: ZooKeeper
          Issue Type: Improvement
          Components: java client
    Affects Versions: 3.6.2
            Reporter: Huizhi Lu
            Assignee: Huizhi Lu


h2. Description

[Related discussion thread|https://mail-archives.apache.org/mod_mbox/zookeeper-dev/202101.mbox/ajax/%3CCAHVM2p%3DJ2GE1jQ3_rs2npSZ%2Bm8evszATKTvBQrmjqMdM5is22Q%40mail.gmail.com%3E]

As we know, assume we are using
the default 1 MB jute.maxbuffer, if a zk client tries to write a large
znode > 1MB, the server will fail it. Server will log "Len error" and
close the connection. The client will receive a connection loss. In a
third party ZkClient lib (eg. I0IZkClient), it'll keep retrying the
operation upon connection loss. And this forever retrying might have a
chance to take down the zk server.
h2. Log

{noformat}
 2021/01/04 18:49:06.372 WARN [ClientCnxn] [main-SendThread(localhost:2181)] Session 0x776989df3190104 for server localhost:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Broken pipe
 
2021/01/04 20:03:22.535 WARN [ClientCnxn] [main-SendThread(localhost:2181)] Session 0x776989df3190104 for server localhost:2181, unexpected error, closing socket connection and attempting reconnectjava.io.IOException: Connection reset by peer\{noformat}
in fact, the error log in the server has more meaningful information: 
{noformat}
 2021-01-04 19:19:38,467 [myid:8] - WARN  [NIOServerCxn.Factory:/0.0.0.0:2181:NIOServerCnxn@373] - Exception causing close of session 0x976988b591a010b due to java.io.IOException: Len error 1076482
2021-01-04 19:19:38,842 [myid:8] - WARN\{noformat}
h2. Proposed Solution

Client side also blocks large data write by add a sanity check for buffer size for the outgoing request and throwing a new KeeperException to signal clients to stop retrying the same operation. It's more efficient as the request is not sent to the server so a round trip is saved and server does not have to disconnect the connection.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)