You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@zookeeper.apache.org by GitBox <gi...@apache.org> on 2022/01/26 12:11:20 UTC

[GitHub] [zookeeper] symat commented on a change in pull request #1799: ZOOKEEPER-4453: NettyServerCnxnFactory: allow to configure the early TLS connection drop feature

symat commented on a change in pull request #1799:
URL: https://github.com/apache/zookeeper/pull/1799#discussion_r792552073



##########
File path: zookeeper-server/src/main/java/org/apache/zookeeper/server/NettyServerCnxn.java
##########
@@ -522,7 +522,10 @@ private void receiveMessage(ByteBuf message) {
                         }
                         ZooKeeperServer zks = this.zkServer;
                         if (zks == null || !zks.isRunning()) {
-                            throw new IOException("ZK down");
+                            LOG.warn("Closing connection to {} because the server is not ready",
+                                    getRemoteSocketAddress());
+                            close(DisconnectReason.IO_EXCEPTION);

Review comment:
       nit: I know we were throwing IOException before which resulted in DisconnectReason.IO_EXCEPTION, but now that we do proper close here, we might use some better DisconnectReason. I see e.g. DisconnectReason.SERVER_SHUTDOWN. Although I'm not sure when this part is triggered... I guess this can be triggered either during initialization or shutdown. Maybe DisconnectReason.IO_EXCEPTION is good enough.
   
   Also: before this change we did log the 'getRemoteSocketAddress()' in line 534, which might be handy to add to this warning above. (if someone is trying to figure out in the ZK log why a session terminated) 

##########
File path: zookeeper-server/src/test/java/org/apache/zookeeper/server/NettyServerCnxnTest.java
##########
@@ -180,42 +181,70 @@ public void testNonMTLSRemoteConn() throws Exception {
         when(zks.isRunning()).thenReturn(true);
         ServerStats.Provider providerMock = mock(ServerStats.Provider.class);
         when(zks.serverStats()).thenReturn(new ServerStats(providerMock));
-        testNonMTLSRemoteConn(zks);
+        testNonMTLSRemoteConn(zks, false, false);
     }
 
     @Test
     public void testNonMTLSRemoteConnZookKeeperServerNotReady() throws Exception {
-        testNonMTLSRemoteConn(null);
+        testNonMTLSRemoteConn(null, false, false);
+    }
+
+    @Test
+    public void testNonMTLSRemoteConnZookKeeperServerNotReadyEarlyDropEnabled() throws Exception {
+        testNonMTLSRemoteConn(null, false, true);
+    }
+
+    @Test
+    public void testMTLSRemoteConnZookKeeperServerNotReadyEarlyDropEnabled() throws Exception {
+        testNonMTLSRemoteConn(null, true, true);
+    }
+
+    @Test
+    public void testMTLSRemoteConnZookKeeperServerNotReadyEarlyDropDisabled() throws Exception {
+        testNonMTLSRemoteConn(null, true, true);
     }
 
     @SuppressWarnings("unchecked")
-    private void testNonMTLSRemoteConn(ZooKeeperServer zks) throws Exception {
-        Channel channel = mock(Channel.class);
-        ChannelId id = mock(ChannelId.class);
-        ChannelFuture success = mock(ChannelFuture.class);
-        ChannelHandlerContext context = mock(ChannelHandlerContext.class);
-        ChannelPipeline channelPipeline = mock(ChannelPipeline.class);
-
-        when(context.channel()).thenReturn(channel);
-        when(channel.pipeline()).thenReturn(channelPipeline);
-        when(success.channel()).thenReturn(channel);
-        when(channel.closeFuture()).thenReturn(success);
-
-        InetSocketAddress address = new InetSocketAddress(0);
-        when(channel.remoteAddress()).thenReturn(address);
-        when(channel.id()).thenReturn(id);
-        NettyServerCnxnFactory factory = new NettyServerCnxnFactory();
-        factory.setZooKeeperServer(zks);
-        Attribute atr = mock(Attribute.class);
-        Mockito.doReturn(atr).when(channel).attr(
-                Mockito.any()
-        );
-        doNothing().when(atr).set(Mockito.any());
-        factory.channelHandler.channelActive(context);
-
-        if (zks != null) {
-            assertEquals(0, zks.serverStats().getNonMTLSLocalConnCount());
-            assertEquals(1, zks.serverStats().getNonMTLSRemoteConnCount());
+    private void testNonMTLSRemoteConn(ZooKeeperServer zks, boolean secure, boolean earlyDrop) throws Exception {
+        System.setProperty(NettyServerCnxnFactory.EARLY_DROP_SECURE_CONNECTION_HANDSHAKES, earlyDrop + "");
+        try {
+            Channel channel = mock(Channel.class);
+            ChannelId id = mock(ChannelId.class);
+            ChannelFuture success = mock(ChannelFuture.class);
+            ChannelHandlerContext context = mock(ChannelHandlerContext.class);
+            ChannelPipeline channelPipeline = mock(ChannelPipeline.class);
+
+            when(context.channel()).thenReturn(channel);
+            when(channel.pipeline()).thenReturn(channelPipeline);
+            when(success.channel()).thenReturn(channel);
+            when(channel.closeFuture()).thenReturn(success);
+
+            InetSocketAddress address = new InetSocketAddress(0);
+            when(channel.remoteAddress()).thenReturn(address);
+            when(channel.id()).thenReturn(id);
+            NettyServerCnxnFactory factory = new NettyServerCnxnFactory();
+            factory.setSecure(secure);
+            factory.setZooKeeperServer(zks);
+            Attribute atr = mock(Attribute.class);
+            Mockito.doReturn(atr).when(channel).attr(
+                    Mockito.any()
+            );
+            doNothing().when(atr).set(Mockito.any());
+            factory.channelHandler.channelActive(context);
+
+            if (zks != null)  {
+                assertEquals(0, zks.serverStats().getNonMTLSLocalConnCount());
+                assertEquals(1, zks.serverStats().getNonMTLSRemoteConnCount());
+            } else {
+                if (earlyDrop && secure) {
+                    // the channel must have been forcibly closed
+                    Mockito.verify(channel, times(1)).close();
+                } else {
+                    Mockito.verify(channel, times(0)).close();
+                }
+            }
+        } finally {
+            System.clearProperty(NettyServerCnxnFactory.EARLY_DROP_SECURE_CONNECTION_HANDSHAKES);

Review comment:
       nit: if you clear this in the finally block, then let's move the setProperty call into the try block. (now if the test killed just before the try block, the property won't be cleared. - very-very unlikely, but still...)
   
   Alternatively I would be OK to put the clearProperty call to the afterEach() method and then you don't need the try-finally block.
   
   Also just double-checking: we don't run multiple test classes (or methods in the same class) paralel in the same JVM, right? Let's make sure we avoid some flaky execution.

##########
File path: zookeeper-server/src/main/java/org/apache/zookeeper/server/NettyServerCnxnFactory.java
##########
@@ -227,11 +228,14 @@ public void channelActive(ChannelHandlerContext ctx) throws Exception {
 
             // Check the zkServer assigned to the cnxn is still running,
             // close it before starting the heavy TLS handshake
-            if (!cnxn.isZKServerRunning()) {
-                LOG.warn("Zookeeper server is not running, close the connection before starting the TLS handshake");
-                ServerMetrics.getMetrics().CNXN_CLOSED_WITHOUT_ZK_SERVER_RUNNING.add(1);
-                channel.close();
-                return;
+            if (secure && !cnxn.isZKServerRunning()) {
+                boolean earlyDropSecureConnectionHandshakes = Boolean.getBoolean(EARLY_DROP_SECURE_CONNECTION_HANDSHAKES);
+                if (earlyDropSecureConnectionHandshakes) {
+                    LOG.warn("Zookeeper server is not running, close the connection before starting the TLS handshake");

Review comment:
       nit: I know we had warning level before, but what do you think about INFO level instead? I like to have logs here, just not sure if this is really something the user should worry.

##########
File path: zookeeper-server/src/main/java/org/apache/zookeeper/server/NettyServerCnxn.java
##########
@@ -522,7 +522,10 @@ private void receiveMessage(ByteBuf message) {
                         }
                         ZooKeeperServer zks = this.zkServer;
                         if (zks == null || !zks.isRunning()) {
-                            throw new IOException("ZK down");
+                            LOG.warn("Closing connection to {} because the server is not ready",
+                                    getRemoteSocketAddress());
+                            close(DisconnectReason.IO_EXCEPTION);

Review comment:
       we have the same "ZK down" exception in line 477. Maybe the logic should be changed there as well?

##########
File path: zookeeper-server/src/main/java/org/apache/zookeeper/server/NettyServerCnxnFactory.java
##########
@@ -227,11 +228,14 @@ public void channelActive(ChannelHandlerContext ctx) throws Exception {
 
             // Check the zkServer assigned to the cnxn is still running,
             // close it before starting the heavy TLS handshake
-            if (!cnxn.isZKServerRunning()) {
-                LOG.warn("Zookeeper server is not running, close the connection before starting the TLS handshake");
-                ServerMetrics.getMetrics().CNXN_CLOSED_WITHOUT_ZK_SERVER_RUNNING.add(1);
-                channel.close();
-                return;
+            if (secure && !cnxn.isZKServerRunning()) {
+                boolean earlyDropSecureConnectionHandshakes = Boolean.getBoolean(EARLY_DROP_SECURE_CONNECTION_HANDSHAKES);

Review comment:
       nit: should we maybe use a final field initialized in a constructor of the NettyServerCnxnFactory instead of parsing the system property all the time? 

##########
File path: zookeeper-docs/src/main/resources/markdown/zookeeperAdmin.md
##########
@@ -1163,6 +1163,12 @@ property, when available, is noted below.
     effect due to TLS handshake timeout when there are too many in-flight TLS
     handshakes. Set it to something like 250 is good enough to avoid herd effect.
 
+* *netty.server.earlyDropSecureConnectionHandshakes*

Review comment:
       I like this to be false by default. However, AFACT this behaviour was enabled in 3.7.0 by default. Maybe we should mention it in the documentation. (and also in the release doc of 3.7.1 and 3.8.0 if we don't forget)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@zookeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org