You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@zookeeper.apache.org by GitBox <gi...@apache.org> on 2020/06/30 11:58:55 UTC

[GitHub] [zookeeper] Hinterwaeldlers opened a new pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Hinterwaeldlers opened a new pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391


   Use common zxid between all ZooKeeperServers used within a server instance


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] maoling commented on pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
maoling commented on pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#issuecomment-781294229


   @Hinterwaeldlers  
   Sorry for our late. I'll recheck this patch at this weekend and nudge it to land. Thanks for your contribution.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] maoling commented on pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
maoling commented on pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#issuecomment-674369940


   > it seems to work too. Not sure if there are no side effects, from not keeping the zxid in sync, but if it works, it might be the better solution
   
   - A good question. If we cannot make sure the zxid we set at the startup is freshest, we will still have consistency issues.
   - For my understanding, we can ensure the zxid is latest. Look the workflow of code(when network partition happens, leader changes state to `LOOKING`):
   
   `leader.lead() ---> leader.shutdown ---> zk.shutdown() ---> ZooKeeperServer#shutdown ---> zkDb.fastForwardDataBase()(fullyShutDown==false)---> snapLog.fastForwardFromEdits ---> processTransaction ---> dt.processTxn ---> DataTree#processTxn ---> if (rc.zxid > lastProcessedZxid) `
   
   @Hinterwaeldlers  Am I right?
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] Hinterwaeldlers commented on a change in pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
Hinterwaeldlers commented on a change in pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#discussion_r486101068



##########
File path: zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/QuorumPeerReadOnlyTest.java
##########
@@ -0,0 +1,172 @@
+package org.apache.zookeeper.server.quorum;
+
+import static org.junit.Assert.assertTrue;
+
+import java.util.Arrays;
+
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.KeeperException.RuntimeInconsistencyException;
+import org.apache.zookeeper.ZooDefs.Ids;

Review comment:
       Thx
   Is fixed now and should work in the new version




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] Hinterwaeldlers commented on a change in pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
Hinterwaeldlers commented on a change in pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#discussion_r486101152



##########
File path: zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/QuorumPeerReadOnlyTest.java
##########
@@ -0,0 +1,172 @@
+package org.apache.zookeeper.server.quorum;

Review comment:
       Appended




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] maoling commented on a change in pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
maoling commented on a change in pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#discussion_r470951044



##########
File path: zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/QuorumPeerReadOnlyTest.java
##########
@@ -0,0 +1,172 @@
+package org.apache.zookeeper.server.quorum;
+
+import static org.junit.Assert.assertTrue;
+
+import java.util.Arrays;
+
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.KeeperException.RuntimeInconsistencyException;
+import org.apache.zookeeper.ZooDefs.Ids;

Review comment:
       You can use `mvn checkstyle:check` to check whether you can pass the CI




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] Hinterwaeldlers commented on pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
Hinterwaeldlers commented on pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#issuecomment-651771189


   Hi thats correct, I am the same person.
   
   The problem within the regular processing is, that when the quorum is lost and the server switches into the readonly mode, a new ReadOnlyZooKeeper is created, where the zxid is never set and using the initial value of 0.
   Whenever a clients connect and the connection is closed with a timeout, an entry to the log is created with the incremented zxid. As the value between the regular ZooKeeperServer and the ReadOnlyServer are not shared an entry with 0x01 is created and stored in the common zkDatabase.
   Within the further leader election, the commit difference is done based on the maxCommitLog Id. As result, the first leader election will use the previous used commitLogId of the ZooKeeperServer. Assuming no further change happened on both sides, a diff with 0 elements is exchanged and a new epoche with a new commitLogId is started.
   Any further server, joining the quorum will receive the new commitLogId and the diff is done based on the maxCommitLogId. As the value is < 0x100000000 a truncate is performed.
   So far this fits the description done in Jira.
   
   My initial thought was to use a static zxid value, so it is shared between all ZooKeeperServer instances. As multiple servers are running within the unit tests, using a shared zxid is not and option so it must be defined in a common place and forwarded into all instances of the ZooKeeperServer. As far as I've checked the code the QuorumPeer is this point.
   
   As result, all tests using the ZooKeeperServer must provide the zxid.
   One possible solution would be the definition of a constructor, which provide the AtomicLong for the tests cases and removes the need to define it there. The drawback of this solution is the presence of a constructor, which is meant to be used only for testing purpose.
   Therefore I preferred a solution, which keep the current behaviour (using a none shared AtomicLong) and do not introduce test only constructors. As I've found the usage of the same constructor with same parameters in different tests, I didn't thought this has any drawbacks.
   What I've noticed was that some tests make use QuorumPeer so maybe change the visibility from private to protected and make use of it might be an option


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] maoling commented on a change in pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
maoling commented on a change in pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#discussion_r470950999



##########
File path: zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/QuorumPeerReadOnlyTest.java
##########
@@ -0,0 +1,172 @@
+package org.apache.zookeeper.server.quorum;

Review comment:
       Don't have a License




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] maoling commented on pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
maoling commented on pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#issuecomment-653732591


   - A quick comment. IIRC, my fix for this issue few days ago is like this: 
   ```
   # ReadOnlyZooKeeperServer#startup
   setZxid(zkDb.getDataTreeLastProcessedZxid());
   super.startup();
   ```
   - For my understanding, the `hzxid` inited with 0 and never be reassigned with last processed `zxid` in that situation. We should think the following question: 
        - When a follower changes to leader, why we don't have this issue? 
        - Is it because that `ZooKeeperServer` instance reload the `zkDataBase` and assign `hzxid` with last processed `zxid`?
       - We can take an example of the code of `NIOServerCnxnFactory#startup`, how a `ZooKeeperServer` instance reload its state space from a failure, then `setupRequestProcessors` to server the requests?
    
   - Put `hzxid` as a param of `ZooKeeperServer` constructor is not a good idea?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] maoling commented on a change in pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
maoling commented on a change in pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#discussion_r470953776



##########
File path: zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/QuorumPeerReadOnlyTest.java
##########
@@ -0,0 +1,172 @@
+package org.apache.zookeeper.server.quorum;
+
+import static org.junit.Assert.assertTrue;
+
+import java.util.Arrays;
+
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.KeeperException.RuntimeInconsistencyException;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.ZooKeeper.States;
+import org.apache.zookeeper.test.ClientBase;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+public class QuorumPeerReadOnlyTest extends QuorumPeerTestBase {
+    
+    String prePropertyReadOnlyValue = null;
+    
+    @Before
+    public void setReadOnlySystemProperty() {
+        prePropertyReadOnlyValue = System.getProperty("readonlymode.enabled");
+        System.setProperty("readonlymode.enabled", "true");
+    }
+    
+    @After
+    public void restoreReadOnlySystemProperty() {
+        if (prePropertyReadOnlyValue == null) {
+            System.clearProperty("readonlymode.enabled");
+        } else {
+            System.setProperty("readonlymode.enabled", prePropertyReadOnlyValue);
+        }
+    }
+    
+    /**
+     * Test zxid is not set to a truncate value
+     */
+    @Test
+    public void testReadOnlyZxid() throws Exception {
+        ClientBase.setupTestEnv();
+        final int SERVER_COUNT = 3;
+        final int[] clientPorts = new int[SERVER_COUNT];
+        StringBuilder sb = new StringBuilder();
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            clientPorts[i] = PortAssignment.unique();
+            sb.append("server." + i + "=127.0.0.1:" + PortAssignment.unique() + ":" + PortAssignment.unique() + "\n");
+        }
+        String quorumCfgSection = sb.toString();
+
+        MainThread[] mt = new MainThread[SERVER_COUNT];
+        ZooKeeper[] zk = new ZooKeeper[SERVER_COUNT];
+        
+
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            mt[i] = new MainThread(i, clientPorts[i], quorumCfgSection + "\nclientPort=" + clientPorts[i]);
+            mt[i].start();
+            
+        }
+        
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            zk[i] = new ZooKeeper("127.0.0.1:" + clientPorts[i], ClientBase.CONNECTION_TIMEOUT, this, true);
+        }
+        
+        waitForAll(zk, States.CONNECTED);
+        
+        zk[0].create("/test", "Test".getBytes(), Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
+
+        // Shutdown all clients first, to ensure that the close zxid is seen and logged by all servers
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            zk[i].close();
+        }
+        // Shutdown all servers, before starting one again, as the commonly used data storage is cleaned
+        // up, when the first server is stopped
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            mt[i].shutdown();
+        }
+        
+        mt[SERVER_COUNT-1].start();
+        // Ensure that close will trigger a timeout
+        zk[SERVER_COUNT-1] = new ZooKeeper("127.0.0.1:" + clientPorts[SERVER_COUNT-1], ClientBase.CONNECTION_TIMEOUT, this, true) {
+//            @Override
+            public void close2() {
+                getTestable().injectSessionExpiration();
+                cnxn.disconnect();
+                try {
+                    Thread.sleep(ClientBase.CONNECTION_TIMEOUT * 2);
+                } catch (InterruptedException e) {
+                    e.printStackTrace();
+                }
+            }
+        };
+        
+        waitForAll(new ZooKeeper[] {zk[SERVER_COUNT-1]}, States.CONNECTEDREADONLY);
+        
+        // Read some data
+        assertTrue(Arrays.equals(zk[SERVER_COUNT-1].getData("/test", false, null), "Test".getBytes()));
+        
+        // Kill the client, without notification
+        zk[SERVER_COUNT-1].close();
+       
+        // Start enough server for a quorum, but not all yet. Don't start any client, to avoid changing the zxid
+        for (int i=SERVER_COUNT-1; i >= ((SERVER_COUNT-1)/2); i--) {
+            if (i != SERVER_COUNT-1) {
+                mt[i].start();
+            }
+        }
+        ensureLeaderIsPresent(mt);
+
+        // Seems like without a waiting time, the test does succeeded sometimes without the fix applied
+        Thread.sleep(1000);
+        
+        // Start the remaining servers
+        for (int i = (SERVER_COUNT-1)/2-1; i >= 0; i--) {
+            mt[i].start();
+        }
+        ensureRemainingserversPresent(mt);
+
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            zk[i] = new ZooKeeper("127.0.0.1:" + clientPorts[i], ClientBase.CONNECTION_TIMEOUT, this, true);
+        }
+        waitForAll(zk, States.CONNECTED);
+
+        // Ensure that all clients can read the data
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            assertTrue(Arrays.equals(zk[i].getData("/test", false, null), "Test".getBytes()));
+        }
+        

Review comment:
       For better consistency check, I suggest that we can create multiply znodes(due to `TRUNC`), then assert the znode count and data content.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] Hinterwaeldlers commented on pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
Hinterwaeldlers commented on pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#issuecomment-690264383


   I am not sure, if I've followed the path correctly, but looks reasonable.
   As far, as I've understood the code flow only the Leader and ReadOnlyServer (only in close due to timeout irc) set a new zxid, which is stored in a snapshot.
   
   The regular server has shutdown and the set the value within the database correctly, so by loading the zxid from the database with your propossed ` setZxid(getZKDatabase().getDataTreeLastProcessedZxid());` should do the correction part.
   I am still not sure, if the other Server types will always have a correct zxid value, but couldn't find any path which could lead to it's usage which does influence the stored data in any way.
   
   I hope the changes are sufficient, to ensure the working principle. Sadly the CI did somehow failed in an unaffected spot. Running it locally worked one PC of mine and failed on the other with and without the patch. So hope, that it is not relevant.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] maoling edited a comment on pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
maoling edited a comment on pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#issuecomment-653732591


   - A quick comment. IIRC, my fix for this issue few days ago is like this: (maybe I'm missing something)
   ```
   # ReadOnlyZooKeeperServer#startup
   setZxid(zkDb.getDataTreeLastProcessedZxid());
   super.startup();
   ```
   - For my understanding, the `hzxid` inited with 0 and never be reassigned with last processed `zxid` in that situation. We should think the following question: 
        - When a follower changes to leader, why we don't have this issue? 
        - Is it because that `ZooKeeperServer` instance reloads the `zkDataBase` and assigns `hzxid` with last processed `zxid`?
       - We can take an example of the code of `NIOServerCnxnFactory#startup`, how a `ZooKeeperServer` instance reloads its state space from a failure, then `setups RequestProcessors` to server the requests?
    
   - Put `hzxid` as a param of `ZooKeeperServer` constructor is not a good idea?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] Hinterwaeldlers commented on a change in pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
Hinterwaeldlers commented on a change in pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#discussion_r486102366



##########
File path: zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/QuorumPeerReadOnlyTest.java
##########
@@ -0,0 +1,172 @@
+package org.apache.zookeeper.server.quorum;
+
+import static org.junit.Assert.assertTrue;
+
+import java.util.Arrays;
+
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.KeeperException.RuntimeInconsistencyException;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.ZooKeeper.States;
+import org.apache.zookeeper.test.ClientBase;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+public class QuorumPeerReadOnlyTest extends QuorumPeerTestBase {
+    
+    String prePropertyReadOnlyValue = null;
+    
+    @Before
+    public void setReadOnlySystemProperty() {
+        prePropertyReadOnlyValue = System.getProperty("readonlymode.enabled");
+        System.setProperty("readonlymode.enabled", "true");
+    }
+    
+    @After
+    public void restoreReadOnlySystemProperty() {
+        if (prePropertyReadOnlyValue == null) {
+            System.clearProperty("readonlymode.enabled");
+        } else {
+            System.setProperty("readonlymode.enabled", prePropertyReadOnlyValue);
+        }
+    }
+    
+    /**
+     * Test zxid is not set to a truncate value
+     */
+    @Test
+    public void testReadOnlyZxid() throws Exception {
+        ClientBase.setupTestEnv();
+        final int SERVER_COUNT = 3;
+        final int[] clientPorts = new int[SERVER_COUNT];
+        StringBuilder sb = new StringBuilder();
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            clientPorts[i] = PortAssignment.unique();
+            sb.append("server." + i + "=127.0.0.1:" + PortAssignment.unique() + ":" + PortAssignment.unique() + "\n");
+        }
+        String quorumCfgSection = sb.toString();
+
+        MainThread[] mt = new MainThread[SERVER_COUNT];
+        ZooKeeper[] zk = new ZooKeeper[SERVER_COUNT];
+        
+
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            mt[i] = new MainThread(i, clientPorts[i], quorumCfgSection + "\nclientPort=" + clientPorts[i]);
+            mt[i].start();
+            
+        }
+        
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            zk[i] = new ZooKeeper("127.0.0.1:" + clientPorts[i], ClientBase.CONNECTION_TIMEOUT, this, true);
+        }
+        
+        waitForAll(zk, States.CONNECTED);
+        
+        zk[0].create("/test", "Test".getBytes(), Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
+
+        // Shutdown all clients first, to ensure that the close zxid is seen and logged by all servers
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            zk[i].close();
+        }
+        // Shutdown all servers, before starting one again, as the commonly used data storage is cleaned
+        // up, when the first server is stopped
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            mt[i].shutdown();
+        }
+        
+        mt[SERVER_COUNT-1].start();
+        // Ensure that close will trigger a timeout
+        zk[SERVER_COUNT-1] = new ZooKeeper("127.0.0.1:" + clientPorts[SERVER_COUNT-1], ClientBase.CONNECTION_TIMEOUT, this, true) {
+//            @Override
+            public void close2() {
+                getTestable().injectSessionExpiration();
+                cnxn.disconnect();
+                try {
+                    Thread.sleep(ClientBase.CONNECTION_TIMEOUT * 2);
+                } catch (InterruptedException e) {
+                    e.printStackTrace();
+                }
+            }
+        };
+        
+        waitForAll(new ZooKeeper[] {zk[SERVER_COUNT-1]}, States.CONNECTEDREADONLY);
+        
+        // Read some data
+        assertTrue(Arrays.equals(zk[SERVER_COUNT-1].getData("/test", false, null), "Test".getBytes()));
+        
+        // Kill the client, without notification
+        zk[SERVER_COUNT-1].close();
+       
+        // Start enough server for a quorum, but not all yet. Don't start any client, to avoid changing the zxid
+        for (int i=SERVER_COUNT-1; i >= ((SERVER_COUNT-1)/2); i--) {
+            if (i != SERVER_COUNT-1) {
+                mt[i].start();
+            }
+        }
+        ensureLeaderIsPresent(mt);
+
+        // Seems like without a waiting time, the test does succeeded sometimes without the fix applied
+        Thread.sleep(1000);
+        
+        // Start the remaining servers
+        for (int i = (SERVER_COUNT-1)/2-1; i >= 0; i--) {
+            mt[i].start();
+        }
+        ensureRemainingserversPresent(mt);
+
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            zk[i] = new ZooKeeper("127.0.0.1:" + clientPorts[i], ClientBase.CONNECTION_TIMEOUT, this, true);
+        }
+        waitForAll(zk, States.CONNECTED);
+
+        // Ensure that all clients can read the data
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            assertTrue(Arrays.equals(zk[i].getData("/test", false, null), "Test".getBytes()));
+        }
+        

Review comment:
       I am not 100% sure what you mean with this point.
   I've appended now a couple of nodes, which are created at the beginning and checked at the end.
   Adding data in between does hide the problem, iirc. At least if it is done after the quorum was able to restore, before all server do show up.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] maoling commented on a change in pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
maoling commented on a change in pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#discussion_r470950952



##########
File path: zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/QuorumPeerReadOnlyTest.java
##########
@@ -0,0 +1,172 @@
+package org.apache.zookeeper.server.quorum;
+
+import static org.junit.Assert.assertTrue;
+
+import java.util.Arrays;
+
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.KeeperException.RuntimeInconsistencyException;
+import org.apache.zookeeper.ZooDefs.Ids;

Review comment:
       useless import




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] maoling commented on pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
maoling commented on pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#issuecomment-798230468


   @Hinterwaeldlers 
   Please transfer the fix to `master` branch([PR-1390](https://github.com/apache/zookeeper/pull/1390)) at first, then we can backport it to branch-3.6 at that convenience. Thanks for your contribution. This is a bug we cannot ignore which‘s related to the consistency(sorry for our late). I'll ping another guy to review it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] maoling commented on a change in pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
maoling commented on a change in pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#discussion_r491304059



##########
File path: zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/QuorumPeerReadOnlyTest.java
##########
@@ -0,0 +1,172 @@
+package org.apache.zookeeper.server.quorum;
+
+import static org.junit.Assert.assertTrue;
+
+import java.util.Arrays;
+
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.KeeperException.RuntimeInconsistencyException;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.ZooKeeper.States;
+import org.apache.zookeeper.test.ClientBase;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+public class QuorumPeerReadOnlyTest extends QuorumPeerTestBase {
+    
+    String prePropertyReadOnlyValue = null;
+    
+    @Before
+    public void setReadOnlySystemProperty() {
+        prePropertyReadOnlyValue = System.getProperty("readonlymode.enabled");
+        System.setProperty("readonlymode.enabled", "true");
+    }
+    
+    @After
+    public void restoreReadOnlySystemProperty() {
+        if (prePropertyReadOnlyValue == null) {
+            System.clearProperty("readonlymode.enabled");
+        } else {
+            System.setProperty("readonlymode.enabled", prePropertyReadOnlyValue);
+        }
+    }
+    
+    /**
+     * Test zxid is not set to a truncate value
+     */
+    @Test
+    public void testReadOnlyZxid() throws Exception {
+        ClientBase.setupTestEnv();
+        final int SERVER_COUNT = 3;
+        final int[] clientPorts = new int[SERVER_COUNT];
+        StringBuilder sb = new StringBuilder();
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            clientPorts[i] = PortAssignment.unique();
+            sb.append("server." + i + "=127.0.0.1:" + PortAssignment.unique() + ":" + PortAssignment.unique() + "\n");
+        }
+        String quorumCfgSection = sb.toString();
+
+        MainThread[] mt = new MainThread[SERVER_COUNT];
+        ZooKeeper[] zk = new ZooKeeper[SERVER_COUNT];
+        
+
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            mt[i] = new MainThread(i, clientPorts[i], quorumCfgSection + "\nclientPort=" + clientPorts[i]);
+            mt[i].start();
+            
+        }
+        
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            zk[i] = new ZooKeeper("127.0.0.1:" + clientPorts[i], ClientBase.CONNECTION_TIMEOUT, this, true);
+        }
+        
+        waitForAll(zk, States.CONNECTED);
+        
+        zk[0].create("/test", "Test".getBytes(), Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
+
+        // Shutdown all clients first, to ensure that the close zxid is seen and logged by all servers
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            zk[i].close();
+        }
+        // Shutdown all servers, before starting one again, as the commonly used data storage is cleaned
+        // up, when the first server is stopped
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            mt[i].shutdown();
+        }
+        
+        mt[SERVER_COUNT-1].start();
+        // Ensure that close will trigger a timeout
+        zk[SERVER_COUNT-1] = new ZooKeeper("127.0.0.1:" + clientPorts[SERVER_COUNT-1], ClientBase.CONNECTION_TIMEOUT, this, true) {
+//            @Override
+            public void close2() {
+                getTestable().injectSessionExpiration();
+                cnxn.disconnect();
+                try {
+                    Thread.sleep(ClientBase.CONNECTION_TIMEOUT * 2);
+                } catch (InterruptedException e) {
+                    e.printStackTrace();
+                }
+            }
+        };
+        
+        waitForAll(new ZooKeeper[] {zk[SERVER_COUNT-1]}, States.CONNECTEDREADONLY);
+        
+        // Read some data
+        assertTrue(Arrays.equals(zk[SERVER_COUNT-1].getData("/test", false, null), "Test".getBytes()));
+        
+        // Kill the client, without notification
+        zk[SERVER_COUNT-1].close();
+       
+        // Start enough server for a quorum, but not all yet. Don't start any client, to avoid changing the zxid
+        for (int i=SERVER_COUNT-1; i >= ((SERVER_COUNT-1)/2); i--) {
+            if (i != SERVER_COUNT-1) {
+                mt[i].start();
+            }
+        }
+        ensureLeaderIsPresent(mt);
+
+        // Seems like without a waiting time, the test does succeeded sometimes without the fix applied
+        Thread.sleep(1000);
+        
+        // Start the remaining servers
+        for (int i = (SERVER_COUNT-1)/2-1; i >= 0; i--) {
+            mt[i].start();
+        }
+        ensureRemainingserversPresent(mt);
+
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            zk[i] = new ZooKeeper("127.0.0.1:" + clientPorts[i], ClientBase.CONNECTION_TIMEOUT, this, true);
+        }
+        waitForAll(zk, States.CONNECTED);
+
+        // Ensure that all clients can read the data
+        for (int i = 0; i < SERVER_COUNT; i++) {
+            assertTrue(Arrays.equals(zk[i].getData("/test", false, null), "Test".getBytes()));
+        }
+        

Review comment:
       IIRC, I reproduced this issue by finding the loss of some children nodes. Essentially, they're all update loss by `TRUNC`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] Hinterwaeldlers closed pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
Hinterwaeldlers closed pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] Hinterwaeldlers commented on pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
Hinterwaeldlers commented on pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#issuecomment-658246185


   I was checking your solution @maoling and it seems to work too. Not sure if there are no side effects, from not keeping the zxid in sync, but if it works, it might be the better solution
   
   Related to the change from Follower to leader. I wasn't able to get the whole code yet, but as far as I understood:
   - The follower is not responsible for setting the next zxid, so it's actual value doesn't matter
   - After the leader is lost, there is a short period where no answers are sent out and a new leader is elected
   - When the leader is started, it will start loading the data and set the zxid accordingly
   So there shouldn't a problem here


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] maoling commented on pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
maoling commented on pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#issuecomment-695203688


   > Sadly the CI did somehow failed in an unaffected.....................
   
   @Hinterwaeldlers 
   The CI was broken a few days ago and now It's back to normal.
   re-trigger the CI by the following alternative ways:
   1.  close this PR and reopen it
   2. `git commit --amend` and `git push origin ZOOKEEPER-3526-3.6 -f`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zookeeper] Hinterwaeldlers edited a comment on pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Posted by GitBox <gi...@apache.org>.
Hinterwaeldlers edited a comment on pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#issuecomment-658246185


   I was checking your solution @maoling and it seems to work too. Not sure if there are no side effects, from not keeping the zxid in sync, but if it works, it might be the better solution
   
   Related to the change from Follower to leader. I wasn't able to get the whole code yet, but as far as I understood:
   - The follower is not responsible for setting the next zxid, so it's actual value doesn't matter
   - After the leader is lost, there is a short period where no answers are sent out and a new leader is elected
   - When the leader is started, it will start loading the data and set the zxid accordingly
   So there shouldn't a problem here
   
   I've changed the patch accordingly, if it fits, I can adept the other pull-requests accordingly


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org