You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@zookeeper.apache.org by GitBox <gi...@apache.org> on 2020/01/13 17:20:07 UTC

[GitHub] [zookeeper] mayawang opened a new pull request #1219: ZOOKEEPER-3427: Introduce SnapshotComparer that assists debugging with snapshots.

mayawang opened a new pull request #1219: ZOOKEEPER-3427: Introduce SnapshotComparer that assists debugging with snapshots.
URL: https://github.com/apache/zookeeper/pull/1219
 
 
   SnapshotComparer is a tool that loads and compares two snapshots, with configurable threshold and various filters. It's useful in use cases that involve snapshot analysis, such as offline data consistency checking, and data trending analysis (e.g. what's growing under which zNode path during when).
   
   See detailed usage doc in `zookeeperTools.md`.
   
   This PR replaces and addresses comments and feedbacks in https://github.com/apache/zookeeper/pull/984
   
   Change Notes:
   1. Made this tool support `gz/snappy` compressed snapshot formats in addition to uncompressed snapshot format.
   2. Fixed a minor bug in CompareLine(), improved error handling and docs.
   3. Added shell script to make SnapshotComparer tool more user friendly.
   4. Added detailed usage docs in `zookeeperTools.md`.  
   
   # Testing:
   <!-- # Unit tests on zookeeper server. All Passed
   mvn test -pl zookeeper-server -->
   
   ### Style check
   mvn -DskipTests checkstyle:check
   
   ### Local Testing by running shell script and compare different file formats:
   Example results:
   1. `snappy` vs `gz`:
   ```
   # $ bin/zkSnapshotComparer.sh -l /zookeeper-data/backup/snapshot.0.snappy -r /zookeeper-data/backup/snapshot.8.gz -b 100000 -n 100
   ...
   Deserialized snapshot in snapshot.0.snappy in 0.030391 seconds
   Processed data tree in 0.098400 seconds
   2020-01-13 00:37:09,089 [myid:] - INFO  [main:WatchManagerFactory@42] - Using org.apache.zookeeper.server.watch.WatchManager as watch manager
   2020-01-13 00:37:09,089 [myid:] - INFO  [main:WatchManagerFactory@42] - Using org.apache.zookeeper.server.watch.WatchManager as watch manager
   Deserialized snapshot in snapshot.8.gz in 0.000734 seconds
   Processed data tree in 0.000140 seconds
   Node count: 4
   Total size: 0
   Max depth: 3
   Count of nodes at depth 1: 1
   Count of nodes at depth 2: 1
   Count of nodes at depth 3: 2
   
   Node count: 8
   Total size: 0
   Max depth: 4
   Count of nodes at depth 1: 1
   Count of nodes at depth 2: 2
   Count of nodes at depth 3: 4
   Count of nodes at depth 4: 1
   
   Analysis for depth 0
   Analysis for depth 1
   Analysis for depth 2
   Analysis for depth 3
   All layers compared.
   ```
   2. `snappy` vs `snappy`
   ```
   $ bin/zkSnapshotComparer.sh -l /zookeeper-data/backup/snapshot.0.snappy -r /zookeeper-data/backup/snapshot.d.snappy -b 100000 -n 100
   ...
   Deserialized snapshot in snapshot.d.snappy in 0.000593 seconds
   Processed data tree in 0.000211 seconds
   Node count: 4
   Total size: 0
   Max depth: 3
   Count of nodes at depth 1: 1
   Count of nodes at depth 2: 1
   Count of nodes at depth 3: 2
   
   Node count: 10
   Total size: 0
   Max depth: 4
   Count of nodes at depth 1: 1
   Count of nodes at depth 2: 2
   Count of nodes at depth 3: 4
   Count of nodes at depth 4: 3
   
   Analysis for depth 0
   Analysis for depth 1
   Analysis for depth 2
   Analysis for depth 3
   All layers compared.
   ```
   3. uncompressed vs `snappy`
   ```
   $ bin/zkSnapshotComparer.sh -l /zookeeper-data/backup/snapshot.0 -r /zookeeper-data/backup/snapshot.d.snappy -b 100000 -n 100
   ...
   Deserialized snapshot in snapshot.0 in 0.003973 seconds
   Processed data tree in 0.077579 seconds
   2020-01-13 00:41:19,566 [myid:] - INFO  [main:WatchManagerFactory@42] - Using org.apache.zookeeper.server.watch.WatchManager as watch manager
   2020-01-13 00:41:19,566 [myid:] - INFO  [main:WatchManagerFactory@42] - Using org.apache.zookeeper.server.watch.WatchManager as watch manager
   Deserialized snapshot in snapshot.d.snappy in 0.032078 seconds
   Processed data tree in 0.000609 seconds
   Node count: 4
   Total size: 0
   Max depth: 3
   Count of nodes at depth 1: 1
   Count of nodes at depth 2: 1
   Count of nodes at depth 3: 2
   
   Node count: 10
   Total size: 0
   Max depth: 4
   Count of nodes at depth 1: 1
   Count of nodes at depth 2: 2
   Count of nodes at depth 3: 4
   Count of nodes at depth 4: 3
   
   Analysis for depth 0
   Analysis for depth 1
   Analysis for depth 2
   Analysis for depth 3
   All layers compared.
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [zookeeper] maoling commented on issue #1219: ZOOKEEPER-3427: Introduce SnapshotComparer that assists debugging with snapshots.

Posted by GitBox <gi...@apache.org>.
maoling commented on issue #1219: ZOOKEEPER-3427: Introduce SnapshotComparer that assists debugging with snapshots.
URL: https://github.com/apache/zookeeper/pull/1219#issuecomment-573985308
 
 
   > and data trending analysis (e.g. what's growing under which zNode path during when).
   
   For my understanding, this tool also needs to tell users what the specific znode paths added, updated, deleted comparing one snapshot to another

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [zookeeper] maoling commented on a change in pull request #1219: ZOOKEEPER-3427: Introduce SnapshotComparer that assists debugging with snapshots.

Posted by GitBox <gi...@apache.org>.
maoling commented on a change in pull request #1219: ZOOKEEPER-3427: Introduce SnapshotComparer that assists debugging with snapshots.
URL: https://github.com/apache/zookeeper/pull/1219#discussion_r366138831
 
 

 ##########
 File path: zookeeper-docs/src/main/resources/markdown/zookeeperTools.md
 ##########
 @@ -205,6 +206,67 @@ USAGE: SnapshotFormatter [-d|-json] snapshot_file
 [[1,0,{"progname":"SnapshotFormatter.java","progver":"0.01","timestamp":1559788148637},[{"name":"\/","asize":0,"dsize":0,"dev":0,"ino":1001},[{"name":"zookeeper","asize":0,"dsize":0,"dev":0,"ino":1002},{"name":"config","asize":0,"dsize":0,"dev":0,"ino":1003},[{"name":"quota","asize":0,"dsize":0,"dev":0,"ino":1004},[{"name":"test","asize":0,"dsize":0,"dev":0,"ino":1005},{"name":"zookeeper_limits","asize":52,"dsize":52,"dev":0,"ino":1006},{"name":"zookeeper_stats","asize":15,"dsize":15,"dev":0,"ino":1007}]]],{"name":"test","asize":0,"dsize":0,"dev":0,"ino":1008}]]
 ```
 
+### zkSnapshotComparer.sh
 
 Review comment:
   `<a name="zkSnapshotComparer"></a>`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [zookeeper] mayawang closed pull request #1219: ZOOKEEPER-3427: Introduce SnapshotComparer that assists debugging with snapshots.

Posted by GitBox <gi...@apache.org>.
mayawang closed pull request #1219: ZOOKEEPER-3427: Introduce SnapshotComparer that assists debugging with snapshots.
URL: https://github.com/apache/zookeeper/pull/1219
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [zookeeper] mayawang commented on issue #1219: ZOOKEEPER-3427: Introduce SnapshotComparer that assists debugging with snapshots.

Posted by GitBox <gi...@apache.org>.
mayawang commented on issue #1219: ZOOKEEPER-3427: Introduce SnapshotComparer that assists debugging with snapshots.
URL: https://github.com/apache/zookeeper/pull/1219#issuecomment-574018369
 
 
   @maoling Thank you very much for the review! Those are all very helpful comments. I just updated https://github.com/apache/zookeeper/pull/984 to preserve previous discussion. I created this PR because I was blocked by a mvn compilation issue which I was able to workaround for now. 
   
   I will close this PR and address your comments in https://github.com/apache/zookeeper/pull/984. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [zookeeper] maoling commented on a change in pull request #1219: ZOOKEEPER-3427: Introduce SnapshotComparer that assists debugging with snapshots.

Posted by GitBox <gi...@apache.org>.
maoling commented on a change in pull request #1219: ZOOKEEPER-3427: Introduce SnapshotComparer that assists debugging with snapshots.
URL: https://github.com/apache/zookeeper/pull/1219#discussion_r366135873
 
 

 ##########
 File path: zookeeper-docs/src/main/resources/markdown/zookeeperTools.md
 ##########
 @@ -205,6 +206,67 @@ USAGE: SnapshotFormatter [-d|-json] snapshot_file
 [[1,0,{"progname":"SnapshotFormatter.java","progver":"0.01","timestamp":1559788148637},[{"name":"\/","asize":0,"dsize":0,"dev":0,"ino":1001},[{"name":"zookeeper","asize":0,"dsize":0,"dev":0,"ino":1002},{"name":"config","asize":0,"dsize":0,"dev":0,"ino":1003},[{"name":"quota","asize":0,"dsize":0,"dev":0,"ino":1004},[{"name":"test","asize":0,"dsize":0,"dev":0,"ino":1005},{"name":"zookeeper_limits","asize":52,"dsize":52,"dev":0,"ino":1006},{"name":"zookeeper_stats","asize":15,"dsize":15,"dev":0,"ino":1007}]]],{"name":"test","asize":0,"dsize":0,"dev":0,"ino":1008}]]
 ```
 
+### zkSnapshotComparer.sh
+SnapshotComparer is a tool that loads and compares two snapshots, with configurable threshold and various filters. It's useful in use cases that involve snapshot analysis, such as offline data consistency checking, and data trending analysis (e.g. what's growing under which zNode path during when).
+
+This tool only outputs information about permanent nodes, ignoring both sessions and ephemeral nodes.
+
+It provides two tuning parameters to help filter out noise: 
+1. `--nodes` Threshold number of children added/removed; 
+2. `--bytes` Threshold number of bytes added/removed.
+
+#### Locate Snapshots
+Snapshots can be found in [Zookeeper Data Directory](zookeeperAdmin.html#The+Data+Directory) which configured in [conf/zoo.cfg](zookeeperStarted.html#sc_InstallingSingleMode) when set up Zookeeper server. 
+
+#### Supported Snapshot Formats
+This tool supports uncompressed snapshot format, and compressed snapshot file formats: `snappy` and `gz`. Snapshots with different formats can be compared using this tool directly without decompression.
+
+#### Running the Tool
+Running the tool with no command line argument or an unrecognized argument, it outputs the following help page:
+```
+usage: java -cp <classPath> org.apache.zookeeper.server.SnapshotComparer
 
 Review comment:
   need a space before ``` to make markdown happy. Other places are the same

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [zookeeper] maoling commented on a change in pull request #1219: ZOOKEEPER-3427: Introduce SnapshotComparer that assists debugging with snapshots.

Posted by GitBox <gi...@apache.org>.
maoling commented on a change in pull request #1219: ZOOKEEPER-3427: Introduce SnapshotComparer that assists debugging with snapshots.
URL: https://github.com/apache/zookeeper/pull/1219#discussion_r366129666
 
 

 ##########
 File path: zookeeper-server/src/main/java/org/apache/zookeeper/server/SnapshotComparer.java
 ##########
 @@ -0,0 +1,458 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.server;
+
+import java.io.File;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Scanner;
+import java.util.zip.CheckedInputStream;
+import org.apache.commons.cli.BasicParser;
+import org.apache.commons.cli.CommandLine;
+import org.apache.commons.cli.HelpFormatter;
+import org.apache.commons.cli.OptionBuilder;
+import org.apache.commons.cli.Options;
+import org.apache.commons.cli.ParseException;
+import org.apache.jute.BinaryInputArchive;
+import org.apache.jute.InputArchive;
+import org.apache.zookeeper.server.persistence.FileSnap;
+import org.apache.zookeeper.server.persistence.SnapStream;
+
+/**
+ * SnapshotComparer is a tool that loads and compares two snapshots, with configurable threshold and various filters.
+ * It's useful in use cases that involve snapshot analysis, such as offline data consistency checking, and data trending analysis (e.g. what's growing under which zNode path during when).
+ * Only outputs information about permanent nodes, ignoring both sessions and ephemeral nodes.
+ */
+public class SnapshotComparer {
+  private final Options options;
+  private static final String leftOption = "left";
+  private static final String rightOption = "right";
+  private static final String byteThresholdOption = "bytes";
+  private static final String nodeThresholdOption = "nodes";
+  private static final String debugOption = "debug";
+  private static final String interactiveOption = "interactive";
+
+  @SuppressWarnings("static")
+  private SnapshotComparer() {
+    options = new Options();
+    options.addOption(
+        OptionBuilder
+            .hasArg()
+            .isRequired(true)
+            .withLongOpt(leftOption)
+            .withDescription("(Required) The left snapshot file.")
+            .withArgName("LEFT")
+            .withType(File.class)
+            .create("l"));
+    options.addOption(
+        OptionBuilder
+            .hasArg()
+            .isRequired(true)
+            .withLongOpt(rightOption)
+            .withDescription("(Required) The right snapshot file.")
+            .withArgName("RIGHT")
+            .withType(File.class)
+            .create("r"));
+    options.addOption(
+        OptionBuilder
+            .hasArg()
+            .isRequired(true)
+            .withLongOpt(byteThresholdOption)
+            .withDescription("(Required) The node data delta size threshold, in bytes, for printing the node.")
+            .withArgName("BYTETHRESHOLD")
+            .withType(String.class)
+            .create("b"));
+    options.addOption(
+        OptionBuilder
+            .hasArg()
+            .isRequired(true)
+            .withLongOpt(nodeThresholdOption)
+            .withDescription("(Required) The descendant node delta size threshold, in nodes, for printing the node.")
+            .withArgName("NODETHRESHOLD")
+            .withType(String.class)
+            .create("n"));
+    options.addOption(
+        OptionBuilder
+            .hasArg()
+            .withLongOpt(debugOption)
+            .withDescription("Use debug output.")
+            .withArgName("DEBUG")
+            .withType(String.class)
+            .create("d"));
+    options.addOption(
+        OptionBuilder
+            .hasArg()
+            .withLongOpt(interactiveOption)
+            .withDescription("Enter interactive mode.")
+            .withArgName("INTERACTIVE")
+            .withType(String.class)
+            .create("i"));
+  }
+
+  private void usage() {
+    HelpFormatter help = new HelpFormatter();
+
+    help.printHelp(
+        120,
+        "java -cp <classPath> " + SnapshotComparer.class.getName(),
+        "",
+        options,
+        "");
+  }
+
+  public static void main(String[] args) throws Exception {
+    SnapshotComparer app = new SnapshotComparer();
+    app.compareSnapshots(args);
+  }
+
+  private void compareSnapshots(String[] args) throws Exception {
+    CommandLine parsedOptions;
+    try {
+      parsedOptions = new BasicParser().parse(options, args);
+    } catch (ParseException e) {
+      System.err.println(e.getMessage());
+      usage();
+      System.exit(-1);
+      return;
 
 Review comment:
   > [ERROR] org.apache.zookeeper.server.SnapshotComparer.compareSnapshots(String[]) invokes System.exit(...), which shuts down the entire virtual machine [org.apache.zookeeper.server.SnapshotComparer] At SnapshotComparer.java:[line 136] DM_EXIT
   
   a try for `ServiceUtils.requestSystemExit`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services