You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/06/21 23:32:45 UTC

[GitHub] [hbase] bharathv opened a new pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

bharathv opened a new pull request #1945:
URL: https://github.com/apache/hbase/pull/1945


   Writing a test for this is tricky. There is enough coverage for
   functional tests. Only concern is performance, but there is enough
   logging for it to detect timed out/badly performing sync calls.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] sguggilam commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
sguggilam commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r443775794



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -595,9 +604,28 @@ private void connectionEvent(WatchedEvent event) {
    * data of an existing node and delete or transition that node, utilizing the
    * previously read version and data.  We want to ensure that the version read
    * is up-to-date from when we begin the operation.
+   * <p>
    */
-  public void sync(String path) throws KeeperException {
-    this.recoverableZooKeeper.sync(path, null, null);
+  public void syncOrTimeout(String path) throws KeeperException {
+    final CountDownLatch latch = new CountDownLatch(1);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    this.recoverableZooKeeper.sync(path, (i, s, o) -> latch.countDown(), null);

Review comment:
       Why can't we instead pass ctx as the last argument instead of NULL during the actual API call ? Just in case we want to use this field somewhere else in future to get the context in callback.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r445695334



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -479,29 +502,15 @@ public ZNodePaths getZNodePaths() {
     return znodePaths;
   }
 
-  /**
-   * Method called from ZooKeeper for events and connection status.
-   * <p>
-   * Valid events are passed along to listeners.  Connection status changes
-   * are dealt with locally.
-   */
-  @Override
-  public void process(WatchedEvent event) {
-    LOG.debug(prefix("Received ZooKeeper Event, " +
-        "type=" + event.getType() + ", " +
-        "state=" + event.getState() + ", " +
-        "path=" + event.getPath()));
-
+  void processEvent(WatchedEvent event) {

Review comment:
       keep it private?

##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -530,10 +539,26 @@ public void process(WatchedEvent event) {
         break;
       }
       default:
-        throw new IllegalStateException("Received event is not valid: " + event.getState());
+        LOG.error("Invalid event of type {} received for path {}. Ignoring.",

Review comment:
       Yeah, this looks better.

##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -79,8 +86,22 @@
   // listeners to be notified
   private final List<ZKListener> listeners = new CopyOnWriteArrayList<>();
 
+  // Single threaded executor pool that processes event notifications from Zookeeper. Events are
+  // processed in the order in which they arrive (pool backed by an unbounded fifo queue). We do
+  // this to decouple the event processing from Zookeeper's ClientCnxn's EventThread context.
+  // EventThread internally runs a single while loop to serially process all the events. When events
+  // are processed by the listeners in the same thread, that blocks the EventThread from processing
+  // subsequent events. Processing events in a separate thread frees up the event thread to continue
+  // and further prevents deadlocks if the process method itself makes other zookeeper calls.
+  // It is ok to do it in a single thread because the Zookeeper ClientCnxn already serializes the
+  // requests using a single while loop and hence there is no performance degradation.
+  private final ExecutorService zkEventProcessor =
+      Executors.newSingleThreadExecutor(Threads.getNamedThreadFactory("zk-event-processor"));

Review comment:
       Why not use `new ThreadFactoryBuilder()` and set name format?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#issuecomment-649805385


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   6m 43s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  4s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 21s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 58s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 37s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   6m  3s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 19s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 14s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m 20s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 39s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 39s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 38s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 39s |  hbase-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   0m 44s |  hbase-zookeeper in the patch passed.  |
   | +1 :green_heart: |  unit  | 208m 57s |  hbase-server in the patch passed.  |
   |  |   | 247m 53s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/3/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1945 |
   | JIRA Issue | HBASE-24603 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 50c23ba9ac38 4.15.0-91-generic #92-Ubuntu SMP Fri Feb 28 11:09:48 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 1378776a91 |
   | Default Java | 1.8.0_232 |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/3/testReport/ |
   | Max. process+thread count | 3220 (vs. ulimit of 12500) |
   | modules | C: hbase-common hbase-zookeeper hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/3/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
virajjasani commented on pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#issuecomment-647431579


   Test failures are relevant, I just ran `TestVisibilityLabelsWithACL` locally and it is failing with the patch.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] apurtell commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
apurtell commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r445736557



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -530,10 +539,26 @@ public void process(WatchedEvent event) {
         break;
       }
       default:
-        throw new IllegalStateException("Received event is not valid: " + event.getState());
+        LOG.error("Invalid event of type {} received for path {}. Ignoring.",
+            event.getState(), event.getPath());
     }
   }
 
+  /**
+   * Method called from ZooKeeper for events and connection status.
+   * <p>
+   * Valid events are passed along to listeners.  Connection status changes
+   * are dealt with locally.
+   */
+  @Override
+  public void process(WatchedEvent event) {
+    LOG.debug(prefix("Received ZooKeeper Event, " +
+        "type=" + event.getType() + ", " +
+        "state=" + event.getState() + ", " +
+        "path=" + event.getPath()));
+    zkEventProcessor.submit(() -> processEvent(event));

Review comment:
       Never mind, as long as it's understood this is a minor pain for backporting.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] bharathv commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
bharathv commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r445723003



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -479,29 +502,15 @@ public ZNodePaths getZNodePaths() {
     return znodePaths;
   }
 
-  /**
-   * Method called from ZooKeeper for events and connection status.
-   * <p>
-   * Valid events are passed along to listeners.  Connection status changes
-   * are dealt with locally.
-   */
-  @Override
-  public void process(WatchedEvent event) {
-    LOG.debug(prefix("Received ZooKeeper Event, " +
-        "type=" + event.getType() + ", " +
-        "state=" + event.getState() + ", " +
-        "path=" + event.getPath()));
-
+  void processEvent(WatchedEvent event) {

Review comment:
       Done.

##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -79,8 +86,22 @@
   // listeners to be notified
   private final List<ZKListener> listeners = new CopyOnWriteArrayList<>();
 
+  // Single threaded executor pool that processes event notifications from Zookeeper. Events are
+  // processed in the order in which they arrive (pool backed by an unbounded fifo queue). We do
+  // this to decouple the event processing from Zookeeper's ClientCnxn's EventThread context.
+  // EventThread internally runs a single while loop to serially process all the events. When events
+  // are processed by the listeners in the same thread, that blocks the EventThread from processing
+  // subsequent events. Processing events in a separate thread frees up the event thread to continue
+  // and further prevents deadlocks if the process method itself makes other zookeeper calls.
+  // It is ok to do it in a single thread because the Zookeeper ClientCnxn already serializes the
+  // requests using a single while loop and hence there is no performance degradation.
+  private final ExecutorService zkEventProcessor =
+      Executors.newSingleThreadExecutor(Threads.getNamedThreadFactory("zk-event-processor"));

Review comment:
       The reason is ThreadFactoryBuilder is from a third party library (guava). Usually I prefer to reuse the code we already have rather than pulling it from elsewhere. Also, making that util public doesn't matter much I guess, its a static utility anyway.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] apurtell commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
apurtell commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r443718162



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -595,9 +604,28 @@ private void connectionEvent(WatchedEvent event) {
    * data of an existing node and delete or transition that node, utilizing the
    * previously read version and data.  We want to ensure that the version read
    * is up-to-date from when we begin the operation.
+   * <p>
    */
-  public void sync(String path) throws KeeperException {
-    this.recoverableZooKeeper.sync(path, null, null);
+  public void syncOrTimeout(String path) throws KeeperException {
+    final CountDownLatch latch = new CountDownLatch(1);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    this.recoverableZooKeeper.sync(path, (i, s, o) -> latch.countDown(), null);
+    try {
+      if (!latch.await(zkSyncTimeout, TimeUnit.MILLISECONDS)) {
+        LOG.error("sync() operation to ZK timed out. Configured timeout: {}ms. This usually points "

Review comment:
       This should be WARN.

##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -595,9 +604,28 @@ private void connectionEvent(WatchedEvent event) {
    * data of an existing node and delete or transition that node, utilizing the
    * previously read version and data.  We want to ensure that the version read
    * is up-to-date from when we begin the operation.
+   * <p>
    */
-  public void sync(String path) throws KeeperException {
-    this.recoverableZooKeeper.sync(path, null, null);
+  public void syncOrTimeout(String path) throws KeeperException {
+    final CountDownLatch latch = new CountDownLatch(1);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    this.recoverableZooKeeper.sync(path, (i, s, o) -> latch.countDown(), null);

Review comment:
       This has to be backported to branch-1, which requires Java 7 compatible code, and there's no real need for a lambda here. We have guaranteed code divergence between the branches in exchange for perhaps some improved readability. Is it worth it?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] bharathv commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
bharathv commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r443737875



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -595,9 +604,28 @@ private void connectionEvent(WatchedEvent event) {
    * data of an existing node and delete or transition that node, utilizing the
    * previously read version and data.  We want to ensure that the version read
    * is up-to-date from when we begin the operation.
+   * <p>
    */
-  public void sync(String path) throws KeeperException {
-    this.recoverableZooKeeper.sync(path, null, null);
+  public void syncOrTimeout(String path) throws KeeperException {
+    final CountDownLatch latch = new CountDownLatch(1);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    this.recoverableZooKeeper.sync(path, (i, s, o) -> latch.countDown(), null);

Review comment:
       I kind of like the fact that the code is a bit concise (since there is no real logic in this lambda in this case) but I get the concern about readability. I don't have a strong preference though, I can switch it to an inline class.

##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -595,9 +604,28 @@ private void connectionEvent(WatchedEvent event) {
    * data of an existing node and delete or transition that node, utilizing the
    * previously read version and data.  We want to ensure that the version read
    * is up-to-date from when we begin the operation.
+   * <p>
    */
-  public void sync(String path) throws KeeperException {
-    this.recoverableZooKeeper.sync(path, null, null);
+  public void syncOrTimeout(String path) throws KeeperException {
+    final CountDownLatch latch = new CountDownLatch(1);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    this.recoverableZooKeeper.sync(path, (i, s, o) -> latch.countDown(), null);
+    try {
+      if (!latch.await(zkSyncTimeout, TimeUnit.MILLISECONDS)) {
+        LOG.error("sync() operation to ZK timed out. Configured timeout: {}ms. This usually points "

Review comment:
       Any particular reason this has to be WARN? This can potentially be a correctness issue at this point and hence I chose the higher log level.

##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -595,9 +604,28 @@ private void connectionEvent(WatchedEvent event) {
    * data of an existing node and delete or transition that node, utilizing the
    * previously read version and data.  We want to ensure that the version read
    * is up-to-date from when we begin the operation.
+   * <p>
    */
-  public void sync(String path) throws KeeperException {
-    this.recoverableZooKeeper.sync(path, null, null);
+  public void syncOrTimeout(String path) throws KeeperException {
+    final CountDownLatch latch = new CountDownLatch(1);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    this.recoverableZooKeeper.sync(path, (i, s, o) -> latch.countDown(), null);

Review comment:
       ok will do.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#issuecomment-647234057


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 33s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 41s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 40s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 44s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 12s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 16s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 30s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 36s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 36s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 31s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 22s |  hbase-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   0m 43s |  hbase-zookeeper in the patch passed.  |
   | -1 :x: |  unit  | 143m 27s |  hbase-server in the patch failed.  |
   |  |   | 173m 32s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.11 Server=19.03.11 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1945 |
   | JIRA Issue | HBASE-24603 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 0a37442ae4ee 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 8f1353b447 |
   | Default Java | 1.8.0_232 |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/testReport/ |
   | Max. process+thread count | 4682 (vs. ulimit of 12500) |
   | modules | C: hbase-common hbase-zookeeper hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] apurtell commented on pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
apurtell commented on pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#issuecomment-649731033


   Approved, assuming a clean precommit prior to merge.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] apurtell commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
apurtell commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r445736122



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -530,10 +539,26 @@ public void process(WatchedEvent event) {
         break;
       }
       default:
-        throw new IllegalStateException("Received event is not valid: " + event.getState());
+        LOG.error("Invalid event of type {} received for path {}. Ignoring.",
+            event.getState(), event.getPath());
     }
   }
 
+  /**
+   * Method called from ZooKeeper for events and connection status.
+   * <p>
+   * Valid events are passed along to listeners.  Connection status changes
+   * are dealt with locally.
+   */
+  @Override
+  public void process(WatchedEvent event) {
+    LOG.debug(prefix("Received ZooKeeper Event, " +
+        "type=" + event.getType() + ", " +
+        "state=" + event.getState() + ", " +
+        "path=" + event.getPath()));
+    zkEventProcessor.submit(() -> processEvent(event));

Review comment:
       Do we need the lambda here? branch-1....




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] apurtell commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
apurtell commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r444414654



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -595,9 +604,28 @@ private void connectionEvent(WatchedEvent event) {
    * data of an existing node and delete or transition that node, utilizing the
    * previously read version and data.  We want to ensure that the version read
    * is up-to-date from when we begin the operation.
+   * <p>
    */
-  public void sync(String path) throws KeeperException {
-    this.recoverableZooKeeper.sync(path, null, null);
+  public void syncOrTimeout(String path) throws KeeperException {
+    final CountDownLatch latch = new CountDownLatch(1);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    this.recoverableZooKeeper.sync(path, (i, s, o) -> latch.countDown(), null);
+    try {
+      if (!latch.await(zkSyncTimeout, TimeUnit.MILLISECONDS)) {
+        LOG.error("sync() operation to ZK timed out. Configured timeout: {}ms. This usually points "

Review comment:
       WARN implies to me as an operator that something bad happened that is noteworthy but not an emergency. ERROR implies a serious problem that probably should cause someone to be paged.
   
   sync is going to occasionally time out. We should be handling such without correctness problems. ERROR logging doesn't substitute for handling the timeout properly. So, assuming we are handling timeouts correctly, WARN level logging is most appropriate. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r443368727



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -595,9 +604,28 @@ private void connectionEvent(WatchedEvent event) {
    * data of an existing node and delete or transition that node, utilizing the
    * previously read version and data.  We want to ensure that the version read
    * is up-to-date from when we begin the operation.
+   * <p>
    */
-  public void sync(String path) throws KeeperException {
-    this.recoverableZooKeeper.sync(path, null, null);
+  public void syncOrTimeout(String path) throws KeeperException {
+    final CountDownLatch latch = new CountDownLatch(1);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    this.recoverableZooKeeper.sync(path, (i, s, o) -> latch.countDown(), null);
+    try {
+      if (!latch.await(zkSyncTimeout, TimeUnit.MILLISECONDS)) {
+        LOG.error("sync() operation to ZK timed out. Configured timeout: {}ms. This usually points "

Review comment:
       IMHO `error` is fine but better to keep it `warn`. We anyways have `warn` for `InterruptedException` also.

##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -595,9 +604,28 @@ private void connectionEvent(WatchedEvent event) {
    * data of an existing node and delete or transition that node, utilizing the
    * previously read version and data.  We want to ensure that the version read
    * is up-to-date from when we begin the operation.
+   * <p>
    */
-  public void sync(String path) throws KeeperException {
-    this.recoverableZooKeeper.sync(path, null, null);
+  public void syncOrTimeout(String path) throws KeeperException {
+    final CountDownLatch latch = new CountDownLatch(1);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    this.recoverableZooKeeper.sync(path, (i, s, o) -> latch.countDown(), null);

Review comment:
       `RecoverableZooKeeper` is IA.Private. Can we remove last argument `Object ctx` as part of this PR? Anyways we pass null context to actual API.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] apurtell commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
apurtell commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r445734505



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -79,8 +86,22 @@
   // listeners to be notified
   private final List<ZKListener> listeners = new CopyOnWriteArrayList<>();
 
+  // Single threaded executor pool that processes event notifications from Zookeeper. Events are
+  // processed in the order in which they arrive (pool backed by an unbounded fifo queue). We do
+  // this to decouple the event processing from Zookeeper's ClientCnxn's EventThread context.
+  // EventThread internally runs a single while loop to serially process all the events. When events
+  // are processed by the listeners in the same thread, that blocks the EventThread from processing
+  // subsequent events. Processing events in a separate thread frees up the event thread to continue
+  // and further prevents deadlocks if the process method itself makes other zookeeper calls.
+  // It is ok to do it in a single thread because the Zookeeper ClientCnxn already serializes the
+  // requests using a single while loop and hence there is no performance degradation.
+  private final ExecutorService zkEventProcessor =
+      Executors.newSingleThreadExecutor(Threads.getNamedThreadFactory("zk-event-processor"));

Review comment:
       We should definitely avoid dependencies on Guava. Hadoop, HBase, and Phoenix do not stay in sync (Hadoop 3 is up on 21 and that will be painful for the rest!) and Google regularly breaks compatibility among Guava major versions. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#issuecomment-649193126


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 30s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 24s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 40s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 41s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   3m 11s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 14s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 25s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 39s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m 13s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   3m 37s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 39s |  The patch does not generate ASF License warnings.  |
   |  |   |  37m 57s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/2/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1945 |
   | JIRA Issue | HBASE-24603 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux a49a55728512 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 7fee4b5fb6 |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-common hbase-zookeeper hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r445742794



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -79,8 +86,22 @@
   // listeners to be notified
   private final List<ZKListener> listeners = new CopyOnWriteArrayList<>();
 
+  // Single threaded executor pool that processes event notifications from Zookeeper. Events are
+  // processed in the order in which they arrive (pool backed by an unbounded fifo queue). We do
+  // this to decouple the event processing from Zookeeper's ClientCnxn's EventThread context.
+  // EventThread internally runs a single while loop to serially process all the events. When events
+  // are processed by the listeners in the same thread, that blocks the EventThread from processing
+  // subsequent events. Processing events in a separate thread frees up the event thread to continue
+  // and further prevents deadlocks if the process method itself makes other zookeeper calls.
+  // It is ok to do it in a single thread because the Zookeeper ClientCnxn already serializes the
+  // requests using a single while loop and hence there is no performance degradation.
+  private final ExecutorService zkEventProcessor =
+      Executors.newSingleThreadExecutor(Threads.getNamedThreadFactory("zk-event-processor"));

Review comment:
       Okk that makes sense




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] bharathv merged pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
bharathv merged pull request #1945:
URL: https://github.com/apache/hbase/pull/1945


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#issuecomment-649707495


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 32s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 29s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 44s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   3m 10s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 13s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 21s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 40s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m  6s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   3m 46s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 36s |  The patch does not generate ASF License warnings.  |
   |  |   |  37m 54s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/3/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1945 |
   | JIRA Issue | HBASE-24603 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 9858995e84aa 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 1378776a91 |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-common hbase-zookeeper hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/3/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] apurtell commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
apurtell commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r444415303



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -595,9 +604,28 @@ private void connectionEvent(WatchedEvent event) {
    * data of an existing node and delete or transition that node, utilizing the
    * previously read version and data.  We want to ensure that the version read
    * is up-to-date from when we begin the operation.
+   * <p>
    */
-  public void sync(String path) throws KeeperException {
-    this.recoverableZooKeeper.sync(path, null, null);
+  public void syncOrTimeout(String path) throws KeeperException {
+    final CountDownLatch latch = new CountDownLatch(1);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    this.recoverableZooKeeper.sync(path, (i, s, o) -> latch.countDown(), null);

Review comment:
       Every Java 8 ism makes a backport to branch-1 that more painful. If you don't need to do it, please don't. Unless the change is not intended for branch-1 then who cares. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#issuecomment-649766401


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 30s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 22s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m 16s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 51s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 58s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 20s |  hbase-common in master failed.  |
   | -0 :warning: |  javadoc  |   0m 40s |  hbase-server in master failed.  |
   | -0 :warning: |  javadoc  |   0m 17s |  hbase-zookeeper in master failed.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 15s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 59s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 48s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 48s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 48s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 19s |  hbase-common in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 18s |  hbase-zookeeper in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 39s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 24s |  hbase-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   0m 45s |  hbase-zookeeper in the patch passed.  |
   | +1 :green_heart: |  unit  | 129m 35s |  hbase-server in the patch passed.  |
   |  |   | 161m 51s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/3/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1945 |
   | JIRA Issue | HBASE-24603 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 610148e44d7c 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 1378776a91 |
   | Default Java | 2020-01-14 |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-common.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-zookeeper.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-common.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-zookeeper.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/3/testReport/ |
   | Max. process+thread count | 3765 (vs. ulimit of 12500) |
   | modules | C: hbase-common hbase-zookeeper hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/3/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#issuecomment-649182536






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] apurtell commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
apurtell commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r445735442



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -79,8 +86,22 @@
   // listeners to be notified
   private final List<ZKListener> listeners = new CopyOnWriteArrayList<>();
 
+  // Single threaded executor pool that processes event notifications from Zookeeper. Events are
+  // processed in the order in which they arrive (pool backed by an unbounded fifo queue). We do
+  // this to decouple the event processing from Zookeeper's ClientCnxn's EventThread context.
+  // EventThread internally runs a single while loop to serially process all the events. When events
+  // are processed by the listeners in the same thread, that blocks the EventThread from processing
+  // subsequent events. Processing events in a separate thread frees up the event thread to continue

Review comment:
       I see, this was at the root of the test failures. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r445715522



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -79,8 +86,22 @@
   // listeners to be notified
   private final List<ZKListener> listeners = new CopyOnWriteArrayList<>();
 
+  // Single threaded executor pool that processes event notifications from Zookeeper. Events are
+  // processed in the order in which they arrive (pool backed by an unbounded fifo queue). We do
+  // this to decouple the event processing from Zookeeper's ClientCnxn's EventThread context.
+  // EventThread internally runs a single while loop to serially process all the events. When events
+  // are processed by the listeners in the same thread, that blocks the EventThread from processing
+  // subsequent events. Processing events in a separate thread frees up the event thread to continue
+  // and further prevents deadlocks if the process method itself makes other zookeeper calls.
+  // It is ok to do it in a single thread because the Zookeeper ClientCnxn already serializes the
+  // requests using a single while loop and hence there is no performance degradation.
+  private final ExecutorService zkEventProcessor =
+      Executors.newSingleThreadExecutor(Threads.getNamedThreadFactory("zk-event-processor"));

Review comment:
       And also we can keep `Threads.getNamedThreadFactory()` private(unmodified) with this change




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#issuecomment-647201906


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 33s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 39s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 40s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   3m 13s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 15s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 27s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 37s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m 40s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   4m 28s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 37s |  The patch does not generate ASF License warnings.  |
   |  |   |  40m 54s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.11 Server=19.03.11 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1945 |
   | JIRA Issue | HBASE-24603 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux bb2ba8b247e7 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 8f1353b447 |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-common hbase-zookeeper hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#issuecomment-647232060


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 28s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m  4s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 49s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 46s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 19s |  hbase-common in master failed.  |
   | -0 :warning: |  javadoc  |   0m 41s |  hbase-server in master failed.  |
   | -0 :warning: |  javadoc  |   0m 17s |  hbase-zookeeper in master failed.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 15s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 58s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 57s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 57s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 42s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 17s |  hbase-common in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 17s |  hbase-zookeeper in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 39s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 35s |  hbase-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   0m 43s |  hbase-zookeeper in the patch passed.  |
   | -1 :x: |  unit  | 134m 57s |  hbase-server in the patch failed.  |
   |  |   | 166m 46s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.11 Server=19.03.11 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1945 |
   | JIRA Issue | HBASE-24603 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 3e6c6571357a 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 8f1353b447 |
   | Default Java | 2020-01-14 |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-common.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-zookeeper.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-common.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-zookeeper.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/testReport/ |
   | Max. process+thread count | 4132 (vs. ulimit of 12500) |
   | modules | C: hbase-common hbase-zookeeper hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1945/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] bharathv commented on a change in pull request #1945: HBASE-24603: Make Zookeeper sync() call synchronous

Posted by GitBox <gi...@apache.org>.
bharathv commented on a change in pull request #1945:
URL: https://github.com/apache/hbase/pull/1945#discussion_r445276172



##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -595,9 +604,28 @@ private void connectionEvent(WatchedEvent event) {
    * data of an existing node and delete or transition that node, utilizing the
    * previously read version and data.  We want to ensure that the version read
    * is up-to-date from when we begin the operation.
+   * <p>
    */
-  public void sync(String path) throws KeeperException {
-    this.recoverableZooKeeper.sync(path, null, null);
+  public void syncOrTimeout(String path) throws KeeperException {
+    final CountDownLatch latch = new CountDownLatch(1);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    this.recoverableZooKeeper.sync(path, (i, s, o) -> latch.countDown(), null);

Review comment:
       That sounds better Sandeep. Fixed it. (I don't have a strong preference either way)

##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -595,9 +604,28 @@ private void connectionEvent(WatchedEvent event) {
    * data of an existing node and delete or transition that node, utilizing the
    * previously read version and data.  We want to ensure that the version read
    * is up-to-date from when we begin the operation.
+   * <p>
    */
-  public void sync(String path) throws KeeperException {
-    this.recoverableZooKeeper.sync(path, null, null);
+  public void syncOrTimeout(String path) throws KeeperException {
+    final CountDownLatch latch = new CountDownLatch(1);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    this.recoverableZooKeeper.sync(path, (i, s, o) -> latch.countDown(), null);
+    try {
+      if (!latch.await(zkSyncTimeout, TimeUnit.MILLISECONDS)) {
+        LOG.error("sync() operation to ZK timed out. Configured timeout: {}ms. This usually points "

Review comment:
       Fair point. Changed.

##########
File path: hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKWatcher.java
##########
@@ -595,9 +604,28 @@ private void connectionEvent(WatchedEvent event) {
    * data of an existing node and delete or transition that node, utilizing the
    * previously read version and data.  We want to ensure that the version read
    * is up-to-date from when we begin the operation.
+   * <p>
    */
-  public void sync(String path) throws KeeperException {
-    this.recoverableZooKeeper.sync(path, null, null);
+  public void syncOrTimeout(String path) throws KeeperException {
+    final CountDownLatch latch = new CountDownLatch(1);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    this.recoverableZooKeeper.sync(path, (i, s, o) -> latch.countDown(), null);

Review comment:
       Ya, this is meant to be back ported to branch-1, but I'll take care of it with a small rewrite, hope thats okay.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org