You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/03/19 18:25:08 UTC

[GitHub] [hbase] saintstack opened a new pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

saintstack opened a new pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311
 
 
   …rdown

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601352676
 
 
   Updated the changeset comment:
   ```
   
   
       HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in teardown
   
       hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Change parameter name and add javadoc to make it more clear what the
        param actually is.
   
       hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
        Move postOpenDeployTasks so if it fails to talk to the Master -- which
        can happen on cluster shutdown -- then we will do cleanup of state;
        without this the RS can get stuck and won't go down.
   
       hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
        Add handleException so CRH looks more like UnassignRegionHandler and
        AssignRegionHandler around exception handling. Add a bit of doc on
        why CRH.
   
       hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/UnassignRegionHandler.java
        Right shift most of the body of process so can add in a finally
        that cleans up rs.getRegionsInTransitionInRS is on exception
        (otherwise outstanding entries can stop a RS going down on cluster
        shutdown)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601422712
 
 
   Thanks @ndimiduk . Will wait see if this patch good by @Apache9 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601557357
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 37s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  6s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m 30s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  branch-2 passed  |
   | -1 :x: |  shadedjars  |   0m 12s |  branch has 7 errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 41s |  hbase-server in branch-2 failed.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m 11s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  6s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  6s |  the patch passed  |
   | -1 :x: |  shadedjars  |   0m 11s |  patch has 7 errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 37s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | -0 :warning: |  unit  |  68m 26s |  hbase-server in the patch failed.  |
   |  |   |  87m 43s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1311 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 18256abb643f 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / ffb2359146 |
   | Default Java | 2020-01-14 |
   | shadedjars | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk11-hadoop3-check/output/branch-shadedjars.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | shadedjars | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk11-hadoop3-check/output/patch-shadedjars.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/testReport/ |
   | Max. process+thread count | 5611 (vs. ulimit of 10000) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/console |
   | versions | git=2.17.1 maven=2018-06-17T18:33:14Z) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601410578
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 43s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  7s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m 50s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   0m 59s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   4m 50s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m 18s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  7s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 55s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 59s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 106m 15s |  hbase-server in the patch failed.  |
   |  |   | 135m 54s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1311 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 1e3dc7b68916 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / ffb2359146 |
   | Default Java | 1.8.0_232 |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk8-hadoop2-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/testReport/ |
   | Max. process+thread count | 5504 (vs. ulimit of 10000) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/console |
   | versions | git=2.17.1 maven=2018-06-17T18:33:14Z) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601867415
 
 
   One of the test failures -- A perversion around Region handling in TestRegionObserverInterface -- exposed issue w/ the CloseRegionHandler refactor trying to make it look like other handlers around  regionsInTransitionInRS handling. Fixed (and fixed the issue @Apache9 noted above). Added region name logging to this journal stuff -- otherwise its just opaque... thats just log change.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601558645
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 50s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  7s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m  6s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m  1s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   4m 39s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m 17s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 58s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 58s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   4m 27s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 34s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  |  67m 34s |  hbase-server in the patch failed.  |
   |  |   |  94m 32s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1311 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 5de009b95d02 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / ffb2359146 |
   | Default Java | 1.8.0_232 |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk8-hadoop2-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/testReport/ |
   | Max. process+thread count | 5609 (vs. ulimit of 10000) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/console |
   | versions | git=2.17.1 maven=2018-06-17T18:33:14Z) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#discussion_r395325828
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
 ##########
 @@ -133,4 +123,10 @@ public void process() {
         remove(this.regionInfo.getEncodedNameAsBytes(), Boolean.FALSE);
     }
   }
+
+  @Override protected void handleException(Throwable t) {
 
 Review comment:
   Yes, inconsistently used. Here trying to keep w/ the herd.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache9 commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
Apache9 commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#discussion_r395452954
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
 ##########
 @@ -92,45 +91,41 @@ public RegionInfo getRegionInfo() {
   }
 
   @Override
-  public void process() {
-    try {
-      String name = regionInfo.getEncodedName();
-      LOG.trace("Processing close of {}", name);
-      String encodedRegionName = regionInfo.getEncodedName();
-      // Check that this region is being served here
-      HRegion region = (HRegion)rsServices.getRegion(encodedRegionName);
-      if (region == null) {
-        LOG.warn("Received CLOSE for region {} but currently not serving - ignoring", name);
-        // TODO: do better than a simple warning
-        return;
-      }
+  public void process() throws IOException {
+    String name = regionInfo.getEncodedName();
+    LOG.trace("Processing close of {}", name);
+    String encodedRegionName = regionInfo.getEncodedName();
+    // Check that this region is being served here
+    HRegion region = (HRegion)rsServices.getRegion(encodedRegionName);
+    if (region == null) {
+      LOG.warn("Received CLOSE for region {} but currently not serving - ignoring", name);
+      // TODO: do better than a simple warning
+      return;
+    }
+
+    // Close the region
+    if (region.close(abort) == null) {
+      // This region got closed.  Most likely due to a split.
+      // The split message will clean up the master state.
+      LOG.warn("Can't close region {}, was already closed during close()", name);
+      return;
+    }
 
-      // Close the region
-      try {
-        if (region.close(abort) == null) {
-          // This region got closed.  Most likely due to a split.
-          // The split message will clean up the master state.
-          LOG.warn("Can't close region {}, was already closed during close()", name);
-          return;
-        }
-      } catch (IOException ioe) {
-        // An IOException here indicates that we couldn't successfully flush the
-        // memstore before closing. So, we need to abort the server and allow
-        // the master to split our logs in order to recover the data.
-        server.abort("Unrecoverable exception while closing region " +
-          regionInfo.getRegionNameAsString() + ", still finishing close", ioe);
-        throw new RuntimeException(ioe);
-      }
+    this.rsServices.removeRegion(region, destination);
+    rsServices.reportRegionStateTransition(new RegionStateTransitionContext(TransitionCode.CLOSED,
+      HConstants.NO_SEQNUM, Procedure.NO_PROC_ID, -1, regionInfo));
 
-      this.rsServices.removeRegion(region, destination);
-      rsServices.reportRegionStateTransition(new RegionStateTransitionContext(TransitionCode.CLOSED,
-        HConstants.NO_SEQNUM, Procedure.NO_PROC_ID, -1, regionInfo));
+    // Done!  Region is closed on this RS
+    this.rsServices.getRegionsInTransitionInRS().
+      remove(this.regionInfo.getEncodedNameAsBytes(), Boolean.FALSE);
+    LOG.debug("Closed " + region.getRegionInfo().getRegionNameAsString());
 
 Review comment:
   LOG.debug("Closed {}", region.getRegionInfo().getRegionNameAsString());

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601406352
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 49s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  7s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m 36s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  branch-2 passed  |
   | -1 :x: |  shadedjars  |   0m 10s |  branch has 7 errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 42s |  hbase-server in branch-2 failed.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m 17s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  6s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  6s |  the patch passed  |
   | -1 :x: |  shadedjars  |   0m 10s |  patch has 7 errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 40s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | -0 :warning: |  unit  | 105m 39s |  hbase-server in the patch failed.  |
   |  |   | 125m 49s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1311 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux eda6b4b0615f 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / ffb2359146 |
   | Default Java | 2020-01-14 |
   | shadedjars | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk11-hadoop3-check/output/branch-shadedjars.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | shadedjars | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk11-hadoop3-check/output/patch-shadedjars.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/testReport/ |
   | Max. process+thread count | 5729 (vs. ulimit of 10000) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/console |
   | versions | git=2.17.1 maven=2018-06-17T18:33:14Z) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601369907
 
 
   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 25s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m 15s |  branch-2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 17s |  branch-2 passed  |
   | +1 :green_heart: |  spotbugs  |   2m 11s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m 35s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 18s |  hbase-server: The patch generated 0 new + 53 unchanged - 4 fixed = 53 total (was 57)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  17m 44s |  Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2.  |
   | +1 :green_heart: |  spotbugs  |   2m 20s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 13s |  The patch does not generate ASF License warnings.  |
   |  |   |  45m 39s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1311 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 5f26a9ea0127 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / ffb2359146 |
   | Max. process+thread count | 83 (vs. ulimit of 10000) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/console |
   | versions | git=2.17.1 maven=2018-06-17T18:33:14Z) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] ndimiduk commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
ndimiduk commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#discussion_r395316718
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
 ##########
 @@ -106,20 +105,11 @@ public void process() {
       }
 
       // Close the region
-      try {
-        if (region.close(abort) == null) {
-          // This region got closed.  Most likely due to a split.
-          // The split message will clean up the master state.
-          LOG.warn("Can't close region {}, was already closed during close()", name);
-          return;
-        }
-      } catch (IOException ioe) {
-        // An IOException here indicates that we couldn't successfully flush the
-        // memstore before closing. So, we need to abort the server and allow
-        // the master to split our logs in order to recover the data.
-        server.abort("Unrecoverable exception while closing region " +
-          regionInfo.getRegionNameAsString() + ", still finishing close", ioe);
-        throw new RuntimeException(ioe);
 
 Review comment:
   After reading the above comment and seeing you discarded the throwing of this exception, I initially choked. But reading through the actual use of these `Handlers` in the `ExecutorService` instance hanging off of `HRegionServer`, and `HMaster` I can only conclude that the above throw was only wishful thinking. There's even a [comment](https://github.com/apache/hbase/blob/678b142da2b75ab6697125bbfdd33e32851650bf/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java#L1381-L1387) (emphasis mine):
   
   > Start up all services. **If any of these threads gets an unhandled exception**
   > **then they just die with a logged message.**  This should be fine because
   > in general, we do not expect the master to get such unhandled exceptions
   > as OOMEs; it should be lightly loaded. See what HRegionServer does if
   > need to install an unexpected exception handler.
   
   The author of the above comment speaks wistfully of what i can only assume is [`HRegionServer#uncaughtExceptionHandler`](https://github.com/apache/hbase/blob/8e26761fd01408471a25d481afed64faf34f7574/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java#L625-L626). However, it doesn't appear that this is threaded down into the executor service, which means this line's `throw` statement is simply logged and ignored.
   
   So yes, I think removing the `throw` is the right choice. It removes the false sense of handling this error condition correctly. It's really the `abort` that protects the content of the memstore.
   
   Also, why is there not a named exception thrown by the memstore when it cannot flush? Seems like a useful point in that data structure's API.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack merged pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
saintstack merged pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache9 commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
Apache9 commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#discussion_r395407484
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/UnassignRegionHandler.java
 ##########
 @@ -94,40 +93,42 @@ public void process() throws IOException {
       }
       return;
     }
-    HRegion region = rs.getRegion(encodedName);
-    if (region == null) {
-      LOG.debug(
-        "Received CLOSE for a region {} which is not online, and we're not opening/closing.",
-        encodedName);
-      rs.getRegionsInTransitionInRS().remove(encodedNameBytes, Boolean.FALSE);
-      return;
-    }
-    String regionName = region.getRegionInfo().getEncodedName();
-    LOG.info("Close {}", regionName);
-    if (region.getCoprocessorHost() != null) {
-      // XXX: The behavior is a bit broken. At master side there is no FAILED_CLOSE state, so if
-      // there are exception thrown from the CP, we can not report the error to master, and if here
-      // we just return without calling reportRegionStateTransition, the TRSP at master side will
-      // hang there for ever. So here if the CP throws an exception out, the only way is to abort
-      // the RS...
-      region.getCoprocessorHost().preClose(abort);
-    }
-    if (region.close(abort) == null) {
-      // XXX: Is this still possible? The old comment says about split, but now split is done at
-      // master side, so...
-      LOG.warn("Can't close region {}, was already closed during close()", regionName);
+    try {
+      HRegion region = rs.getRegion(encodedName);
+      if (region == null) {
+        LOG.debug(
+          "Received CLOSE for a region {} which is not online, and we're not opening/closing.",
+          encodedName);
+        return;
+      }
+      String regionName = region.getRegionInfo().getEncodedName();
+      LOG.info("Close {}", regionName);
+      if (region.getCoprocessorHost() != null) {
+        // XXX: The behavior is a bit broken. At master side there is no FAILED_CLOSE state, so if
+        // there are exception thrown from the CP, we can not report the error to master, and if
+        // here we just return without calling reportRegionStateTransition, the TRSP at master side
+        // will hang there for ever. So here if the CP throws an exception out, the only way is to
+        // abort the RS...
+        region.getCoprocessorHost().preClose(abort);
+      }
+      if (region.close(abort) == null) {
+        // XXX: Is this still possible? The old comment says about split, but now split is done at
+        // master side, so...
+        LOG.warn("Can't close region {}, was already closed during close()", regionName);
+        return;
+      }
+      rs.removeRegion(region, destination);
+      if (!rs.reportRegionStateTransition(
+        new RegionStateTransitionContext(TransitionCode.CLOSED, HConstants.NO_SEQNUM, closeProcId,
+          -1, region.getRegionInfo()))) {
+        throw new IOException("Failed to report close to master: " + regionName);
+      }
+      // Cache the close region procedure id after report region transition succeed.
+      rs.finishRegionProcedure(closeProcId);
+      LOG.info("Closed {}", regionName);
+    } finally {
 
 Review comment:
   So this is the actual fix here?
   
   If you really want to do this to let the test pass, I suggest you add the removal in the handleException method, and add a FIXME or TODO comment to say that this is just for making test pass, should be addressed later.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] ndimiduk commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
ndimiduk commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#discussion_r395309129
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
 ##########
 @@ -38,9 +37,13 @@
 /**
  * Handles closing of a region on a region server.
  * <p/>
- * Now for regular close region request, we will use {@link UnassignRegionHandler} instead. But when
- * shutting down the region server, will also close regions and the related methods still use this
- * class so we keep it here.
+ * In normal operation, we use {@link UnassignRegionHandler} closing Regions but when shutting down
+ * the region server and closing out Regions, we use this handler instead; it does not expect to
+ * be able to communicate the close back to the Master.
+ * <p>Expects that the close has been registered in the hosting RegionServer before
+ * submitting this Handler; i.e. <code>rss.getRegionsInTransitionInRS().putIfAbsent(
+ * this.regionInfo.getEncodedNameAsBytes(), Boolean.FALSE);</code> has been called first.
+ * In here when done, we do the deregister.</p>
 
 Review comment:
   helpful observation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601916982
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 25s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  6s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m  8s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m  1s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   4m 57s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m 44s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  0s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  0s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   4m 58s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 101m  0s |  hbase-server in the patch failed.  |
   |  |   | 129m  6s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1311 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 3e382637ced1 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 8320f73c8c |
   | Default Java | 1.8.0_232 |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-jdk8-hadoop2-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/testReport/ |
   | Max. process+thread count | 4172 (vs. ulimit of 10000) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/console |
   | versions | git=2.17.1 maven=2018-06-17T18:33:14Z) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601535096
 
 
   bq. If you really want to do this to let the test pass, I suggest you add the removal in the handleException method, and add a FIXME or TODO comment to say that this is just for making test pass, should be addressed later.
   
   I can move it to handleException, np. I will NOT note that it a UT fix only. There is an obvious hole here holds up shutdowns and shutdowns are not UT only.
   
   These Handlers strike me as arbitrary regards where stuff goes; no wonder there are holes.
   
   Let me put up another patch w/ your suggestions.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601551244
 
 
   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 56s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   7m 49s |  branch-2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 26s |  branch-2 passed  |
   | +1 :green_heart: |  spotbugs  |   2m 40s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m 48s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 27s |  hbase-server: The patch generated 0 new + 53 unchanged - 4 fixed = 53 total (was 57)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  22m 19s |  Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2.  |
   | +1 :green_heart: |  spotbugs  |   2m 27s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 13s |  The patch does not generate ASF License warnings.  |
   |  |   |  56m  6s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1311 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux c521f1c30def 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / ffb2359146 |
   | Max. process+thread count | 83 (vs. ulimit of 10000) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/console |
   | versions | git=2.17.1 maven=2018-06-17T18:33:14Z) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] ndimiduk commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
ndimiduk commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#discussion_r395312284
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
 ##########
 @@ -133,4 +123,10 @@ public void process() {
         remove(this.regionInfo.getEncodedNameAsBytes(), Boolean.FALSE);
     }
   }
+
+  @Override protected void handleException(Throwable t) {
 
 Review comment:
   Maybe it's just because I'm new to the *Handler code, but it's not clear to me why one would handle exceptions locally vs. handle them from this `handleException` method. I guess it's all hooks for operating within the confines of a `Runnable` off on a thread pool somewhere.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] ndimiduk commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
ndimiduk commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#discussion_r395307356
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
 ##########
 @@ -132,11 +131,11 @@ public void process() throws IOException {
       // opening can not be interrupted by a close request any more.
       region = HRegion.openHRegion(regionInfo, htd, rs.getWAL(regionInfo), rs.getConfiguration(),
         rs, null);
+      rs.postOpenDeployTasks(new PostOpenDeployContext(region, openProcId, masterSystemTime));
 
 Review comment:
   Yikes! Yeah, this seems better here. Good.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] ndimiduk commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
ndimiduk commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#discussion_r395318061
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/UnassignRegionHandler.java
 ##########
 @@ -94,40 +93,42 @@ public void process() throws IOException {
       }
       return;
     }
-    HRegion region = rs.getRegion(encodedName);
-    if (region == null) {
-      LOG.debug(
-        "Received CLOSE for a region {} which is not online, and we're not opening/closing.",
-        encodedName);
-      rs.getRegionsInTransitionInRS().remove(encodedNameBytes, Boolean.FALSE);
-      return;
-    }
-    String regionName = region.getRegionInfo().getEncodedName();
-    LOG.info("Close {}", regionName);
-    if (region.getCoprocessorHost() != null) {
-      // XXX: The behavior is a bit broken. At master side there is no FAILED_CLOSE state, so if
-      // there are exception thrown from the CP, we can not report the error to master, and if here
-      // we just return without calling reportRegionStateTransition, the TRSP at master side will
-      // hang there for ever. So here if the CP throws an exception out, the only way is to abort
-      // the RS...
-      region.getCoprocessorHost().preClose(abort);
-    }
-    if (region.close(abort) == null) {
-      // XXX: Is this still possible? The old comment says about split, but now split is done at
-      // master side, so...
-      LOG.warn("Can't close region {}, was already closed during close()", regionName);
+    try {
+      HRegion region = rs.getRegion(encodedName);
+      if (region == null) {
+        LOG.debug(
+          "Received CLOSE for a region {} which is not online, and we're not opening/closing.",
+          encodedName);
+        return;
+      }
+      String regionName = region.getRegionInfo().getEncodedName();
+      LOG.info("Close {}", regionName);
+      if (region.getCoprocessorHost() != null) {
+        // XXX: The behavior is a bit broken. At master side there is no FAILED_CLOSE state, so if
+        // there are exception thrown from the CP, we can not report the error to master, and if
+        // here we just return without calling reportRegionStateTransition, the TRSP at master side
+        // will hang there for ever. So here if the CP throws an exception out, the only way is to
+        // abort the RS...
+        region.getCoprocessorHost().preClose(abort);
+      }
+      if (region.close(abort) == null) {
+        // XXX: Is this still possible? The old comment says about split, but now split is done at
+        // master side, so...
+        LOG.warn("Can't close region {}, was already closed during close()", regionName);
+        return;
+      }
+      rs.removeRegion(region, destination);
+      if (!rs.reportRegionStateTransition(
+        new RegionStateTransitionContext(TransitionCode.CLOSED, HConstants.NO_SEQNUM, closeProcId,
+          -1, region.getRegionInfo()))) {
+        throw new IOException("Failed to report close to master: " + regionName);
+      }
+      // Cache the close region procedure id after report region transition succeed.
+      rs.finishRegionProcedure(closeProcId);
+      LOG.info("Closed {}", regionName);
+    } finally {
       rs.getRegionsInTransitionInRS().remove(encodedNameBytes, Boolean.FALSE);
 
 Review comment:
   good.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#discussion_r395434478
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
 ##########
 @@ -132,11 +131,11 @@ public void process() throws IOException {
       // opening can not be interrupted by a close request any more.
       region = HRegion.openHRegion(regionInfo, htd, rs.getWAL(regionInfo), rs.getConfiguration(),
         rs, null);
+      rs.postOpenDeployTasks(new PostOpenDeployContext(region, openProcId, masterSystemTime));
 
 Review comment:
   bq. IIRC, the design here is that, postOpenDeployTasks is the PONR, if we arrive here, then we can not revert back, the only way to address the exception is to abort the region server.
   
   Ok. That helps. Let me add above as comment and ensure the above happens and that I get my fix in.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601540785
 
 
   New push. Enjoys the benefit of @Apache9 feedback. Main change is restoring these Handlers to as they were (but w/ the PONR comment added) and then in the handleException, just removing entry from RS RIT map just before call to abort. Gets me what I want and leaves rest of code as was.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601936046
 
 
   Test report shows no failures. The test output got dropped because of below. I've been running this in tests local and seems fine. Will merge and keep an eye on it.
   
   Post stage
   [Pipeline] junit
   [2020-03-20T21:24:40.217Z] Recording test results
   [2020-03-20T21:24:43.747Z] Remote call on H1 failed
   Error when executing always post condition:
   java.io.IOException: Remote call on H1 failed
   	at hudson.remoting.Channel.call(Channel.java:963)
   	at hudson.FilePath.act(FilePath.java:1072)
   	at hudson.FilePath.act(FilePath.java:1061)
   	at hudson.tasks.junit.JUnitParser.parseResult(JUnitParser.java:114)
   	at hudson.tasks.junit.JUnitResultArchiver.parse(JUnitResultArchiver.java:137)
   	at hudson.tasks.junit.JUnitResultArchiver.parseAndAttach(JUnitResultArchiver.java:167)
   	at hudson.tasks.junit.pipeline.JUnitResultsStepExecution.run(JUnitResultsStepExecution.java:52)
   	at hudson.tasks.junit.pipeline.JUnitResultsStepExecution.run(JUnitResultsStepExecution.java:25)
   	at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
   	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:748)
   Caused by: java.lang.NoClassDefFoundError: Could not initialize class jenkins.model.Jenkins
   	at hudson.ExtensionList.lookup(ExtensionList.java:433)
   	at hudson.tasks.junit.TestNameTransformer.all(TestNameTransformer.java:40)
   	at hudson.tasks.junit.TestNameTransformer.getTransformedName(TestNameTransformer.java:33)
   	at hudson.tasks.junit.CaseResult.getTransformedTestName(CaseResult.java:273)
   	at hudson.tasks.junit.SuiteResult.casesByName(SuiteResult.java:134)
   	at hudson.tasks.junit.SuiteResult.addCase(SuiteResult.java:297)
   	at hudson.tasks.junit.SuiteResult.<init>(SuiteResult.java:270)
   	at hudson.tasks.junit.SuiteResult.parseSuite(SuiteResult.java:209)
   	at hudson.tasks.junit.SuiteResult.parse(SuiteResult.java:181)
   	at hudson.tasks.junit.TestResult.parse(TestResult.java:348)
   	at hudson.tasks.junit.TestResult.parsePossiblyEmpty(TestResult.java:281)
   	at hudson.tasks.junit.TestResult.parse(TestResult.java:206)
   	at hudson.tasks.junit.TestResult.parse(TestResult.java:178)
   	at hudson.tasks.junit.TestResult.<init>(TestResult.java:143)
   	at hudson.tasks.junit.JUnitParser$ParseResultCallable.invoke(JUnitParser.java:146)
   	at hudson.tasks.junit.JUnitParser$ParseResultCallable.invoke(JUnitParser.java:118)
   	at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3052)
   	at hudson.remoting.UserRequest.perform(UserRequest.java:212)
   	at hudson.remoting.UserRequest.perform(UserRequest.java:54)
   	at hudson.remoting.Request$2.run(Request.java:369)
   	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
   	... 4 more
   	Suppressed: hudson.remoting.Channel$CallSiteStackTrace: Remote call to H1
   		at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1743)
   		at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
   		at hudson.remoting.Channel.call(Channel.java:957)
   		at hudson.FilePath.act(FilePath.java:1072)
   		at hudson.FilePath.act(FilePath.java:1061)
   		at hudson.tasks.junit.JUnitParser.parseResult(JUnitParser.java:114)
   		at hudson.tasks.junit.JUnitResultArchiver.parse(JUnitResultArchiver.java:137)
   		at hudson.tasks.junit.JUnitResultArchiver.parseAndAttach(JUnitResultArchiver.java:167)
   		at hudson.tasks.junit.pipeline.JUnitResultsStepExecution.run(JUnitResultsStepExecution.java:52)
   		at hudson.tasks.junit.pipeline.JUnitResultsStepExecution.run(JUnitResultsStepExecution.java:25)
   		at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
   		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
   		... 4 more
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601884588
 
 
   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 41s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m 18s |  branch-2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 21s |  branch-2 passed  |
   | +1 :green_heart: |  spotbugs  |   2m  9s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m 43s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 23s |  hbase-server: The patch generated 0 new + 259 unchanged - 4 fixed = 259 total (was 263)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m 54s |  Patch does not cause any errors with Hadoop 2.10.0 or 3.1.2.  |
   | +1 :green_heart: |  spotbugs  |   2m 18s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 14s |  The patch does not generate ASF License warnings.  |
   |  |   |  40m 14s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1311 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 0053669810d7 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 8320f73c8c |
   | Max. process+thread count | 83 (vs. ulimit of 10000) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/console |
   | versions | git=2.17.1 maven=2018-06-17T18:33:14Z) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601911675
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 23s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  8s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  10m 43s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 51s |  branch-2 passed  |
   | -1 :x: |  shadedjars  |   0m 13s |  branch has 7 errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   1m  2s |  hbase-server in branch-2 failed.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   9m 49s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 46s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 46s |  the patch passed  |
   | -1 :x: |  shadedjars  |   0m 24s |  patch has 7 errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   1m  9s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  82m 49s |  hbase-server in the patch passed.  |
   |  |   | 113m 27s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/1311 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 9bf12a52404a 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 8320f73c8c |
   | Default Java | 2020-01-14 |
   | shadedjars | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-jdk11-hadoop3-check/output/branch-shadedjars.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | shadedjars | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-jdk11-hadoop3-check/output/patch-shadedjars.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/testReport/ |
   | Max. process+thread count | 5634 (vs. ulimit of 10000) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/console |
   | versions | git=2.17.1 maven=2018-06-17T18:33:14Z) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache9 commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
Apache9 commented on a change in pull request #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#discussion_r395406458
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
 ##########
 @@ -132,11 +131,11 @@ public void process() throws IOException {
       // opening can not be interrupted by a close request any more.
       region = HRegion.openHRegion(regionInfo, htd, rs.getWAL(regionInfo), rs.getConfiguration(),
         rs, null);
+      rs.postOpenDeployTasks(new PostOpenDeployContext(region, openProcId, masterSystemTime));
 
 Review comment:
   No...
   
   IIRC, the design here is that, postOpenDeployTasks is the PONR, if we arrive here, then we can not revert back, the only way to address the exception is to abort the region server.
   
   The fact is that, if we haven't told master anything, it is fine for us to close the region and tell master the failure, but once we have already called master with the succeeded message, even if the rpc call fails, we do not know whether the other side(the master) has received and processed the request already, so the only way is to retry for ever, and if this can not be done, the only way is to abort ourselves...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601871233
 
 
   Thanks for review @Apache9 . I'd filed HBASE-24015 a few days ago because it seemed plain this issue had opened a can of worms -- and that was before you showed up. You want to go more radical that the scope of HBASE-24015, so I made HBASE-24026 for shutdown redo. Thanks.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…

Posted by GitBox <gi...@apache.org>.
saintstack commented on issue #1311: HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea…
URL: https://github.com/apache/hbase/pull/1311#issuecomment-601534556
 
 
   bq. Let's focus on just making the UT pass here, without changing other code.
   
   It is not just about unit test.
   
   bq. I suggest we open a follow on issue, to discuss the abort behavior.
   
   You are welcome to. I'm current just interested in landing a fix for cluster shutdown/RS aborts and concurrent assign/unassigns which causes flakey test failures and hangs in the wild.
   
   bq. To me, the operations in abort method do not make sense. Maybe we just need to try our best to close the connection to zk to let master know we are dead, and then just do a System.exit(1). For now we will do lots of clean up work and even want to flush all the regions? This is not a abort I'd say, it is almost like a graceful shutdown...
   
   For new issue.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services