You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/07/21 19:26:49 UTC

[GitHub] [hbase] taklwu opened a new pull request #2114: HBASE-24286: HMaster won't become healthy after after cloning or crea…

taklwu opened a new pull request #2114:
URL: https://github.com/apache/hbase/pull/2114


   …ting a new cluster pointing at the same file system
   
   HBase currently does not handle `Unknown Servers` automatically and requires
   users to run hbck2 `scheduleRecoveries` when one see unknown servers on
   the HBase report UI.
   
   This became a blocker on HBase2 adoption especially when a table wasn't
   disabled before shutting down a HBase cluster on cloud or any dynamic
   environment that hostname may change frequently. Once the cluster restarts,
   hbase:meta will be keeping the old hostname/IPs for the previous cluster,
   and those region servers became `Unknown Servers` and will never be recycled.
   
   Our fix here is to trigger a repair immediately after the CatalogJanitor
   figured out any `Unknown Servers` with submitting a HBCKServerCrashProcedure
   such that regions on `Unknown Server ` can be reassigned to other online
   servers.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] z-york commented on pull request #2114: HBASE-24286: HMaster won't become healthy after after cloning or crea…

Posted by GitBox <gi...@apache.org>.
z-york commented on pull request #2114:
URL: https://github.com/apache/hbase/pull/2114#issuecomment-865234155


   @petersomogyi It's probably worth pinging on https://github.com/apache/hbase/pull/2113 and subsequent PRs as that's where most of the conversation happened. I thought Stephen had offered to put this behind a config before, but maybe I'm mistaken. Anyways, it's worth revisting this issue anyways IMO.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] z-york commented on pull request #2114: HBASE-24286: HMaster won't become healthy after after cloning or crea…

Posted by GitBox <gi...@apache.org>.
z-york commented on pull request #2114:
URL: https://github.com/apache/hbase/pull/2114#issuecomment-865234155


   @petersomogyi It's probably worth pinging on https://github.com/apache/hbase/pull/2113 and subsequent PRs as that's where most of the conversation happened. I thought Stephen had offered to put this behind a config before, but maybe I'm mistaken. Anyways, it's worth revisting this issue anyways IMO.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on pull request #2114: HBASE-24286: HMaster won't become healthy after after cloning or crea…

Posted by GitBox <gi...@apache.org>.
joshelser commented on pull request #2114:
URL: https://github.com/apache/hbase/pull/2114#issuecomment-865145985


   > I'm suggesting to hide this behind a feature flag
   
   Makes sense to me. I think that addresses some of the other concerns from @Apache9 (mentioning him to make sure that's OK with him).
   
   If @taklwu is OK with it (and can grant you edit perms), maybe you can update this PR with your changes? Or, close this and open a new one with your modifications.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2114: HBASE-24286: HMaster won't become healthy after after cloning or crea…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2114:
URL: https://github.com/apache/hbase/pull/2114#issuecomment-662135462


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 37s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 52s |  master passed  |
   | +1 :green_heart: |  compile  |   0m 58s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 35s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 30s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 55s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 55s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 32s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 143m 31s |  hbase-server in the patch failed.  |
   |  |   | 167m 59s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2114/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2114 |
   | JIRA Issue | HBASE-24286 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 68b843a4f0a7 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / f35c5eaadd |
   | Default Java | 1.8.0_232 |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2114/1/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2114/1/testReport/ |
   | Max. process+thread count | 4659 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2114/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on a change in pull request #2114: HBASE-24286: HMaster won't become healthy after after cloning or crea…

Posted by GitBox <gi...@apache.org>.
joshelser commented on a change in pull request #2114:
URL: https://github.com/apache/hbase/pull/2114#discussion_r458341484



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
##########
@@ -173,6 +175,8 @@ int scan() throws IOException {
       this.lastReport = scanForReport();
       if (!this.lastReport.isEmpty()) {
         LOG.warn(this.lastReport.toString());
+        // expires unknown servers
+        repairUnknownServers();

Review comment:
       I guess not an issue for master (which doesn't have the separate namespace table), but elsewhere do we still have `hbase.master.namespace.init.timeout` setting an upper-bound on how long we wait for hbase:namespace to get assigned? Thinking that, waiting for CatalogJanitor to run, will be a pretty "slow" solution (up to 5min wait), and we may have a master crash if the ns init timeout is 5 mins as well as the catalog janitor's interval.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2114: HBASE-24286: HMaster won't become healthy after after cloning or crea…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2114:
URL: https://github.com/apache/hbase/pull/2114#issuecomment-662078817


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 36s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 50s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m  6s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   2m  5s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 28s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m  5s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m 20s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m 48s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 14s |  The patch does not generate ASF License warnings.  |
   |  |   |  35m 41s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2114/1/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2114 |
   | JIRA Issue | HBASE-24286 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux b0c22b5b487c 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / f35c5eaadd |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2114/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] petersomogyi commented on pull request #2114: HBASE-24286: HMaster won't become healthy after after cloning or crea…

Posted by GitBox <gi...@apache.org>.
petersomogyi commented on pull request #2114:
URL: https://github.com/apache/hbase/pull/2114#issuecomment-865087242


   I've made some testing recently using this patch to be able to start an HBase cluster on a pre-existing HBase root directory. Currently I have to use `HBCK2 recoverUnknown` or SCPs but this automates the startup procedure. Based on my testing the patch works well and HBase successfully reassign the regions that are present in hbase:meta table with different hostnames (a.k.a unknown servers).
   
   Since for some installations it might not be required to do this I'm suggesting to hide this behind a feature flag.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] petersomogyi commented on pull request #2114: HBASE-24286: HMaster won't become healthy after after cloning or crea…

Posted by GitBox <gi...@apache.org>.
petersomogyi commented on pull request #2114:
URL: https://github.com/apache/hbase/pull/2114#issuecomment-865087242


   I've made some testing recently using this patch to be able to start an HBase cluster on a pre-existing HBase root directory. Currently I have to use `HBCK2 recoverUnknown` or SCPs but this automates the startup procedure. Based on my testing the patch works well and HBase successfully reassign the regions that are present in hbase:meta table with different hostnames (a.k.a unknown servers).
   
   Since for some installations it might not be required to do this I'm suggesting to hide this behind a feature flag.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on pull request #2114: HBASE-24286: HMaster won't become healthy after after cloning or crea…

Posted by GitBox <gi...@apache.org>.
joshelser commented on pull request #2114:
URL: https://github.com/apache/hbase/pull/2114#issuecomment-865145985


   > I'm suggesting to hide this behind a feature flag
   
   Makes sense to me. I think that addresses some of the other concerns from @Apache9 (mentioning him to make sure that's OK with him).
   
   If @taklwu is OK with it (and can grant you edit perms), maybe you can update this PR with your changes? Or, close this and open a new one with your modifications.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2114: HBASE-24286: HMaster won't become healthy after after cloning or crea…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2114:
URL: https://github.com/apache/hbase/pull/2114#issuecomment-662157497


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 18s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  2s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 50s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   7m  6s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 58s |  hbase-server in master failed.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m 10s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 29s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 29s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 51s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 44s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 206m 34s |  hbase-server in the patch failed.  |
   |  |   | 238m  7s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2114/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2114 |
   | JIRA Issue | HBASE-24286 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 0742f5feebba 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / f35c5eaadd |
   | Default Java | 2020-01-14 |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2114/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2114/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2114/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2114/1/testReport/ |
   | Max. process+thread count | 3856 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2114/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org