You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/08/10 22:20:01 UTC

[GitHub] [hbase] taklwu opened a new pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

taklwu opened a new pull request #2237:
URL: https://github.com/apache/hbase/pull/2237


   …it's not partial
   
   Add a check on meta bootstrap to skip removing meta table directory
   if ZK data does not exist when hmaster restart. here the existence
   of clusterID in ZK indicate if the meta is partial if we hit the
   INIT_META_WRITE_FS_LAYOUT in InitMetaProcedure


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-675370513


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 33s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  7s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 49s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   0m 58s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   5m 12s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 39s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 25s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  0s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  0s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 10s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 150m  2s |  hbase-server in the patch failed.  |
   |  |   | 173m 32s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/2/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2237 |
   | JIRA Issue | HBASE-24833 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 51731f86518e 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 7335dbc834 |
   | Default Java | 1.8.0_232 |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/2/artifact/yetus-jdk8-hadoop2-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/2/testReport/ |
   | Max. process+thread count | 3277 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-914593891


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 34s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  7s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 31s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   7m 10s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 44s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m  6s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  7s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   7m 10s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 135m 40s |  hbase-server in the patch passed.  |
   |  |   | 165m 57s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/5/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2237 |
   | JIRA Issue | HBASE-24833 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux b6554c2ebd46 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 62c58fc11a |
   | Default Java | AdoptOpenJDK-11.0.10+9 |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/5/testReport/ |
   | Max. process+thread count | 3817 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/5/console |
   | versions | git=2.17.1 maven=3.6.3 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@hbase.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] z-york commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
z-york commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r558573235



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java
##########
@@ -166,4 +170,35 @@ protected void completionCleanup(MasterProcedureEnv env) {
   public void await() throws InterruptedException {
     latch.await();
   }
+
+  private static boolean deleteMetaTableDirectoryIfPartial(FileSystem rootDirectoryFs,
+    Path metaTableDir) throws IOException {
+    boolean isPartial = true;
+    try {
+      TableDescriptor metaDescriptor =
+        FSTableDescriptors.getTableDescriptorFromFs(rootDirectoryFs, metaTableDir);
+      // when entering the state of INIT_META_WRITE_FS_LAYOUT, if a meta table directory is found,
+      // the meta table should not have any useful data and considers as partial.
+      // if we find any valid HFiles, operator should fix the meta e.g. via HBCK.
+      if (metaDescriptor != null && metaDescriptor.getColumnFamilyCount() > 0) {
+        RemoteIterator<LocatedFileStatus> iterator = rootDirectoryFs.listFiles(metaTableDir, true);
+        while (iterator.hasNext()) {
+          LocatedFileStatus status = iterator.next();
+          if (StoreFileInfo.isHFile(status.getPath()) && HFile
+            .isHFileFormat(rootDirectoryFs, status.getPath())) {
+            isPartial = false;
+            break;
+          }
+        }
+      }
+    } finally {
+      if (!isPartial) {
+        throw new IOException("Meta table is not partial, please sideline this meta directory "
+          + "or run HBCK to fix this meta table, e.g. rebuild the server hostname in ZNode for the "

Review comment:
       Should we mention the HBCK sub command to help fix this? (Is this offline meta repair?) Since we're adding a new failure mode, it should be clear how to fix this case. 

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java
##########
@@ -166,4 +170,35 @@ protected void completionCleanup(MasterProcedureEnv env) {
   public void await() throws InterruptedException {
     latch.await();
   }
+
+  private static boolean deleteMetaTableDirectoryIfPartial(FileSystem rootDirectoryFs,
+    Path metaTableDir) throws IOException {
+    boolean isPartial = true;

Review comment:
       Change to shouldDelete? isPartial is a bit confusing to me.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-674265891


   I apologized for the dev@ email, but I was thinking differently overnight about your suggestion (sorry that I reread many times until I found the gap this morning)
   
   > You can see the code in finisheActiveMasterInitialization and also the code in AssignmentManager.start. In AssignmentManager.start, we will try to load the start of meta region from zookeeper, and if there is none, we will create a region node in offline state. And in finishActiveMasterInitialization, if we find that the state of meta region node is offline, we will schedule InitMetaProcedure.
   So what you need to do here, is to put the meta region znode to zookeeper, before you restart the hbase cluster. So we will not schedule InitMetaProcedure again.
   
   
   didn't the coming up master region that store the meta location in [HBASE-24408](https://issues.apache.org/jira/browse/HBASE-24408) and [PR#1746](https://github.com/apache/hbase/pull/1746/commits/976d0c4e5b732a23773bd306f79e8017344b58f3) solve our conflict of interests that we don't need to relaying on ZK for getting the server name (old host) for the meta region ? such even if we don't have the ZK, we can move on and don't submit the InitMetaProcedure because the state of the meta region is not `OFFLINE`. 
   
   if you confirm above, I may say bring this PR and keep highlighting the zookeeper discussion is my mistake and I should have learnt the master region ahead of this PR. (then we just need to move to the coming up version, and we can still restart on the cloud use cases)
   
   [update] I will follow your suggestion and throws a exception if we find the meta table is not empty/partial


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r564936787



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java
##########
@@ -166,4 +170,35 @@ protected void completionCleanup(MasterProcedureEnv env) {
   public void await() throws InterruptedException {
     latch.await();
   }
+
+  private static boolean deleteMetaTableDirectoryIfPartial(FileSystem rootDirectoryFs,
+    Path metaTableDir) throws IOException {
+    boolean isPartial = true;
+    try {
+      TableDescriptor metaDescriptor =
+        FSTableDescriptors.getTableDescriptorFromFs(rootDirectoryFs, metaTableDir);
+      // when entering the state of INIT_META_WRITE_FS_LAYOUT, if a meta table directory is found,
+      // the meta table should not have any useful data and considers as partial.
+      // if we find any valid HFiles, operator should fix the meta e.g. via HBCK.
+      if (metaDescriptor != null && metaDescriptor.getColumnFamilyCount() > 0) {
+        RemoteIterator<LocatedFileStatus> iterator = rootDirectoryFs.listFiles(metaTableDir, true);
+        while (iterator.hasNext()) {
+          LocatedFileStatus status = iterator.next();
+          if (StoreFileInfo.isHFile(status.getPath()) && HFile
+            .isHFileFormat(rootDirectoryFs, status.getPath())) {
+            isPartial = false;
+            break;
+          }
+        }
+      }
+    } finally {
+      if (!isPartial) {
+        throw new IOException("Meta table is not partial, please sideline this meta directory "
+          + "or run HBCK to fix this meta table, e.g. rebuild the server hostname in ZNode for the "

Review comment:
       good point on documentation, I need more time to figure out if there is existing command(s)/option(s) other than manually sidelining the meta directory to a different location. let's mark it as a blocker/requirement before I merging this PR. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-736772662


   sorry for the delay response.
   
   >  Is clumsy operator deleting the meta location znode by mistake a valid failure mode ?
   
   no this is a special case that we have been supporting, where the HBase cluster freshly restarts on top of only flushed HFiles and does not come with WAL or ZK. and we admitted that it's a bit different from the community stand points that WAL and ZK must be both pre-existed when master or/and RSs start on existing HFiles to resume the states left from any procedures. 
   
   > What about adding extra step before assign where we wait asking Master a question about the cluster state such as if any of the RSs that are checking in have Regions on them; i.e. if Regions already assigned, if an already 'up' cluster? Would that help?
   
   having extra step to check if RSs has any assigned may help, but I don't know if we can do that before the server manager find any region server is online. 
   
   > You fellows don't want to have to run a script beforehand? ZK is up and just put an empty location up or ask Master or hbck2 to do it for you? 
   
   I think HBCK/HBCK2 is performing online repairing, there are few concerns we're having 
   1. if the master is not up and running, then we cannot proceed 
   2. even if the master is up, the repairing on hundreds or thousand of regions implies long scanning time, which IMO we can save this time by just reloading it from existing meta. 
   3. having an additional steps/scripts to start a HBase cluster in the mentioned cloud use case seem a manual/semi-automated step we don't find a good fit to hold and maintain them.
   
   Personally, it's fine to me with throwing exception as Duo suggested, and on our side we need to find a way to continue if we see this exception. then we improve it in the future when we need to completely getting rid of the extra step on hbck. 
   
   So, for this PR, if we don't hear any other critical suggestion, maybe I will leave it "close" as unresolved, do you guys agree ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] z-york commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
z-york commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-677947479


   I am currently -1 on the implementation pending answers to my questions. I have not been able to find any valid failure modes that will cause data loss or corrupting actions if hbase:meta is onlined. If this is true, we can make InitMetaProcedure idempotent (see above) and we can handle any InitMetaProcedure failure condition.
   
   In my investigation I was actually looking at the master branch which contains an additional state for INIT_META_CREATE_NAMESPACES. There are additional bugs associated with this approach that I will open JIRAs for. I will not bring the additional complexity of folding the namespace table into the meta table into this discussion.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673357573


   OK, so your point, is to add a feature that, we could start a cluster without no data on zk? I would vote a -1 here as I do not think it is possible. And even we could make it work for now, we can not force all later developpers to keep in mind that, all data on zk are not stable, you should not depend on them... It does not make sense.
   
   And on the InitMetaProcedure, maybe we could avoid deleting the whole meta, if there are HFiles in it. This is a possible improvement, to avoid damage the cluster if there are something wrong with zk. As for InitMetaProcedure, if we are at the first state, we can make sure that only directories are created and no actual data? Anyway, even if we add this check, I think we should fail the procedure and fail the start up, as we know that there are something wrong. Users should use HBCK or other tools to fix the inconsistency before moving forward.
   
   Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-674372607


   > didn't the coming up master region that store the meta location in HBASE-24408 and PR#1746 solve our conflict of interests that we don't need to relaying on ZK for getting the server name (old host) for the meta region ? such even if we don't have the ZK, we can move on and don't submit the InitMetaProcedure because the state of the meta region is not OFFLINE.
   
   HBASE-24388 is only on a feature branch and we are still discussing whether to go this way... There is another solution to introduce a general root table, then we will have InitRootProcedure and the state of the root region is still on zookeeper. Of course I'm a sponsor of HBASE-24388 as it is implemented by me...
   And I'm a fan of removing zookeeper dependency completely if possible(or at least just use it as an external storage so on cloud we could easily use other storage systems to replace it, such as dynamodb).
   
   > [update] I will follow your suggestion and throws a exception if we find the meta table is not empty/partial
   Yes let's do this first to prevent damaging a cluster.
   
   And in general, I do not think it is very hard to introduce a tool for your scenario? Just put a dummy meta state node on zookeeper can solve the problem. And for the regions, as said before, you'd better disable them allbefore shutting down the old cluster, and after starting the new cluster, enable them. Even if you do not do this, I think you could just scan the meta regions, to find out all the recorded region servers and schedule HBCKSCP for them to get the cluster back.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673880783


   > how we could start a cluster with no data on zookeeper? 
   
   IMO the title of the google design may be going on the cloud use cases that has been restarting on just HFiles without WAL and without Zookeeper (but all the user tables are flushed and disabled before terminating it). I knew that may not be mentioned in the [book tutorial](https://hbase.apache.org/book.html), and it may be a good time to clarify how that cases are actually working and some users has been using in HBase-1.4.x and HBase maybe before 2.1.7. Then we can see what the gaps maybe now in branch-2.2+ to support it back (basically, that's the intention of having this PR and [PR#2113](https://github.com/apache/hbase/pull/2113) )
   
   > As you want to start the HMaster and recover from the inconsistency
   
   what does `inconsistency` mean here? I see your point that using `InitMetaProcedure#INIT_META_WRITE_FS_LAYOUT` to `indicate` inconsistency, but if we don't delete meta and just starting the cluster, IMO HBCK `-detail` will show a clean result without any inconsistency? we may not hit any `inconsistency` when getting into `InitMetaProcedure`. For this topic, I may just start a email on if `InitMetaProcedure` should delete meta without checking `partial` and `consistency`, please bear with me, and this may be the only thing I want a quick discussion instead of a long design doc on the cloud use cases. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-675299128


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 31s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 36s |  branch-2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 10s |  branch-2 passed  |
   | +1 :green_heart: |  spotbugs  |   1m 59s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 10s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m  8s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m 32s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m 11s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 15s |  The patch does not generate ASF License warnings.  |
   |  |   |  32m 46s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/2/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2237 |
   | JIRA Issue | HBASE-24833 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 7a93156de9ca 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 7335dbc834 |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] anoopsjohn commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
anoopsjohn commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673347350


   Thanks Duo..  I was about to put the flow why we do InitMeta..  In case of the cluster recreate what @taklwu says, the WAL data itself not there as well as zk.  So as what u said, if one has to do, then they have to do 2 things
   1. When the initial cluster was dropped, before that the WAL fs (HDFS backed by managed disk in cloud) need to be backedup. Also the zk data (the meta server location) to be backed up
   2. When the cluster recreated from existing data, some tool (hbck or some thing) need to recreate the zk node and put that old location value. Before that it has to recover back the WAL FS data onto the new HDFS cluster.
   The WAL fs data backup and restore make sense. But IMHO the zk data thing looks another hack. Till 2.1.6 this was not needed. Even if the meta location is not there zk, the init meta will kick in and that will create meta region from existing data and assign to some RS.  But the meta cleanup make the existing entire data to be deleted. I think that is not good. We are adding more things to META table these days.  The NS info itself is in another CF in META table.. There is some other discussion around adding the committed HFiles data into META (This is not concluded but looks like we keep increasing the responsibility of META table).   So dropping all these information is not that good. 
   So my thinking was to add such cluster recreate as a 1st class feature in HBase itself. This can be used by anyone anywhere.  As long as the data is persisted, we can drop the cluster and recreate later.   Now for that 2 blockers and the biggest one is this META dir delete as part of InitMetaProc.  I agree to the intent of adding that cleanup.  But now if we have to have this support also how can we avoid this delete?   Recreate entire META using some hbck options should be the very last option to think off IMO. If we can solve other ways why not?
    


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r472680071



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java
##########
@@ -167,4 +171,35 @@ protected void completionCleanup(MasterProcedureEnv env) {
   public void await() throws InterruptedException {
     latch.await();
   }
+
+  private static boolean deleteMetaTableDirectoryIfPartial(FileSystem rootDirectoryFs,
+    Path metaTableDir) throws IOException {
+    boolean isPartial = true;
+    try {
+      TableDescriptor metaDescriptor =
+        FSTableDescriptors.getTableDescriptorFromFs(rootDirectoryFs, metaTableDir);
+      // when entering the state of INIT_META_WRITE_FS_LAYOUT, if a meta table directory is found,
+      // the meta table should not have any useful data and considers as partial.
+      // if we find any valid HFiles, operator should fix the meta e.g. via HBCK.
+      if (metaDescriptor != null && metaDescriptor.getColumnFamilyCount() > 0) {
+        RemoteIterator<LocatedFileStatus> iterator = rootDirectoryFs.listFiles(metaTableDir, true);
+        while (iterator.hasNext()) {
+          LocatedFileStatus status = iterator.next();
+          if (StoreFileInfo.isHFile(status.getPath()) && HFile
+            .isHFileFormat(rootDirectoryFs, status.getPath())) {
+            isPartial = false;
+            break;
+          }
+        }
+      }
+    } finally {
+      if (!isPartial) {
+        throw new IOException("Meta table is not partial, please sideline this meta directory "
+          + "or run HBCK to fix this meta table, e.g. rebuild the server hostname in ZNode for the "
+          + "meta region");
+      }

Review comment:
       [updated] yeah, I saw the `UnsupportedOperationException` and the timeout in the unit test logs, then the master didn't stop, such that I added a section in HMaster to catch this exception. Did I do it wrong ? or any idea how I can fail the procedure right? 
   
   ```
         initMetaProc.await();
         if (initMetaProc.isFailed() && initMetaProc.hasException()) {
           throw new IOException("Failed to initialize meta table", initMetaProc.getException());
         }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] anoopsjohn commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
anoopsjohn commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-672701120


   Ya the biggest issue now is the META FS dir delete which happens after such a cluster recreate.  An HBCK kind of tool to recreate the META data is too much IMO.  Also we many other meta data also in META.  So IMHO we should some how avoid this META delete.  After cluster recreate this happens because in zk we dont have a META location and also no SCP in MasterProcWal which is targeting an RS which was holding META.  So ya the same Q..  What is a partial META we refer?  As part of META bootstrap we create the META tableInfo. That is an atomic op. So can be partial here? Sorry if am missing some thing.. Did not check code in detail.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-914608446


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   4m 13s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  6s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 59s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m  2s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   6m 36s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 44s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 29s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  3s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  3s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 45s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 146m 53s |  hbase-server in the patch passed.  |
   |  |   | 177m 48s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/5/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2237 |
   | JIRA Issue | HBASE-24833 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux e6c4aa84f5e1 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 62c58fc11a |
   | Default Java | AdoptOpenJDK-1.8.0_282-b08 |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/5/testReport/ |
   | Max. process+thread count | 4237 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/5/console |
   | versions | git=2.17.1 maven=3.6.3 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@hbase.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-914528088


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |  12m 18s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m 21s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   4m 23s |  branch-2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 40s |  branch-2 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 11s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m  8s |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 48s |  the patch passed  |
   | +1 :green_heart: |  javac  |   3m 48s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 17s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  13m  2s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m 22s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 13s |  The patch does not generate ASF License warnings.  |
   |  |   |  61m 19s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/5/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2237 |
   | JIRA Issue | HBASE-24833 |
   | Optional Tests | dupname asflicense javac spotbugs hadoopcheck hbaseanti checkstyle compile |
   | uname | Linux 1ef067af5fe6 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 62c58fc11a |
   | Default Java | AdoptOpenJDK-1.8.0_282-b08 |
   | Max. process+thread count | 86 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/5/console |
   | versions | git=2.17.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@hbase.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] anoopsjohn edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
anoopsjohn edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673384573


   The biggest issue now is META dir delete..  In one direction we are trying NOT to keep any persisted data in zk no? Till 2.1.7 even for META table also, we never had an issue even if that is not there.. The bootstrap code was assigning existing META table to new RS.  Some way or the other, I dont mind, we need a solution for this.  As per the existing code itself, am not sure why we need to delete the entire dir with assumption of partial bootstrap.  There is a dir creation steps and then tableInfo create.. The tableInfo file create is atomic too. Every step we check if dir/file exists . Correct?   If this META delete was not there  things would have been easy.  We have supported cluster drop and create with version <2.1.6..  But from 2.1.7 this is very diff because of this META delete.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-751833447


   thanks Duo for your understanding and comments ;) and happy holidays! 
   
   @z-york your point on idempotent is good, but at this point will you agree that we create a follow up on first add exception and discuss/handle this data loss issues in a different JIRA ?  will you reconsider to change your -1 vote ? (btw I can rebase it after we agree this change)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-675360847


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 36s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  8s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 14s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m  4s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   5m 46s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 41s |  hbase-server in branch-2 failed.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 59s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  5s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 46s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 40s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 127m 20s |  hbase-server in the patch passed.  |
   |  |   | 154m 31s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/2/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2237 |
   | JIRA Issue | HBASE-24833 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux bd5910b16740 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 7335dbc834 |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/2/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/2/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/2/testReport/ |
   | Max. process+thread count | 4189 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] anoopsjohn commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
anoopsjohn commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r469412600



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -915,6 +917,11 @@ private void finishActiveMasterInitialization(MonitoredTask status)
       this.tableDescriptors.getAll();
     }
 
+    // check cluster Id stored in ZNode before, and use it to indicate if a cluster has been
+    // restarted with an existing Zookeeper quorum.
+    isClusterRestartWithExistingZNodes =

Review comment:
       Ya at proc level..  But before that itself one more thing.
   Say there is a cluster recreate over existing data and so no zk node for clusterId. We will get true for 'isClusterRestartWithExistingZNodes '. In next lines, we will create and write the zk node for clusterId.  Now assume after executing that lines, the HM restarted.  So the zk node was created but the InitMetaProc was NOT submitted.  Now after restart, when we come here, we have zk data for clusterId.  So 'isClusterRestartWithExistingZNodes ' will become false.  Now this time the InitMetaProc started and as part of that we will end up deleting the Meta dir.
   So this says the need to keep this boolean info somewhere once we find that and even before creating the zk node for ClusterId.  Am I making the concern clear this time?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673400372


   Anyway, let's get back to the initial goal? I think it is to support restart a cluster with nothing on zookeeper? Let's have an overall design first? As I said above, it is not easy to force all developpers to follow this rule. If we want to do this, we need to find a suitable way to force the later developpers know, they should not depend on the data on zookeeper...


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r472680071



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java
##########
@@ -167,4 +171,35 @@ protected void completionCleanup(MasterProcedureEnv env) {
   public void await() throws InterruptedException {
     latch.await();
   }
+
+  private static boolean deleteMetaTableDirectoryIfPartial(FileSystem rootDirectoryFs,
+    Path metaTableDir) throws IOException {
+    boolean isPartial = true;
+    try {
+      TableDescriptor metaDescriptor =
+        FSTableDescriptors.getTableDescriptorFromFs(rootDirectoryFs, metaTableDir);
+      // when entering the state of INIT_META_WRITE_FS_LAYOUT, if a meta table directory is found,
+      // the meta table should not have any useful data and considers as partial.
+      // if we find any valid HFiles, operator should fix the meta e.g. via HBCK.
+      if (metaDescriptor != null && metaDescriptor.getColumnFamilyCount() > 0) {
+        RemoteIterator<LocatedFileStatus> iterator = rootDirectoryFs.listFiles(metaTableDir, true);
+        while (iterator.hasNext()) {
+          LocatedFileStatus status = iterator.next();
+          if (StoreFileInfo.isHFile(status.getPath()) && HFile
+            .isHFileFormat(rootDirectoryFs, status.getPath())) {
+            isPartial = false;
+            break;
+          }
+        }
+      }
+    } finally {
+      if (!isPartial) {
+        throw new IOException("Meta table is not partial, please sideline this meta directory "
+          + "or run HBCK to fix this meta table, e.g. rebuild the server hostname in ZNode for the "
+          + "meta region");
+      }

Review comment:
       [updated] yeah, I saw the `UnsupportedOperationException` and the timeout in the unit test logs, then the master didn't stop, such that I added a section in HMaster to catch this exception and fail the master startup with another IOException. 
   
   Did I do it wrong ? or any idea how I can fail the procedure right? 
   
   ```
       // wait meta to be initialized after we start procedure executor
       if (initMetaProc != null) {
         initMetaProc.await();
         if (initMetaProc.isFailed() && initMetaProc.hasException()) {
           throw new IOException("Failed to initialize meta table", initMetaProc.getException());
         }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-941241199


   merged and if we need a new documentation, we can have a followup JIRA


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@hbase.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673253206


   first of all, thanks Duo again. 
   
   > I think for the scenario here, we just need to write the cluster id and other things to zookeeper? Just make sure that the current code in HBase will not consider us as a fresh new cluster. We do not need to rebuild meta?
   
   So, let me confirm your suggestion, that means if we add one more field in ZNode, e.g. a boolean `completedMetaBoostrap`, if we find both `clusterId` and `completedMetaBoostrap` in ZK, we will not delete meta directory ?
   
   followup if ZK Znode data is used to determine if this is a fresh new cluster, can we skip the delete meta directory if `clusterId` and `completedMetaBoostrap` are never set but we found meta directory?  this is the cloud use cases which we don't have ZK to make the decision; such we don't know if the meta is partial, and IMO, we should just leave the meta directory and if anything bad happens, the operator can still run HBCK. (if we do the other way around and always delete the meta, then we're losing the possibility the cluster can heal itself, and we cannot confirm if this is partial, doesn't it?)
   
    > For the InitMetaProcedure, the assumption is that, if we found that the meta table directory is there, then it means the procedure itself has crashed before finishing the creation of meta table, i.e, the meta table is 'partial'. So it is safe to just remove it and create again. I think this is a very common trick in distributed system for handling failures?
   
   do you mean `idempotent` is the `trick` ? `InitMetaProcedure` may be idempotent and can make `hbase:meta` online (as a empty table), but I don't think if the cluster/HM itself is `idempotent` automatically; and yeah, it can rebuild the data content of the original meta with the help of HBCK, but just if HM continues the flow with some existing data, e.g. the namespace table (sorry for branch-2 we have namespace table) and HM restart with a empty meta, based on the experiment I did, the cluster hangs and HM cannot be initialized. 
   
   if we step back to just think on the definition of `partial` meta, it would be great if the meta table itself can tell if it's partial, because it's still a table in HBase and HFiles are immutable. e.g. can we tell if a user table is partial by looking at its data? I may be wrong, but it seems like we're not able to tell from HFiles itself, and we need ZK and WAL to define it.  
   
   So, again, IMO data content in a table is sensitive ([updated] sorry if you guys think data in meta table is not sensitive), I'm proposing not to delete meta directory if possible (it's also like running a hbck to delete and rebuild). 
   
   Based on our discussion here, IMO we have two proposal mentioned to define `partial meta` . 
   
   1. add a boolean in WAL like a proc-level data
   2. write a boolean in ZNode to tell if the bootstrap completes
   *. no matter we choose 1) and 2) above, we have an additional condition, if we don't find any WAL or ZK about this condition, we should not delete the meta table. 
   
   seems 2) + *) should be the simplest solution, what do you guys think?  
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-736772662


   sorry for the delayed response.
   
   >  Is clumsy operator deleting the meta location znode by mistake a valid failure mode ?
   
   no this is a special case that we have been supporting, where the HBase cluster freshly restarts on top of only flushed HFiles and does not come with WAL or ZK. and we admitted that it's a bit different from the community stand points that WAL and ZK must be both pre-existed when master or/and RSs start on existing HFiles to resume the states left from any procedures. 
   
   > What about adding extra step before assign where we wait asking Master a question about the cluster state such as if any of the RSs that are checking in have Regions on them; i.e. if Regions already assigned, if an already 'up' cluster? Would that help?
   
   having extra step to check if RSs has any assigned may help, but I don't know if we can do that before the server manager find any region server is online. 
   
   > You fellows don't want to have to run a script beforehand? ZK is up and just put an empty location up or ask Master or hbck2 to do it for you? 
   
   I think HBCK/HBCK2 is performing online repairing, there are few concerns we're having 
   1. if the master is not up and running, then we cannot proceed 
   2. even if the master is up, the repairing on hundreds or thousand of regions implies long scanning time, which IMO we can save this time by just reloading it from existing meta. 
   3. having an additional steps/scripts to start a HBase cluster in the mentioned cloud use case seem a manual/semi-automated step we don't find a good fit to hold and maintain them.
   
   Personally, it's fine to me with throwing exception as Duo suggested, and on our side we need to find a way to continue if we see this exception. then we improve it in the future when we need to completely getting rid of the extra step on hbck. 
   
   So, for this PR, if we don't hear any other critical suggestion, maybe I will leave it "close" as unresolved, do you guys agree ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673393155


   I do not see why we need to change this. Consider the usage of InitMetaProcedure, if there is a meta directory there, it must be 'partial'. And even it is not partial, we are still safe to delete it, as there should be no data in it.
   
   If this is not the case, we should have inconsistency in our cluster. For safety, I think we could skip the deletion of the meta directory, but we should still fail the procedure and thus fail the initialization of master. Users should fix the inconsistency before starting again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] z-york commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
z-york commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-760432544


   I think we can agree to disagree on the automatic/idempotent IMP, we can propose a plan for that if I have more time to devote to it in the future. For now we can keep that as a patch. I think that this PR can still go in with throwing an exception instead of deleting meta (I don't see any reason to submit a separate PR, let's keep this discussion). 
   
   I think we should fail instead of allowing an automatic delete of the meta directory (or at least have an option to fail and not delete) since we are continuing to add more metadata into meta and it will become more and more costly to rebuild. I have seen many cases where operators clear out ZK nodes + restart master to unblock some assignment issues, but admittedly that is on 1.x versions of HBase, I think the recovery options might be better in 2.x.
   
   We already have offline meta repair that I believe should be able to solve these issues if an exception is thrown.
   
   
   btw @saintstack for the double assigned meta scenario wouldn't the IO fencing/lease on the meta WAL handle that? Or will it try to write to a unique WAL each assignment? Just curious, not blocking this PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu merged pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu merged pull request #2237:
URL: https://github.com/apache/hbase/pull/2237


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@hbase.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-671666623


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   2m 48s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  7s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m 11s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   6m 38s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 41s |  hbase-server in branch-2 failed.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 33s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 17s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 49s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 44s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 137m  7s |  hbase-server in the patch failed.  |
   |  |   | 169m 27s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2237 |
   | JIRA Issue | HBASE-24833 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 79f0b2d535a9 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 17a0c2aabf |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/1/testReport/ |
   | Max. process+thread count | 3679 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-671971576


   I still feel a bit strange that we should deal with this in normal code path...
   
   Can we add a new HBCK option for this?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673880783


   > how we could start a cluster with no data on zookeeper? 
   
   IMO the title of the google design may be going on the cloud use cases that has been restarting on just HFiles without WAL and without Zookeeper (but all the user tables are flushed and disabled before terminating it). I knew that may not be mentioned in the [book tutorial](https://hbase.apache.org/book.html), and it may be a good time to clarify how that cases are actually working and some users has been using in HBase-1.4.x and HBase maybe before 2.1.7. Then we can see what the gaps maybe now in branch-2.2+ to support it back (basically, that's the intention of having this PR and [PR#2113](https://github.com/apache/hbase/pull/2113) )
   
   > As you want to start the HMaster and recover from the inconsistency
   
   what does `inconsistency` mean here? I see your point that using `InitMetaProcedure#INIT_META_WRITE_FS_LAYOUT` to `indicate` inconsistency, but if we don't delete meta and just starting the cluster, IMO HBCK `-detail` will show a clean result without any inconsistency? (correct me if I'm wrong) . For this topic, I may just start a email on if `InitMetaProcedure` should delete meta without checking `partial` and `consistency`, please bear with me, and this may be the only thing I want a quick discussion instead of a long design doc on the cloud use cases. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r472482587



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java
##########
@@ -167,4 +171,35 @@ protected void completionCleanup(MasterProcedureEnv env) {
   public void await() throws InterruptedException {
     latch.await();
   }
+
+  private static boolean deleteMetaTableDirectoryIfPartial(FileSystem rootDirectoryFs,
+    Path metaTableDir) throws IOException {
+    boolean isPartial = true;
+    try {
+      TableDescriptor metaDescriptor =
+        FSTableDescriptors.getTableDescriptorFromFs(rootDirectoryFs, metaTableDir);
+      // when entering the state of INIT_META_WRITE_FS_LAYOUT, if a meta table directory is found,
+      // the meta table should not have any useful data and considers as partial.
+      // if we find any valid HFiles, operator should fix the meta e.g. via HBCK.
+      if (metaDescriptor != null && metaDescriptor.getColumnFamilyCount() > 0) {
+        RemoteIterator<LocatedFileStatus> iterator = rootDirectoryFs.listFiles(metaTableDir, true);
+        while (iterator.hasNext()) {
+          LocatedFileStatus status = iterator.next();
+          if (StoreFileInfo.isHFile(status.getPath()) && HFile
+            .isHFileFormat(rootDirectoryFs, status.getPath())) {
+            isPartial = false;
+            break;
+          }
+        }
+      }
+    } finally {
+      if (!isPartial) {
+        throw new IOException("Meta table is not partial, please sideline this meta directory "
+          + "or run HBCK to fix this meta table, e.g. rebuild the server hostname in ZNode for the "
+          + "meta region");
+      }

Review comment:
       @Apache9, we're failing the `InitMetaProcedure` with an `IOException`, and HMaster will fail the master startup if `InitMetaProcedure` is `FAILED` with an exception. 
   
   Still, alternatively, we could continue the bootstrap without throwing (but this is not good as you recommended)
   
   so, do you think this change align with your comments? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r472680071



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java
##########
@@ -167,4 +171,35 @@ protected void completionCleanup(MasterProcedureEnv env) {
   public void await() throws InterruptedException {
     latch.await();
   }
+
+  private static boolean deleteMetaTableDirectoryIfPartial(FileSystem rootDirectoryFs,
+    Path metaTableDir) throws IOException {
+    boolean isPartial = true;
+    try {
+      TableDescriptor metaDescriptor =
+        FSTableDescriptors.getTableDescriptorFromFs(rootDirectoryFs, metaTableDir);
+      // when entering the state of INIT_META_WRITE_FS_LAYOUT, if a meta table directory is found,
+      // the meta table should not have any useful data and considers as partial.
+      // if we find any valid HFiles, operator should fix the meta e.g. via HBCK.
+      if (metaDescriptor != null && metaDescriptor.getColumnFamilyCount() > 0) {
+        RemoteIterator<LocatedFileStatus> iterator = rootDirectoryFs.listFiles(metaTableDir, true);
+        while (iterator.hasNext()) {
+          LocatedFileStatus status = iterator.next();
+          if (StoreFileInfo.isHFile(status.getPath()) && HFile
+            .isHFileFormat(rootDirectoryFs, status.getPath())) {
+            isPartial = false;
+            break;
+          }
+        }
+      }
+    } finally {
+      if (!isPartial) {
+        throw new IOException("Meta table is not partial, please sideline this meta directory "
+          + "or run HBCK to fix this meta table, e.g. rebuild the server hostname in ZNode for the "
+          + "meta region");
+      }

Review comment:
       yeah, I saw it the not support rollback then the master didn't stop, such that I added a section in HMaster to catch this exception. Did I do it wrong ? or any idea how I can fail the procedure? 
   
   ```
         initMetaProc.await();
         if (initMetaProc.isFailed() && initMetaProc.hasException()) {
           throw new IOException("Failed to initialize meta table", initMetaProc.getException());
         }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-674265891


   I apologized for the dev@ email, but I was thinking differently overnight about your suggestion (sorry that I reread many times until I found the gap this morning)
   
   > You can see the code in finisheActiveMasterInitialization and also the code in AssignmentManager.start. In AssignmentManager.start, we will try to load the start of meta region from zookeeper, and if there is none, we will create a region node in offline state. And in finishActiveMasterInitialization, if we find that the state of meta region node is offline, we will schedule InitMetaProcedure.
   So what you need to do here, is to put the meta region znode to zookeeper, before you restart the hbase cluster. So we will not schedule InitMetaProcedure again.
   
   
   didn't the coming up master region that store the meta location in [HBASE-24408](https://issues.apache.org/jira/browse/HBASE-24408) and [PR#1746](https://github.com/apache/hbase/pull/1746/commits/976d0c4e5b732a23773bd306f79e8017344b58f3) solve our conflict of interests that we don't need to relaying on ZK for getting the server name (old host) for the meta region ? such even if we don't have the ZK, we can move on and don't submit the InitMetaProcedure because the state of the meta region is not `OFFLINE`. 
   
   if you confirm above, I may say bring this PR and keep highlighting the zookeeper discussion is my mistake and I should have learnt the master region ahead of this PR. (then we just need to move to the coming up version, and we can still restart on the cloud use cases)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-760575353


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 14s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  8s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 46s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   7m 28s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 42s |  branch-2 passed  |
   | -0 :warning: |  patch  |   8m 21s |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 32s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 12s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 12s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   7m 27s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 197m 39s |  hbase-server in the patch passed.  |
   |  |   | 228m 55s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/4/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2237 |
   | JIRA Issue | HBASE-24833 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 3d27205dc3dd 4.15.0-128-generic #131-Ubuntu SMP Wed Dec 9 06:57:35 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 2bb7beb448 |
   | Default Java | AdoptOpenJDK-11.0.6+10 |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/4/testReport/ |
   | Max. process+thread count | 2576 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/4/console |
   | versions | git=2.17.1 maven=3.6.3 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r469554017



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -915,6 +917,11 @@ private void finishActiveMasterInitialization(MonitoredTask status)
       this.tableDescriptors.getAll();
     }
 
+    // check cluster Id stored in ZNode before, and use it to indicate if a cluster has been
+    // restarted with an existing Zookeeper quorum.
+    isClusterRestartWithExistingZNodes =

Review comment:
       > So this says the need to keep this boolean info somewhere once we find that and even before creating the zk node for ClusterId. Am I making the concern clear this time?
   
   ack and I got the concern. 
   
   but before I move to implement the proc level information, let's wait a bit on @Apache9 for the question on META's `tableInfo` and `partial meta`.  The ideal case is that we may be able to use `FSTableDescriptors.getTableDescriptorFromFs()` if found and can be readed to indicate meta is not partial. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-760501642


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 54s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 49s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   3m 11s |  branch-2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 10s |  branch-2 passed  |
   | +1 :green_heart: |  spotbugs  |   2m  0s |  branch-2 passed  |
   | -0 :warning: |  patch  |   2m  9s |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 25s |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 27s |  the patch passed  |
   | +1 :green_heart: |  javac  |   3m 27s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 14s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  12m 35s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m 19s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 15s |  The patch does not generate ASF License warnings.  |
   |  |   |  43m 19s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/4/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2237 |
   | JIRA Issue | HBASE-24833 |
   | Optional Tests | dupname asflicense javac spotbugs hadoopcheck hbaseanti checkstyle compile |
   | uname | Linux 95de4940b43d 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 2bb7beb448 |
   | Default Java | AdoptOpenJDK-1.8.0_232-b09 |
   | Max. process+thread count | 96 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/4/console |
   | versions | git=2.17.1 maven=3.6.3 spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673751605


   > should not depend on the data on zookeeper.
   
   I agreed with you that we may not be ready to totally skip relying on the data stored on zookeeper, that's definitely a boarder discussion on what HBase currently depends on Zookeeper (branch-2 and master), especially if data on Zookeeper could be ephemeral or removed. (I thought we're in the progress of moving data into ROOT region, aren't we ? e.g. [Proc-v2](https://issues.apache.org/jira/browse/HBASE-20610)
   
   Also, my initial goal is that the meta data/directory should not be deleted if possible, and we're trying to provide a persisted condition not to always delete meta if it's not `partial` (protected by the ZK data).
   
   sorry that I may be newbie on the proc-v2 and zk data, should we start a thread on the dev@ list to discuss about the following ? (my goal is to find a consensus how we can move this PR to either completes it or not fixed)
   
   1. should we delete meta directory when HM starts ? 
   2. after 2.2+, should not depend on the data on zookeeper and have more of the info into proc-v2 in the master region?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] anoopsjohn commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
anoopsjohn commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673350314


   To ur Q on the status on zk for clusterId @taklwu , The clusterId stored in zk itself is a PBed message.  So we can always add the boolean optional field there. And later change also.  So no need for new nodes or any.. But lets wait for a conclusion on the direction.  Like still we dont have consensus on whether adding this as a top class HBase thing itself? (Bootstrap the cluster over existing data). If top class feature, we dont have to do many tricks to satisfy such needs. IMHO that is worth adding as all can get benefited. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu merged pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu merged pull request #2237:
URL: https://github.com/apache/hbase/pull/2237


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@hbase.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-671687631


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 23s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  7s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 17s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   5m 54s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 43s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m  0s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  0s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  0s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 44s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 39s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 221m 44s |  hbase-server in the patch passed.  |
   |  |   | 248m 32s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/1/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2237 |
   | JIRA Issue | HBASE-24833 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 257932d04b1f 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 17a0c2aabf |
   | Default Java | 1.8.0_232 |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/1/testReport/ |
   | Max. process+thread count | 2313 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r469554017



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -915,6 +917,11 @@ private void finishActiveMasterInitialization(MonitoredTask status)
       this.tableDescriptors.getAll();
     }
 
+    // check cluster Id stored in ZNode before, and use it to indicate if a cluster has been
+    // restarted with an existing Zookeeper quorum.
+    isClusterRestartWithExistingZNodes =

Review comment:
       > So this says the need to keep this boolean info somewhere once we find that and even before creating the zk node for ClusterId. Am I making the concern clear this time?
   
   ack and I got the concern. 
   
   but before I move to implement the proc level information, let's wait a bit on @Apache9 for the question on META's `tableInfo` and `partial meta`. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-916299365


   @joshelser  should we wait till add the documentation in this commit ? or should we have a followup PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@hbase.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-760640589


   I have fixed the conflicts, probably will push it in two days and see if you guys have any comments.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673386671


   For 2.2+ we have a completely different way to do master initialization. This is exactly what we I said above, it is not easy to force all the developpers to follow the rule, that the data on zookeeper are not stable. Maybe it can work for now, but maybe it will be broken in the future...


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-672204387


   Thanks @Apache9 , I want to agree with you to have a HBCK option, but one concern I have and keep struggling about making this automated instead of HBCK options. If one HBase cluster has hundred of tables with thousand of regions, how would the operator recovery the cluster? does he/she (offline/online) repair the meta table by scanning the storage on each region ? (instead we can just load the meta without rebuilding it?) 
   
   Tbh, I felt bad to bring this meta table issue because normal HBase cluster does not assume Zookeeper (and WAL) could be gone after the cluster starts and restarts. 
   
   for this PR/JIRA, mainly, I'm questioning what a `partial meta` should be, any thoughts ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-760552795


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 37s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  8s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 46s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   0m 59s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   6m 17s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 40s |  branch-2 passed  |
   | -0 :warning: |  patch  |   7m 10s |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 14s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 59s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 59s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m  1s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 36s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 138m 42s |  hbase-server in the patch passed.  |
   |  |   | 164m 17s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/4/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2237 |
   | JIRA Issue | HBASE-24833 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux e39ecde1ac07 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 2bb7beb448 |
   | Default Java | AdoptOpenJDK-1.8.0_232-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/4/testReport/ |
   | Max. process+thread count | 4342 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/4/console |
   | versions | git=2.17.1 maven=3.6.3 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673896406


   The inconsistency is between the data on filesystem and zookeeper. In general, after the meta table has been initialized, it should have a directory on the file system, and also a znode to indicate its state on zookeeper.
   
   The problem here is that, in your scenario, the data on zookeeper are all gone. This is what I called 'inconsistency'.
   
   And I've also mentioned several times, it works for several versions does not mean it will work for all the versions. They are just different implementation choices, and you can not force the developpers to not use the data on zookeeper to determine what to do next.
   
   So I do not think it worth to ask on the dev list about 'whether' the InitMetaProcedure should delete meta without checking 'partial' or not. The problem here is you delete the state znode for meta region on zookeeper, not a problem for InitMetaProcedure. Either you add it back through HBCK2, or you should have a overall design on how we could fix the problem, without using HBCK2.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-671632659


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 42s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ branch-2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 47s |  branch-2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 48s |  branch-2 passed  |
   | +1 :green_heart: |  spotbugs  |   2m 48s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 10s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 31s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  15m 53s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m 56s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 16s |  The patch does not generate ASF License warnings.  |
   |  |   |  44m 25s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/1/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2237 |
   | JIRA Issue | HBASE-24833 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 96268feba611 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 17a0c2aabf |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2237/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673280729


   I do not think you fully understand what I proposed from the beginning of the related topic. Honestly I do not know what you are going to fix here, what is the ZNode you want to add a boolean to?
   
   My suggestion is always that, introducing a new HBCK option, to reconstruct the content on zookeeper, so we know that this is not a fresh new cluster, and we will not schedule InitMetaProcedure. So for InitMetaProcedure itself, we do not need to consider whether the meta region is partial or not.
   
   You can see the code in finisheActiveMasterInitialization and also the code in AssignmentManager.start. In AssignmentManager.start, we will try to load the start of meta region from zookeeper, and if there is none, we will create a region node in offline state. And in finishActiveMasterInitialization, if we find that the state of meta region node is offline, we will schedule InitMetaProcedure.
   So what you need to do here, is to put the meta region znode to zookeeper, before you restart the hbase cluster. So we will not schedule InitMetaProcedure again.
   
   Is this clear enough?
   
   Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] saintstack commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
saintstack commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-701935647


   > I am currently -1 on the implementation pending answers to my questions. I have not been able to find any valid failure modes that will cause data loss or corrupting actions if hbase:meta is onlined. If this is true, we can make InitMetaProcedure idempotent (see above) and we can handle any InitMetaProcedure failure condition.
   
   @z-york  Is clumsy operator deleting the meta location znode by mistake a valid failure mode? Throw in a Master restart soon after (Will IMP run? If so, double assign of meta if the rest of the cluster was up at the time which can make for dataloss as the two meta Regions fight over what hfiles make up the meta Region).
   
   The dataloss will be worse though if we just blanket delete meta dir if it exists already when IMP runs.
   
   bq. ....but I don't see a clear agreement among everyone from whether we should continue the bootstrap or fail hard on the bootstrap when we find the meta table in InitMetaProcedure.
   
   Dunno. Its called IMP. When it runs, there is supposed to be no meta. If there is, then something is not right: i.e. see above clumsy operator. Shouldn't remove the meta dir though if exists already? Fail the master startup? HBCK2 time?
   
   Could do Zach's idea of making it idempotent but IMP scope does not cover writing location in zk so can't have this as a 'step' in IMP.  What about adding extra step before assign where we wait asking Master a question about the cluster state such as if any of the RSs that are checking in have Regions on them; i.e. if Regions already assigned, if an already 'up' cluster? Would that help?
   
   You fellows don't want to have to run a script beforehand? ZK is up and just put an empty location up or ask Master or hbck2 to do it for you? You just want the cluster to start up over hfiles?  Thanks.
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-679418281


   so.....how can we get consensus on this PR or this series of idempotent issues (for InitMetaProcedure) ? I don't mind to break them into more tasks as Zach has created those followup bugs (HBASE-24922 and HBASE-24923), but I don't see a clear agreement among everyone from whether we should continue the bootstrap or fail hard on the bootstrap when we find the meta table in InitMetaProcedure. 
   
   how do we move further? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
joshelser commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r701499308



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java
##########
@@ -166,4 +170,35 @@ protected void completionCleanup(MasterProcedureEnv env) {
   public void await() throws InterruptedException {
     latch.await();
   }
+
+  private static boolean deleteMetaTableDirectoryIfPartial(FileSystem rootDirectoryFs,
+    Path metaTableDir) throws IOException {
+    boolean isPartial = true;
+    try {
+      TableDescriptor metaDescriptor =
+        FSTableDescriptors.getTableDescriptorFromFs(rootDirectoryFs, metaTableDir);
+      // when entering the state of INIT_META_WRITE_FS_LAYOUT, if a meta table directory is found,
+      // the meta table should not have any useful data and considers as partial.
+      // if we find any valid HFiles, operator should fix the meta e.g. via HBCK.
+      if (metaDescriptor != null && metaDescriptor.getColumnFamilyCount() > 0) {
+        RemoteIterator<LocatedFileStatus> iterator = rootDirectoryFs.listFiles(metaTableDir, true);
+        while (iterator.hasNext()) {
+          LocatedFileStatus status = iterator.next();
+          if (StoreFileInfo.isHFile(status.getPath()) && HFile
+            .isHFileFormat(rootDirectoryFs, status.getPath())) {
+            isPartial = false;
+            break;
+          }
+        }
+      }
+    } finally {
+      if (!isPartial) {
+        throw new IOException("Meta table is not partial, please sideline this meta directory "
+          + "or run HBCK to fix this meta table, e.g. rebuild the server hostname in ZNode for the "

Review comment:
       https://issues.apache.org/jira/browse/HBASE-24833?focusedCommentId=17409167&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17409167 rough sketch what I was able to do to fix it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@hbase.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] anoopsjohn commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
anoopsjohn commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673384573


   The biggest issue now is META dir delete..  In one direction we are trying NOT to keep any persisted data in zk no? Till 2.1.7 even for META table also, we never had an issue even if that is not there.. The bootstrap code was assigning existing META table to new RS.  Some way or the other, I dont mind, we need a solution for this.  As per the existing code itself, am not sure why we need to delete the entire dir with assumption of partial bootstrap.  There is a dir creation steps and then tableInfo create.. Every step we check if dir/file exists . Correct?   If this META delete was not there  things would have been easy.  We have supported cluster drop and create with version <2.1.6..  But from 2.1.7 this is very diff because of this META delete.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r472680071



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java
##########
@@ -167,4 +171,35 @@ protected void completionCleanup(MasterProcedureEnv env) {
   public void await() throws InterruptedException {
     latch.await();
   }
+
+  private static boolean deleteMetaTableDirectoryIfPartial(FileSystem rootDirectoryFs,
+    Path metaTableDir) throws IOException {
+    boolean isPartial = true;
+    try {
+      TableDescriptor metaDescriptor =
+        FSTableDescriptors.getTableDescriptorFromFs(rootDirectoryFs, metaTableDir);
+      // when entering the state of INIT_META_WRITE_FS_LAYOUT, if a meta table directory is found,
+      // the meta table should not have any useful data and considers as partial.
+      // if we find any valid HFiles, operator should fix the meta e.g. via HBCK.
+      if (metaDescriptor != null && metaDescriptor.getColumnFamilyCount() > 0) {
+        RemoteIterator<LocatedFileStatus> iterator = rootDirectoryFs.listFiles(metaTableDir, true);
+        while (iterator.hasNext()) {
+          LocatedFileStatus status = iterator.next();
+          if (StoreFileInfo.isHFile(status.getPath()) && HFile
+            .isHFileFormat(rootDirectoryFs, status.getPath())) {
+            isPartial = false;
+            break;
+          }
+        }
+      }
+    } finally {
+      if (!isPartial) {
+        throw new IOException("Meta table is not partial, please sideline this meta directory "
+          + "or run HBCK to fix this meta table, e.g. rebuild the server hostname in ZNode for the "
+          + "meta region");
+      }

Review comment:
       yeah, I saw the `UnsupportedOperationException` and the timeout in the unit test logs, then the master didn't stop, such that I added a section in HMaster to catch this exception. Did I do it wrong ? or any idea how I can fail the procedure? 
   
   ```
         initMetaProc.await();
         if (initMetaProc.isFailed() && initMetaProc.hasException()) {
           throw new IOException("Failed to initialize meta table", initMetaProc.getException());
         }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-751833447


   thanks Duo for your understanding and comments ;) and happy holidays! 
   
   @z-york your point on idempotent is good, but at this point will you agree that we create a follow up on first add exception and discuss/handle this data loss issues in a different JIRA ?  will you reconsider to change your -1 vote on this particular PR ? (btw I can rebase it after we agree this change)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673836007


   I think we'd better start from a high level design, on how we could start a cluster with no data on zookeeper. My proposal is always, introducing a new option in HBCK2 to reconstruct the data on zookeeper, but you actually start the cluster. If you guys think it is possible to do this in normal code path, please give an overall design. Open a google doc? So others could comments on it?
   
   For deleting meta, I agree that we should not delete it if it is not 'partial'. But as I said above many times, if there is no inconsitency, scheduling an InitMetaProcedure implies that we do not have a meta table yet, so in InitMetaProcedure it is always safe to delete the meta directory. So even if we add a check in InitMetaProcedure to determine whether the meta table is 'real' partial, the result should be an ERROR log to tell users something crtical happens, please fix the inconsistency before moving forward.
   
   I do not think this is what you guys want here? As you want to start the HMaster and recover from the inconsistency?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-674372607


   > didn't the coming up master region that store the meta location in HBASE-24408 and PR#1746 solve our conflict of interests that we don't need to relaying on ZK for getting the server name (old host) for the meta region ? such even if we don't have the ZK, we can move on and don't submit the InitMetaProcedure because the state of the meta region is not OFFLINE.
   
   HBASE-24388 is only on a feature branch and we are still discussing whether to go this way... There is another solution to introduce a general root table, then we will have InitRootProcedure and the state of the root region is still on zookeeper. Of course I'm a sponsor of HBASE-24388 as it is implemented by me...
   And I'm a fan of removing zookeeper dependency completely if possible(or at least just use it as an external storage so on cloud we could easily use other storage systems to replace it, such as dynamodb).
   
   > [update] I will follow your suggestion and throws a exception if we find the meta table is not empty/partial
   
   Yes let's do this first to prevent damaging a cluster.
   
   And in general, I do not think it is very hard to introduce a tool for your scenario? Just put a dummy meta state node on zookeeper can solve the problem. And for the regions, as said before, you'd better disable them allbefore shutting down the old cluster, and after starting the new cluster, enable them. Even if you do not do this, I think you could just scan the meta regions, to find out all the recorded region servers and schedule HBCKSCP for them to get the cluster back.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] anoopsjohn commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
anoopsjohn commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673390864


   InitMetaProcedure was there before also..  Only in 2.1.7, this delete of the META dir added. Am asking abt that only @Apache9 .  That is the major concern.  The init proc was before doing  FS init if the meta dir is not yet there. And then init the assign of the META in the cluster.  Now it will do both what so ever, the cur status of the meta dir.  The req is a change for that.  Is that some thing u veto?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-736772662


   >  Is clumsy operator deleting the meta location znode by mistake a valid failure mode ?
   no this is a special case that we have been supporting, where the HBase cluster freshly restarts on top of only flushed HFiles and does not come with WAL or ZK. and we admitted that it's a bit different from the community stand points that WAL and ZK must be both pre-existed when master or/and RSs start on existing HFiles to resume the states left from any procedures. 
   
   > What about adding extra step before assign where we wait asking Master a question about the cluster state such as if any of the RSs that are checking in have Regions on them; i.e. if Regions already assigned, if an already 'up' cluster? Would that help?
   
   having extra step to check if RSs has any assigned may help, but I don't know if we can do that before the server manager find any region server is online. 
   
   > You fellows don't want to have to run a script beforehand? ZK is up and just put an empty location up or ask Master or hbck2 to do it for you? 
   I think HBCK/HBCK2 is performing online repairing, there are few concerns we're having 
   1. if the master is not up and running, then we cannot proceed 
   2. even if the master is up, the repairing on hundreds or thousand of regions implies long scanning time, which IMO we can save this time by just reloading it from existing meta. 
   3. having an additional steps/scripts to start a HBase cluster in the mentioned cloud use case seem a manual/semi-automated step we don't find a good fit to hold and maintain them.
   
   Personally, it's fine to me with throwing exception as Duo suggested, and on our side we need to find a way to continue if we see this exception. then we improve it in the future when we need to completely getting rid of the extra step on hbck. 
   
   So, for this PR, if we don't hear any other critical suggestion, maybe I will leave it "close" as unresolved, do you guys agree ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673393155


   I do not see why we need to change this. Consider the usage of InitMetaProcedure, if there is a meta directory there, it must be 'partial'. And even it is not partial, we are still safe to delete it, as there should be no data in it.
   
   If this is not the case, we must have inconsistency in our cluster. For safety, I think we could skip the deletion of the meta directory, but we should still fail the procedure and thus fail the initialization of master. Users should fix the inconsistency before starting again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673253206


   first of all, thanks Duo again. 
   
   > I think for the scenario here, we just need to write the cluster id and other things to zookeeper? Just make sure that the current code in HBase will not consider us as a fresh new cluster. We do not need to rebuild meta?
   
   So, let me confirm your suggestion, that means if we add one more field in ZNode, e.g. a boolean `completedMetaBoostrap`, if we find both `clusterId` and `completedMetaBoostrap` in ZK, we will not delete meta directory ?
   
   followup if ZK Znode data is used to determine if this is a fresh new cluster, can we skip the delete meta directory if `clusterId` and `completedMetaBoostrap` are never set but we found meta directory?  this is the cloud use cases which we don't have ZK to make the decision; such we don't know if the meta is partial, and IMO, we should just leave the meta directory and if anything bad happens, the operator can still run HBCK. (if we do the other way around and always delete the meta, then we're losing the possibility the cluster can heal itself, and we cannot confirm if this is partial, doesn't it?)
   
    > For the InitMetaProcedure, the assumption is that, if we found that the meta table directory is there, then it means the procedure itself has crashed before finishing the creation of meta table, i.e, the meta table is 'partial'. So it is safe to just remove it and create again. I think this is a very common trick in distributed system for handling failures?
   
   do you mean `idempotent` is the `trick` ? `InitMetaProcedure` may be idempotent and can make `hbase:meta` online (as a empty table), but I don't think if the cluster/HM itself is `idempotent` automatically; and yeah, it can rebuild the data content of the original meta with the help of HBCK, but just if HM continues the flow with some existing data, e.g. the namespace table (sorry for branch-2 we have namespace table) and HM restart with a empty meta, based on the experiment I did, the cluster hangs and HM cannot be initialized. 
   
   if we step back to just think on the definition of `partial` meta, it would be great if the meta table itself can tell if it's partial, because it's still a table in HBase and HFiles are immutable. e.g. can we tell if a user table is partial by looking at its data? I may be wrong, but it seems like we're not able to tell from HFiles itself, and we need ZK and WAL to define it.  
   
   So, again, IMO data content in a table is sensitive especially the meta table, I'm proposing not to delete meta if possible here (it's also like running a hbck to delete and rebuild). 
   
   Based on our discussion here, IMO we have two proposal mentioned to define `partial meta` . 
   
   1. add a boolean in WAL like a proc-level data
   2. write a boolean in ZNode to tell if the bootstrap completes
   *. no matter we choose 1) and 2) above, we have an additional condition, if we don't find any WAL or ZK about this condition, we should not delete the meta table. 
   
   seems 2) + *) should be the simplest solution, what do you guys think?  
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673387968


   And for the problem here, it is why we introduce the InitMetaProcedure. For InitMetaProcedure itself, the assumption is that it will do the initialization work, it will be very strange that, we schedule a 'InitMeta' procedure, but we already have a meta in place? Why not just do not schedule the InitMetaProcedure at the first place?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-737011410


   > sorry for the delayed response.
   > 
   > > Is clumsy operator deleting the meta location znode by mistake a valid failure mode ?
   > 
   > no this is a special case that we have been supporting, where the HBase cluster freshly restarts on top of only flushed HFiles and does not come with WAL or ZK. and we admitted that it's a bit different from the community stand points that WAL and ZK must be both pre-existed when master or/and RSs start on existing HFiles to resume the states left from any procedures.
   Yes, this is not a typical scenario in the open source version of HBase so I do not think adding the special logic in the open source version is a good idea. In the future new developers who do not know this background may change the code again and cause problems.
   > 
   > > What about adding extra step before assign where we wait asking Master a question about the cluster state such as if any of the RSs that are checking in have Regions on them; i.e. if Regions already assigned, if an already 'up' cluster? Would that help?
   > 
   > having extra step to check if RSs has any assigned may help, but I don't know if we can do that before the server manager find any region server is online.
   > 
   > > You fellows don't want to have to run a script beforehand? ZK is up and just put an empty location up or ask Master or hbck2 to do it for you?
   > 
   > I think HBCK/HBCK2 is performing online repairing, there are few concerns we're having
   > 
   > 1. if the master is not up and running, then we cannot proceed
   > 2. even if the master is up, the repairing on hundreds or thousand of regions implies long scanning time, which IMO we can save this time by just reloading it from existing meta.
   > 3. having an additional steps/scripts to start a HBase cluster in the mentioned cloud use case seem a manual/semi-automated step we don't find a good fit to hold and maintain them.
   I'm fine with adding a new command in HBCK2 to do these fix ups before starting a cluster. Personally I do not think HBCK2 'must' put all the fix logic at master side. Buy anyway, since the repo is called hbase-operator-tools, I think it is free for us to create a new sub module to place new scripts? Though for now it only happens on AWS, I think we could abstract it as a general scenario where we want to start a HBase cluster only on HFiles.
   > 
   > Personally, it's fine to me with throwing exception as Duo suggested, and on our side we need to find a way to continue if we see this exception. then we improve it in the future when we need to completely getting rid of the extra step on hbck.
   > 
   > So, for this PR, if we don't hear any other critical suggestion, maybe I will leave it "close" as unresolved, do you guys agree ?
   This is a scenario for HBase on cloud, especially for AWS, so I think if you guys want to close it as unresolved, others will not have any strong opinon to object :) Take it easy.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] z-york commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
z-york commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-676829830


   I have been watching this for a while, and I still don't have good clarity on the following case.
   
   > 
   > > [update] I will follow your suggestion and throws a exception if we find the meta table is not empty/partial
   > 
   > Yes let's do this first to prevent damaging a cluster.
   
   @Apache9 Can you explain how a partial meta table will damage a cluster? I have been looking at the code for the InitMetaProcedure and the only thing that I can see that is not idempotent is the directory creation for the NS (and the delete of the meta table, but that is being used as the fix for making things idempotent). If InitMetaProcedure is idempotent, there should be no need to delete the directory.
   
   If we restart a cluster and schedule an InitMetaProcedure:
   1. Don't delete the meta table if it exists
   2. It will then move to Assign (which makes sense, to trigger an IMP it cannot already be assigned on a RS). 
   3. Then it will move to create NS (which if I understand the logic correctly):
       a. it can remove the create directory if they already exist
       b. still overwrite the row in the meta table since it should be the same (or check to see if the row exists first if we want to avoid an overwrite).
   
   After that case, if there is any inconsistency in the meta table, that can be handled via HBCK as usual.
   
   A couple tenants we should always have:
   1. Any potential data loss or corrupting action should be taken by HBCK where the operator is making a conscious choice on how to fix a scenario.
   2. As more information is being stored in hbase:meta, it becomes harder (or at least more time consuming) to regenerate the information from the filesystem. hbase:meta should only be deleted/completely regenerated in HBCK.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673253206


   first of all, thanks Duo again. 
   
   > I think for the scenario here, we just need to write the cluster id and other things to zookeeper? Just make sure that the current code in HBase will not consider us as a fresh new cluster. We do not need to rebuild meta?
   
   So, let me confirm your suggestion, that means if we add one more field in ZNode, e.g. a boolean `completedMetaBoostrap`, if we find both `clusterId` and `completedMetaBoostrap` in ZK, we will not delete meta directory ?
   
   followup if ZK Znode data is used to determine if this is a fresh new cluster, can we skip the delete meta directory if `clusterId` and `completedMetaBoostrap` are never set but we found meta directory?  this is the cloud use cases which we don't have ZK to make the decision; such we don't know if the meta is partial, and IMO, we should just leave the meta directory and if anything bad happens, the operator can still run HBCK. (if we do the other way around and always delete the meta, then we're losing the possibility the cluster can heal itself, and we cannot confirm if this is partial, doesn't it?)
   
    > For the InitMetaProcedure, the assumption is that, if we found that the meta table directory is there, then it means the procedure itself has crashed before finishing the creation of meta table, i.e, the meta table is 'partial'. So it is safe to just remove it and create again. I think this is a very common trick in distributed system for handling failures?
   
   do you mean `idempotent` is trick ? `InitMetaProcedure` may be idempotent and can make `hbase:meta` online (as a empty table), but I don't think if the cluster/HM itself is `idempotent` automatically; and yeah, it can rebuild the data content of the original meta with the help of HBCK, but just if HM continues the flow with some existing data, e.g. the namespace table (sorry for branch-2 we have namespace table) and HM restart with a empty meta, based on the experiment I did, the cluster hangs and HM cannot be initialized. 
   
   if we step back to just think on the definition of `partial` meta, it would be great if the meta table itself can tell if it's partial, because it's still a table in HBase and HFiles are immutable. e.g. can we tell if a user table is partial by looking at its data? I may be wrong, but it seems like we're not able to tell from HFiles itself, and we need ZK and WAL to define it.  
   
   So, again, IMO data content in a table is sensitive especially the meta table, I'm proposing not to delete meta if possible here (it's also like running a hbck to delete and rebuild). 
   
   Based on our discussion here, IMO we have two proposal mentioned to define `partial meta` . 
   
   1. add a boolean in WAL like a proc-level data
   2. write a boolean in ZNode to tell if the bootstrap completes
   *. no matter we choose 1) and 2) above, we have an additional condition, if we don't find any WAL or ZK about this condition, we should not delete the meta table. 
   
   seems 2) + *) should be the simplest solution, what do you guys think?  
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r703703740



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java
##########
@@ -166,4 +170,35 @@ protected void completionCleanup(MasterProcedureEnv env) {
   public void await() throws InterruptedException {
     latch.await();
   }
+
+  private static boolean deleteMetaTableDirectoryIfPartial(FileSystem rootDirectoryFs,
+    Path metaTableDir) throws IOException {
+    boolean isPartial = true;

Review comment:
       changed to `shouldDelete`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@hbase.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-672204387


   Thanks @Apache9 , I want to agree with you to have a HBCK option, but one concern I have and keep struggling about making this automated instead of HBCK options. If one HBase cluster has hundred of tables with thousand of regions, how would the operator recovery the cluster? does he/she (offline/online) repair the meta table by scanning the storage on each region ? (instead we can just load the meta without rebuilding it?) 
   
   Tbh, I felt bad to bring this meta table issue because normal HBase cluster does not assume Zookeeper (and WAL) could be gone after the cluster starts and restarts. 
   
   [updated] for this PR/JIRA, mainly, I'm questioning what a `partial meta` should be (e.g. it's now relying on the state of `InitMetaProcedure` instead of the data of meta table), any thoughts ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673880783


   > how we could start a cluster with no data on zookeeper? 
   
   IMO the title of the google design may be going on the cloud use cases that has been restarting on just HFiles without WAL and without Zookeeper (but all the user tables are flushed and disabled before terminating it). I knew that may not be mentioned in the [book tutorial](https://hbase.apache.org/book.html), and it may be a good time to clarify how that cases are actually working and some users has been using in HBase-1.4.x and HBase maybe before 2.1.7. Then we can see what the gaps maybe now in [updated]branch-2.4+ (we don't need that in 2.3.0 and 2.2.5 becase `InitMetaProcedure` does not have `INIT_META_WRITE_FS_LAYOUT`) to support it back (basically, that's the intention of having this PR and [PR#2113](https://github.com/apache/hbase/pull/2113) )
   
   > As you want to start the HMaster and recover from the inconsistency
   
   what does `inconsistency` mean here? I see your point that using `InitMetaProcedure#INIT_META_WRITE_FS_LAYOUT` to `indicate` inconsistency, but if we don't delete meta and just starting the cluster, IMO HBCK `-detail` will show a clean result without any inconsistency? (correct me if I'm wrong) . For this topic, I may just start a email on if `InitMetaProcedure` should delete meta without checking `partial` and `consistency`, please bear with me, and this may be the only thing I want a quick discussion instead of a long design doc on the cloud use cases. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-736772662


   >  Is clumsy operator deleting the meta location znode by mistake a valid failure mode ?
   
   no this is a special case that we have been supporting, where the HBase cluster freshly restarts on top of only flushed HFiles and does not come with WAL or ZK. and we admitted that it's a bit different from the community stand points that WAL and ZK must be both pre-existed when master or/and RSs start on existing HFiles to resume the states left from any procedures. 
   
   > What about adding extra step before assign where we wait asking Master a question about the cluster state such as if any of the RSs that are checking in have Regions on them; i.e. if Regions already assigned, if an already 'up' cluster? Would that help?
   
   having extra step to check if RSs has any assigned may help, but I don't know if we can do that before the server manager find any region server is online. 
   
   > You fellows don't want to have to run a script beforehand? ZK is up and just put an empty location up or ask Master or hbck2 to do it for you? 
   
   I think HBCK/HBCK2 is performing online repairing, there are few concerns we're having 
   1. if the master is not up and running, then we cannot proceed 
   2. even if the master is up, the repairing on hundreds or thousand of regions implies long scanning time, which IMO we can save this time by just reloading it from existing meta. 
   3. having an additional steps/scripts to start a HBase cluster in the mentioned cloud use case seem a manual/semi-automated step we don't find a good fit to hold and maintain them.
   
   Personally, it's fine to me with throwing exception as Duo suggested, and on our side we need to find a way to continue if we see this exception. then we improve it in the future when we need to completely getting rid of the extra step on hbck. 
   
   So, for this PR, if we don't hear any other critical suggestion, maybe I will leave it "close" as unresolved, do you guys agree ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673751605


   > should not depend on the data on zookeeper.
   
   I agreed with you that we may not be ready to totally skip relying on the data stored on zookeeper, that's definitely a boarder discussion on what HBase currently depends on Zookeeper (branch-2 and master), especially if data on Zookeeper could be ephemeral or removed. (I thought we're in the progress of moving data into ROOT region, aren't we ? e.g. [Proc-v2](https://issues.apache.org/jira/browse/HBASE-20610)
   
   Also, my initial goal is that the meta data/directory should not be deleted if possible, and we're trying to provide a persisted condition not to always delete meta if it's not `partial` (protected by the ZK data).
   
   sorry that I may be newbie on the proc-v2 and zk data, should we start a thread on the dev@ list to discuss about the following ? (my goal is to find a consensus how we can move this PR further)
   
   1. should we delete meta directory when HM starts ? 
   2. after 2.2+, should not depend on the data on zookeeper and have more of the info into proc-v2 in the master region?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673201873


   > Thanks @Apache9 , I want to agree with you to have a HBCK option, but one concern I have and keep struggling about making this automated instead of HBCK options. If one HBase cluster has hundred of tables with thousand of regions, how would the operator recovery the cluster? does he/she (offline/online) repair the meta table by scanning the storage on each region ? (instead we can just load the meta without rebuilding it?)
   
   I think for the scenario here, we just need to write the cluster id and other things to zookeeper? Just make sure that the current code in HBase will not consider us as a fresh new cluster. We do not need to rebuild meta?
   > 
   > Tbh, I felt bad to bring this meta table issue because normal HBase cluster does not assume Zookeeper (and WAL) could be gone after the cluster starts and restarts.
   > 
   > [updated] for this PR/JIRA, mainly, I'm questioning what a `partial meta` should be (e.g. it's now relying on the state of `InitMetaProcedure` instead of the data of meta table), any thoughts ?
   After introducing proc-v2, we reply on it to record the state of a multi-step operation. Here, I believe the problem is that, we schedule an InitMetaProcedure when we do have a meta table in place. For the InitMetaProcedure, the assumption is that, if we found that the meta table directory is there, then it means the procedure itself has crashed before finishing the creation of meta table, i.e, the meta table is 'partial'. So it is safe to just remove it and create again. I think this is a very common trick in distributed system for handling failures?
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
Apache9 commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r472640228



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java
##########
@@ -167,4 +171,35 @@ protected void completionCleanup(MasterProcedureEnv env) {
   public void await() throws InterruptedException {
     latch.await();
   }
+
+  private static boolean deleteMetaTableDirectoryIfPartial(FileSystem rootDirectoryFs,
+    Path metaTableDir) throws IOException {
+    boolean isPartial = true;
+    try {
+      TableDescriptor metaDescriptor =
+        FSTableDescriptors.getTableDescriptorFromFs(rootDirectoryFs, metaTableDir);
+      // when entering the state of INIT_META_WRITE_FS_LAYOUT, if a meta table directory is found,
+      // the meta table should not have any useful data and considers as partial.
+      // if we find any valid HFiles, operator should fix the meta e.g. via HBCK.
+      if (metaDescriptor != null && metaDescriptor.getColumnFamilyCount() > 0) {
+        RemoteIterator<LocatedFileStatus> iterator = rootDirectoryFs.listFiles(metaTableDir, true);
+        while (iterator.hasNext()) {
+          LocatedFileStatus status = iterator.next();
+          if (StoreFileInfo.isHFile(status.getPath()) && HFile
+            .isHFileFormat(rootDirectoryFs, status.getPath())) {
+            isPartial = false;
+            break;
+          }
+        }
+      }
+    } finally {
+      if (!isPartial) {
+        throw new IOException("Meta table is not partial, please sideline this meta directory "
+          + "or run HBCK to fix this meta table, e.g. rebuild the server hostname in ZNode for the "
+          + "meta region");
+      }

Review comment:
       In general the approach is fine. The only concern is the implementation. I do not think InitMetaProcedure support rollback, so what will happen if we throw exception here? You will see a ERROR log in the output to say that the procedure does not support rollback?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
joshelser commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r559183840



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java
##########
@@ -166,4 +170,35 @@ protected void completionCleanup(MasterProcedureEnv env) {
   public void await() throws InterruptedException {
     latch.await();
   }
+
+  private static boolean deleteMetaTableDirectoryIfPartial(FileSystem rootDirectoryFs,
+    Path metaTableDir) throws IOException {
+    boolean isPartial = true;
+    try {
+      TableDescriptor metaDescriptor =
+        FSTableDescriptors.getTableDescriptorFromFs(rootDirectoryFs, metaTableDir);
+      // when entering the state of INIT_META_WRITE_FS_LAYOUT, if a meta table directory is found,
+      // the meta table should not have any useful data and considers as partial.
+      // if we find any valid HFiles, operator should fix the meta e.g. via HBCK.
+      if (metaDescriptor != null && metaDescriptor.getColumnFamilyCount() > 0) {
+        RemoteIterator<LocatedFileStatus> iterator = rootDirectoryFs.listFiles(metaTableDir, true);
+        while (iterator.hasNext()) {
+          LocatedFileStatus status = iterator.next();
+          if (StoreFileInfo.isHFile(status.getPath()) && HFile
+            .isHFileFormat(rootDirectoryFs, status.getPath())) {
+            isPartial = false;
+            break;
+          }
+        }
+      }
+    } finally {
+      if (!isPartial) {
+        throw new IOException("Meta table is not partial, please sideline this meta directory "
+          + "or run HBCK to fix this meta table, e.g. rebuild the server hostname in ZNode for the "

Review comment:
       Instead of specific commands to run, what about referencing something in the book (which could be updated if "what to do" changes across release versions)?

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java
##########
@@ -167,4 +171,35 @@ protected void completionCleanup(MasterProcedureEnv env) {
   public void await() throws InterruptedException {
     latch.await();
   }
+
+  private static boolean deleteMetaTableDirectoryIfPartial(FileSystem rootDirectoryFs,
+    Path metaTableDir) throws IOException {
+    boolean isPartial = true;
+    try {
+      TableDescriptor metaDescriptor =
+        FSTableDescriptors.getTableDescriptorFromFs(rootDirectoryFs, metaTableDir);
+      // when entering the state of INIT_META_WRITE_FS_LAYOUT, if a meta table directory is found,
+      // the meta table should not have any useful data and considers as partial.
+      // if we find any valid HFiles, operator should fix the meta e.g. via HBCK.
+      if (metaDescriptor != null && metaDescriptor.getColumnFamilyCount() > 0) {
+        RemoteIterator<LocatedFileStatus> iterator = rootDirectoryFs.listFiles(metaTableDir, true);
+        while (iterator.hasNext()) {
+          LocatedFileStatus status = iterator.next();
+          if (StoreFileInfo.isHFile(status.getPath()) && HFile
+            .isHFileFormat(rootDirectoryFs, status.getPath())) {
+            isPartial = false;
+            break;
+          }
+        }
+      }
+    } finally {
+      if (!isPartial) {
+        throw new IOException("Meta table is not partial, please sideline this meta directory "
+          + "or run HBCK to fix this meta table, e.g. rebuild the server hostname in ZNode for the "
+          + "meta region");
+      }

Review comment:
       I would worry that the ProcedureExecutor is going to resubmit the IMP again. You might have to, instead, let IMP finish without an error and determine via some side-effect (e.g. set a variable in HMaster?) if we bailed out of IMP and then fail at that level. Truthfully, I don't remember what happens by default (if all procedures which throw an exception are implicitly retried.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-916299365


   @joshelser  should we wait till add the documentation in this commit ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@hbase.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-941241199


   merged and if we need a new documentation, we can have a followup JIRA


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@hbase.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] taklwu commented on a change in pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Posted by GitBox <gi...@apache.org>.
taklwu commented on a change in pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#discussion_r468798519



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -915,6 +917,11 @@ private void finishActiveMasterInitialization(MonitoredTask status)
       this.tableDescriptors.getAll();
     }
 
+    // check cluster Id stored in ZNode before, and use it to indicate if a cluster has been
+    // restarted with an existing Zookeeper quorum.
+    isClusterRestartWithExistingZNodes =

Review comment:
       Thanks Anoop, if I get your suggestion correctly, should we add a master procedure (or a state in InitMetaProcedure) which record this boolean? this will have the assumption with the WAL exists when cluster recreates. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org