You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/12/02 06:00:32 UTC

[GitHub] [hbase] Apache9 commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

Apache9 commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-737011410


   > sorry for the delayed response.
   > 
   > > Is clumsy operator deleting the meta location znode by mistake a valid failure mode ?
   > 
   > no this is a special case that we have been supporting, where the HBase cluster freshly restarts on top of only flushed HFiles and does not come with WAL or ZK. and we admitted that it's a bit different from the community stand points that WAL and ZK must be both pre-existed when master or/and RSs start on existing HFiles to resume the states left from any procedures.
   Yes, this is not a typical scenario in the open source version of HBase so I do not think adding the special logic in the open source version is a good idea. In the future new developers who do not know this background may change the code again and cause problems.
   > 
   > > What about adding extra step before assign where we wait asking Master a question about the cluster state such as if any of the RSs that are checking in have Regions on them; i.e. if Regions already assigned, if an already 'up' cluster? Would that help?
   > 
   > having extra step to check if RSs has any assigned may help, but I don't know if we can do that before the server manager find any region server is online.
   > 
   > > You fellows don't want to have to run a script beforehand? ZK is up and just put an empty location up or ask Master or hbck2 to do it for you?
   > 
   > I think HBCK/HBCK2 is performing online repairing, there are few concerns we're having
   > 
   > 1. if the master is not up and running, then we cannot proceed
   > 2. even if the master is up, the repairing on hundreds or thousand of regions implies long scanning time, which IMO we can save this time by just reloading it from existing meta.
   > 3. having an additional steps/scripts to start a HBase cluster in the mentioned cloud use case seem a manual/semi-automated step we don't find a good fit to hold and maintain them.
   I'm fine with adding a new command in HBCK2 to do these fix ups before starting a cluster. Personally I do not think HBCK2 'must' put all the fix logic at master side. Buy anyway, since the repo is called hbase-operator-tools, I think it is free for us to create a new sub module to place new scripts? Though for now it only happens on AWS, I think we could abstract it as a general scenario where we want to start a HBase cluster only on HFiles.
   > 
   > Personally, it's fine to me with throwing exception as Duo suggested, and on our side we need to find a way to continue if we see this exception. then we improve it in the future when we need to completely getting rid of the extra step on hbck.
   > 
   > So, for this PR, if we don't hear any other critical suggestion, maybe I will leave it "close" as unresolved, do you guys agree ?
   This is a scenario for HBase on cloud, especially for AWS, so I think if you guys want to close it as unresolved, others will not have any strong opinon to object :) Take it easy.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org