You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/12/01 19:37:47 UTC

[GitHub] [hbase] taklwu edited a comment on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

taklwu edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-736772662


   >  Is clumsy operator deleting the meta location znode by mistake a valid failure mode ?
   
   no this is a special case that we have been supporting, where the HBase cluster freshly restarts on top of only flushed HFiles and does not come with WAL or ZK. and we admitted that it's a bit different from the community stand points that WAL and ZK must be both pre-existed when master or/and RSs start on existing HFiles to resume the states left from any procedures. 
   
   > What about adding extra step before assign where we wait asking Master a question about the cluster state such as if any of the RSs that are checking in have Regions on them; i.e. if Regions already assigned, if an already 'up' cluster? Would that help?
   
   having extra step to check if RSs has any assigned may help, but I don't know if we can do that before the server manager find any region server is online. 
   
   > You fellows don't want to have to run a script beforehand? ZK is up and just put an empty location up or ask Master or hbck2 to do it for you? 
   
   I think HBCK/HBCK2 is performing online repairing, there are few concerns we're having 
   1. if the master is not up and running, then we cannot proceed 
   2. even if the master is up, the repairing on hundreds or thousand of regions implies long scanning time, which IMO we can save this time by just reloading it from existing meta. 
   3. having an additional steps/scripts to start a HBase cluster in the mentioned cloud use case seem a manual/semi-automated step we don't find a good fit to hold and maintain them.
   
   Personally, it's fine to me with throwing exception as Duo suggested, and on our side we need to find a way to continue if we see this exception. then we improve it in the future when we need to completely getting rid of the extra step on hbck. 
   
   So, for this PR, if we don't hear any other critical suggestion, maybe I will leave it "close" as unresolved, do you guys agree ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org