You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by st...@apache.org on 2020/07/29 14:45:16 UTC

[hbase] branch master updated: HBASE-24766 Document Remote Procedure Execution (#2131)

This is an automated email from the ASF dual-hosted git repository.

stack pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hbase.git


The following commit(s) were added to refs/heads/master by this push:
     new 652f1e6  HBASE-24766 Document Remote Procedure Execution (#2131)
652f1e6 is described below

commit 652f1e664b5131c971905f860e5245c38f7489bf
Author: Michael Stack <sa...@users.noreply.github.com>
AuthorDate: Wed Jul 29 07:44:01 2020 -0700

    HBASE-24766 Document Remote Procedure Execution (#2131)
    
    Add outline of how the mechanism works generally.
    
    Signed-off-by: Anoop Sam John anoopsamjohn@apache.org
    Signed-off-by: Viraj Jasani vjasani@apache.org
    Signed-off-by: Jan Hentschel jan.hentschel@ultratendency.com
---
 .../master/procedure/ServerRemoteProcedure.java    | 45 +++++++++++++++-------
 1 file changed, 32 insertions(+), 13 deletions(-)

diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerRemoteProcedure.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerRemoteProcedure.java
index 568b95d..9d35bcf 100644
--- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerRemoteProcedure.java
+++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerRemoteProcedure.java
@@ -33,21 +33,40 @@ import org.slf4j.LoggerFactory;
 
 @InterfaceAudience.Private
 /**
- * This extract the common used methods of procedures which are send to remote servers. Developers
- * who extends this class only need to override remoteCallBuild() and complete(). This procedure
- * will help add the operation to {@link RSProcedureDispatcher}
+ * The base class for Procedures that run {@link java.util.concurrent.Callable}s on a (remote)
+ * RegionServer; e.g. asking a RegionServer to split a WAL file as a sub-procedure of
+ * the ServerCrashProcedure recovery process.
  *
- * If adding the operation to dispatcher failed, addOperationToNode will throw
- * FailedRemoteDispatchException, and this procedure will return null which procedure Executor will
- * mark this procedure as complete. Thus the upper layer of this procedure must have a way to
- * check if this procedure really succeed and how to deal with it.
+ * <p>To implement a new Procedure type, extend this class and override remoteCallBuild() and
+ * complete(). The dispatch and callback will be handled for you here, internally.
  *
- * If sending the operation to remote RS failed, dispatcher will call remoteCallFailed() to
- * handle this, which actually call remoteOperationDone with the exception.
- * If the targetServer crashed but this procedure has no response, than dispatcher will call
- * remoteOperationFailed() to handle this, which also calls remoteOperationDone with the exception.
- * If the operation is successful, then remoteOperationCompleted will be called and actually calls
- * the remoteOperationDone without exception.
+ * <p>The Procedure works as follows. It uses {@link RSProcedureDispatcher}, the same system
+ * used dispatching Region OPEN and CLOSE RPCs, to pass a Callable to a RegionServer. Examples
+ * include {@link org.apache.hadoop.hbase.regionserver.SplitWALCallable} and
+ * {@link org.apache.hadoop.hbase.replication.regionserver.RefreshPeerCallable}. Rather than
+ * assign/unassign, the Master calls #executeProcedures against the remote RegionServer wrapping
+ * a Callable in a {@link ExecuteProceduresRequest}. Upon successful dispatch,
+ * the Procedure then suspends itself on the Master-side and relinqushes its executor worker.
+ * On receipt, the RegionServer submits the Callable to its executor service. When the Callable
+ * completes, it adds itself to a queue on the RegionServer side for processing by a background
+ * thread, the {@link RemoteProcedureResultReporter}. It picks up the completed Callable from the
+ * queue and RPCs the master at #reportProcedureDone with the procedure id and whether success or
+ * failure. The master calls complete() setting success or failure state and then reschedules the
+ * suspended Procedure so it can finish.
+ *
+ * <p>Here are some details on operation:
+ * <p>If adding the operation to the dispatcher fails, addOperationToNode will throw
+ * FailedRemoteDispatchException, and this Procedure will return 'null'. The Procedure Executor
+ * will then mark this procedure as 'complete' (though we failed to dispatch our task). In this
+ * case, the upper layer of this procedure must have a way to check if this Procedure really
+ * succeeded or not and have appropriate handling.
+ *
+ * <p>If sending the operation to remote RS failed, dispatcher will call remoteCallFailed() to
+ * handle this which calls remoteOperationDone with the exception. If the targetServer crashed but
+ * this procedure has no response, than dispatcher will call remoteOperationFailed() which also
+ * calls remoteOperationDone with the exception. If the operation is successful, then
+ * remoteOperationCompleted will be called and actually calls the remoteOperationDone without
+ * exception.
  *
  * In remoteOperationDone, we'll check if the procedure is already get wake up by others. Then
  * developer could implement complete() based on their own purpose.