You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/06/04 00:55:08 UTC

[GitHub] [ozone] neils-dev commented on a diff in pull request #3389: HDDS-6433. Refactor OMFailoverProxyProvider to provide a base for OM and GrpcOM FailoverProxyProviders

neils-dev commented on code in PR #3389:
URL: https://github.com/apache/ozone/pull/3389#discussion_r886197608


##########
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/ha/GrpcOMFailoverProxyProvider.java:
##########
@@ -48,93 +54,105 @@
  * connecting to another OM node from the list of proxies.
  */
 public class GrpcOMFailoverProxyProvider<T> extends
-    OMFailoverProxyProvider<T> {
-
-  private Map<String, String> omAddresses;
+    OMFailoverProxyProviderBase<T> {
+  public static final Logger LOG =
+      LoggerFactory.getLogger(GrpcOMFailoverProxyProvider.class);
 
   public GrpcOMFailoverProxyProvider(ConfigurationSource configuration,
-                                     UserGroupInformation ugi,
                                      String omServiceId,
                                      Class<T> protocol) throws IOException {
-    super(configuration, ugi, omServiceId, protocol);
+    super(configuration, omServiceId, protocol);
   }
 
   @Override
   protected void loadOMClientConfigs(ConfigurationSource config, String omSvcId)
       throws IOException {
-    // to be used for base class omProxies,
-    // ProxyInfo not applicable for gRPC, just need key set
-    Map<String, ProxyInfo<T>> omProxiesNodeIdKeyset = new HashMap<>();
-    // to be used for base class omProxyInfos
-    // OMProxyInfo not applicable for gRPC, just need key set
-    Map<String, OMProxyInfo> omProxyInfosNodeIdKeyset = new HashMap<>();
-    List<String> omNodeIDList = new ArrayList<>();
-    omAddresses = new HashMap<>();
 
     Collection<String> omNodeIds = OmUtils.getActiveOMNodeIds(config, omSvcId);
+    Map<String, ProxyInfo<T>> omProxies = new HashMap<>();
+    List<String> omNodeIDList = new ArrayList<>();
 
     for (String nodeId : OmUtils.emptyAsSingletonNull(omNodeIds)) {
-
       String rpcAddrKey = ConfUtils.addKeySuffixes(OZONE_OM_ADDRESS_KEY,
           omSvcId, nodeId);
-
       Optional<String> hostaddr = getHostNameFromConfigKeys(config,
           rpcAddrKey);
-
       OptionalInt hostport = HddsUtils.getNumberFromConfigKeys(config,
           ConfUtils.addKeySuffixes(OMConfigKeys.OZONE_OM_GRPC_PORT_KEY,
               omSvcId, nodeId),
           OMConfigKeys.OZONE_OM_GRPC_PORT_KEY);
       if (nodeId == null) {
         nodeId = OzoneConsts.OM_DEFAULT_NODE_ID;
       }
-      omProxiesNodeIdKeyset.put(nodeId, null);
-      omProxyInfosNodeIdKeyset.put(nodeId, null);
       if (hostaddr.isPresent()) {
-        omAddresses.put(nodeId,
-            hostaddr.get() + ":"
-                + hostport.orElse(config
-                .getObject(GrpcOmTransport
-                    .GrpcOmTransportConfig.class)
-                .getPort()));
+        ProxyInfo<T> proxyInfo =
+            new ProxyInfo<>(createOMProxy(),
+                hostaddr.get() + ":"
+                    + hostport.orElse(config
+                    .getObject(GrpcOmTransport
+                        .GrpcOmTransportConfig.class)
+                    .getPort()));
+        omProxies.put(nodeId, proxyInfo);
       } else {
         LOG.error("expected host address not defined for: {}", rpcAddrKey);
         throw new ConfigurationException(rpcAddrKey + "is not defined");
       }
       omNodeIDList.add(nodeId);
     }
 
-    if (omProxiesNodeIdKeyset.isEmpty()) {
+    if (omProxies.isEmpty()) {
       throw new IllegalArgumentException("Could not find any configured " +
           "addresses for OM. Please configure the system with "
           + OZONE_OM_ADDRESS_KEY);
     }
+    setOmProxies(omProxies);
+    setOmNodeIDList(omNodeIDList);
+  }
 
-    // set base class omProxies, omProxyInfos, omNodeIDList
+  private T createOMProxy() throws IOException {
+    InetSocketAddress addr = new InetSocketAddress(0);
+    Configuration hadoopConf =
+        LegacyHadoopConfigurationSource.asHadoopConfiguration(getConf());
+    return (T) RPC.getProxy(getInterface(), 0, addr, hadoopConf);
+  }
 
-    // omProxies needed in base class
-    // omProxies.size == number of om nodes
-    // omProxies key needs to be valid nodeid
-    // omProxyInfos keyset needed in base class
-    setProxies(omProxiesNodeIdKeyset, omProxyInfosNodeIdKeyset, omNodeIDList);
+  /**
+   * Get the proxy object which should be used until the next failover event
+   * occurs. RPC proxy object is intialized lazily.
+   * @return the OM proxy object to invoke methods upon
+   */
+  @Override
+  public synchronized ProxyInfo<T> getProxy() {
+    return getOMProxyMap().get(getCurrentProxyOMNodeId());
   }
 
   @Override
-  protected Text computeDelegationTokenService() {
-    return new Text();
+  protected synchronized boolean shouldFailover(Exception ex) {
+    if (ex instanceof StatusRuntimeException) {
+      StatusRuntimeException srexp = (StatusRuntimeException)ex;
+      Status status = srexp.getStatus();
+      if (status.getCode() == Status.Code.RESOURCE_EXHAUSTED) {

Review Comment:
   Thanks @kerneltime for bringing up gRPC errors which we should not failover and retry.  From the list of possible codes, including `FAILED_PRECONDITION` and `INVALID_ARGUMENT,` for the most part they are a result of application level errors that are thrown and handled by the business logic.  In our case, for errors at the application layer, these are thrown as OMExceptions that are sent back to the caller through OMResponses.  Because of this errors such as `FAILED_PRECONDITION` and `PERMISSION_ERRORS` are not thrown and handled by the transport, rather they are thrown and handled by the business logic through an error OMResponse.
   
   From the list of possible codes, I am adding `DATA_LOSS` in addition to `RESOURCE_EXHAUSTED` as errors we do not perform failover.  See list of possible codes and client handling for each possibility in shared doc 
   https://docs.google.com/document/d/1EXepGLluPxQfX7dJE6nR6hSYk1wQz4YzVnwZ6xOe8Xo/edit?usp=sharing.
   
   Let me know if we should include additional codes that the client should NOT failover.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org