You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Peter Bacsko (JIRA)" <ji...@apache.org> on 2019/05/29 12:33:00 UTC

[jira] [Comment Edited] (YARN-9581) WebAppUtils#getRMWebAppURLWithScheme ignores rm2

    [ https://issues.apache.org/jira/browse/YARN-9581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850809#comment-16850809 ] 

Peter Bacsko edited comment on YARN-9581 at 5/29/19 12:32 PM:
--------------------------------------------------------------

Thanks for the patch [~Prabhu Joseph]. Overall looks good, but this piece of code keeps showing up:
{noformat}
    String webAppAddress = WebAppUtils.getRMWebAppURLWithScheme(conf, 0);
    try {
      return getAMContainerInfoFromRM(appId, webAppAddress);
    } catch (Exception e) {
      if (HAUtil.isHAEnabled(conf)) {
        webAppAddress = WebAppUtils.getRMWebAppURLWithScheme(conf, 1);
        return getAMContainerInfoFromRM(appId, webAppAddress);
      }
      throw e;
    }
  }
{noformat}
I've been thinking about removing the duplicates and it requires lambda usage. Right now we have this at three places so I'm not sure if it's worth the trouble but I'll show my solution regardless.

Add this to eg. WebAppUtils:
{noformat}
  public static <T,R> R execOnActiveRM(Configuration conf, ThrowingBiFunction<String, T, R> func, T arg) throws Exception {
    String rm1Address = getRMWebAppURLWithScheme(conf, 0);
    try {
      return func.apply(rm1Address, arg);
    } catch (Exception e) {
      if (HAUtil.isHAEnabled(conf)) {
        String rm2Address = getRMWebAppURLWithScheme(conf, 1);
        LOG.info("RM on host {} is unavailable, trying {}", rm1Address, rm2Address);
        return func.apply(rm2Address, arg);
      }
      LOG.error("Error connecting to RM");
      throw e;
    }
  }

  @FunctionalInterface
  public interface ThrowingBiFunction<T, U, R> {
    R apply(T t, U u) throws Exception;
  }
{noformat}
And then repeated invocations are reduced to:

{{LogsCLI.java}}:
{noformat}
  protected List<JSONObject> getAMContainerInfoForRMWebService(
      Configuration conf, String appId) throws Exception {

    return WebAppUtils.execOnActiveRM(conf,
        this::getAMContainerInfoFromRM, appId);
  }
{noformat}
{{SchedConfCLI.java}}:
{noformat}
   ... (in SchedConfCLI.run())

    return WebAppUtils.execOnActiveRM(conf,
        this::updateSchedulerConfOnRMNode, updateInfo);
{noformat}
{{YarnWebServiceUtils.java}}:
{noformat}
  public static JSONObject getNodeInfoFromRMWebService(Configuration conf,
      String nodeId) throws ClientHandlerException,
      UniformInterfaceException {
    
    try {
      return WebAppUtils.execOnActiveRM(conf,
          YarnWebServiceUtils::getNodeInfoFromRM, nodeId);
    } catch (Exception e) {
      if (e instanceof ClientHandlerException) {
        throw ((ClientHandlerException) e);
      } else if (e instanceof UniformInterfaceException) {
        throw ((UniformInterfaceException) e);
      } else {
        throw new RuntimeException(e);
      }
    }
  }
{noformat}
Exception handling causes minor issues, eg. {{Exception}} has to be caught in {{getNodeInfoFromRMWebService}} because different methods have different {{throws}} clauses.

It's slightly more elegant and we have the fallback logic at a single place. From clean code perspective, that's a win.

[~adam.antal] [~snemeth] opinions?


was (Author: pbacsko):
Thanks for the patch [~Prabhu Joseph]. Overall looks good, but this piece of code keeps showing up:
{noformat}
    String webAppAddress = WebAppUtils.getRMWebAppURLWithScheme(conf, 0);
    try {
      return getAMContainerInfoFromRM(appId, webAppAddress);
    } catch (Exception e) {
      if (HAUtil.isHAEnabled(conf)) {
        webAppAddress = WebAppUtils.getRMWebAppURLWithScheme(conf, 1);
        return getAMContainerInfoFromRM(appId, webAppAddress);
      }
      throw e;
    }
  }
{noformat}
I've been thinking about removing the duplicates and it requires lambda usage. Right now we have this at three places so I'm not sure if it's worth the trouble but I'll show my solution regardless.

Add this to eg. WebAppUtils:
{noformat}
  public static <T,R> R execOnActiveRM(Configuration conf, ThrowingBiFunction<String, T, R> func, T arg) throws Exception {
    String rm1Address = WebAppUtils.getRMWebAppURLWithScheme(conf, 0);
    try {
      return func.apply(rm1Address, arg);
    } catch (Exception e) {
      if (HAUtil.isHAEnabled(conf)) {
        String rm2Address = WebAppUtils.getRMWebAppURLWithScheme(conf, 1);
        LOG.info("RM on host {} is unavailable, trying {}", rm1Address, rm2Address);
        return func.apply(rm2Address, arg);
      }
      LOG.error("Error connecting to RM");
      throw e;
    }
  }

  @FunctionalInterface
  public interface ThrowingBiFunction<T, U, R> {
    R apply(T t, U u) throws Exception;
  }
{noformat}
And then repeated invocations are reduced to:

{{LogsCLI.java}}:
{noformat}
  protected List<JSONObject> getAMContainerInfoForRMWebService(
      Configuration conf, String appId) throws Exception {

    return WebAppUtils.execOnActiveRM(conf,
        this::getAMContainerInfoFromRM, appId);
  }
{noformat}
{{SchedConfCLI.java}}:
{noformat}
   ... (in SchedConfCLI.run())

    return WebAppUtils.execOnActiveRM(conf,
        this::updateSchedulerConfOnRMNode, updateInfo);
{noformat}
{{YarnWebServiceUtils.java}}:
{noformat}
  public static JSONObject getNodeInfoFromRMWebService(Configuration conf,
      String nodeId) throws ClientHandlerException,
      UniformInterfaceException {
    
    try {
      return WebAppUtils.execOnActiveRM(conf,
          YarnWebServiceUtils::getNodeInfoFromRM, nodeId);
    } catch (Exception e) {
      if (e instanceof ClientHandlerException) {
        throw ((ClientHandlerException) e);
      } else if (e instanceof UniformInterfaceException) {
        throw ((UniformInterfaceException) e);
      } else {
        throw new RuntimeException(e);
      }
    }
  }
{noformat}
Exception handling causes minor issues, eg. {{Exception}} has to be caught in {{getNodeInfoFromRMWebService}} because different methods have different {{throws}} clauses.

It's slightly more elegant and we have the fallback logic at a single place. From clean code perspective, that's a win.

[~adam.antal] [~snemeth] opinions?

> WebAppUtils#getRMWebAppURLWithScheme ignores rm2
> ------------------------------------------------
>
>                 Key: YARN-9581
>                 URL: https://issues.apache.org/jira/browse/YARN-9581
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 3.2.0
>            Reporter: Prabhu Joseph
>            Assignee: Prabhu Joseph
>            Priority: Major
>         Attachments: YARN-9581-001.patch, YARN-9581-002.patch, YARN-9581-003.patch
>
>
> Yarn Logs fails for a running job in case of RM HA with rm2 active and rm1 is down.
> {code}
> hrt_qa@prabhuYarn:~> /usr/hdp/current/hadoop-yarn-client/bin/yarn  logs -applicationId application_1558613472348_0004 -am 1
> 19/05/24 18:04:49 INFO client.AHSProxy: Connecting to Application History server at prabhuYarn/172.27.23.55:10200
> 19/05/24 18:04:50 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
> Unable to get AM container informations for the application:application_1558613472348_0004
> java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: Error while authenticating with endpoint: https://prabhuYarn:8090/ws/v1/cluster/apps/application_1558613472348_0004/appattempts
> Can not get AMContainers logs for the application:application_1558613472348_0004 with the appOwner:hrt_qa
> {code}
> LogsCli getRMWebAppURLWithoutScheme only checks the first one from the RM list yarn.resourcemanager.ha.rm-ids.
> {code}
> yarnConfig.set(YarnConfiguration.RM_HA_ID, rmIds.get(0));
> {code}
> SchedConfCli also fails 
> {code}
> [ambari-qa@pjosephdocker-3 ~]$ yarn  schedulerconf -update root.default:maximum-capacity=90
> Exception in thread "main" com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused (Connection refused)
> 	at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
> 	at com.sun.jersey.api.client.Client.handle(Client.java:652)
> 	at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org