You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2016/09/16 19:55:21 UTC

[jira] [Comment Edited] (HIVE-14624) LLAP: Use FQDN for all communication

    [ https://issues.apache.org/jira/browse/HIVE-14624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15497217#comment-15497217 ] 

Sergey Shelukhin edited comment on HIVE-14624 at 9/16/16 7:55 PM:
------------------------------------------------------------------

Fixed the build, added logging after address creation.
[~sseth] this addresses the case where AM LLAP plugin sends short name to LLAP as AmHost; LLAP is then unable to connect due to 
{noformat}
2016-09-15T23:42:22,092 ERROR [ExecutionCompletionThread #0 ()] org.apache.hadoop.hive.llap.daemon.impl.TaskRunnerCallable: TezTaskRunner execution failed for : AppId=application_1473964966092_0005, containerId=container_222212222_0005_01_000001, Dag=insert into x values(1),(2)(Stage-1), Vertex=Map 1, FragmentNum=0, Attempt=0
java.lang.IllegalArgumentException: java.net.UnknownHostException: [snip]
	at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:411) ~[hadoop-common-2.7.3.2.5.1.0-12.jar:?]
	at org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:390) ~[hadoop-common-2.7.3.2.5.1.0-12.jar:?]
	at org.apache.hadoop.hive.llap.daemon.impl.TaskRunnerCallable.callInternal(TaskRunnerCallable.java:219) ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
	at org.apache.hadoop.hive.llap.daemon.impl.TaskRunnerCallable.callInternal(TaskRunnerCallable.java:91) ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) ~[tez-common-0.9.0-SNAPSHOT.jar:0.9.0-SNAPSHOT]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_91]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]
Caused by: java.net.UnknownHostException: [snip]
	... 9 more
{noformat}

[~leftylev] fqdn is a well-known, industry-wide abbreviation, I don't think it needs to be expanded.


was (Author: sershe):
Fixed the build, added logging after address creation.
[~sseth] this addresses the case where AM LLAP plugin sends short name to LLAP as AmHost; LLAP is then unable to connect due to 
{noformat}
2016-09-15T23:42:22,092 ERROR [ExecutionCompletionThread #0 ()] org.apache.hadoop.hive.llap.daemon.impl.TaskRunnerCallable: TezTaskRunner execution failed for : AppId=application_1473964966092_0005, containerId=container_222212222_0005_01_000001, Dag=insert into x values(1),(2)(Stage-1), Vertex=Map 1, FragmentNum=0, Attempt=0
java.lang.IllegalArgumentException: java.net.UnknownHostException: [snip]
	at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:411) ~[hadoop-common-2.7.3.2.5.1.0-12.jar:?]
	at org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:390) ~[hadoop-common-2.7.3.2.5.1.0-12.jar:?]
	at org.apache.hadoop.hive.llap.daemon.impl.TaskRunnerCallable.callInternal(TaskRunnerCallable.java:219) ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
	at org.apache.hadoop.hive.llap.daemon.impl.TaskRunnerCallable.callInternal(TaskRunnerCallable.java:91) ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) ~[tez-common-0.9.0-SNAPSHOT.jar:0.9.0-SNAPSHOT]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_91]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]
Caused by: java.net.UnknownHostException: wn0-f363e2
	... 9 more
{noformat}

[~leftylev] fqdn is a well-known, industry-wide abbreviation, I don't think it needs to be expanded.

> LLAP: Use FQDN for all communication 
> -------------------------------------
>
>                 Key: HIVE-14624
>                 URL: https://issues.apache.org/jira/browse/HIVE-14624
>             Project: Hive
>          Issue Type: Bug
>          Components: llap
>    Affects Versions: 2.2.0
>            Reporter: Gopal V
>            Assignee: Sergey Shelukhin
>             Fix For: 2.2.0
>
>         Attachments: HIVE-14624.01.patch, HIVE-14624.02.patch, HIVE-14624.patch
>
>
> {code}
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java:                + socketAddress.getHostName());
> llap-client/src/java/org/apache/hadoop/hive/llap/registry/impl/LlapFixedRegistryImpl.java:            host = socketAddress.getHostName();
> llap-common/src/java/org/apache/hadoop/hive/llap/metrics/MetricsUtils.java:  public static String getHostName() {
> llap-common/src/java/org/apache/hadoop/hive/llap/metrics/MetricsUtils.java:      return InetAddress.getLocalHost().getHostName();
> llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java:    String name = address.getHostName();
> llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java:    builder.setAmHost(address.getHostName());
> llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/AMReporter.java:    nodeId = LlapNodeId.getInstance(localAddress.get().getHostName(), localAddress.get().getPort());
> llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/ContainerRunnerImpl.java:        localAddress.get().getHostName(), vertex.getDagName(), qIdProto.getDagIndex(),
> llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/ContainerRunnerImpl.java:          new ExecutionContextImpl(localAddress.get().getHostName()), env,
> llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapDaemon.java:    String hostName = MetricsUtils.getHostName();
> llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapProtocolServerImpl.java:        .setBindAddress(addr.getHostName())
> llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java:          request.getContainerIdString(), executionContext.getHostName(), vertex.getDagName(),
> llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java:    String displayName = "LlapDaemonCacheMetrics-" + MetricsUtils.getHostName();
> llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java:    displayName = "LlapDaemonIOMetrics-" + MetricsUtils.getHostName();
> llap-server/src/test/org/apache/hadoop/hive/llap/daemon/impl/TestLlapDaemonProtocolServerImpl.java:          new LlapProtocolClientImpl(new Configuration(), serverAddr.getHostName(),
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskCommunicator.java:    builder.setAmHost(getAddress().getHostName());
> llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java:      String displayName = "LlapTaskSchedulerMetrics-" + MetricsUtils.getHostName();
> {code}
> In systems where the hostnames do not match FQDN, calling the getCanonicalHostName() will allow for resolution of the hostname when accessing from a different base domain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)