You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@kylin.apache.org by op <52...@qq.com> on 2017/10/13 07:13:32 UTC

回复： yarn configuration problem when building kylin

hi,Shuangyin Ge
our clusters contains 63 datanodes ,resourcemanager and namenode are set up in the same 2 nodes ,both enabled HA..they are working stably for some years.  do you think we have to change some configurations?
we put kylin in client node 129 and resourcemanagers  are in 225 and 236


in addition，can you speak chinese?


thanks
------------------ 原始邮件 ------------------
发件人: "Shuangyin Ge";<go...@gmail.com>;
发送时间: 2017年10月13日(星期五) 下午3:03
收件人: "user"<us...@kylin.apache.org>;

主题: Re: yarn configuration problem when building kylin



Hello op,


Can you try to specify yarn.resourcemanager.hostname.rm1 and yarn.resourcemanager.hostname.rm2 in yarn-site.xml as well following https://hadoop.apache.org/docs/r2.8.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html?

2017-10-13 14:44 GMT+08:00 op <52...@qq.com>:
when i builing my cube,the progress is always pending,then i find this in kylin.log,can't connect to the correct resourcemanager address,i've checked my environment,can you give me some advice?


2017-10-13 14:33:48,978 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] client.RMProxy:56 : Connecting to ResourceManager at /0.0.0.0:8032
2017-10-13 14:33:50,061 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:51,062 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:52,063 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:53,064 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:54,065 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:55,067 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:56,068 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:57,069 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:58,070 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:59,071 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:00,072 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 10 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:01,073 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 11 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:02,074 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 12 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:03,075 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 13 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:04,076 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 14 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)



my yarn enabled HA,there are some of the configurations:


<property>
   <name>yarn.resourcemanager.cluster-id</name>
   <value>boh</value>
   <final>false</final>
</property>  


<property>
   <name>yarn.resourcemanager.ha.rm-ids</name>
   <value>rm1,rm2</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.address.rm1</name>
   <value>hadoop001:23188</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.https.address.rm1</name>
   <value>hadoop001:23189</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
   <value>hadoop001:23125</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.scheduler.address.rm1</name>
   <value>hadoop001:23130</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.address.rm1</name>
   <value>hadoop001:23140</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.admin.address.rm1</name>
   <value>hadoop001:23141</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.address.rm2</name>
   <value>hadoop011:23188</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.https.address.rm2</name>
   <value>hadoop011:23189</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
   <value>hadoop011:23125</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.scheduler.address.rm2</name>
   <value>hadoop011:23130</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.address.rm2</name>
   <value>hadoop011:23140</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.admin.address.rm2</name>
   <value>hadoop011:23141</value>
   <final>false</final>
</property>

回复： yarn configuration problem when building kylin

Posted by op <52...@qq.com>.

my hadoop version
[hadoop@hadoop006 root]$ hadoop version
Hadoop 2.6.4
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 5082c73637530b0b7e115f9625ed7fac69f937e6
Compiled by jenkins on 2016-02-12T09:45Z
Compiled with protoc 2.5.0
From source with checksum 8dee2286ecdbbbc930a6c87b65cbc010
This command was run using /opt/beh/core/hadoop/share/hadoop/common/hadoop-common-2.6.4.jar



my yarn-site.xml  


thanks


-------------------------------------------------------------------------------
<configuration>


<property>
   <name>yarn.nodemanager.aux-services</name>
   <value>mapreduce_shuffle</value>
   <final>false</final>
</property> 


<property>
   <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
   <value>org.apache.hadoop.mapred.ShuffleHandler</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.cluster-id</name>
   <value>boh</value>
   <final>false</final>
</property>  


<property>
   <name>yarn.resourcemanager.ha.rm-ids</name>
   <value>rm1,rm2</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.address.rm1</name>
   <value>hadoop001:23188</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.https.address.rm1</name>
   <value>hadoop001:23189</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
   <value>hadoop001:23125</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.scheduler.address.rm1</name>
   <value>hadoop001:23130</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.address.rm1</name>
   <value>hadoop001:23140</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.admin.address.rm1</name>
   <value>hadoop001:23141</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.address.rm2</name>
   <value>hadoop011:23188</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.https.address.rm2</name>
   <value>hadoop011:23189</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
   <value>hadoop011:23125</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.scheduler.address.rm2</name>
   <value>hadoop011:23130</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.address.rm2</name>
   <value>hadoop011:23140</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.admin.address.rm2</name>
   <value>hadoop011:23141</value>
   <final>false</final>
</property>


<property>
   <name>yarn.nodemanager.address</name>
   <value>0.0.0.0:23998</value>
   <final>false</final>
</property>


<property>
   <name>yarn.nodemanager.webapp.address</name>
   <value>0.0.0.0:23999</value>
   <final>false</final>
</property>


<property>
   <name>mapreduce.shuffle.port</name>
   <value>23080</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.ha.enabled</name>
   <value>true</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
   <value>true</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.recovery.enabled</name>
   <value>true</value>
   <final>false</final>
</property>


<property>
   <name>yarn.nodemanager.recovery.enabled</name>
   <value>true</value>
   <final>false</final>
</property>


<property>
   <name>yarn.nodemanager.recovery.dir</name>
   <value>/opt/beh/data/yarn/yarn-nm-recovery</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.connect.retry-interval.ms</name>
   <value>2000</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.zk-address</name>
   <value>hadoop001:2181,hadoop012:2181,hadoop011:2181</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.zk.state-store.address</name>
   <value>hadoop001:2181,hadoop012:2181,hadoop011:2181</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.store.class</name>
   <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
   <final>false</final>
</property>


<property>
   <name>yarn.nodemanager.local-dirs</name>
   <value>/data/disk1/tmp/yarn/local,/data/disk2/tmp/yarn/local,/data/disk3/tmp/yarn/local,/data/disk4/tmp/yarn/local,/data/disk5/tmp/yarn/local,/data/disk6/tmp/yarn/local,/data/disk7/tmp/yarn/local,/data/disk8/tmp/yarn/local,/data/disk9/tmp/yarn/local,/data/disk10/tmp/yarn/local,/data/disk11/tmp/yarn/local,/data/disk12/tmp/yarn/local</value>
   <final>false</final>
</property> 




<!--Compute Resource Setting Start-->
<property>
   <name>yarn.nodemanager.resource.memory-mb</name>
   <value>122880</value>
   <final>false</final>
   <description>Amount of physical memory, in MB, that can be allocated for containers.Default:8192</description>
</property>


<property>
   <name>yarn.scheduler.minimum-allocation-mb</name>
   <value>4096</value>
   <description>The minimum allocation for every container request at the RM, in MBs. Memory requests lower than this will throw a InvalidResourceRequestException.Default:1024</description>
</property>


<property>
   <name>yarn.scheduler.maximum-allocation-mb</name>
   <value>16384</value>
   <description>The maximum allocation for every container request at the RM, in MBs. Memory requests higher than this will throw a InvalidResourceRequestException.Default:8192</description>
</property>


<property>
   <name>yarn.nodemanager.resource.cpu-vcores</name>
   <value>48</value>
   <final>false</final>
   <description>Number of vcores that can be allocated for containers. This is used by the RM scheduler when allocating resources for containers. This is not used to limit the number o physical cores used by YARN containers.Default:8</description>
</property>


<property>
   <name>yarn.scheduler.minimum-allocation-vcores</name>
   <value>1</value>
   <final>false</final>
   <description>The minimum allocation for every container request at the RM, in terms of virtual CPU cores. Requests lower than this will throw a InvalidResourceRequestException.Default:1</description>
</property>


<property>
   <name>yarn.scheduler.maximum-allocation-vcores</name>
   <value>4</value>
   <final>false</final>
   <description>The maximum allocation for every container request at the RM, in terms of virtual CPU cores. Requests higher than this will throw a InvalidResourceRequestException.Default:32</description>
</property>


<property>
   <name>yarn.nodemanager.vmem-pmem-ratio</name>
   <value>2.1</value>
   <description>Ratio between virtual memory to physical memory when setting memory limits for containers. Container allocations are expressed in terms of physical memory, and virtual memory usage is allowed to exceed this allocation by this ratio.Default:2.1</description>
</property>


<property>
   <name>yarn.nodemanager.pmem-check-enabled</name>
   <value>true</value>
   <description>Whether physical memory limits will be enforced for containers.</description>
</property>


<property>
   <name>yarn.nodemanager.vmem-check-enabled</name>
   <value>false</value>
   <description>Whether virtual memory limits will be enforced for containers.</description>
</property>


<!--Compute Resource Setting End-->


<!--Uber Mode Start-->
<property>
   <name>mapreduce.job.ubertask.enable</name>
   <value>true</value>
   <description>Whether to enable the small-jobs "ubertask" optimization, which runs "sufficiently small" jobs sequentially within a single JVM. "Small" is defined by the following maxmaps, maxreduces, and maxbytes settings. Note that configurations for application masters also affect the "Small" definition - yarn.app.mapreduce.am.resource.mb must be larger than both mapreduce.map.memory.mb and mapreduce.reduce.memory.mb, and yarn.app.mapreduce.am.resource.cpu-vcores must be larger than both mapreduce.map.cpu.vcores and mapreduce.reduce.cpu.vcores to enable ubertask. Users may override this value.
   </description>
</property>
   
<property>
   <name>mapreduce.job.ubertask.maxmaps</name>
   <value>9</value>
   <description>Threshold for number of maps, beyond which job is considered too big for the ubertasking optimization. Users may override this value, but only downward.</description>
</property>


<property>
   <name>mapreduce.job.ubertask.maxreduces</name>
   <value>1</value>
   <description>Threshold for number of reduces, beyond which job is considered too big for the ubertasking optimization. CURRENTLY THE CODE CANNOT SUPPORT MORE THAN ONE REDUCE and will ignore larger values. (Zero is a valid max, however.) Users may override this value, but only downward.</description>
</property>


<property>
   <name>mapreduce.job.ubertask.maxbytes</name>
   <value>134217728</value>
   <description>Threshold for number of input bytes, beyond which job is considered too big for the ubertasking optimization. If no value is specified, dfs.block.size is used as a default. Be sure to specify a default value in mapred-site.xml if the underlying filesystem is not HDFS. Users may override this value, but only downward.</description>
</property>


<property>
   <name>yarn.app.mapreduce.am.env</name>
   <value>LD_LIBRARY_PATH=$HADOOP_HOME/lib/native</value>
</property>
<!--Uber Mode Stop-->


<property>
   <name>yarn.nodemanager.log-dirs</name>
   <value>/opt/beh/logs/yarn/userlogs</value>
   <final>false</final>
</property>


<!--LOG Aggregation Start-->
<property>
   <name>yarn.log-aggregation-enable</name>
   <value>true</value>
   <final>false</final>
</property>


<property>
   <name>yarn.nodemanager.localizer.address</name>
   <value>0.0.0.0:23344</value>
   <final>false</final>
</property>


<property>
   <name>yarn.nodemanager.remote-app-log-dir</name>
   <value>hdfs://boh/var/log/hadoop-yarn/apps</value>
   <final>false</final>
</property>




<property>
   <name>yarn.log-aggregation.retain-seconds</name>
   <value>2592000</value>
   <final>false</final>
</property>




<property>
   <name>yarn.log-aggregation.retain-check-interval-seconds</name>
   <value>3600</value>
   <final>false</final>
</property>


<property>
   <name>yarn.nodemanager.log-aggregation.compression-type</name>
   <value>none</value>
</property>


<!--LOG Aggregation End-->


<property>
   <name>yarn.resourcemanager.scheduler.class</name>
   <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
   <description>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler;org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</description>
   <final>false</final>
</property>


<property>
   <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
   <value>5000</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.resource-tracker.client.thread-count</name>
   <value>50</value>
   <final>false</final>
   <description>Number of threads to handle resource tracker calls.default:50</description>
</property>


<property>
   <name>yarn.resourcemanager.scheduler.client.thread-count</name>
   <value>50</value>
   <final>false</final>
   <description>Number of threads to handle scheduler interface.default:50</description>
</property>


<!--Disk Health Checker Start-->


<property>
   <name>yarn.nodemanager.disk-health-checker.min-healthy-disks</name>
   <value>0.25</value>
   <description>The minimum fraction of number of disks to be healthy for the nodemanager to launch new containers. This correspond to both yarn-nodemanager.local-dirs and yarn.nodemanager.log-dirs. i.e. If there are less number of healthy local-dirs (or log-dirs) available, then new containers will not be launched on this node.Default:0.25</description>
</property>


<property>
   <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
   <value>90.0</value>
   <description>The maximum percentage of disk space utilization allowed after which a disk is marked as bad. Values can range from 0.0 to 100.0. If the value is greater than or equal to 100, the nodemanager will check for full disk. This applies to yarn-nodemanager.local-dirs and yarn.nodemanager.log-dirs.Default:90.0</description>
</property>


<property>
   <name>yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb</name>
   <value>1024</value>
   <description>The minimum space that must be available on a disk for it to be used. This applies to yarn-nodemanager.local-dirs and yarn.nodemanager.log-dirs.Default:0</description>
</property>


<!--Disk Health Checker End-->


<!--Debug Delay Start-->
<property>
   <name>yarn.nodemanager.delete.debug-delay-sec</name>
   <value>600</value>
   <description>Number of seconds after an application finishes before the nodemanager's DeletionService will delete the application's localized file directory and log directory. To diagnose Yarn application problems, set this property's value large enough (for example, to 600 = 10 minutes) to permit examination of these directories. After changing the property's value, you must restart the nodemanager in order for it to have an effect. The roots of Yarn applications' work directories is configurable with the yarn.nodemanager.local-dirs property (see below), and the roots of the Yarn applications' log directories is configurable with the yarn.nodemanager.log-dirs property (see also below).Default:0</description>
</property>
<!--Debug Delay End-->




<!--Health Checker Start-->
<property>
   <name>yarn.nodemanager.health-checker.interval-ms</name>
   <value>60000</value>
   <description>Frequency of running node health script.Default:60000</description>
</property>


<property>
   <name>yarn.nodemanager.health-checker.script.timeout-ms</name>
   <value>60000</value>
   <description>Script time out period.Default:120000</description>
</property>
<!--Health Checker End-->


<!--YARN Timeline Server Start-->


<property>
   <name>yarn.timeline-service.enabled</name>
   <value>true</value>
</property>


<property>
  <description>The hostname of the Timeline service web application.</description>
  <name>yarn.timeline-service.hostname</name>
  <value>hadoop001</value>
</property>


<property>
   <name>yarn.timeline-service.address</name>
   <value>hadoop001:10200</value>
</property>


<property>
   <name>yarn.timeline-service.webapp.address</name>
   <value>hadoop001:8188</value>
</property>


<property>
   <name>yarn.timeline-service.webapp.https.address</name>
   <value>hadoop001:8190</value>
</property>


<property>
   <name>yarn.timeline-service.ttl-enable</name>
   <value>true</value>
</property>


<property>
   <name>yarn.timeline-service.ttl-ms</name>
   <value>2678400000</value>
</property>


<property>
   <name>yarn.timeline-service.store-class</name>
   <value>org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore</value>
</property>


<property>
   <name>yarn.timeline-service.leveldb-timeline-store.ttl-interval-ms</name>
   <value>300000</value>
</property>


<property>
   <name>yarn.timeline-service.leveldb-timeline-store.path</name>
   <value>/opt/beh/data/yarn/timeline</value>
</property>


<property>
   <name>yarn.timeline-service.generic-application-history.enabled</name>
   <value>true</value>
</property>




<property>
   <name>yarn.timeline-service.generic-application-history.store-class</name>
   <value>org.apache.hadoop.yarn.server.applicationhistoryservice.NullApplicationHistoryStore</value>
</property>


<property>
   <description>Handler thread count to serve the client RPC requests.</description>
   <name>yarn.timeline-service.handler-thread-count</name>
   <value>10</value>
</property>




<property>
  <name>yarn.resourcemanager.system-metrics-publisher.enabled</name>
  <value>true</value>
</property>


<property>
  <description>Enables cross-origin support (CORS) for web services wherecross-origin web response headers are needed. For example, javascript making a web services request to the timeline server.</description>
  <name>yarn.timeline-service.http-cross-origin.enabled</name>
  <value>true</value>
  <!--hadoop 2.6.0才支持 详细信息见http://search-hadoop.com/m/tQlTsMD%26subj=Tez+nbsp+taskcount+log+visualization-->
</property>


<property>
  <description>Comma separated list of origins that are allowed for web
  services needing cross-origin (CORS) support. Wildcards (*) and patterns
  allowed</description>
  <name>yarn.timeline-service.http-cross-origin.allowed-origins</name>
  <value>*</value>
</property>


<property>
  <description>Comma separated list of methods that are allowed for web
  services needing cross-origin (CORS) support.</description>
  <name>yarn.timeline-service.http-cross-origin.allowed-methods</name>
  <value>GET,POST,HEAD</value>
</property>


<property>
  <description>Comma separated list of headers that are allowed for web
  services needing cross-origin (CORS) support.</description>
  <name>yarn.timeline-service.http-cross-origin.allowed-headers</name>
  <value>X-Requested-With,Content-Type,Accept,Origin</value>
</property>


<property>
  <description>The number of seconds a pre-flighted request can be cached
  for web services needing cross-origin (CORS) support.</description>
  <name>yarn.timeline-service.http-cross-origin.max-age</name>
  <value>1800</value>
</property>




<!--YARN Timeline Server END-->


<!--ACL Start-->
<property>
   <name>yarn.acl.enable</name>
   <value>true</value>
</property>
                    
<property>
   <name>yarn.admin.acl</name>
   <value>*</value>
</property>
<!--ACL End-->


</configuration>



------------------ 原始邮件 ------------------
发件人: "ShaoFeng Shi";<sh...@apache.org>;
发送时间: 2017年10月15日(星期天) 上午9:24
收件人: "user"<us...@kylin.apache.org>;

主题: Re: yarn configuration problem when building kylin



Kylin support YARN HA since a very early version. And many users are running in this way. 

If you can provide the Hadoop version and full yarn-site.xml, that would be easier for investigation.


2017-10-14 16:24 GMT+08:00 op <52...@qq.com>:
doesn’t apache-kylin-2.0.0-bin-hbase098.tar.gz support YARN HA? i just changed my yarn-site.xml，disabled YARN HA，and then the resorcemanager can be successfully detected。。


------------------ 原始邮件 ------------------
发件人: "╰╮爱ャ国灬";<52...@qq.com>;
发送时间: 2017年10月13日(星期五) 下午5:21
收件人: "user"<us...@kylin.apache.org>;

主题: 回复： yarn configuration problem when building kylin



hello ShaoFeng.
the situation above is ,when building a cube,the first two step can successfully finish,but will Get stuck at the third step ,the log ptinting Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried x times without stop。







------------------ 原始邮件 ------------------
发件人: "ShaoFeng Shi";<sh...@apache.org>;
发送时间: 2017年10月13日(星期五) 下午5:02
收件人: "user"<us...@kylin.apache.org>;

主题: Re: yarn configuration problem when building kylin



Obviously, Kylin didn't aware your yarn-site.xml, causing it connecting with a non-existing address. 

Please check whether the right yarn-site.xml is in the Hadoop configuration folder, e.g, /etc/hadoop/conf. You can also try to run a sample Hadoop job from the Kylin node, to verify whether the node is properly configured.


BTW, English is the recommended language for communication, because Kylin users are from different countries. 


2017-10-13 15:13 GMT+08:00 op <52...@qq.com>:
hi,Shuangyin Ge
our clusters contains 63 datanodes ,resourcemanager and namenode are set up in the same 2 nodes ,both enabled HA..they are working stably for some years.  do you think we have to change some configurations?
we put kylin in client node 129 and resourcemanagers  are in 225 and 236


in addition，can you speak chinese?


thanks
------------------ 原始邮件 ------------------
发件人: "Shuangyin Ge";<go...@gmail.com>;
发送时间: 2017年10月13日(星期五) 下午3:03
收件人: "user"<us...@kylin.apache.org>;

主题: Re: yarn configuration problem when building kylin



Hello op,


Can you try to specify yarn.resourcemanager.hostname.rm1 and yarn.resourcemanager.hostname.rm2 in yarn-site.xml as well following https://hadoop.apache.org/docs/r2.8.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html?

2017-10-13 14:44 GMT+08:00 op <52...@qq.com>:
when i builing my cube,the progress is always pending,then i find this in kylin.log,can't connect to the correct resourcemanager address,i've checked my environment,can you give me some advice?


2017-10-13 14:33:48,978 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] client.RMProxy:56 : Connecting to ResourceManager at /0.0.0.0:8032
2017-10-13 14:33:50,061 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:51,062 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:52,063 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:53,064 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:54,065 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:55,067 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:56,068 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:57,069 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:58,070 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:59,071 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:00,072 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 10 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:01,073 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 11 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:02,074 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 12 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:03,075 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 13 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:04,076 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 14 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)



my yarn enabled HA,there are some of the configurations:


<property>
   <name>yarn.resourcemanager.cluster-id</name>
   <value>boh</value>
   <final>false</final>
</property>  


<property>
   <name>yarn.resourcemanager.ha.rm-ids</name>
   <value>rm1,rm2</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.address.rm1</name>
   <value>hadoop001:23188</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.https.address.rm1</name>
   <value>hadoop001:23189</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
   <value>hadoop001:23125</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.scheduler.address.rm1</name>
   <value>hadoop001:23130</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.address.rm1</name>
   <value>hadoop001:23140</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.admin.address.rm1</name>
   <value>hadoop001:23141</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.address.rm2</name>
   <value>hadoop011:23188</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.https.address.rm2</name>
   <value>hadoop011:23189</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
   <value>hadoop011:23125</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.scheduler.address.rm2</name>
   <value>hadoop011:23130</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.address.rm2</name>
   <value>hadoop011:23140</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.admin.address.rm2</name>
   <value>hadoop011:23141</value>
   <final>false</final>
</property>












-- 
Best regards,

Shaofeng Shi 史少锋






 









-- 
Best regards,

Shaofeng Shi 史少锋

Re: yarn configuration problem when building kylin

Posted by ShaoFeng Shi <sh...@apache.org>.

Kylin support YARN HA since a very early version. And many users are
running in this way.

If you can provide the Hadoop version and full yarn-site.xml, that would be
easier for investigation.

2017-10-14 16:24 GMT+08:00 op <52...@qq.com>:

> doesn’t apache-kylin-2.0.0-bin-hbase098.tar.gz support YARN HA? i just
> changed my yarn-site.xml，disabled YARN HA，and then the resorcemanager can
> be successfully detected。。
>
> ------------------ 原始邮件 ------------------
> *发件人:* "╰╮爱ャ国灬";<52...@qq.com>;
> *发送时间:* 2017年10月13日(星期五) 下午5:21
> *收件人:* "user"<us...@kylin.apache.org>;
> *主题:* 回复： yarn configuration problem when building kylin
>
> hello ShaoFeng.
> the situation above is ,when building a cube,the first two step can
> successfully finish,but will Get stuck at the third step ,the log ptinting
> Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried x times
> without stop。
>
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "ShaoFeng Shi";<sh...@apache.org>;
> *发送时间:* 2017年10月13日(星期五) 下午5:02
> *收件人:* "user"<us...@kylin.apache.org>;
> *主题:* Re: yarn configuration problem when building kylin
>
> Obviously, Kylin didn't aware your yarn-site.xml, causing it connecting
> with a non-existing address.
>
> Please check whether the right yarn-site.xml is in the Hadoop
> configuration folder, e.g, /etc/hadoop/conf. You can also try to run a
> sample Hadoop job from the Kylin node, to verify whether the node is
> properly configured.
>
> BTW, English is the recommended language for communication, because Kylin
> users are from different countries.
>
> 2017-10-13 15:13 GMT+08:00 op <52...@qq.com>:
>
>> hi,Shuangyin Ge
>> our clusters contains 63 datanodes ,resourcemanager and namenode are set
>> up in the same 2 nodes ,both enabled HA..they are working stably for some
>> years.  do you think we have to change some configurations?
>> we put kylin in client node 129 and resourcemanagers  are in 225 and 236
>>
>> in addition，can you speak chinese?
>>
>> thanks
>> ------------------ 原始邮件 ------------------
>> *发件人:* "Shuangyin Ge";<go...@gmail.com>;
>> *发送时间:* 2017年10月13日(星期五) 下午3:03
>> *收件人:* "user"<us...@kylin.apache.org>;
>> *主题:* Re: yarn configuration problem when building kylin
>>
>> Hello op,
>>
>> Can you try to specify yarn.resourcemanager.hostname.rm1 and
>> yarn.resourcemanager.hostname.rm2 in yarn-site.xml as well following
>> https://hadoop.apache.org/docs/r2.8.0/hadoop-yarn/
>> hadoop-yarn-site/ResourceManagerHA.html?
>>
>> 2017-10-13 14:44 GMT+08:00 op <52...@qq.com>:
>>
>>> when i builing my cube,the progress is always pending,then i find this
>>> in kylin.log,can't connect to the correct resourcemanager address,i've
>>> checked my environment,can you give me some advice?
>>>
>>> 2017-10-13 14:33:48,978 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> client.RMProxy:56 : Connecting to ResourceManager at /0.0.0.0:8032
>>> 2017-10-13 14:33:50,061 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>> 2017-10-13 14:33:51,062 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>> 2017-10-13 14:33:52,063 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>> 2017-10-13 14:33:53,064 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>> 2017-10-13 14:33:54,065 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>> 2017-10-13 14:33:55,067 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>> 2017-10-13 14:33:56,068 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>> 2017-10-13 14:33:57,069 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>> 2017-10-13 14:33:58,070 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>> 2017-10-13 14:33:59,071 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>> 2017-10-13 14:34:00,072 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 10 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>> 2017-10-13 14:34:01,073 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 11 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>> 2017-10-13 14:34:02,074 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 12 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>> 2017-10-13 14:34:03,075 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 13 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>> 2017-10-13 14:34:04,076 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>>> Already tried 14 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>>> sleepTime=1 SECONDS)
>>>
>>> my yarn enabled HA,there are some of the configurations:
>>>
>>> <property>
>>>    <name>yarn.resourcemanager.cluster-id</name>
>>>    <value>boh</value>
>>>    <final>false</final>
>>> </property>
>>>
>>> <property>
>>>    <name>yarn.resourcemanager.ha.rm-ids</name>
>>>    <value>rm1,rm2</value>
>>>    <final>false</final>
>>> </property>
>>>
>>> <property>
>>>    <name>yarn.resourcemanager.webapp.address.rm1</name>
>>>    <value>hadoop001:23188</value>
>>>    <final>false</final>
>>> </property>
>>>
>>> <property>
>>>    <name>yarn.resourcemanager.webapp.https.address.rm1</name>
>>>    <value>hadoop001:23189</value>
>>>    <final>false</final>
>>> </property>
>>>
>>> <property>
>>>    <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
>>>    <value>hadoop001:23125</value>
>>>    <final>false</final>
>>> </property>
>>>
>>> <property>
>>>    <name>yarn.resourcemanager.scheduler.address.rm1</name>
>>>    <value>hadoop001:23130</value>
>>>    <final>false</final>
>>> </property>
>>>
>>> <property>
>>>    <name>yarn.resourcemanager.address.rm1</name>
>>>    <value>hadoop001:23140</value>
>>>    <final>false</final>
>>> </property>
>>>
>>> <property>
>>>    <name>yarn.resourcemanager.admin.address.rm1</name>
>>>    <value>hadoop001:23141</value>
>>>    <final>false</final>
>>> </property>
>>>
>>> <property>
>>>    <name>yarn.resourcemanager.webapp.address.rm2</name>
>>>    <value>hadoop011:23188</value>
>>>    <final>false</final>
>>> </property>
>>>
>>> <property>
>>>    <name>yarn.resourcemanager.webapp.https.address.rm2</name>
>>>    <value>hadoop011:23189</value>
>>>    <final>false</final>
>>> </property>
>>>
>>> <property>
>>>    <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
>>>    <value>hadoop011:23125</value>
>>>    <final>false</final>
>>> </property>
>>>
>>> <property>
>>>    <name>yarn.resourcemanager.scheduler.address.rm2</name>
>>>    <value>hadoop011:23130</value>
>>>    <final>false</final>
>>> </property>
>>>
>>> <property>
>>>    <name>yarn.resourcemanager.address.rm2</name>
>>>    <value>hadoop011:23140</value>
>>>    <final>false</final>
>>> </property>
>>>
>>> <property>
>>>    <name>yarn.resourcemanager.admin.address.rm2</name>
>>>    <value>hadoop011:23141</value>
>>>    <final>false</final>
>>> </property>
>>>
>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>


-- 
Best regards,

Shaofeng Shi 史少锋

回复： yarn configuration problem when building kylin

Posted by op <52...@qq.com>.

doesn’t apache-kylin-2.0.0-bin-hbase098.tar.gz support YARN HA? i just changed my yarn-site.xml，disabled YARN HA，and then the resorcemanager can be successfully detected。。


------------------ 原始邮件 ------------------
发件人: "╰╮爱ャ国灬";<52...@qq.com>;
发送时间: 2017年10月13日(星期五) 下午5:21
收件人: "user"<us...@kylin.apache.org>;

主题: 回复： yarn configuration problem when building kylin



hello ShaoFeng.
the situation above is ,when building a cube,the first two step can successfully finish,but will Get stuck at the third step ,the log ptinting Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried x times without stop。







------------------ 原始邮件 ------------------
发件人: "ShaoFeng Shi";<sh...@apache.org>;
发送时间: 2017年10月13日(星期五) 下午5:02
收件人: "user"<us...@kylin.apache.org>;

主题: Re: yarn configuration problem when building kylin



Obviously, Kylin didn't aware your yarn-site.xml, causing it connecting with a non-existing address. 

Please check whether the right yarn-site.xml is in the Hadoop configuration folder, e.g, /etc/hadoop/conf. You can also try to run a sample Hadoop job from the Kylin node, to verify whether the node is properly configured.


BTW, English is the recommended language for communication, because Kylin users are from different countries. 


2017-10-13 15:13 GMT+08:00 op <52...@qq.com>:
hi,Shuangyin Ge
our clusters contains 63 datanodes ,resourcemanager and namenode are set up in the same 2 nodes ,both enabled HA..they are working stably for some years.  do you think we have to change some configurations?
we put kylin in client node 129 and resourcemanagers  are in 225 and 236


in addition，can you speak chinese?


thanks
------------------ 原始邮件 ------------------
发件人: "Shuangyin Ge";<go...@gmail.com>;
发送时间: 2017年10月13日(星期五) 下午3:03
收件人: "user"<us...@kylin.apache.org>;

主题: Re: yarn configuration problem when building kylin



Hello op,


Can you try to specify yarn.resourcemanager.hostname.rm1 and yarn.resourcemanager.hostname.rm2 in yarn-site.xml as well following https://hadoop.apache.org/docs/r2.8.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html?

2017-10-13 14:44 GMT+08:00 op <52...@qq.com>:
when i builing my cube,the progress is always pending,then i find this in kylin.log,can't connect to the correct resourcemanager address,i've checked my environment,can you give me some advice?


2017-10-13 14:33:48,978 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] client.RMProxy:56 : Connecting to ResourceManager at /0.0.0.0:8032
2017-10-13 14:33:50,061 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:51,062 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:52,063 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:53,064 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:54,065 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:55,067 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:56,068 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:57,069 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:58,070 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:59,071 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:00,072 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 10 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:01,073 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 11 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:02,074 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 12 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:03,075 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 13 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:04,076 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 14 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)



my yarn enabled HA,there are some of the configurations:


<property>
   <name>yarn.resourcemanager.cluster-id</name>
   <value>boh</value>
   <final>false</final>
</property>  


<property>
   <name>yarn.resourcemanager.ha.rm-ids</name>
   <value>rm1,rm2</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.address.rm1</name>
   <value>hadoop001:23188</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.https.address.rm1</name>
   <value>hadoop001:23189</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
   <value>hadoop001:23125</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.scheduler.address.rm1</name>
   <value>hadoop001:23130</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.address.rm1</name>
   <value>hadoop001:23140</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.admin.address.rm1</name>
   <value>hadoop001:23141</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.address.rm2</name>
   <value>hadoop011:23188</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.https.address.rm2</name>
   <value>hadoop011:23189</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
   <value>hadoop011:23125</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.scheduler.address.rm2</name>
   <value>hadoop011:23130</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.address.rm2</name>
   <value>hadoop011:23140</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.admin.address.rm2</name>
   <value>hadoop011:23141</value>
   <final>false</final>
</property>












-- 
Best regards,

Shaofeng Shi 史少锋

回复： yarn configuration problem when building kylin

Posted by op <52...@qq.com>.

i tried ，but it doesn't work，thanks




------------------ 原始邮件 ------------------
发件人: "ShaoFeng Shi";<sh...@apache.org>;
发送时间: 2017年10月13日(星期五) 晚上7:13
收件人: "user"<us...@kylin.apache.org>;

主题: Re: yarn configuration problem when building kylin



The first two steps are hive actions; It means your hive configuration is properly detected. The third step is the first job that submits MR from Kylin; the error indicates yarn or MR configuration is not detected.

A simple way is, putting the "core-site.xml", "yarn-site.xml" and "mapred-site.xml" to $KYLIN_HOME/conf, restart Kylin and then resume the error job. Just take a try to see whether it can solve the problem.

Re: yarn configuration problem when building kylin

Posted by ShaoFeng Shi <sh...@apache.org>.

The first two steps are hive actions; It means your hive configuration is
properly detected. The third step is the first job that submits MR from
Kylin; the error indicates yarn or MR configuration is not detected.

A simple way is, putting the "core-site.xml", "yarn-site.xml" and
"mapred-site.xml" to $KYLIN_HOME/conf, restart Kylin and then resume the
error job. Just take a try to see whether it can solve the problem.

回复： yarn configuration problem when building kylin

Posted by op <52...@qq.com>.

hello ShaoFeng.
the situation above is ,when building a cube,the first two step can successfully finish,but will Get stuck at the third step ,the log ptinting Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried x times without stop。







------------------ 原始邮件 ------------------
发件人: "ShaoFeng Shi";<sh...@apache.org>;
发送时间: 2017年10月13日(星期五) 下午5:02
收件人: "user"<us...@kylin.apache.org>;

主题: Re: yarn configuration problem when building kylin



Obviously, Kylin didn't aware your yarn-site.xml, causing it connecting with a non-existing address. 

Please check whether the right yarn-site.xml is in the Hadoop configuration folder, e.g, /etc/hadoop/conf. You can also try to run a sample Hadoop job from the Kylin node, to verify whether the node is properly configured.


BTW, English is the recommended language for communication, because Kylin users are from different countries. 


2017-10-13 15:13 GMT+08:00 op <52...@qq.com>:
hi,Shuangyin Ge
our clusters contains 63 datanodes ,resourcemanager and namenode are set up in the same 2 nodes ,both enabled HA..they are working stably for some years.  do you think we have to change some configurations?
we put kylin in client node 129 and resourcemanagers  are in 225 and 236


in addition，can you speak chinese?


thanks
------------------ 原始邮件 ------------------
发件人: "Shuangyin Ge";<go...@gmail.com>;
发送时间: 2017年10月13日(星期五) 下午3:03
收件人: "user"<us...@kylin.apache.org>;

主题: Re: yarn configuration problem when building kylin



Hello op,


Can you try to specify yarn.resourcemanager.hostname.rm1 and yarn.resourcemanager.hostname.rm2 in yarn-site.xml as well following https://hadoop.apache.org/docs/r2.8.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html?

2017-10-13 14:44 GMT+08:00 op <52...@qq.com>:
when i builing my cube,the progress is always pending,then i find this in kylin.log,can't connect to the correct resourcemanager address,i've checked my environment,can you give me some advice?


2017-10-13 14:33:48,978 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] client.RMProxy:56 : Connecting to ResourceManager at /0.0.0.0:8032
2017-10-13 14:33:50,061 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:51,062 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:52,063 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:53,064 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:54,065 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:55,067 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:56,068 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:57,069 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:58,070 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:33:59,071 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:00,072 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 10 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:01,073 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 11 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:02,074 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 12 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:03,075 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 13 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)
2017-10-13 14:34:04,076 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63] ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 14 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1 SECONDS)



my yarn enabled HA,there are some of the configurations:


<property>
   <name>yarn.resourcemanager.cluster-id</name>
   <value>boh</value>
   <final>false</final>
</property>  


<property>
   <name>yarn.resourcemanager.ha.rm-ids</name>
   <value>rm1,rm2</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.address.rm1</name>
   <value>hadoop001:23188</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.https.address.rm1</name>
   <value>hadoop001:23189</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
   <value>hadoop001:23125</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.scheduler.address.rm1</name>
   <value>hadoop001:23130</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.address.rm1</name>
   <value>hadoop001:23140</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.admin.address.rm1</name>
   <value>hadoop001:23141</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.address.rm2</name>
   <value>hadoop011:23188</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.webapp.https.address.rm2</name>
   <value>hadoop011:23189</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
   <value>hadoop011:23125</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.scheduler.address.rm2</name>
   <value>hadoop011:23130</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.address.rm2</name>
   <value>hadoop011:23140</value>
   <final>false</final>
</property>


<property>
   <name>yarn.resourcemanager.admin.address.rm2</name>
   <value>hadoop011:23141</value>
   <final>false</final>
</property>












-- 
Best regards,

Shaofeng Shi 史少锋

Re: yarn configuration problem when building kylin

Posted by ShaoFeng Shi <sh...@apache.org>.

Obviously, Kylin didn't aware your yarn-site.xml, causing it connecting
with a non-existing address.

Please check whether the right yarn-site.xml is in the Hadoop configuration
folder, e.g, /etc/hadoop/conf. You can also try to run a sample Hadoop job
from the Kylin node, to verify whether the node is properly configured.

BTW, English is the recommended language for communication, because Kylin
users are from different countries.

2017-10-13 15:13 GMT+08:00 op <52...@qq.com>:

> hi,Shuangyin Ge
> our clusters contains 63 datanodes ,resourcemanager and namenode are set
> up in the same 2 nodes ,both enabled HA..they are working stably for some
> years.  do you think we have to change some configurations?
> we put kylin in client node 129 and resourcemanagers  are in 225 and 236
>
> in addition，can you speak chinese?
>
> thanks
> ------------------ 原始邮件 ------------------
> *发件人:* "Shuangyin Ge";<go...@gmail.com>;
> *发送时间:* 2017年10月13日(星期五) 下午3:03
> *收件人:* "user"<us...@kylin.apache.org>;
> *主题:* Re: yarn configuration problem when building kylin
>
> Hello op,
>
> Can you try to specify yarn.resourcemanager.hostname.rm1 and
> yarn.resourcemanager.hostname.rm2 in yarn-site.xml as well following
> https://hadoop.apache.org/docs/r2.8.0/hadoop-yarn/hadoop-yarn-site/
> ResourceManagerHA.html?
>
> 2017-10-13 14:44 GMT+08:00 op <52...@qq.com>:
>
>> when i builing my cube,the progress is always pending,then i find this in
>> kylin.log,can't connect to the correct resourcemanager address,i've checked
>> my environment,can you give me some advice?
>>
>> 2017-10-13 14:33:48,978 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> client.RMProxy:56 : Connecting to ResourceManager at /0.0.0.0:8032
>> 2017-10-13 14:33:50,061 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>> 2017-10-13 14:33:51,062 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>> 2017-10-13 14:33:52,063 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>> 2017-10-13 14:33:53,064 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>> 2017-10-13 14:33:54,065 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>> 2017-10-13 14:33:55,067 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>> 2017-10-13 14:33:56,068 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>> 2017-10-13 14:33:57,069 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>> 2017-10-13 14:33:58,070 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>> 2017-10-13 14:33:59,071 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>> 2017-10-13 14:34:00,072 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 10 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>> 2017-10-13 14:34:01,073 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 11 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>> 2017-10-13 14:34:02,074 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 12 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>> 2017-10-13 14:34:03,075 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 13 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>> 2017-10-13 14:34:04,076 INFO  [Job a8f48457-7c00-4cae-8857-c6e61c10213d-63]
>> ipc.Client:783 : Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
>> Already tried 14 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
>> sleepTime=1 SECONDS)
>>
>> my yarn enabled HA,there are some of the configurations:
>>
>> <property>
>>    <name>yarn.resourcemanager.cluster-id</name>
>>    <value>boh</value>
>>    <final>false</final>
>> </property>
>>
>> <property>
>>    <name>yarn.resourcemanager.ha.rm-ids</name>
>>    <value>rm1,rm2</value>
>>    <final>false</final>
>> </property>
>>
>> <property>
>>    <name>yarn.resourcemanager.webapp.address.rm1</name>
>>    <value>hadoop001:23188</value>
>>    <final>false</final>
>> </property>
>>
>> <property>
>>    <name>yarn.resourcemanager.webapp.https.address.rm1</name>
>>    <value>hadoop001:23189</value>
>>    <final>false</final>
>> </property>
>>
>> <property>
>>    <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
>>    <value>hadoop001:23125</value>
>>    <final>false</final>
>> </property>
>>
>> <property>
>>    <name>yarn.resourcemanager.scheduler.address.rm1</name>
>>    <value>hadoop001:23130</value>
>>    <final>false</final>
>> </property>
>>
>> <property>
>>    <name>yarn.resourcemanager.address.rm1</name>
>>    <value>hadoop001:23140</value>
>>    <final>false</final>
>> </property>
>>
>> <property>
>>    <name>yarn.resourcemanager.admin.address.rm1</name>
>>    <value>hadoop001:23141</value>
>>    <final>false</final>
>> </property>
>>
>> <property>
>>    <name>yarn.resourcemanager.webapp.address.rm2</name>
>>    <value>hadoop011:23188</value>
>>    <final>false</final>
>> </property>
>>
>> <property>
>>    <name>yarn.resourcemanager.webapp.https.address.rm2</name>
>>    <value>hadoop011:23189</value>
>>    <final>false</final>
>> </property>
>>
>> <property>
>>    <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
>>    <value>hadoop011:23125</value>
>>    <final>false</final>
>> </property>
>>
>> <property>
>>    <name>yarn.resourcemanager.scheduler.address.rm2</name>
>>    <value>hadoop011:23130</value>
>>    <final>false</final>
>> </property>
>>
>> <property>
>>    <name>yarn.resourcemanager.address.rm2</name>
>>    <value>hadoop011:23140</value>
>>    <final>false</final>
>> </property>
>>
>> <property>
>>    <name>yarn.resourcemanager.admin.address.rm2</name>
>>    <value>hadoop011:23141</value>
>>    <final>false</final>
>> </property>
>>
>
>


-- 
Best regards,

Shaofeng Shi 史少锋