You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Jeff Zhang <zj...@gmail.com> on 2015/08/18 05:40:50 UTC

Confusing Yarn RPC Configuration

I use yarn.resourcemanager.connect.max-wait.ms to control how much time to
wait for setting up RM connection. But the weird thing I found that this
configuration is not the real max wait time. Actually Yarn will convert it
to retry count with configuration
yarn.resourcemanager.connect.retry-interval.ms.
Let's say yarn.resourcemanager.connect.max-wait.ms=10000 and
yarn.resourcemanager.connect.retry-interval.ms=2000, then yarn will create
RetryUpToMaximumCountWithFixedSleep with max count = 5 (10000/2000)
Because for each RM connection, there's retry policy inside of hadoop RPC.
Let's say ipc.client.connect.retry.interval=1000
and ipc.client.connect.max.retries=10, so for each RM connection it will
try 10 times and totally cost 10 seconds (1000*10).  So overall for the RM
connection it would cost 50 seconds (10 * 5), and this number is not
consistent with yarn.resourcemanager.connect.max-wait.ms which confuse
users. I am not sure the purpose of 2 rounds of retry policy (Yarn side and
RPC internal side), should it be only 1 round of retry policy and yarn
related configuration is just for override the RPC configuration ?

BTW, I believe it is the same issue for node manage connection.

-- 
Best Regards

Jeff Zhang

Re: Confusing Yarn RPC Configuration

Posted by Jeff Zhang <zj...@gmail.com>.

Thanks, looks like it is resolved in 2.7

On Wed, Aug 19, 2015 at 3:03 PM, Rohith Sharma K S <
rohithsharmaks@huawei.com> wrote:

> >>> I believe it is the same issue for node manage connection
>
> This would be probably related to below issues
>
> https://issues.apache.org/jira/i#browse/YARN-3944
>
> https://issues.apache.org/jira/i#browse/YARN-3238
>
>
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* Jeff Zhang [mailto:zjffdu@gmail.com]
> *Sent:* 18 August 2015 09:11
> *To:* user@hadoop.apache.org
> *Subject:* Confusing Yarn RPC Configuration
>
>
>
>
> I use yarn.resourcemanager.connect.max-wait.ms to control how much time
> to wait for setting up RM connection. But the weird thing I found that this
> configuration is not the real max wait time. Actually Yarn will convert it
> to retry count with configuration
> yarn.resourcemanager.connect.retry-interval.ms.
>
> Let's say yarn.resourcemanager.connect.max-wait.ms=10000 and
> yarn.resourcemanager.connect.retry-interval.ms=2000, then yarn will
> create RetryUpToMaximumCountWithFixedSleep with max count = 5 (10000/2000)
>
> Because for each RM connection, there's retry policy inside of hadoop RPC.
> Let's say ipc.client.connect.retry.interval=1000
> and ipc.client.connect.max.retries=10, so for each RM connection it will
> try 10 times and totally cost 10 seconds (1000*10).  So overall for the RM
> connection it would cost 50 seconds (10 * 5), and this number is not
> consistent with yarn.resourcemanager.connect.max-wait.ms which confuse
> users. I am not sure the purpose of 2 rounds of retry policy (Yarn side and
> RPC internal side), should it be only 1 round of retry policy and yarn
> related configuration is just for override the RPC configuration ?
>
>
>
> BTW, I believe it is the same issue for node manage connection.
>
>
>
> --
>
> Best Regards
>
> Jeff Zhang
>



-- 
Best Regards

Jeff Zhang

Re: Confusing Yarn RPC Configuration

Posted by Jeff Zhang <zj...@gmail.com>.

Thanks, looks like it is resolved in 2.7

On Wed, Aug 19, 2015 at 3:03 PM, Rohith Sharma K S <
rohithsharmaks@huawei.com> wrote:

> >>> I believe it is the same issue for node manage connection
>
> This would be probably related to below issues
>
> https://issues.apache.org/jira/i#browse/YARN-3944
>
> https://issues.apache.org/jira/i#browse/YARN-3238
>
>
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* Jeff Zhang [mailto:zjffdu@gmail.com]
> *Sent:* 18 August 2015 09:11
> *To:* user@hadoop.apache.org
> *Subject:* Confusing Yarn RPC Configuration
>
>
>
>
> I use yarn.resourcemanager.connect.max-wait.ms to control how much time
> to wait for setting up RM connection. But the weird thing I found that this
> configuration is not the real max wait time. Actually Yarn will convert it
> to retry count with configuration
> yarn.resourcemanager.connect.retry-interval.ms.
>
> Let's say yarn.resourcemanager.connect.max-wait.ms=10000 and
> yarn.resourcemanager.connect.retry-interval.ms=2000, then yarn will
> create RetryUpToMaximumCountWithFixedSleep with max count = 5 (10000/2000)
>
> Because for each RM connection, there's retry policy inside of hadoop RPC.
> Let's say ipc.client.connect.retry.interval=1000
> and ipc.client.connect.max.retries=10, so for each RM connection it will
> try 10 times and totally cost 10 seconds (1000*10).  So overall for the RM
> connection it would cost 50 seconds (10 * 5), and this number is not
> consistent with yarn.resourcemanager.connect.max-wait.ms which confuse
> users. I am not sure the purpose of 2 rounds of retry policy (Yarn side and
> RPC internal side), should it be only 1 round of retry policy and yarn
> related configuration is just for override the RPC configuration ?
>
>
>
> BTW, I believe it is the same issue for node manage connection.
>
>
>
> --
>
> Best Regards
>
> Jeff Zhang
>



-- 
Best Regards

Jeff Zhang

Re: Confusing Yarn RPC Configuration

Posted by Jeff Zhang <zj...@gmail.com>.

Thanks, looks like it is resolved in 2.7

On Wed, Aug 19, 2015 at 3:03 PM, Rohith Sharma K S <
rohithsharmaks@huawei.com> wrote:

> >>> I believe it is the same issue for node manage connection
>
> This would be probably related to below issues
>
> https://issues.apache.org/jira/i#browse/YARN-3944
>
> https://issues.apache.org/jira/i#browse/YARN-3238
>
>
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* Jeff Zhang [mailto:zjffdu@gmail.com]
> *Sent:* 18 August 2015 09:11
> *To:* user@hadoop.apache.org
> *Subject:* Confusing Yarn RPC Configuration
>
>
>
>
> I use yarn.resourcemanager.connect.max-wait.ms to control how much time
> to wait for setting up RM connection. But the weird thing I found that this
> configuration is not the real max wait time. Actually Yarn will convert it
> to retry count with configuration
> yarn.resourcemanager.connect.retry-interval.ms.
>
> Let's say yarn.resourcemanager.connect.max-wait.ms=10000 and
> yarn.resourcemanager.connect.retry-interval.ms=2000, then yarn will
> create RetryUpToMaximumCountWithFixedSleep with max count = 5 (10000/2000)
>
> Because for each RM connection, there's retry policy inside of hadoop RPC.
> Let's say ipc.client.connect.retry.interval=1000
> and ipc.client.connect.max.retries=10, so for each RM connection it will
> try 10 times and totally cost 10 seconds (1000*10).  So overall for the RM
> connection it would cost 50 seconds (10 * 5), and this number is not
> consistent with yarn.resourcemanager.connect.max-wait.ms which confuse
> users. I am not sure the purpose of 2 rounds of retry policy (Yarn side and
> RPC internal side), should it be only 1 round of retry policy and yarn
> related configuration is just for override the RPC configuration ?
>
>
>
> BTW, I believe it is the same issue for node manage connection.
>
>
>
> --
>
> Best Regards
>
> Jeff Zhang
>



-- 
Best Regards

Jeff Zhang

Re: Confusing Yarn RPC Configuration

Posted by Jeff Zhang <zj...@gmail.com>.

Thanks, looks like it is resolved in 2.7

On Wed, Aug 19, 2015 at 3:03 PM, Rohith Sharma K S <
rohithsharmaks@huawei.com> wrote:

> >>> I believe it is the same issue for node manage connection
>
> This would be probably related to below issues
>
> https://issues.apache.org/jira/i#browse/YARN-3944
>
> https://issues.apache.org/jira/i#browse/YARN-3238
>
>
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* Jeff Zhang [mailto:zjffdu@gmail.com]
> *Sent:* 18 August 2015 09:11
> *To:* user@hadoop.apache.org
> *Subject:* Confusing Yarn RPC Configuration
>
>
>
>
> I use yarn.resourcemanager.connect.max-wait.ms to control how much time
> to wait for setting up RM connection. But the weird thing I found that this
> configuration is not the real max wait time. Actually Yarn will convert it
> to retry count with configuration
> yarn.resourcemanager.connect.retry-interval.ms.
>
> Let's say yarn.resourcemanager.connect.max-wait.ms=10000 and
> yarn.resourcemanager.connect.retry-interval.ms=2000, then yarn will
> create RetryUpToMaximumCountWithFixedSleep with max count = 5 (10000/2000)
>
> Because for each RM connection, there's retry policy inside of hadoop RPC.
> Let's say ipc.client.connect.retry.interval=1000
> and ipc.client.connect.max.retries=10, so for each RM connection it will
> try 10 times and totally cost 10 seconds (1000*10).  So overall for the RM
> connection it would cost 50 seconds (10 * 5), and this number is not
> consistent with yarn.resourcemanager.connect.max-wait.ms which confuse
> users. I am not sure the purpose of 2 rounds of retry policy (Yarn side and
> RPC internal side), should it be only 1 round of retry policy and yarn
> related configuration is just for override the RPC configuration ?
>
>
>
> BTW, I believe it is the same issue for node manage connection.
>
>
>
> --
>
> Best Regards
>
> Jeff Zhang
>



-- 
Best Regards

Jeff Zhang

RE: Confusing Yarn RPC Configuration

Posted by Rohith Sharma K S <ro...@huawei.com>.

>>> I believe it is the same issue for node manage connection
This would be probably related to below issues
https://issues.apache.org/jira/i#browse/YARN-3944
https://issues.apache.org/jira/i#browse/YARN-3238


Thanks & Regards
Rohith Sharma K S

From: Jeff Zhang [mailto:zjffdu@gmail.com]
Sent: 18 August 2015 09:11
To: user@hadoop.apache.org
Subject: Confusing Yarn RPC Configuration


I use yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms> to control how much time to wait for setting up RM connection. But the weird thing I found that this configuration is not the real max wait time. Actually Yarn will convert it to retry count with configuration yarn.resourcemanager.connect.retry-interval.ms<http://yarn.resourcemanager.connect.retry-interval.ms>.
Let's say yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms>=10000 and  yarn.resourcemanager.connect.retry-interval.ms<http://yarn.resourcemanager.connect.retry-interval.ms>=2000, then yarn will create RetryUpToMaximumCountWithFixedSleep with max count = 5 (10000/2000)
Because for each RM connection, there's retry policy inside of hadoop RPC. Let's say ipc.client.connect.retry.interval=1000 and ipc.client.connect.max.retries=10, so for each RM connection it will try 10 times and totally cost 10 seconds (1000*10).  So overall for the RM connection it would cost 50 seconds (10 * 5), and this number is not consistent with yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms> which confuse users. I am not sure the purpose of 2 rounds of retry policy (Yarn side and RPC internal side), should it be only 1 round of retry policy and yarn related configuration is just for override the RPC configuration ?

BTW, I believe it is the same issue for node manage connection.

--
Best Regards

Jeff Zhang

RE: Confusing Yarn RPC Configuration

Posted by Rohith Sharma K S <ro...@huawei.com>.

>>> I believe it is the same issue for node manage connection
This would be probably related to below issues
https://issues.apache.org/jira/i#browse/YARN-3944
https://issues.apache.org/jira/i#browse/YARN-3238


Thanks & Regards
Rohith Sharma K S

From: Jeff Zhang [mailto:zjffdu@gmail.com]
Sent: 18 August 2015 09:11
To: user@hadoop.apache.org
Subject: Confusing Yarn RPC Configuration


I use yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms> to control how much time to wait for setting up RM connection. But the weird thing I found that this configuration is not the real max wait time. Actually Yarn will convert it to retry count with configuration yarn.resourcemanager.connect.retry-interval.ms<http://yarn.resourcemanager.connect.retry-interval.ms>.
Let's say yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms>=10000 and  yarn.resourcemanager.connect.retry-interval.ms<http://yarn.resourcemanager.connect.retry-interval.ms>=2000, then yarn will create RetryUpToMaximumCountWithFixedSleep with max count = 5 (10000/2000)
Because for each RM connection, there's retry policy inside of hadoop RPC. Let's say ipc.client.connect.retry.interval=1000 and ipc.client.connect.max.retries=10, so for each RM connection it will try 10 times and totally cost 10 seconds (1000*10).  So overall for the RM connection it would cost 50 seconds (10 * 5), and this number is not consistent with yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms> which confuse users. I am not sure the purpose of 2 rounds of retry policy (Yarn side and RPC internal side), should it be only 1 round of retry policy and yarn related configuration is just for override the RPC configuration ?

BTW, I believe it is the same issue for node manage connection.

--
Best Regards

Jeff Zhang

Fwd: Confusing Yarn RPC Configuration

Posted by Jeff Zhang <zj...@gmail.com>.

+ yarn-dev

---------- Forwarded message ----------
From: Jeff Zhang <zj...@gmail.com>
Date: Tue, Aug 18, 2015 at 11:40 AM
Subject: Confusing Yarn RPC Configuration
To: user@hadoop.apache.org

I use yarn.resourcemanager.connect.max-wait.ms to control how much time to
wait for setting up RM connection. But the weird thing I found that this
configuration is not the real max wait time. Actually Yarn will convert it
to retry count with configuration
yarn.resourcemanager.connect.retry-interval.ms.
Let's say yarn.resourcemanager.connect.max-wait.ms=10000 and
yarn.resourcemanager.connect.retry-interval.ms=2000, then yarn will create
RetryUpToMaximumCountWithFixedSleep with max count = 5 (10000/2000)
Because for each RM connection, there's retry policy inside of hadoop RPC.
Let's say ipc.client.connect.retry.interval=1000
and ipc.client.connect.max.retries=10, so for each RM connection it will
try 10 times and totally cost 10 seconds (1000*10).  So overall for the RM
connection it would cost 50 seconds (10 * 5), and this number is not
consistent with yarn.resourcemanager.connect.max-wait.ms which confuse
users. I am not sure the purpose of 2 rounds of retry policy (Yarn side and
RPC internal side), should it be only 1 round of retry policy and yarn
related configuration is just for override the RPC configuration ?

BTW, I believe it is the same issue for node manage connection.

-- 
Best Regards

Jeff Zhang

-- 
Best Regards

Jeff Zhang

RE: Confusing Yarn RPC Configuration

Posted by Rohith Sharma K S <ro...@huawei.com>.

>>> I believe it is the same issue for node manage connection
This would be probably related to below issues
https://issues.apache.org/jira/i#browse/YARN-3944
https://issues.apache.org/jira/i#browse/YARN-3238


Thanks & Regards
Rohith Sharma K S

From: Jeff Zhang [mailto:zjffdu@gmail.com]
Sent: 18 August 2015 09:11
To: user@hadoop.apache.org
Subject: Confusing Yarn RPC Configuration


I use yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms> to control how much time to wait for setting up RM connection. But the weird thing I found that this configuration is not the real max wait time. Actually Yarn will convert it to retry count with configuration yarn.resourcemanager.connect.retry-interval.ms<http://yarn.resourcemanager.connect.retry-interval.ms>.
Let's say yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms>=10000 and  yarn.resourcemanager.connect.retry-interval.ms<http://yarn.resourcemanager.connect.retry-interval.ms>=2000, then yarn will create RetryUpToMaximumCountWithFixedSleep with max count = 5 (10000/2000)
Because for each RM connection, there's retry policy inside of hadoop RPC. Let's say ipc.client.connect.retry.interval=1000 and ipc.client.connect.max.retries=10, so for each RM connection it will try 10 times and totally cost 10 seconds (1000*10).  So overall for the RM connection it would cost 50 seconds (10 * 5), and this number is not consistent with yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms> which confuse users. I am not sure the purpose of 2 rounds of retry policy (Yarn side and RPC internal side), should it be only 1 round of retry policy and yarn related configuration is just for override the RPC configuration ?

BTW, I believe it is the same issue for node manage connection.

--
Best Regards

Jeff Zhang

RE: Confusing Yarn RPC Configuration

Posted by Rohith Sharma K S <ro...@huawei.com>.

>>> I believe it is the same issue for node manage connection
This would be probably related to below issues
https://issues.apache.org/jira/i#browse/YARN-3944
https://issues.apache.org/jira/i#browse/YARN-3238


Thanks & Regards
Rohith Sharma K S

From: Jeff Zhang [mailto:zjffdu@gmail.com]
Sent: 18 August 2015 09:11
To: user@hadoop.apache.org
Subject: Confusing Yarn RPC Configuration


I use yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms> to control how much time to wait for setting up RM connection. But the weird thing I found that this configuration is not the real max wait time. Actually Yarn will convert it to retry count with configuration yarn.resourcemanager.connect.retry-interval.ms<http://yarn.resourcemanager.connect.retry-interval.ms>.
Let's say yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms>=10000 and  yarn.resourcemanager.connect.retry-interval.ms<http://yarn.resourcemanager.connect.retry-interval.ms>=2000, then yarn will create RetryUpToMaximumCountWithFixedSleep with max count = 5 (10000/2000)
Because for each RM connection, there's retry policy inside of hadoop RPC. Let's say ipc.client.connect.retry.interval=1000 and ipc.client.connect.max.retries=10, so for each RM connection it will try 10 times and totally cost 10 seconds (1000*10).  So overall for the RM connection it would cost 50 seconds (10 * 5), and this number is not consistent with yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms> which confuse users. I am not sure the purpose of 2 rounds of retry policy (Yarn side and RPC internal side), should it be only 1 round of retry policy and yarn related configuration is just for override the RPC configuration ?

BTW, I believe it is the same issue for node manage connection.

--
Best Regards

Jeff Zhang