You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Henry Hung <YT...@winbond.com> on 2014/06/05 04:34:30 UTC

[hadoop 2.2.0] map tasks failed with error "This token is expired"

Hi All,

Strange thing happens after I start to use Fair Scheduler, when executing a large MR job (around 660 maps and 1 reduce), some of the map tasks will failed with this error:

2014-06-05 10:13:47,379 ERROR org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Unauthorized request to start container.
This token is expired. current time is 1401934427379 found 1401933840832

I already double check the timestamp of all yarn servers and I'm very sure that all the servers are on sync, then I found this JIRA discussing about capacity scheduler problem with using timestamp when reserving container:
https://issues.apache.org/jira/browse/YARN-180

I want to know if this problem also exist inside fair scheduler? And is there a fix for it?

Best regards,
Henry

________________________________
The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

RE: [hadoop 2.2.0] map tasks failed with error "This token is expired"

Posted by Henry Hung <YT...@winbond.com>.
After looking further into this problem, I also found out another JIRA:
https://issues.apache.org/jira/browse/YARN-1417

that explain the fair scheduler also have the problem when dealing with map task that queue too long.

But the sad news is the fix for Hadoop-2.4.0 and I'm not be able to upgrade it right now because there is no resource to do it.

So, there are 2 questions I hoping somebody can answer:

1.       One action I can do is to set this value "yarn.resourcemanager.rm.container-allocation.expiry-interval-ms" to be larger than default 600000 ms, but I don't know what will happens to overall system when I set it to 7 days?
Because right now there is a urgent job that will analyze 1 TB data and could take 3 to 5 days to complete, and I use fair scheduler to constraint the number of containers to run this huge job, so it will not impact small job that need to be executed in daily basis.

2.       Is there a way to port the fix into Hadoop 2.2.0? could you give me some direction to which java files need to be looked at?
I already try to compare 2.2.0 src and 2.4.0 src, but a lot have changed and I kind of spinning around in place right now.

Best regards,
Henry Hung

From: MA33 YTHung1
Sent: Thursday, June 05, 2014 10:35 AM
To: user@hadoop.apache.org
Subject: [hadoop 2.2.0] map tasks failed with error "This token is expired"

Hi All,

Strange thing happens after I start to use Fair Scheduler, when executing a large MR job (around 660 maps and 1 reduce), some of the map tasks will failed with this error:

2014-06-05 10:13:47,379 ERROR org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Unauthorized request to start container.
This token is expired. current time is 1401934427379 found 1401933840832

I already double check the timestamp of all yarn servers and I'm very sure that all the servers are on sync, then I found this JIRA discussing about capacity scheduler problem with using timestamp when reserving container:
https://issues.apache.org/jira/browse/YARN-180

I want to know if this problem also exist inside fair scheduler? And is there a fix for it?

Best regards,
Henry

________________________________
The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

________________________________
The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

RE: [hadoop 2.2.0] map tasks failed with error "This token is expired"

Posted by Henry Hung <YT...@winbond.com>.
After looking further into this problem, I also found out another JIRA:
https://issues.apache.org/jira/browse/YARN-1417

that explain the fair scheduler also have the problem when dealing with map task that queue too long.

But the sad news is the fix for Hadoop-2.4.0 and I'm not be able to upgrade it right now because there is no resource to do it.

So, there are 2 questions I hoping somebody can answer:

1.       One action I can do is to set this value "yarn.resourcemanager.rm.container-allocation.expiry-interval-ms" to be larger than default 600000 ms, but I don't know what will happens to overall system when I set it to 7 days?
Because right now there is a urgent job that will analyze 1 TB data and could take 3 to 5 days to complete, and I use fair scheduler to constraint the number of containers to run this huge job, so it will not impact small job that need to be executed in daily basis.

2.       Is there a way to port the fix into Hadoop 2.2.0? could you give me some direction to which java files need to be looked at?
I already try to compare 2.2.0 src and 2.4.0 src, but a lot have changed and I kind of spinning around in place right now.

Best regards,
Henry Hung

From: MA33 YTHung1
Sent: Thursday, June 05, 2014 10:35 AM
To: user@hadoop.apache.org
Subject: [hadoop 2.2.0] map tasks failed with error "This token is expired"

Hi All,

Strange thing happens after I start to use Fair Scheduler, when executing a large MR job (around 660 maps and 1 reduce), some of the map tasks will failed with this error:

2014-06-05 10:13:47,379 ERROR org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Unauthorized request to start container.
This token is expired. current time is 1401934427379 found 1401933840832

I already double check the timestamp of all yarn servers and I'm very sure that all the servers are on sync, then I found this JIRA discussing about capacity scheduler problem with using timestamp when reserving container:
https://issues.apache.org/jira/browse/YARN-180

I want to know if this problem also exist inside fair scheduler? And is there a fix for it?

Best regards,
Henry

________________________________
The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

________________________________
The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

RE: [hadoop 2.2.0] map tasks failed with error "This token is expired"

Posted by Henry Hung <YT...@winbond.com>.
After looking further into this problem, I also found out another JIRA:
https://issues.apache.org/jira/browse/YARN-1417

that explain the fair scheduler also have the problem when dealing with map task that queue too long.

But the sad news is the fix for Hadoop-2.4.0 and I'm not be able to upgrade it right now because there is no resource to do it.

So, there are 2 questions I hoping somebody can answer:

1.       One action I can do is to set this value "yarn.resourcemanager.rm.container-allocation.expiry-interval-ms" to be larger than default 600000 ms, but I don't know what will happens to overall system when I set it to 7 days?
Because right now there is a urgent job that will analyze 1 TB data and could take 3 to 5 days to complete, and I use fair scheduler to constraint the number of containers to run this huge job, so it will not impact small job that need to be executed in daily basis.

2.       Is there a way to port the fix into Hadoop 2.2.0? could you give me some direction to which java files need to be looked at?
I already try to compare 2.2.0 src and 2.4.0 src, but a lot have changed and I kind of spinning around in place right now.

Best regards,
Henry Hung

From: MA33 YTHung1
Sent: Thursday, June 05, 2014 10:35 AM
To: user@hadoop.apache.org
Subject: [hadoop 2.2.0] map tasks failed with error "This token is expired"

Hi All,

Strange thing happens after I start to use Fair Scheduler, when executing a large MR job (around 660 maps and 1 reduce), some of the map tasks will failed with this error:

2014-06-05 10:13:47,379 ERROR org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Unauthorized request to start container.
This token is expired. current time is 1401934427379 found 1401933840832

I already double check the timestamp of all yarn servers and I'm very sure that all the servers are on sync, then I found this JIRA discussing about capacity scheduler problem with using timestamp when reserving container:
https://issues.apache.org/jira/browse/YARN-180

I want to know if this problem also exist inside fair scheduler? And is there a fix for it?

Best regards,
Henry

________________________________
The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

________________________________
The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

RE: [hadoop 2.2.0] map tasks failed with error "This token is expired"

Posted by Henry Hung <YT...@winbond.com>.
After looking further into this problem, I also found out another JIRA:
https://issues.apache.org/jira/browse/YARN-1417

that explain the fair scheduler also have the problem when dealing with map task that queue too long.

But the sad news is the fix for Hadoop-2.4.0 and I'm not be able to upgrade it right now because there is no resource to do it.

So, there are 2 questions I hoping somebody can answer:

1.       One action I can do is to set this value "yarn.resourcemanager.rm.container-allocation.expiry-interval-ms" to be larger than default 600000 ms, but I don't know what will happens to overall system when I set it to 7 days?
Because right now there is a urgent job that will analyze 1 TB data and could take 3 to 5 days to complete, and I use fair scheduler to constraint the number of containers to run this huge job, so it will not impact small job that need to be executed in daily basis.

2.       Is there a way to port the fix into Hadoop 2.2.0? could you give me some direction to which java files need to be looked at?
I already try to compare 2.2.0 src and 2.4.0 src, but a lot have changed and I kind of spinning around in place right now.

Best regards,
Henry Hung

From: MA33 YTHung1
Sent: Thursday, June 05, 2014 10:35 AM
To: user@hadoop.apache.org
Subject: [hadoop 2.2.0] map tasks failed with error "This token is expired"

Hi All,

Strange thing happens after I start to use Fair Scheduler, when executing a large MR job (around 660 maps and 1 reduce), some of the map tasks will failed with this error:

2014-06-05 10:13:47,379 ERROR org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Unauthorized request to start container.
This token is expired. current time is 1401934427379 found 1401933840832

I already double check the timestamp of all yarn servers and I'm very sure that all the servers are on sync, then I found this JIRA discussing about capacity scheduler problem with using timestamp when reserving container:
https://issues.apache.org/jira/browse/YARN-180

I want to know if this problem also exist inside fair scheduler? And is there a fix for it?

Best regards,
Henry

________________________________
The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

________________________________
The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.