You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "Daniel Dai (Created) (JIRA)" <ji...@apache.org> on 2012/02/24 21:50:01 UTC

[jira] [Created] (MAPREDUCE-3919) Redirecting to job history server takes hours

Redirecting to job history server takes hours
---------------------------------------------

                 Key: MAPREDUCE-3919
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3919
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 0.23.0
            Reporter: Daniel Dai


Saw the following message happening regularly, the job end up success, but reconnecting job history server takes a long time (>10 hours sometimes).

2012-02-24 03:49:05,226 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: hrt11n31.cc1.ygridcore.net/98.137.234.159:44716. Already tried 0 time(s).
2012-02-24 03:49:05,229 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-02-24 03:49:06,233 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s).
2012-02-24 03:49:07,236 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s).
2012-02-24 03:49:08,239 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s).
2012-02-24 03:49:09,242 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s).
2012-02-24 03:49:10,245 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s).
2012-02-24 03:49:11,248 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s).
2012-02-24 03:49:12,251 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s).
2012-02-24 03:49:13,254 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s).
2012-02-24 03:49:14,257 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s).
2012-02-24 03:49:15,260 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s).
......
2012-02-24 18:10:35,711 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-02-24 18:10:36,714 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s).
2012-02-24 18:10:37,717 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s).2012-02-24 18:10:38,784 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s).2012-02-24 18:10:39,787 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s).
2012-02-24 18:10:40,791 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s).
2012-02-24 18:10:41,793 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s).2012-02-24 18:10:42,796 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s).2012-02-24 18:10:43,799 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s).
2012-02-24 18:10:44,802 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s).
2012-02-24 18:10:45,805 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s).
2012-02-24 18:10:45,808 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-02-24 18:10:46,810 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s).2012-02-24 18:10:47,813 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s).2012-02-24 18:10:48,815 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s).
2012-02-24 18:10:49,120 [main] WARN  org.apache.hadoop.mapred.ClientServiceDelegate - Error from remote end: Unknown job job_1330051901509_0017
2012-02-24 18:10:49,120 [main] ERROR org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as:pigtester (auth:SIMPLE) cause:org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: Unknown job job_1330051901509_0017

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (MAPREDUCE-3919) Redirecting to job history server takes hours

Posted by "Daniel Dai (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved MAPREDUCE-3919.
-----------------------------------

    Resolution: Invalid

Didn't see it again after starting JobHistoryServer. Thanks Vinod!
                
> Redirecting to job history server takes hours
> ---------------------------------------------
>
>                 Key: MAPREDUCE-3919
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3919
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.0
>            Reporter: Daniel Dai
>            Priority: Critical
>
> Saw the following message happening regularly, the job end up success, but reconnecting job history server takes a long time (>10 hours sometimes).
> 2012-02-24 03:49:05,226 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: hrt11n31.cc1.ygridcore.net/98.137.234.159:44716. Already tried 0 time(s).
> 2012-02-24 03:49:05,229 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
> 2012-02-24 03:49:06,233 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s).
> 2012-02-24 03:49:07,236 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s).
> 2012-02-24 03:49:08,239 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s).
> 2012-02-24 03:49:09,242 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s).
> 2012-02-24 03:49:10,245 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s).
> 2012-02-24 03:49:11,248 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s).
> 2012-02-24 03:49:12,251 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s).
> 2012-02-24 03:49:13,254 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s).
> 2012-02-24 03:49:14,257 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s).
> 2012-02-24 03:49:15,260 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s).
> ......
> 2012-02-24 18:10:35,711 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
> 2012-02-24 18:10:36,714 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s).
> 2012-02-24 18:10:37,717 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s).2012-02-24 18:10:38,784 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s).2012-02-24 18:10:39,787 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s).
> 2012-02-24 18:10:40,791 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s).
> 2012-02-24 18:10:41,793 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s).2012-02-24 18:10:42,796 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s).2012-02-24 18:10:43,799 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s).
> 2012-02-24 18:10:44,802 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s).
> 2012-02-24 18:10:45,805 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s).
> 2012-02-24 18:10:45,808 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
> 2012-02-24 18:10:46,810 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s).2012-02-24 18:10:47,813 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s).2012-02-24 18:10:48,815 [main] INFO  org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s).
> 2012-02-24 18:10:49,120 [main] WARN  org.apache.hadoop.mapred.ClientServiceDelegate - Error from remote end: Unknown job job_1330051901509_0017
> 2012-02-24 18:10:49,120 [main] ERROR org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as:pigtester (auth:SIMPLE) cause:org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: Unknown job job_1330051901509_0017

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira