You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Jiandan Yang (JIRA)" <ji...@apache.org> on 2019/03/06 09:38:00 UTC

[jira] [Updated] (MAPREDUCE-7191) JobHistoryServer should log exception when loading/parsing history file failed

     [ https://issues.apache.org/jira/browse/MAPREDUCE-7191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jiandan Yang  updated MAPREDUCE-7191:
-------------------------------------
    Description: 
I'm test rolling 2.7.2 to 3.2.0. 
RM& NM has upgrade to 3.2.0, JobHistoryServer is still 2.7.2.
When submitting MR job using 3.2.0 client I found JobHistory URL could not open, and showing "Could not load history file hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist"

There are only loading log just like following and no exception info in log file of  JobHistoryServer.
{code:java}
2019-03-06 16:24:19,489 INFO org.apache.hadoop.mapreduce.v2.hs.CompletedJob: Loading job: job_1551697798944_0020 from file: hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist
2019-03-06 16:24:19,489 INFO org.apache.hadoop.mapreduce.v2.hs.CompletedJob: Loading history file: [hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist]
{code}

After I add some log when loading history file failed I get following exception. 3.2.0 write jhist files using *binary* format,  but 2.7.2 using *json* format. After I set mapreduce.jobhistory.jhist.format=json in 3.2.0 client configuration, I can get job info from jhs.

There is still no log in Hadoop-3.2.0, I think it's very helpful to add some log to debug.

Loading jhist file Exception is follows:
{code:java}
2019-03-06 16:51:55,664 WARN org.apache.hadoop.mapreduce.v2.hs.CompletedJob: Could not load history file hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist
java.io.IOException: Incompatible event log version: Avro-Binary
        at org.apache.hadoop.mapreduce.jobhistory.EventReader.<init>(EventReader.java:71)
        at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:139)
        at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.loadFullHistoryData(CompletedJob.java:347)
        at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.<init>(CompletedJob.java:101)
        at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo.loadJob(HistoryFileManager.java:450)
        at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.loadJob(CachedHistoryStorage.java:180)
        at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.access$000(CachedHistoryStorage.java:52)
        at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage$1.load(CachedHistoryStorage.java:103)
        at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage$1.load(CachedHistoryStorage.java:100)
        at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
        at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
        at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
        at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
        at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969)
        at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829)
        at com.google.common.cache.LocalCache$LocalManualCache.getUnchecked(LocalCache.java:4834)
        at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.getFullJob(CachedHistoryStorage.java:193)
        at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:217)
        at org.apache.hadoop.mapreduce.v2.app.webapp.AppController.requireJob(AppController.java:381)
        at org.apache.hadoop.mapreduce.v2.app.webapp.AppController.job(AppController.java:108)
        at org.apache.hadoop.mapreduce.v2.hs.webapp.HsController.job(HsController.java:104)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
{code}


  was:
I'm test rolling 2.7.2 to 3.2.0. 
RM& NM has upgrade to 3.2.0, JobHistoryServer is still 2.7.2.
When submitting MR job using 3.2.0 client I found JobHistory URL could not open, and showing "Could not load history file hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist"

There are only loading log just like following and no exception info in log file of  JobHistoryServer.
{code:java}
2019-03-06 16:24:19,489 INFO org.apache.hadoop.mapreduce.v2.hs.CompletedJob: Loading job: job_1551697798944_0020 from file: hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist
2019-03-06 16:24:19,489 INFO org.apache.hadoop.mapreduce.v2.hs.CompletedJob: Loading history file: [hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist]
{code}

After I add some log when loading history file failed I get following exception. 3.2.0 write jhist files using *binary* format,  but 2.7.2 using *json* format. After I set mapreduce.jobhistory.jhist.format=json in 3.2.0 client configuration, I can get job info from jhs.

There is still no log in Hadoop-3.2.0, I think it's very helpful to add some log to debug.

Loading jhist file Exception is follows:
{code:java}
2019-03-06 16:51:55,664 WARN org.apache.hadoop.mapreduce.v2.hs.CompletedJob: Could not load history file hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist
java.io.IOException: Incompatible event log version: Avro-Binary
        at org.apache.hadoop.mapreduce.jobhistory.EventReader.<init>(EventReader.java:71)
        at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:139)
        at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.loadFullHistoryData(CompletedJob.java:347)
        at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.<init>(CompletedJob.java:101)
        at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo.loadJob(HistoryFileManager.java:450)
        at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.loadJob(CachedHistoryStorage.java:180)
        at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.access$000(CachedHistoryStorage.java:52)
        at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage$1.load(CachedHistoryStorage.java:103)
        at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage$1.load(CachedHistoryStorage.java:100)
        at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
        at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
        at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
        at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
        at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969)
        at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829)
        at com.google.common.cache.LocalCache$LocalManualCache.getUnchecked(LocalCache.java:4834)
        at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.getFullJob(CachedHistoryStorage.java:193)
        at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:217)
        at org.apache.hadoop.mapreduce.v2.app.webapp.AppController.requireJob(AppController.java:381)
        at org.apache.hadoop.mapreduce.v2.app.webapp.AppController.job(AppController.java:108)
        at org.apache.hadoop.mapreduce.v2.hs.webapp.HsController.job(HsController.java:104)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
{java}



> JobHistoryServer should log exception when loading/parsing history file failed
> ------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7191
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7191
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>            Reporter: Jiandan Yang 
>            Assignee: Jiandan Yang 
>            Priority: Major
>
> I'm test rolling 2.7.2 to 3.2.0. 
> RM& NM has upgrade to 3.2.0, JobHistoryServer is still 2.7.2.
> When submitting MR job using 3.2.0 client I found JobHistory URL could not open, and showing "Could not load history file hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist"
> There are only loading log just like following and no exception info in log file of  JobHistoryServer.
> {code:java}
> 2019-03-06 16:24:19,489 INFO org.apache.hadoop.mapreduce.v2.hs.CompletedJob: Loading job: job_1551697798944_0020 from file: hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist
> 2019-03-06 16:24:19,489 INFO org.apache.hadoop.mapreduce.v2.hs.CompletedJob: Loading history file: [hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist]
> {code}
> After I add some log when loading history file failed I get following exception. 3.2.0 write jhist files using *binary* format,  but 2.7.2 using *json* format. After I set mapreduce.jobhistory.jhist.format=json in 3.2.0 client configuration, I can get job info from jhs.
> There is still no log in Hadoop-3.2.0, I think it's very helpful to add some log to debug.
> Loading jhist file Exception is follows:
> {code:java}
> 2019-03-06 16:51:55,664 WARN org.apache.hadoop.mapreduce.v2.hs.CompletedJob: Could not load history file hdfs://NameNode:8020/tmp/hadoop-yarn/staging/history3/done/2019/03/06/000000/job_1551697798944_0020****.jhist
> java.io.IOException: Incompatible event log version: Avro-Binary
>         at org.apache.hadoop.mapreduce.jobhistory.EventReader.<init>(EventReader.java:71)
>         at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:139)
>         at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.loadFullHistoryData(CompletedJob.java:347)
>         at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.<init>(CompletedJob.java:101)
>         at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo.loadJob(HistoryFileManager.java:450)
>         at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.loadJob(CachedHistoryStorage.java:180)
>         at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.access$000(CachedHistoryStorage.java:52)
>         at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage$1.load(CachedHistoryStorage.java:103)
>         at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage$1.load(CachedHistoryStorage.java:100)
>         at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
>         at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
>         at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
>         at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
>         at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
>         at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969)
>         at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829)
>         at com.google.common.cache.LocalCache$LocalManualCache.getUnchecked(LocalCache.java:4834)
>         at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.getFullJob(CachedHistoryStorage.java:193)
>         at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:217)
>         at org.apache.hadoop.mapreduce.v2.app.webapp.AppController.requireJob(AppController.java:381)
>         at org.apache.hadoop.mapreduce.v2.app.webapp.AppController.job(AppController.java:108)
>         at org.apache.hadoop.mapreduce.v2.hs.webapp.HsController.job(HsController.java:104)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org