You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@eagle.apache.org by "Jayesh (JIRA)" <ji...@apache.org> on 2017/07/20 17:53:07 UTC
[jira] [Updated] (EAGLE-920) mr failed job trouble shooting
[ https://issues.apache.org/jira/browse/EAGLE-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jayesh updated EAGLE-920:
-------------------------
Fix Version/s: (was: v0.5.0)
v0.5.1
> mr failed job trouble shooting
> ------------------------------
>
> Key: EAGLE-920
> URL: https://issues.apache.org/jira/browse/EAGLE-920
> Project: Eagle
> Issue Type: Improvement
> Components: App::Job Performance Monitor
> Affects Versions: v0.5.0
> Reporter: wujinhu
> Assignee: wujinhu
> Fix For: v0.5.1
>
>
> We will follow below steps when we find a failed mr job.
> 1. get error category distribution of the job via api
> query=TaskAttemptErrorCategoryService[@site="sandbox" and @jobId="job_1486726244016_162594"]<@errorCategory>{count}
> 2. get error category - error message mapping and failed task attempts list
> query=JobErrorMappingService[@site="sandbox" and @jobId="job_1486726244016_162594" and @errorCategory="java.lang.RuntimeException"]
> 3. dive into one task attempt
> query=TaskAttemptExecutionService[@site="sandbox" and @taskAttemptId="attempt_1486726244016_162594_m_002451_1"]
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)