You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "Issac Buenrostro (JIRA)" <ji...@apache.org> on 2017/11/07 23:27:00 UTC

[jira] [Resolved] (GOBBLIN-273) Add failure monitoring

     [ https://issues.apache.org/jira/browse/GOBBLIN-273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Issac Buenrostro resolved GOBBLIN-273.
--------------------------------------
    Resolution: Fixed

Issue resolved by pull request #2125
[https://github.com/apache/incubator-gobblin/pull/2125]

> Add failure monitoring
> ----------------------
>
>                 Key: GOBBLIN-273
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-273
>             Project: Apache Gobblin
>          Issue Type: Task
>          Components: gobblin-core
>            Reporter: Zhixiong Chen
>            Assignee: Zhixiong Chen
>
> When a job failed with a very long log, it's not easy to dive into the log and find the reason of the failure. Here a reporter is plugin-ed into the Gobblin Metrics architecture to collect job failure events into a file. A job now has task level and dataset level failure events reported for free.
> h3. `MetricContext#submitFailureEvent`
> When a failure event needs to be reported, it should be submitted with this method, which encapsulates the event into a `FailureEventNotification`
> h3. `FileFailureEventReporter`
> Report all failure events into a file. Each job has its own report folder. 
> h3. Configurations
> To enable job failure reporting, the following configurations are required
> {code:java}
> // Some comments here
> metrics.enabled=true
> fs.uri=<file system uri> // by default, local file system is used
> failure.log.dir=<root folder of all jobs failure reports>
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)