You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "Hung Tran (JIRA)" <ji...@apache.org> on 2018/05/03 20:04:00 UTC

[jira] [Resolved] (GOBBLIN-484) Propagate fork exception to task commit

     [ https://issues.apache.org/jira/browse/GOBBLIN-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hung Tran resolved GOBBLIN-484.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 0.13.0

Issue resolved by pull request #2354
[https://github.com/apache/incubator-gobblin/pull/2354]

> Propagate fork exception to task commit
> ---------------------------------------
>
>                 Key: GOBBLIN-484
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-484
>             Project: Apache Gobblin
>          Issue Type: Improvement
>            Reporter: Kuai Yu
>            Assignee: Kuai Yu
>            Priority: Major
>             Fix For: 0.13.0
>
>
> >>> Today if exception occurred in task level, we will not propagate this exception to the commit phase, which means in fork.commit, we will see some exceptions like this :
> 2018/04/30 08:03:19.369 ERROR [Task] [Task-committing-pool-0] [gobblin-cluster-worker] [DYNAMICS-CONTACT-438563007_1525075320170] Task task_DYNAMICS-CONTACT-438563007_1525075320170_0 failed
> org.apache.gobblin.runtime.ForkException: Fork branches [0] failed for task task_DYNAMICS-CONTACT-438563007_1525075320170_0
> at org.apache.gobblin.runtime.Task.commit(Task.java:884)
> at org.apache.gobblin.runtime.GobblinMultiTaskAttempt$1$1.call(GobblinMultiTaskAttempt.java:167)
> at org.apache.gobblin.runtime.GobblinMultiTaskAttempt$1$1.call(GobblinMultiTaskAttempt.java:162)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at org.apache.gobblin.util.executors.MDCPropagatingRunnable.run(MDCPropagatingRunnable.java:39)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> >>> However the root cause of exception happened earlier before the commit phase, which is in the task run() stage, some records failed to process:
> 2018/04/30 08:03:19.352 ERROR [Task] [TaskExecutor-1] [gobblin-cluster-worker] [DYNAMICS-CONTACT-438563007_1525075320170] Processing record incurs an unexpected exception:
> java.lang.IllegalStateException: Fork 0 of task task_DYNAMICS-CONTACT-438563007_1525075320170_0 has failed and is no longer running
> at org.apache.gobblin.runtime.fork.Fork.putRecord(Fork.java:285)
> at org.apache.gobblin.runtime.Task.processRecord(Task.java:778)
> at org.apache.gobblin.runtime.Task.runSynchronousModel(Task.java:459)
> at org.apache.gobblin.runtime.Task.run(Task.java:341)
> at org.apache.gobblin.runtime.TaskExecutor$TrackingTask.run(TaskExecutor.java:443)
> at org.apache.gobblin.util.executors.MDCPropagatingRunnable.run(MDCPropagatingRunnable.java:39)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2018/04/30 08:03:19.353 ERROR [Task] [TaskExecutor-1] [gobblin-cluster-worker] [DYNAMICS-CONTACT-438563007_1525075320170] Task task_DYNAMICS-CONTACT-438563007_1525075320170_0 failed
> java.lang.RuntimeException
> at org.apache.gobblin.runtime.Task.runSynchronousModel(Task.java:464)
> at org.apache.gobblin.runtime.Task.run(Task.java:341)
> at org.apache.gobblin.runtime.TaskExecutor$TrackingTask.run(TaskExecutor.java:443)
> at org.apache.gobblin.util.executors.MDCPropagatingRunnable.run(MDCPropagatingRunnable.java:39)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2018/04/30 08:03:19.368 INFO [com_2792] [TaskState
> >>> Now further look into the problem, we know it is due to the record processing timeout from espresso writer:
> 2018/04/30 08:03:19.348 ERROR [Fork-0] [ForkExecutor-0] [gobblin-cluster-worker] [DYNAMICS-CONTACT-438563007_1525075320170] Fork 0 of task task_DYNAMICS-CONTACT-438563007_1525075320170_0 failed to process data records
> java.io.IOException: java.util.concurrent.ExecutionException: org.apache.gobblin.exception.NonTransientException: Irrecoverable failure on async write
> at org.apache.gobblin.writer.RetryWriter.callWithRetry(RetryWriter.java:143)
> at org.apache.gobblin.writer.RetryWriter.writeEnvelope(RetryWriter.java:123)
> at org.apache.gobblin.runtime.fork.Fork.processRecord(Fork.java:492)
> at org.apache.gobblin.runtime.fork.AsynchronousFork.processRecord(AsynchronousFork.java:103)
> at org.apache.gobblin.runtime.fork.AsynchronousFork.processRecords(AsynchronousFork.java:86)
> at org.apache.gobblin.runtime.fork.Fork.run(Fork.java:238)
> at org.apache.gobblin.util.executors.MDCPropagatingRunnable.run(MDCPropagatingRunnable.java:39)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException: org.apache.gobblin.exception.NonTransientException: Irrecoverable failure on async write
> at ligobblin.shaded.com.github.rholder.retry.Retryer$ExceptionAttempt.<init>(Retryer.java:254)
> at ligobblin.shaded.com.github.rholder.retry.Retryer.call(Retryer.java:163)
> at ligobblin.shaded.com.github.rholder.retry.Retryer$RetryerCallable.call(Retryer.java:318)
> at org.apache.gobblin.writer.RetryWriter.callWithRetry(RetryWriter.java:141)
> ... 11 more
> Caused by: org.apache.gobblin.exception.NonTransientException: Irrecoverable failure on async write
> at org.apache.gobblin.writer.AsyncWriterManager.maybeThrow(AsyncWriterManager.java:309)
> at org.apache.gobblin.writer.AsyncWriterManager.write(AsyncWriterManager.java:271)
> at org.apache.gobblin.writer.AsyncWriterManager.writeEnvelope(AsyncWriterManager.java:259)
> at org.apache.gobblin.writer.CloseOnFlushWriterWrapper.writeEnvelope(CloseOnFlushWriterWrapper.java:93)
> at org.apache.gobblin.instrumented.writer.InstrumentedDataWriterDecorator.writeEnvelope(InstrumentedDataWriterDecorator.java:75)
> at org.apache.gobblin.writer.PartitionedDataWriter.writeEnvelope(PartitionedDataWriter.java:161)
> at org.apache.gobblin.writer.ThrottleWriter.writeEnvelope(ThrottleWriter.java:131)
> at org.apache.gobblin.writer.RetryWriter$2.call(RetryWriter.java:118)
> at org.apache.gobblin.writer.RetryWriter$2.call(RetryWriter.java:115)
> at ligobblin.shaded.com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78)
> at ligobblin.shaded.com.github.rholder.retry.Retryer.call(Retryer.java:160)
> ... 13 more
> Caused by: java.lang.RuntimeException: java.io.IOException: java.util.concurrent.TimeoutException
> at org.apache.gobblin.proxies.EspressoProxy.getRecordsPerGetRequest(EspressoProxy.java:199)
> at org.apache.gobblin.proxies.EspressoProxy.get(EspressoProxy.java:216)
> at org.apache.gobblin.writer.http.espresso.EspressoWriter.changeExist(EspressoWriter.java:81)
> at org.apache.gobblin.writer.http.espresso.EspressoMultiputWriter$1.call(EspressoMultiputWriter.java:89)
> at org.apache.gobblin.writer.http.espresso.EspressoMultiputWriter$1.call(EspressoMultiputWriter.java:86)
> ... 4 more
> Caused by: java.io.IOException: java.util.concurrent.TimeoutException
> at com.linkedin.espresso.client.r2d2impl.R2D2EspressoClient.execute(R2D2EspressoClient.java:560)
> at org.apache.gobblin.proxies.EspressoProxy.getRecordsPerGetRequest(EspressoProxy.java:162)
> ... 8 more
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)