You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Oleg Kuznetsov (JIRA)" <ji...@apache.org> on 2019/03/17 16:15:00 UTC

[jira] [Commented] (KAFKA-2480) Handle non-CopycatExceptions from SinkTasks

    [ https://issues.apache.org/jira/browse/KAFKA-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794524#comment-16794524 ] 

Oleg Kuznetsov commented on KAFKA-2480:
---------------------------------------

[~gwenshap] [~ewencp]
Looks like the way it was implemented does not guarantee actual waiting will happen.

 

The code:
{code:java}
//timeout The time, in milliseconds, spent waiting in poll if data is not available in the buffer.
//* If 0, returns immediately with any records that are available currently in the buffer, else returns empty.
//* Must not be negative.

consumer.poll(timeoutMs)
{code}
does not have to wait *timeout* ms to return, if there are records in the topic available for consumption.

 

Now client code cannot rely on this, for example, trying to meet SLA accessing an external storage.

I propose to treat it as business-logic waiting request, where client code expects at least *timeoutMs* to wait before return.

> Handle non-CopycatExceptions from SinkTasks
> -------------------------------------------
>
>                 Key: KAFKA-2480
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2480
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: KafkaConnect
>            Reporter: Ewen Cheslack-Postava
>            Assignee: Ewen Cheslack-Postava
>            Priority: Major
>             Fix For: 0.9.0.0
>
>
> Currently we catch Throwable in WorkerSinkTask, but we just log the exception. This can lead to data loss because it indicates the messages in the {{put(records)}} call probably were not handled properly. We need to decide what the policy for handling these types of exceptions should be -- try repeating the same records again, risking duplication? or skip them, risking loss? or kill the task immediately and require intervention since it's unclear what happened?
> SourceTasks don't have the same concern -- they can throw other exceptions and as long as we catch them, it is up to the connector to ensure that it does not lose data as a result.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)