You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Ewen Cheslack-Postava (JIRA)" <ji...@apache.org> on 2018/02/23 04:19:00 UTC

[jira] [Commented] (KAFKA-6551) Unbounded queues in WorkerSourceTask cause OutOfMemoryError

    [ https://issues.apache.org/jira/browse/KAFKA-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373906#comment-16373906 ] 

Ewen Cheslack-Postava commented on KAFKA-6551:
----------------------------------------------

Seems reasonable – this should only be an issue if producing to the topic is failing and we generate a large backlog, but very good point that this should be bounded, at least roughly, and pause poll()ing until it is resolved. A bit hard to say what the right metric for measurement is since this holds onto the entire record. Maybe # of records will work in practice just because you can set it to a reasonable default and never think about it again while still not hitting any OOMs. But any large messages could make that assumption fail.

> Unbounded queues in WorkerSourceTask cause OutOfMemoryError
> -----------------------------------------------------------
>
>                 Key: KAFKA-6551
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6551
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>            Reporter: Gunnar Morling
>            Priority: Major
>
> A Debezium user reported an {{OutOfMemoryError}} to us, with over 50,000 messages in the {{WorkerSourceTask#outstandingMessages}} map.
> This map is unbounded and I can't see any way of "rate limiting" which would control how many records are added to it. Growth can only indirectly be limited by reducing the offset flush interval, but as connectors can return large amounts of messages in single {{poll()}} calls that's not sufficient in all cases. Note the user reported this issue during snapshotting a database, i.e. a high number of records arrived in a very short period of time.
> To solve the problem I'd suggest to make this map backpressure-aware and thus prevent its indefinite growth, so that no further records will be polled from the connector until messages have been taken out of the map again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)