You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Justine Olshan (JIRA)" <ji...@apache.org> on 2019/05/30 16:19:00 UTC
[jira] [Assigned] (KAFKA-8448) Too many kafka.log.Log instances
(Memory Leak)
[ https://issues.apache.org/jira/browse/KAFKA-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Justine Olshan reassigned KAFKA-8448:
-------------------------------------
Assignee: Justine Olshan
> Too many kafka.log.Log instances (Memory Leak)
> ----------------------------------------------
>
> Key: KAFKA-8448
> URL: https://issues.apache.org/jira/browse/KAFKA-8448
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 2.2.0
> Environment: Red Hat 4.4.7-16, java version "1.8.0_152", kafka_2.12-2.2.0
> Reporter: Juan Olivares
> Assignee: Justine Olshan
> Priority: Major
>
> We have a custom Kafka health check which creates a topic, add some ACLs (read/write topic and group), produce & consume a single message and then quickly remove it and all the related ACLs created. We close the consumer involved, but no the producer.
> We have observed that # of instances of {{kafka.log.Log}} keep growing, while there's no evidence of topics being leaked, neither running {{/opt/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --describe}} , nor looking at the disk directory where topics are stored.
> After looking at the heapdump we've observed the following
> - None of the {{kafka.log.Log}} references ({{currentLogs}}, {{logsToBeDeleted }} and {{logsToBeDeleted}}) in {{kafka.log.LogManager}} is holding the big amount of {{kafka.log.Log}} instances.
> - The only reference preventing {{kafka.log.Log}} to be Garbage collected seems to be {{java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue}} which contains schedule tasks created with the name {{PeriodicProducerExpirationCheck}}.
> I can see in the code that for every {{kafka.log.Log}} a task with this name is scheduled.
> {code:java}
> scheduler.schedule(name = "PeriodicProducerExpirationCheck", fun = () => {
> lock synchronized {
> producerStateManager.removeExpiredProducers(time.milliseconds)
> }
> }, period = producerIdExpirationCheckIntervalMs, delay = producerIdExpirationCheckIntervalMs, unit = TimeUnit.MILLISECONDS)
> {code}
> However it seems those tasks are never unscheduled/cancelled
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)