You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Vinoth Chandar (JIRA)" <ji...@apache.org> on 2015/09/25 18:45:04 UTC

[jira] [Commented] (KAFKA-2580) Kafka Broker keeps file handles open for all log files (even if its not written to/read from)

    [ https://issues.apache.org/jira/browse/KAFKA-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908298#comment-14908298 ] 

Vinoth Chandar commented on KAFKA-2580:
---------------------------------------

More context on how we determined this

{code}
vinoth@kafka-agg:~$ sudo ls -l /proc/<broker-pid>/fd | wc -l
50820
vinoth@kafka-agg::~$ ls -R /var/kafka-spool/data | grep -e ".log" -e ".index" | wc -l
97242
vinoth@kafka-agg::~$ ls -R /var/kafka-spool/data | grep -e ".index" | wc -l
48456
vinoth@kafka-agg::~$ ls -R /var/kafka-spool/data | grep -e ".log"  | wc -l
48788


vinoth@kafka-changelog-cluster:~$ sudo ls -l /proc/<broker-pid>/fd | wc -l
59128
vinoth@kafka-changelog-cluster:~$ ls -R /var/kafka-spool/data | grep -e ".log" -e ".index" | wc -l
117548
vinoth@kafka-changelog-cluster:~$ ls -R /var/kafka-spool/data | grep  -e ".index" | wc -l 
58774
vinoth@kafka-changelog-cluster:~$ ls -R /var/kafka-spool/data | grep  -e ".log" | wc -l
58774
{code}

> Kafka Broker keeps file handles open for all log files (even if its not written to/read from)
> ---------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-2580
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2580
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8.2.1
>            Reporter: Vinoth Chandar
>
> We noticed this in one of our clusters where we stage logs for a longer amount of time. It appears that the Kafka broker keeps file handles open even for non active (not written to or read from) files. (in fact, there are some threads going back to 2013 http://grokbase.com/t/kafka/users/132p65qwcn/keeping-logs-forever) 
> Needless to say, this is a problem and forces us to either artificially bump up ulimit (its already at 100K) or expand the cluster (even if we have sufficient IO and everything). 
> Filing this ticket, since I could find anything similar. Very interested to know if there are plans to address this (given how Samza's changelog topic is meant to be a persistent large state use case).  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)