You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Bharath Srinivasan <bh...@gmail.com> on 2016/08/25 21:19:41 UTC

Kafka 0.8.2.2 - CLOSE_WAITS on broker

Hello:

We are running a data pipeline application stack using Kafka 0.8.2.2 in
production. We have been seeing intermittent CLOSE_WAIT on our kafka
brokers frequently and they fill up the file handles pretty quickly. By the
time the open file count reaches around 40K, the node becomes unresponsive
and we see huge GC pauses. The only way out has been restart of the node.
When the nodes are working fine, the average open files in the nodes stay
around 6K during peak load and 3K at average.

Configurations:
- 5 broker cluster (Single node spec: 24 core processors, 250 GB RAM, 256GB
SSD)
- 20 topics and 1100 partitions across all topics
- Replication factor of 3
- Java based KafkaProducer and high level consumers
(ZookeeperConsumerConnector)
- GC params { -Xmx32G -Xms4G -server -XX:MetaspaceSize=96m -XX:+UseG1GC
-XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35
-XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=50
-XX:MaxMetaspaceFreeRatio=80 }

Any pointers here? Appreciate your help.

Thanks,
Bharath

Re: Kafka 0.8.2.2 - CLOSE_WAITS on broker

Posted by Bharath Srinivasan <bh...@gmail.com>.
Java / OS info:
----------
java.specification.version = 1.8
java.vendor = Oracle Corporation
java.version = 1.8.0_45
Oracle Linux Server release 6.7
kernel version 2.6.32-573.18.1.el6.x86_64

Redacted LSOF
---------------------

~46K Close Waits
------------------
java    4692 kafka 2618u  IPv6          264581081       0t0       TCP
XX-XXXX-kafka01:XmlIpcRegSvc->XX-XXXX-host1:33089 (CLOSE_WAIT)
java    4692 kafka 2619u  IPv6          264581082       0t0       TCP
XX-XXXX-kafka01:XmlIpcRegSvc->XX-XXXX-host2:37371 (CLOSE_WAIT)
java    4692 kafka 2621u  IPv6          264600187       0t0       TCP
XX-XXXX-kafka01:XmlIpcRegSvc->XX-XXXX-host3:40788 (CLOSE_WAIT)


475 Established connections
----------------------------
java    4692 kafka *427u  IPv6          282382725       0t0       TCP
XX-XXXX-kafka01:54099->XX-XXXX-host1:eforward (ESTABLISHED)
java    4692 kafka *639u  IPv6          282426735       0t0       TCP
XX-XXXX-kafka01:36157->XX-XXXX-kafka01:59964 (ESTABLISHED)
java    4692 kafka *860u  IPv6          282480072       0t0       TCP
XX-XXXX-kafka01:XmlIpcRegSvc->XX-XXXX-host2:50547 (ESTABLISHED)
java    4692 kafka *507u  IPv6          282481853       0t0       TCP
XX-XXXX-kafka01:XmlIpcRegSvc->XX-XXXX-host3:45096 (ESTABLISHED)

~3K
----------------------------
java    4692 kafka 2367u   REG              253,3 104857335 141033710
/XXX/kafka/LOG/__consumer_offsets-10/00000000000035177234.log

~1.5K
----------------------------
java    4692 kafka  mem    REG              253,3  10485760 141297356
/XXX/kafka/LOG/TOPIC-1-9/00000000000000028243.index

~1.5K
----------------------------
java    4692 kafka  818u   REG              253,3   2548089 141297556
/XXX/kafka/LOG/TOPIC-1-2-76/00000000000000146894.log
java    4692 kafka  819u   REG              253,3         0 141165545
/XXX/kafka/LOG/TOPIC-2-2-11/00000000000000000000.log



On Fri, Aug 26, 2016 at 6:37 AM, Jaikiran Pai <ja...@gmail.com>
wrote:

> Which Java vendor and version are you using in runtime? Also what OS is
> this? Can you get the lsof output (on Linux) and paste the output of that
> to some place (like gist) to show us what descriptors are open etc...
>
> -Jaikiran
>
>
> On Friday 26 August 2016 02:49 AM, Bharath Srinivasan wrote:
>
>> Hello:
>>
>> We are running a data pipeline application stack using Kafka 0.8.2.2 in
>> production. We have been seeing intermittent CLOSE_WAIT on our kafka
>> brokers frequently and they fill up the file handles pretty quickly. By
>> the
>> time the open file count reaches around 40K, the node becomes unresponsive
>> and we see huge GC pauses. The only way out has been restart of the node.
>> When the nodes are working fine, the average open files in the nodes stay
>> around 6K during peak load and 3K at average.
>>
>> Configurations:
>> - 5 broker cluster (Single node spec: 24 core processors, 250 GB RAM,
>> 256GB
>> SSD)
>> - 20 topics and 1100 partitions across all topics
>> - Replication factor of 3
>> - Java based KafkaProducer and high level consumers
>> (ZookeeperConsumerConnector)
>> - GC params { -Xmx32G -Xms4G -server -XX:MetaspaceSize=96m -XX:+UseG1GC
>> -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35
>> -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=50
>> -XX:MaxMetaspaceFreeRatio=80 }
>>
>> Any pointers here? Appreciate your help.
>>
>> Thanks,
>> Bharath
>>
>>
>

Re: Kafka 0.8.2.2 - CLOSE_WAITS on broker

Posted by Jaikiran Pai <ja...@gmail.com>.
Which Java vendor and version are you using in runtime? Also what OS is 
this? Can you get the lsof output (on Linux) and paste the output of 
that to some place (like gist) to show us what descriptors are open etc...

-Jaikiran

On Friday 26 August 2016 02:49 AM, Bharath Srinivasan wrote:
> Hello:
>
> We are running a data pipeline application stack using Kafka 0.8.2.2 in
> production. We have been seeing intermittent CLOSE_WAIT on our kafka
> brokers frequently and they fill up the file handles pretty quickly. By the
> time the open file count reaches around 40K, the node becomes unresponsive
> and we see huge GC pauses. The only way out has been restart of the node.
> When the nodes are working fine, the average open files in the nodes stay
> around 6K during peak load and 3K at average.
>
> Configurations:
> - 5 broker cluster (Single node spec: 24 core processors, 250 GB RAM, 256GB
> SSD)
> - 20 topics and 1100 partitions across all topics
> - Replication factor of 3
> - Java based KafkaProducer and high level consumers
> (ZookeeperConsumerConnector)
> - GC params { -Xmx32G -Xms4G -server -XX:MetaspaceSize=96m -XX:+UseG1GC
> -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35
> -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=50
> -XX:MaxMetaspaceFreeRatio=80 }
>
> Any pointers here? Appreciate your help.
>
> Thanks,
> Bharath
>