You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Bruno Cadonna (Jira)" <ji...@apache.org> on 2020/08/13 10:01:00 UTC

[jira] [Updated] (KAFKA-10396) Overall memory of container keep on growing due to kafka stream / rocksdb and OOM killed once limit reached

     [ https://issues.apache.org/jira/browse/KAFKA-10396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Cadonna updated KAFKA-10396:
----------------------------------
    Component/s:     (was: KafkaConnect)

> Overall memory of container keep on growing due to kafka stream / rocksdb and OOM killed once limit reached
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-10396
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10396
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 2.3.1, 2.5.0
>            Reporter: Vagesh Mathapati
>            Priority: Critical
>
> We are observing that overall memory of our container keep on growing and never came down.
> After analysis find out that rocksdbjni.so is keep on allocating 64M chunks of memory off-heap and never releases back. This causes OOM kill after memory reaches configured limit.
> We use Kafka stream and globalktable for our many kafka topics.
> Below is our environment
>  * Kubernetes cluster
>  * openjdk 11.0.7 2020-04-14 LTS
>  * OpenJDK Runtime Environment Zulu11.39+16-SA (build 11.0.7+10-LTS)
>  * OpenJDK 64-Bit Server VM Zulu11.39+16-SA (build 11.0.7+10-LTS, mixed mode)
>  * Springboot 2.3
>  * spring-kafka-2.5.0
>  * kafka-streams-2.5.0
>  * kafka-streams-avro-serde-5.4.0
>  * rocksdbjni-5.18.3
> Observed same result with kafka 2.3 version.
> Below is the snippet of our analysis
> from pmap output we took addresses from these 64M allocations (RSS)
> Address Kbytes RSS Dirty Mode Mapping
> 00007f3ce8000000 65536 65532 65532 rw--- [ anon ]
> 00007f3cf4000000 65536 65536 65536 rw--- [ anon ]
> 00007f3d64000000 65536 65536 65536 rw--- [ anon ]
> We tried to match with memory allocation logs enabled with the help of Azul systems team.
> @ /tmp/librocksdbjni6564497922441568920.so:
> _Z18rocksdb_get_helperP7JNIEnv_PN7rocksdb2DBERKNS1_11ReadOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayii+0x261)[0x7f3e1c65d741] - 0x7f3ce8ff7ca0
>  @ /tmp/librocksdbjni6564497922441568920.so:
> _ZN7rocksdb15BlockBasedTable3GetERKNS_11ReadOptionsERKNS_5SliceEPNS_10GetContextEPKNS_14SliceTransformEb+0x894)[0x7f3e1c898fd4] - 0x7f3ce8ff9780
>  @ /tmp/librocksdbjni6564497922441568920.so:
> _Z18rocksdb_get_helperP7JNIEnv_PN7rocksdb2DBERKNS1_11ReadOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayii+0xfa)[0x7f3e1c65d5da] - 0x7f3ce8ff9750
>  @ /tmp/librocksdbjni6564497922441568920.so:
> _Z18rocksdb_get_helperP7JNIEnv_PN7rocksdb2DBERKNS1_11ReadOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayii+0x261)[0x7f3e1c65d741] - 0x7f3ce8ff97c0
>  @ /tmp/librocksdbjni6564497922441568920.so:_Z18rocksdb_get_helperP7JNIEnv_PN7rocksdb2DBERKNS1_11ReadOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayii+0xfa)[0x7f3e1c65d5da] - 0x7f3ce8ffccf0
>  @ /tmp/librocksdbjni6564497922441568920.so:
> _Z18rocksdb_get_helperP7JNIEnv_PN7rocksdb2DBERKNS1_11ReadOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayii+0x261)[0x7f3e1c65d741] - 0x7f3ce8ffcd10
>  @ /tmp/librocksdbjni6564497922441568920.so:
> _Z18rocksdb_get_helperP7JNIEnv_PN7rocksdb2DBERKNS1_11ReadOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayii+0xfa)[0x7f3e1c65d5da] - 0x7f3ce8ffccf0
>  @ /tmp/librocksdbjni6564497922441568920.so:
> _Z18rocksdb_get_helperP7JNIEnv_PN7rocksdb2DBERKNS1_11ReadOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayii+0x261)[0x7f3e1c65d741] - 0x7f3ce8ffcd10
> We also identified that content on this 64M is just 0s and no any data present in it.
> I tried to tune the rocksDB configuratino as mentioned but it did not helped. [https://docs.confluent.io/current/streams/developer-guide/config-streams.html#streams-developer-guide-rocksdb-config]
>  
> Please let me know if you need any more details



--
This message was sent by Atlassian Jira
(v8.3.4#803005)