You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Davor Poldrugo (JIRA)" <ji...@apache.org> on 2016/11/28 20:23:58 UTC

[jira] [Created] (KAFKA-4455) Commit during rebalance does not close RocksDB which later causes: org.rocksdb.RocksDBException: IO error: lock .../LOCK: No locks available

Davor Poldrugo created KAFKA-4455:
-------------------------------------

             Summary: Commit during rebalance does not close RocksDB which later causes: org.rocksdb.RocksDBException: IO error: lock .../LOCK: No locks available
                 Key: KAFKA-4455
                 URL: https://issues.apache.org/jira/browse/KAFKA-4455
             Project: Kafka
          Issue Type: Bug
          Components: streams
    Affects Versions: 0.10.1.0
         Environment: Kafka Streams were running on CentOS - I have observed this - after some time the locks were released even if the jvm/process wasn't restarted, so I guess CentOS has some lock cleaning policy.
            Reporter: Davor Poldrugo


h2. Problem description
From time to time a rebalance in Kafka Streams causes the commit to throw CommitFailedException. When this exception is thrown, the tasks and processors are not closed. If some processor contains a state store (RocksDB), the RocksDB is not closed, which leads to not relasead LOCK's on OS level, and when the Kafka Streams app is trying to open tasks and their respective processors and state stores the {{org.rocksdb.RocksDBException: IO error: lock .../LOCK: No locks available}} is thrown. If the the jvm/process is restarted the locks are released.

h2. Additional info
I have been running 3 Kafka Streams instances on separate machines with {{num.stream.threads=1}} and each with it's own state directory. Other Kafka Streams apps were running but they had separate directories for state stores.

h2. Stacktrace
[^RocksDBException_IO-error_stacktrace.txt] 

h2. Suggested solution
To avoid restarting the jvm, modify Kafka Streams to close tasks, which will lead to release of resources - in this case - filesystem LOCK files.

h2. Possible solution code
Branch: https://github.com/dpoldrugo/kafka/commits/infobip-fork
Commit: [BUGFIX: When commit fails during rebalance - release resources|https://github.com/dpoldrugo/kafka/commit/af0d16fc5f8629ab0583c94edf3dbf41158b73f3]

h2. Note
This could be related this issues: KAFKA-3708 and KAFKA-3938



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)