You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "A. Sophie Blee-Goldman (Jira)" <ji...@apache.org> on 2021/03/25 02:23:00 UTC

[jira] [Resolved] (KAFKA-7213) NullPointerException during state restoration in kafka streams

     [ https://issues.apache.org/jira/browse/KAFKA-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

A. Sophie Blee-Goldman resolved KAFKA-7213.
-------------------------------------------
    Resolution: Fixed

I think we can close this as the version is quite old now and much refactoring of the task management code has occurred since then, with no reports of NPEs -- please reopen if you encounter this issue again on more recent versions

> NullPointerException during state restoration in kafka streams
> --------------------------------------------------------------
>
>                 Key: KAFKA-7213
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7213
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 1.0.0
>            Reporter: Abhishek Agarwal
>            Priority: Major
>
> I had written a custom state store which has a batch restoration callback registered. What I have observed, when multiple consumer instances are restarted, the application keeps failing with NullPointerException. The stack trace is 
> {noformat}
> java.lang.NullPointerException: null
> 	at org.apache.kafka.streams.state.internals.RocksDBStore.putAll(RocksDBStore.java:351) ~[kafka-streams-1.0.0.jar:?]
> 	at org.apache.kafka.streams.state.internals.RocksDBSlotKeyValueBytesStore.putAll(RocksDBSlotKeyValueBytesStore.java:100) ~[streams-core-1.0.0.297.jar:?]
> 	at org.apache.kafka.streams.state.internals.RocksDBSlotKeyValueBytesStore$SlotKeyValueBatchRestoreCallback.restoreAll(RocksDBSlotKeyValueBytesStore.java:303) ~[streams-core-1.0.0.297.jar:?]
> 	at org.apache.kafka.streams.processor.internals.CompositeRestoreListener.restoreAll(CompositeRestoreListener.java:89) ~[kafka-streams-1.0.0.jar:?]
> 	at org.apache.kafka.streams.processor.internals.StateRestorer.restore(StateRestorer.java:75) ~[kafka-streams-1.0.0.jar:?]
> 	at org.apache.kafka.streams.processor.internals.StoreChangelogReader.processNext(StoreChangelogReader.java:277) ~[kafka-streams-1.0.0.jar:?]
> 	at org.apache.kafka.streams.processor.internals.StoreChangelogReader.restorePartition(StoreChangelogReader.java:238) ~[kafka-streams-1.0.0.jar:?]
> 	at org.apache.kafka.streams.processor.internals.StoreChangelogReader.restore(StoreChangelogReader.java:83) ~[kafka-streams-1.0.0.jar:?]
> 	at org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:263) ~[kafka-streams-1.0.0.jar:?]
> 	at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:803) ~[kafka-streams-1.0.0.jar:?]
> 	at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774) ~[kafka-streams-1.0.0.jar:?]
> 	at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744) ~[kafka-streams-1.0.0.jar:?]
> {noformat}
> The faulty line in question is 
> {noformat}
> db.write(wOptions, batch);
> {noformat}
> in RocksDBStore.java which would mean that db variable is null. Probably the store has been closed and restoration is still being done on it. After going through the code, I think the problem is when state transitions from PARTITIONS_ASSIGNED to PARTITIONS_REVOKED and restoration is still in progress. 
> In such state transition, while the active tasks themselves are closed, the changelog reader is not reset. It tries to restore the tasks that have already been closed, db is null and results in NPE. 
> I will put in a fix to see if that fixes the issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)