You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Sudharsan R <su...@gmail.com> on 2022/05/16 23:33:36 UTC

Checkpoint declined (Null Pointer exception)

Hello,
I have the following situation:
We upgraded our application code on a flink 1.11.1 cluster. We use rocksdb
as the state backend. The upgrade used a savepoint from the prior app
version. We added a few MapStates to an existing
KeyedProcessWindowFunction. This function used to have a single valueState
before. We also started using a different WindowTrigger function.

At some point, we had to downgrade the application code. We took a
savepoint and restored the old app version using this savepoint. This
action itself succeeds. However none of the periodic checkpoints succeed!
Every checkpoint fails on this particular KeyedProcessWindowFunction with a
Null Pointer Exception. This looks interesting (
https://issues.apache.org/jira/browse/FLINK-11094). However, the bug is
old. As far as i understand, this should work. Am I missing something? How
can I debug this?

This is the stack Trace:

2022-05-16 08:02:19

java.io.IOException: Could not perform checkpoint 1175 for operator XXX ->
(Sink: kinesis-sink-1, Sink: kinesis-sink-2, Sink: kinesis-sink-3) (2/2).

    at
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:863)

    at
org.apache.flink.streaming.runtime.io.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:113)

    at
org.apache.flink.streaming.runtime.io.CheckpointBarrierAligner.processBarrier(CheckpointBarrierAligner.java:198)

    at
org.apache.flink.streaming.runtime.io.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:93)

    at
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:158)

    at
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)

    at
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)

    at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)

    at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)

    at
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)

    at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)

    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)

    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)

    at java.lang.Thread.run(Thread.java:748)

Caused by: org.apache.flink.runtime.checkpoint.CheckpointException: Could
not complete snapshot 1175 for operator XXX -> (Sink: kinesis-sink-1, Sink:
kinesis-sink-2, Sink: kinesis-sink-3) (2/2). Failure reason: Checkpoint was
declined.

    at
org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:215)

    at
org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:156)

    at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:314)

    at
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointStreamOperator(SubtaskCheckpointCoordinatorImpl.java:614)

    at
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.buildOperatorSnapshotFutures(SubtaskCheckpointCoordinatorImpl.java:540)

    at
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.takeSnapshotSync(SubtaskCheckpointCoordinatorImpl.java:507)

    at
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointState(SubtaskCheckpointCoordinatorImpl.java:266)

    at
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$5(StreamTask.java:892)

    at
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47)

    at
org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:882)

    at
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:850)

    ... 13 more

Caused by: java.lang.NullPointerException

Thanks
Sudharsan