You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Bartłomiej Tartanus (JIRA)" <ji...@apache.org> on 2017/10/30 14:56:00 UTC

[jira] [Created] (FLINK-7949) AsyncWaitOperator is not restarting when queue is full

Bartłomiej Tartanus created FLINK-7949:
------------------------------------------

             Summary: AsyncWaitOperator is not restarting when queue is full
                 Key: FLINK-7949
                 URL: https://issues.apache.org/jira/browse/FLINK-7949
             Project: Flink
          Issue Type: Bug
          Components: Streaming
    Affects Versions: 1.3.2
            Reporter: Bartłomiej Tartanus


Issue was describe here:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoint-was-declined-tasks-not-ready-td16066.html
Issue - AsyncWaitOperator can't restart properly after failure (thread is waiting forever)

Scenario to reproduce this issue:
1. The queue is full (let's assume that its capacity is N elements) 
2. There is some pending element waiting, so the 
pendingStreamElementQueueEntry field in AsyncWaitOperator is not null and 
while-loop in addAsyncBufferEntry method is trying to add this element to 
the queue (but element is not added because queue is full) 
3. Now the snapshot is taken - the whole queue of N elements is being 
written into the ListState in snapshotState method and also (what is more 
important) this pendingStreamElementQueueEntry is written to this list too. 
4. The process is being restarted, so it tries to recover all the elements 
and put them again into the queue, but the list of recovered elements hold 
N+1 element and our queue capacity is only N. Process is not started yet, so 
it can not process any element and this one element is waiting endlessly. 
But it's never added and the process will never process anything. Deadlock. 
5. Trigger is fired and indeed discarded because the process is not running 
yet. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)