You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Shashank Agarwal (JIRA)" <ji...@apache.org> on 2018/01/03 13:16:00 UTC

[jira] [Commented] (FLINK-7760) Restore failing from external checkpointing metadata.

    [ https://issues.apache.org/jira/browse/FLINK-7760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309634#comment-16309634 ] 

Shashank Agarwal commented on FLINK-7760:
-----------------------------------------

Hi [~kkl0u] ,

I have checked again and facing the same issue while restore. So unable to checkpoint in Rocksdb and in fsStateBackend after savepoint during restore facing this issue. 

So unable to restore my state in any case. It's not printing any extra logs in debugging mode also. Please guide I am using CEP, Yarn, HDFS, Scala. Otherwise, i have to use some DB for the state which I don't want.

Thanks

> Restore failing from external checkpointing metadata.
> -----------------------------------------------------
>
>                 Key: FLINK-7760
>                 URL: https://issues.apache.org/jira/browse/FLINK-7760
>             Project: Flink
>          Issue Type: Sub-task
>          Components: CEP, State Backends, Checkpointing
>    Affects Versions: 1.3.2
>         Environment: Yarn, Flink 1.3.2, HDFS,  FsStateBackend
>            Reporter: Shashank Agarwal
>            Priority: Blocker
>             Fix For: 1.4.0, 1.5.0, 1.4.1
>
>
> My job failed due to failure of cassandra. I have enabled ExternalizedCheckpoints. But when job tried to restore from that checkpoint it's failing continuously with following error.
> {code:java}
> 2017-10-04 09:39:20,611 INFO  org.apache.flink.runtime.taskmanager.Task                     - KeyedCEPPatternOperator -> Map (1/2) (8ff7913f820ead571c8b54ccc6b16045) switched from RUNNING to FAILED.
> java.lang.IllegalStateException: Could not initialize keyed state backend.
> 	at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initKeyedState(AbstractStreamOperator.java:321)
> 	at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:217)
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:676)
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:663)
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:252)
> 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.StreamCorruptedException: invalid type code: 00
> 	at java.io.ObjectInputStream$BlockDataInputStream.readBlockHeader(ObjectInputStream.java:2519)
> 	at java.io.ObjectInputStream$BlockDataInputStream.refill(ObjectInputStream.java:2553)
> 	at java.io.ObjectInputStream$BlockDataInputStream.skipBlockData(ObjectInputStream.java:2455)
> 	at java.io.ObjectInputStream.skipCustomData(ObjectInputStream.java:1951)
> 	at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1621)
> 	at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
> 	at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)
> 	at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
> 	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
> 	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
> 	at org.apache.flink.cep.nfa.NFA$NFASerializer.deserializeCondition(NFA.java:1211)
> 	at org.apache.flink.cep.nfa.NFA$NFASerializer.deserializeStates(NFA.java:1169)
> 	at org.apache.flink.cep.nfa.NFA$NFASerializer.deserialize(NFA.java:957)
> 	at org.apache.flink.cep.nfa.NFA$NFASerializer.deserialize(NFA.java:852)
> 	at org.apache.flink.runtime.state.heap.StateTableByKeyGroupReaders$StateTableByKeyGroupReaderV2V3.readMappingsInKeyGroup(StateTableByKeyGroupReaders.java:132)
> 	at org.apache.flink.runtime.state.heap.HeapKeyedStateBackend.restorePartitionedState(HeapKeyedStateBackend.java:518)
> 	at org.apache.flink.runtime.state.heap.HeapKeyedStateBackend.restore(HeapKeyedStateBackend.java:397)
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.createKeyedStateBackend(StreamTask.java:772)
> 	at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initKeyedState(AbstractStreamOperator.java:311)
> 	... 6 more
> {code}
> I have tried to start new job also after failure with parameter {code:java} -s [checkpoint meta data path]{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)