You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@heron.apache.org by GitBox <gi...@apache.org> on 2019/03/22 02:11:12 UTC

[GitHub] [incubator-heron] simingweng opened a new issue #3222: Saving and restoring Checkpoint to/from via DistributedLog API does not work

simingweng opened a new issue #3222: Saving and restoring Checkpoint to/from via DistributedLog API does not work
URL: https://github.com/apache/incubator-heron/issues/3222
 
 
   When saving a checkpoint into Bookkeeper via DistributedLog API, the following exception is thrown when the serialized size of the checkpoint is larger than 4096 bytes:
   
   `org.apache.heron.spi.statefulstorage.StatefulStorageException: Failed to read checkpoint from /meter-readings-topology/1553116574587664199-1553116774/spr-bolt_1
   	at org.apache.heron.statefulstorage.dlog.DlogStorage.restoreCheckpoint(DlogStorage.java:171)
   	at org.apache.heron.ckptmgr.CheckpointManagerServer.handleGetInstanceStateRequest(CheckpointManagerServer.java:313)
   	at org.apache.heron.ckptmgr.CheckpointManagerServer.onRequest(CheckpointManagerServer.java:111)
   	at org.apache.heron.common.network.HeronServer.handlePacket(HeronServer.java:212)
   	at org.apache.heron.common.network.HeronServer.handleRead(HeronServer.java:169)
   	at org.apache.heron.common.basics.NIOLooper.handleSelectedKeys(NIOLooper.java:116)
   	at org.apache.heron.common.basics.NIOLooper.access$000(NIOLooper.java:38)
   	at org.apache.heron.common.basics.NIOLooper$1.run(NIOLooper.java:51)
   	at org.apache.heron.common.basics.WakeableLooper.executeTasksOnWakeup(WakeableLooper.java:191)
   	at org.apache.heron.common.basics.WakeableLooper.runOnce(WakeableLooper.java:110)
   	at org.apache.heron.common.basics.WakeableLooper.loop(WakeableLooper.java:100)
   	at org.apache.heron.ckptmgr.CheckpointManager.startAndLoop(CheckpointManager.java:183)
   	at org.apache.heron.ckptmgr.CheckpointManager.main(CheckpointManager.java:276)
   Caused by: org.apache.heron.shaded.com.google.protobuf.InvalidProtocolBufferException: While parsing a protocol message, the input ended unexpectedly in the middle of a field.  This could mean either that the input has been truncated or that an embedded message misreported its own length.
   	at org.apache.heron.shaded.com.google.protobuf.InvalidProtocolBufferException.truncatedMessage(InvalidProtocolBufferException.java:86)
   	at org.apache.heron.shaded.com.google.protobuf.CodedInputStream$StreamDecoder.readRawBytesSlowPathRemainingChunks(CodedInputStream.java:2937)
   	at org.apache.heron.shaded.com.google.protobuf.CodedInputStream$StreamDecoder.readBytesSlowPath(CodedInputStream.java:2972)
   	at org.apache.heron.shaded.com.google.protobuf.CodedInputStream$StreamDecoder.readBytes(CodedInputStream.java:2382)
   	at org.apache.heron.proto.ckptmgr.CheckpointManager$InstanceStateCheckpoint.<init>(CheckpointManager.java:9270)
   	at org.apache.heron.proto.ckptmgr.CheckpointManager$InstanceStateCheckpoint.<init>(CheckpointManager.java:9219)
   	at org.apache.heron.proto.ckptmgr.CheckpointManager$InstanceStateCheckpoint$1.parsePartialFrom(CheckpointManager.java:9979)
   	at org.apache.heron.proto.ckptmgr.CheckpointManager$InstanceStateCheckpoint$1.parsePartialFrom(CheckpointManager.java:9974)
   	at org.apache.heron.shaded.com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:221)
   	at org.apache.heron.shaded.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:239)
   	at org.apache.heron.shaded.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:244)
   	at org.apache.heron.shaded.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
   	at org.apache.heron.shaded.com.google.protobuf.GeneratedMessageV3.parseWithIOException(GeneratedMessageV3.java:311)
   	at org.apache.heron.proto.ckptmgr.CheckpointManager$InstanceStateCheckpoint.parseFrom(CheckpointManager.java:9540)
   	at org.apache.heron.statefulstorage.dlog.DlogStorage.restoreCheckpoint(DlogStorage.java:169)
   	... 12 more`
   
   And then, when restore a checkpoint from Bookkeeper via DistributedLog API, the following exception is thrown:
   
   `[2019-03-21 21:10:02 +0000] [STDERR] stderr: Exception in thread "main"   
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: java.lang.IllegalStateException: InputStream#read(byte[]) returned invalid result: 0
   The InputStream implementation is buggy.  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.shaded.com.google.protobuf.CodedInputStream$StreamDecoder.tryRefillBuffer(CodedInputStream.java:2786)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.shaded.com.google.protobuf.CodedInputStream$StreamDecoder.isAtEnd(CodedInputStream.java:2700)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.shaded.com.google.protobuf.CodedInputStream$StreamDecoder.readTag(CodedInputStream.java:2051)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.proto.ckptmgr.CheckpointManager$InstanceStateCheckpoint.<init>(CheckpointManager.java:9250)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.proto.ckptmgr.CheckpointManager$InstanceStateCheckpoint.<init>(CheckpointManager.java:9219)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.proto.ckptmgr.CheckpointManager$InstanceStateCheckpoint$1.parsePartialFrom(CheckpointManager.java:9979)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.proto.ckptmgr.CheckpointManager$InstanceStateCheckpoint$1.parsePartialFrom(CheckpointManager.java:9974)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.shaded.com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:221)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.shaded.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:239)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.shaded.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:244)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.shaded.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.shaded.com.google.protobuf.GeneratedMessageV3.parseWithIOException(GeneratedMessageV3.java:311)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.proto.ckptmgr.CheckpointManager$InstanceStateCheckpoint.parseFrom(CheckpointManager.java:9540)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.statefulstorage.dlog.DlogStorage.restoreCheckpoint(DlogStorage.java:173)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.ckptmgr.CheckpointManagerServer.handleGetInstanceStateRequest(CheckpointManagerServer.java:313)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.ckptmgr.CheckpointManagerServer.onRequest(CheckpointManagerServer.java:111)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.common.network.HeronServer.handlePacket(HeronServer.java:212)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.common.network.HeronServer.handleRead(HeronServer.java:169)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.common.basics.NIOLooper.handleSelectedKeys(NIOLooper.java:116)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.common.basics.NIOLooper.access$000(NIOLooper.java:38)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.common.basics.NIOLooper$1.run(NIOLooper.java:51)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.common.basics.WakeableLooper.executeTasksOnWakeup(WakeableLooper.java:191)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.common.basics.WakeableLooper.runOnce(WakeableLooper.java:110)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.common.basics.WakeableLooper.loop(WakeableLooper.java:100)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.ckptmgr.CheckpointManager.startAndLoop(CheckpointManager.java:183)  
   [2019-03-21 21:10:02 +0000] [STDERR] stderr: 	at org.apache.heron.ckptmgr.CheckpointManager.main(CheckpointManager.java:276) `

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services