You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/04/12 16:41:40 UTC

[GitHub] [flink] zhijiangW commented on a change in pull request #11687: [FLINK-16536][network][checkpointing] Implement InputChannel state recovery for unaligned checkpoint

zhijiangW commented on a change in pull request #11687: [FLINK-16536][network][checkpointing] Implement InputChannel state recovery for unaligned checkpoint
URL: https://github.com/apache/flink/pull/11687#discussion_r407222896
 
 

 ##########
 File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/RemoteInputChannel.java
 ##########
 @@ -149,16 +154,58 @@ void assignExclusiveSegments() throws IOException {
 		}
 	}
 
+	/**
+	 * Reads the channel state data executed by netty thread, so it can make use of almost all the
+	 * existing processes to avoid bringing additional race conditions with task thread. Also it can
+	 * avoid introducing another thread pool to handle this work to make things more complex.
+	 */
+	private void readInputChannelState() throws IOException {
+		while (true) {
+			Buffer buffer;
+			synchronized (bufferQueue) {
+				buffer = bufferQueue.takeBuffer();
+				if (buffer == null) {
+					if (isReleased()) {
+						return;
+					}
+
+					buffer = inputGate.getBufferPool().requestBuffer();
+					if (buffer != null) {
+						bufferQueue.addFloatingBuffer(buffer);
+						continue;
+					} else {
+						inputGate.getBufferProvider().addBufferListener(this);
+						isWaitingForStateBuffers = true;
+						return;
+					}
+				}
+			}
+
+			ChannelStateReader.ReadResult result = inputGate.stateReader.readInputData(channelInfo, buffer);
 
 Review comment:
   Yes, `readInputChannelState` can be executed concurrently by multiple netty threads for different channels by design. I think in general task processing should be more faster than reading states, so one thread might not be enough for filling buffer to feed task thread well. And every channel actually has exclusive buffers which can be used in parallel to speed up recovery process.
   
   I overlooked the `NotThreadSafe` annotation in `ChanelStateReaderImpl`.  Since every input channel handle will actually generate a separate `ChannelStateStreamReader` and respective stream, I was supposed one input channel state should not be read by multiple threads, but different channel states can be read by different threads concurrent. I would further confirm with Roman whether there are other limitations.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services