You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/10/16 04:44:00 UTC

[jira] [Commented] (FLINK-10356) Add sanity checks to SpillingAdaptiveSpanningRecordDeserializer

    [ https://issues.apache.org/jira/browse/FLINK-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16651107#comment-16651107 ] 

ASF GitHub Bot commented on FLINK-10356:
----------------------------------------

zhijiangW commented on a change in pull request #6705: [FLINK-10356][network] add sanity checks to SpillingAdaptiveSpanningRecordDeserializer
URL: https://github.com/apache/flink/pull/6705#discussion_r225389817
 
 

 ##########
 File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/api/serialization/SpillingAdaptiveSpanningRecordDeserializer.java
 ##########
 @@ -549,21 +584,53 @@ private void addNextChunkFromMemorySegment(MemorySegment segment, int offset, in
 				}
 				else {
 					spillingChannel.close();
+					spillingChannel = null;
 
-					BufferedInputStream inStream = new BufferedInputStream(new FileInputStream(spillFile), 2 * 1024 * 1024);
+					BufferedInputStream inStream =
+						new BufferedInputStream(
+							new FileInputStream(checkNotNull(spillFile)),
+							2 * 1024 * 1024);
 					this.spillFileReader = new DataInputViewStreamWrapper(inStream);
 				}
 			}
 		}
 
-		private void moveRemainderToNonSpanningDeserializer(NonSpanningWrapper deserializer) {
+		private void moveRemainderToNonSpanningDeserializer(NonSpanningWrapper deserializer) throws IOException {
+			Optional<String> deserializationError = getDeserializationError(0);
+			if (deserializationError.isPresent()) {
+				throw new IOException(deserializationError.get());
+			}
+
 			deserializer.clear();
 
 			if (leftOverData != null) {
 				deserializer.initializeFromMemorySegment(leftOverData, leftOverStart, leftOverLimit);
 			}
 		}
 
+		private Optional<String> getDeserializationError(int addToReadBytes) {
 
 Review comment:
   I think it is better to give a comment for this method for easily understanding, especially for the meaning of the `addToReadBytes`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Add sanity checks to SpillingAdaptiveSpanningRecordDeserializer
> ---------------------------------------------------------------
>
>                 Key: FLINK-10356
>                 URL: https://issues.apache.org/jira/browse/FLINK-10356
>             Project: Flink
>          Issue Type: Improvement
>          Components: Network
>    Affects Versions: 1.5.0, 1.5.1, 1.5.2, 1.5.3, 1.6.0, 1.6.1, 1.7.0, 1.5.4
>            Reporter: Nico Kruber
>            Assignee: Nico Kruber
>            Priority: Major
>              Labels: pull-request-available
>
> {{SpillingAdaptiveSpanningRecordDeserializer}} doesn't have any consistency checks for usage calls or serializers behaving properly, e.g. to read only as many bytes as available/promised for that record. At least these checks should be added:
>  # Check that buffers have not been read from yet before adding them (this is an invariant {{SpillingAdaptiveSpanningRecordDeserializer}} works with and from what I can see, it is followed now.
>  # Check that after deserialization, we actually consumed {{recordLength}} bytes
>  ** If not, in the spanning deserializer, we currently simply skip the remaining bytes.
>  ** But in the non-spanning deserializer, we currently continue from the wrong offset.
>  # Protect against {{setNextBuffer}} being called before draining all available records



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)