You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/05 20:03:20 UTC

[GitHub] [iceberg] SinghAsDev commented on a change in pull request #3813: [S3FileIO] Add capability to perform checksum validations using S3 eTags

SinghAsDev commented on a change in pull request #3813:
URL: https://github.com/apache/iceberg/pull/3813#discussion_r779102577



##########
File path: aws/src/main/java/org/apache/iceberg/aws/s3/S3OutputStream.java
##########
@@ -216,43 +247,43 @@ private void uploadParts() {
       return;
     }
 
-    stagingFiles.stream()
+    stagingFilesWithETags.stream()
         // do not upload the file currently being written
-        .filter(f -> closed || !f.equals(currentStagingFile))
+        .filter(f -> closed || !f.file().equals(currentStagingFile))
         // do not upload any files that have already been processed
-        .filter(Predicates.not(multiPartMap::containsKey))
+        .filter(Predicates.not(f -> multiPartMap.containsKey(f.file())))
         .forEach(f -> {
           UploadPartRequest.Builder requestBuilder = UploadPartRequest.builder()
               .bucket(location.bucket())
               .key(location.key())
               .uploadId(multipartUploadId)
-              .partNumber(stagingFiles.indexOf(f) + 1)
-              .contentLength(f.length());
+              .partNumber(stagingFilesWithETags.indexOf(f) + 1)
+              .contentLength(f.file().length());
 
           S3RequestUtil.configureEncryption(awsProperties, requestBuilder);
 
           UploadPartRequest uploadRequest = requestBuilder.build();
 
           CompletableFuture<CompletedPart> future = CompletableFuture.supplyAsync(
               () -> {
-                UploadPartResponse response = s3.uploadPart(uploadRequest, RequestBody.fromFile(f));
+                UploadPartResponse response = s3.uploadPart(uploadRequest, RequestBody.fromFile(f.file()));
+                checkEtag(f.eTag(), response.eTag());

Review comment:
       Yea, I was also initially doing that. However, I could not find a way to add a reliable test. Tests would succeed with wrong md5 checksums added to request. Likely due to s3 mock. With that and the fact that current approach allows to add a better error message, I would propose doing explicit checks here. However, if you have a strong preference on this, I can modify. Let me know.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org