You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2022/06/17 09:26:00 UTC

[jira] [Resolved] (HADOOP-18298) Hadoop AWS | Staging committer Multipartupload not completing on minio

     [ https://issues.apache.org/jira/browse/HADOOP-18298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steve Loughran resolved HADOOP-18298.
-------------------------------------
    Resolution: Invalid

> Hadoop AWS | Staging committer Multipartupload not completing on minio
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-18298
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18298
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 3.3.1
>         Environment: minio
>            Reporter: Ayush Goyal
>            Priority: Major
>
> In Hadoop aws staging committer(org.apache.hadoop.fs.s3a.commit.staging.StagingCommitter), Committer uploads files from local to s3(method- commitTaskInternal) which calls uploadFileToPendingCommit of CommitOperation to upload file using multipart upload.
>  
> Multipart upload consists of three steps-
> 1)Initialise multipartupload.
> 2) Breaks the file to part and upload Parts.
> 3) Merge all the parts of files and finalize multipart.
>  
> In the implementation of uploadFileToPendingCommit, first 2 steps are implemented. However, 3rd part is missing which leads to uploading the parts file but because it is not merged at the end of job no files are there in destination directory.
>  
> S3 logs before implement 3rd steps-
>  
> {code:java}
> 2022-05-30T13:49:31:000 [200 OK] s3.NewMultipartUpload localhost:9000/minio-feature-testing/spark-job/processed/output-parquet-staging-7/part-00000-ce0a965f-622a-4950-bb4b-550470883134-c000-b552fb34-6156-4aa8-9085-679ad14fab6e.snappy.parquet?uploads  240b:c1d1:123:664f:c5d2:2::               8.677ms      ↑ 137 B ↓ 724 B
> 2022-05-30T13:49:31:000 [200 OK] s3.PutObjectPart localhost:9000/minio-feature-testing/spark-job/processed/output-parquet-staging-7/part-00000-ce0a965f-622a-4950-bb4b-550470883134-c000-b552fb34-6156-4aa8-9085-679ad14fab6e.snappy.parquet?uploadId=f3beae8e-3001-48be-9bc4-306b71940e50&partNumber=1  240b:c1d1:123:664f:c5d2:2::                443.156ms    ↑ 51 KiB ↓ 325 B
> 2022-05-30T13:49:32:000 [200 OK] s3.ListObjectsV2 localhost:9000/minio-feature-testing/?list-type=2&delimiter=%2F&max-keys=2&prefix=spark-job%2Fprocessed%2Foutput-parquet-staging-7%2F_SUCCESS%2F&fetch-owner=false  240b:c1d1:123:664f:c5d2:2::                3.414ms      ↑ 137 B ↓ 646 B
> 2022-05-30T13:49:32:000 [200 OK] s3.PutObject localhost:9000/minio-feature-testing/spark-job/processed/output-parquet-staging-7/_SUCCESS 240b:c1d1:123:664f:c5d2:2::                52.734ms     ↑ 8.7 KiB ↓ 380 B
> 2022-05-30T13:49:32:000 [200 OK] s3.DeleteMultipleObjects localhost:9000/minio-feature-testing/?delete  240b:c1d1:123:664f:c5d2:2::                73.954ms     ↑ 350 B ↓ 432 B
> 2022-05-30T13:49:32:000 [404 Not Found] s3.HeadObject localhost:9000/minio-feature-testing/spark-job/processed/output-parquet-staging-7/_temporary 240b:c1d1:123:664f:c5d2:2::                2.658ms      ↑ 137 B ↓ 291 B
> 2022-05-30T13:49:32:000 [200 OK] s3.ListObjectsV2 localhost:9000/minio-feature-testing/?list-type=2&delimiter=%2F&max-keys=2&prefix=spark-job%2Fprocessed%2Foutput-parquet-staging-7%2F_temporary%2F&fetch-owner=false  240b:c1d1:123:664f:c5d2:2::                 4.807ms      ↑ 137 B ↓ 648 B
> 2022-05-30T13:49:32:000 [200 OK] s3.ListMultipartUploads localhost:9000/minio-feature-testing/?uploads&prefix=spark-job%2Fprocessed%2Foutput-parquet-staging-7%2F  240b:c0e0:102:553e:b4c2:2::               1.081ms      ↑ 137 B ↓ 776 B
> 2022-05-30T13:49:32:000 [404 Not Found] s3.HeadObject localhost:9000/minio-feature-testing/spark-job/processed/output-parquet-staging-7/.spark-staging-ce0a965f-622a-4950-bb4b-550470883134 240b:c1d1:123:664f:c5d2:2::                 5.68ms       ↑ 137 B ↓ 291 B
> 2022-05-30T13:49:32:000 [200 OK] s3.ListObjectsV2 localhost:9000/minio-feature-testing/?list-type=2&delimiter=%2F&max-keys=2&prefix=spark-job%2Fprocessed%2Foutput-parquet-staging-7%2F.spark-staging-ce0a965f-622a-4950-bb4b-550470883134%2F&fetch-owner=false  240b:c1d1:123:664f:c5d2:2::              2.452ms      ↑ 137 B ↓ 689 B
>   {code}
> Here , After s3.PutObjectPart there is no completeMultipartupload call for 3rd step.
>  
> S3 logs after implement 3rd steps-
>  
> {code:java}
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D0/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads  240b:c1d1:123:664f:c5d2:2::               9.116ms      ↑ 137 B ↓ 750 B
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D15/quarter%3D45/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads  240b:c1d1:123:664f:c5d2:2::               9.416ms      ↑ 137 B ↓ 751 B
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D45/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads  240b:c1d1:123:664f:c5d2:2::               8.506ms      ↑ 137 B ↓ 751 B
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D15/quarter%3D0/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads  240b:c1d1:123:664f:c5d2:2::               9.815ms      ↑ 137 B ↓ 750 B
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D30/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads  240b:c1d1:123:664f:c5d2:2::               10.09ms      ↑ 137 B ↓ 751 B
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D15/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads  240b:c1d1:123:664f:c5d2:2::               9.851ms      ↑ 137 B ↓ 751 B
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D17/quarter%3D0/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads  240b:c1d1:123:664f:c5d2:2::               9.006ms      ↑ 137 B ↓ 750 B
> 2022-06-17T10:56:12:000 [200 OK] s3.NewMultipartUpload localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D15/quarter%3D15/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads  240b:c1d1:123:664f:c5d2:2::               9.217ms      ↑ 137 B ↓ 751 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D15/quarter%3D45/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=7da87f0a-f8ff-4f9c-b877-b2fdd18d3c5f&partNumber=1  240b:c1d1:123:664f:c5d2:2::               817.474ms    ↑ 52 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D15/quarter%3D15/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=782769d0-43f1-43b8-aae0-54ac4c8c6603&partNumber=1  240b:c1d1:123:664f:c5d2:2::               818.363ms    ↑ 85 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D17/quarter%3D0/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=2c509073-e2b6-4d0a-a65a-bb4f154a432c&partNumber=1  240b:c1d1:123:664f:c5d2:2::               819.765ms    ↑ 54 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D0/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=c7e09609-6193-4d41-bc05-4020291725e4&partNumber=1  240b:c1d1:123:664f:c5d2:2::               818.782ms    ↑ 55 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D15/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=3bb4278e-455a-4dc4-af01-ed3227430590&partNumber=1  240b:c1d1:123:664f:c5d2:2::               817.97ms     ↑ 51 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D15/quarter%3D0/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=8fe799e3-c712-43b7-a074-a2359232de07&partNumber=1  240b:c1d1:123:664f:c5d2:2::               819.183ms    ↑ 80 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D45/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=c2e1477b-5457-4cbe-8fdb-4e80eaca63fe&partNumber=1  240b:c1d1:123:664f:c5d2:2::               818.126ms    ↑ 53 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.PutObjectPart localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D30/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=992167c8-fbde-4a0d-bd4d-5ce7ddd51a87&partNumber=1  240b:c1d1:123:664f:c5d2:2::               818.176ms    ↑ 56 KiB ↓ 325 B
> 2022-06-17T10:56:12:000 [200 OK] s3.CompleteMultipartUpload localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D15/quarter%3D45/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=7da87f0a-f8ff-4f9c-b877-b2fdd18d3c5f  240b:c1d1:123:664f:c5d2:2::               632.761ms    ↑ 272 B ↓ 1.1 KiB
> 2022-06-17T10:56:13:000 [200 OK] s3.NewMultipartUpload localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D17/quarter%3D15/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploads  240b:c1d1:123:664f:c5d2:2::               6.231ms      ↑ 137 B ↓ 751 B
> 2022-06-17T10:56:12:000 [200 OK] s3.CompleteMultipartUpload localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D16/quarter%3D15/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=3bb4278e-455a-4dc4-af01-ed3227430590  240b:c1d1:123:664f:c5d2:2::               697.946ms    ↑ 272 B ↓ 1.1 KiB
> 2022-06-17T10:56:12:000 [200 OK] s3.CompleteMultipartUpload localhost:9000/minio-feature-testing/spark-job/pm-processed/output-parquet-staging-39/day%3D23/hour%3D17/quarter%3D0/part-00004-d0b529ca-112f-43f2-a7dd-44de4db6aa7f-dffa7213-d492-48f9-9e6a-fb08bc81ceeb.c000.snappy.parquet?uploadId=2c509073-e2b6-4d0a-a65a-bb4f154a432c  240b:c1d1:123:664f:c5d2:2::               714.377ms    ↑ 272 B ↓ 1.1 KiB
>  {code}
>  
>  
> Needs to be implement -
>  
> After uploadPart call and all upload id's are added to commitData, innerCommit should be called.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org