You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@jclouds.apache.org by Andrew Gaul <no...@github.com> on 2015/06/02 06:41:10 UTC
[jclouds] JCLOUDS-894: Add portable multipart upload (#762)
This unifies the provider multipart upload code paths and removes code
duplication.
You can view, comment on, or merge this pull request online at:
https://github.com/jclouds/jclouds/pull/762
-- Commit Summary --
* JCLOUDS-894: Add portable multipart upload
* JCLOUDS-894: Add portable multipart upload for S3
* JCLOUDS-894: Add portable multipart upload for Azure
* JCLOUDS-894: Add portable multipart upload for Swift
* JCLOUDS-894: Odds and ends
-- File Changes --
M apis/atmos/src/main/java/org/jclouds/atmos/blobstore/AtmosBlobStore.java (7)
M apis/cloudfiles/src/main/java/org/jclouds/cloudfiles/blobstore/CloudFilesBlobStore.java (5)
M apis/openstack-swift/src/main/java/org/jclouds/openstack/swift/v1/blobstore/RegionScopedSwiftBlobStore.java (25)
M apis/openstack-swift/src/test/java/org/jclouds/openstack/swift/v1/blobstore/integration/SwiftBlobIntegrationLiveTest.java (5)
M apis/s3/src/main/java/org/jclouds/s3/blobstore/S3BlobStore.java (14)
M apis/s3/src/main/java/org/jclouds/s3/blobstore/config/S3BlobStoreContextModule.java (6)
D apis/s3/src/main/java/org/jclouds/s3/blobstore/strategy/AsyncMultipartUploadStrategy.java (31)
D apis/s3/src/main/java/org/jclouds/s3/blobstore/strategy/MultipartUploadStrategy.java (28)
D apis/s3/src/main/java/org/jclouds/s3/blobstore/strategy/internal/ParallelMultipartUploadStrategy.java (292)
D apis/s3/src/main/java/org/jclouds/s3/blobstore/strategy/internal/SequentialMultipartUploadStrategy.java (123)
D apis/s3/src/test/java/org/jclouds/s3/blobstore/strategy/internal/SequentialMultipartUploadStrategyMockTest.java (147)
M apis/swift/src/main/java/org/jclouds/openstack/swift/blobstore/SwiftBlobStore.java (5)
M blobstore/src/main/java/org/jclouds/blobstore/internal/BaseBlobStore.java (28)
M providers/aws-s3/src/main/java/org/jclouds/aws/s3/blobstore/AWSS3BlobStore.java (16)
M providers/azureblob/src/main/java/org/jclouds/azureblob/blobstore/AzureBlobStore.java (13)
D providers/azureblob/src/main/java/org/jclouds/azureblob/blobstore/strategy/AzureBlobBlockUploadStrategy.java (82)
D providers/azureblob/src/main/java/org/jclouds/azureblob/blobstore/strategy/MultipartUploadStrategy.java (34)
M providers/azureblob/src/test/java/org/jclouds/azureblob/blobstore/integration/AzureBlobIntegrationLiveTest.java (31)
D providers/azureblob/src/test/java/org/jclouds/azureblob/blobstore/strategy/AzureBlobBlockUploadStrategyTest.java (123)
M providers/hpcloud-objectstorage/src/main/java/org/jclouds/hpcloud/objectstorage/blobstore/HPCloudObjectStorageBlobStore.java (5)
-- Patch Links --
https://github.com/jclouds/jclouds/pull/762.patch
https://github.com/jclouds/jclouds/pull/762.diff
---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/762
Re: [jclouds] JCLOUDS-894: Add portable multipart upload (#762)
Posted by Timur Alperovich <no...@github.com>.
AWS Java S3 SDK does the following:
public static long calculateOptimalPartSize(PutObjectRequest putObjectRequest, TransferManagerConfiguration configuration) {
double contentLength = TransferManagerUtils.getContentLength(putObjectRequest);
double optimalPartSize = (double)contentLength / (double)MAXIMUM_UPLOAD_PARTS;
// round up so we don't push the upload over the maximum number of parts
optimalPartSize = Math.ceil(optimalPartSize);
return (long)Math.max(optimalPartSize, configuration.getMinimumUploadPartSize());
}
AWS SDK defaults to a maximum of 10000 parts. The minimum default part size is 5MB. So, uploading a 51GB file, for example, would use 8500 6MB parts.
Jclouds could use a similar mechanism. It would probably make sense to expose the configuration parameters to be able to change the default behavior.
---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/762#issuecomment-108633453
Re: [jclouds] JCLOUDS-894: Add portable multipart upload (#762)
Posted by Andrew Gaul <no...@github.com>.
> @@ -275,4 +285,20 @@ public String copyBlob(String fromContainer, String fromName, String toContainer
> Closeables2.closeQuietly(is);
> }
> }
> +
> + // TODO: parallel uploads
> + @Beta
> + protected String putMultipartBlob(String container, Blob blob, PutOptions overrides) {
> + MultipartUpload mpu = initiateMultipartUpload(container, blob.getMetadata());
> + List<MultipartPart> parts = Lists.newArrayList();
> + long contentLength = blob.getMetadata().getContentMetadata().getContentLength();
> + long partSize = getMaximumMultipartPartSize(); // TODO: optimal?
We need a better strategy here -- we should pick a combination of minimum part size, maximum part size, and number of parts. A good combination will do less work when encountering network errors and allow better use of the uplink via parallel uploads.
---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/762/files#r31492595
Re: [jclouds] JCLOUDS-894: Add portable multipart upload (#762)
Posted by Andrew Gaul <no...@github.com>.
@danbroudy @kahing @zack-shoylev This pull request follows on to the earlier one exposing the component multipart operations.
---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/762#issuecomment-107800525
Re: [jclouds] JCLOUDS-894: Add portable multipart upload (#762)
Posted by Timur Alperovich <no...@github.com>.
> @@ -275,4 +285,20 @@ public String copyBlob(String fromContainer, String fromName, String toContainer
> Closeables2.closeQuietly(is);
> }
> }
> +
> + // TODO: parallel uploads
> + @Beta
> + protected String putMultipartBlob(String container, Blob blob, PutOptions overrides) {
> + MultipartUpload mpu = initiateMultipartUpload(container, blob.getMetadata());
> + List<MultipartPart> parts = Lists.newArrayList();
> + long contentLength = blob.getMetadata().getContentMetadata().getContentLength();
> + long partSize = getMaximumMultipartPartSize(); // TODO: optimal?
AWS Java S3 SDK does the following:
public static long calculateOptimalPartSize(PutObjectRequest putObjectRequest, TransferManagerConfiguration configuration) {
double contentLength = TransferManagerUtils.getContentLength(putObjectRequest);
double optimalPartSize = (double)contentLength / (double)MAXIMUM_UPLOAD_PARTS;
// round up so we don't push the upload over the maximum number of parts
optimalPartSize = Math.ceil(optimalPartSize);
return (long)Math.max(optimalPartSize, configuration.getMinimumUploadPartSize());
}
AWS SDK defaults to a maximum of 10000 parts. The minimum default part size is 5MB. So, uploading a 51GB file, for example, would use 8500 6MB parts.
Jclouds could use a similar mechanism. It would probably make sense to expose the configuration parameters to be able to change the default behavior.
---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/762/files#r31677839
Re: [jclouds] JCLOUDS-894: Add portable multipart upload (#762)
Posted by Andrew Gaul <no...@github.com>.
> @@ -275,4 +285,20 @@ public String copyBlob(String fromContainer, String fromName, String toContainer
> Closeables2.closeQuietly(is);
> }
> }
> +
> + // TODO: parallel uploads
> + @Beta
> + protected String putMultipartBlob(String container, Blob blob, PutOptions overrides) {
> + MultipartUpload mpu = initiateMultipartUpload(container, blob.getMetadata());
> + List<MultipartPart> parts = Lists.newArrayList();
> + long contentLength = blob.getMetadata().getContentMetadata().getContentLength();
> + long partSize = getMaximumMultipartPartSize(); // TODO: optimal?
I reparented the S3 `MultipartUploadSlicingAlgorithm` to core so we have the same algorithm as before.
---
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/762/files#r31872080