You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@libcloud.apache.org by GitBox <gi...@apache.org> on 2020/01/03 21:31:37 UTC

[GitHub] [libcloud] c-w commented on issue #1399: Upload large file to Azure Blobs

c-w commented on issue #1399: Upload large file to Azure Blobs
URL: https://github.com/apache/libcloud/issues/1399#issuecomment-570704353
 
 
   **TL;DR**
   The maximum file size currently supported by the Azure Storage driver is 256 MB. Uploading larger file sizes will require a code change in libcloud.
   
   **Details**
   The Azure Storage driver's implementation of [upload_object_via_stream](https://github.com/apache/libcloud/blob/6dca82e649456b42d23f439854d3dc807c806abf/libcloud/storage/drivers/azure_blobs.py#L822-L841) delegates to [_put_object](https://github.com/apache/libcloud/blob/6dca82e649456b42d23f439854d3dc807c806abf/libcloud/storage/drivers/azure_blobs.py#L945-L951) which calls through to the generic [_upload_object](https://github.com/apache/libcloud/blob/6dca82e649456b42d23f439854d3dc807c806abf/libcloud/storage/base.py#L584-L592) which does a single PUT request to the storage backend. Given that [we're using Azure Storage API version 2016-05-31](https://github.com/apache/libcloud/blob/6dca82e649456b42d23f439854d3dc807c806abf/libcloud/storage/drivers/azure_blobs.py#L180), according to the [Put Blob documentation](https://docs.microsoft.com/en-us/rest/api/storageservices/put-blob#remarks), the maximum file size that can be uploaded in one Put Blob request is 256 MB. As such, to support uploading files larger than 256 MB, the Azure Storage driver would have to implement chunked blob upload via [Put Block](https://docs.microsoft.com/en-us/rest/api/storageservices/put-block) and [Put Block List](https://docs.microsoft.com/en-us/rest/api/storageservices/put-block-list). It looks like the Azure Storage driver used to implement the chunked blob upload flow (e.g. see [24f34c9](https://github.com/apache/libcloud/blob/24f34c99c9440523a53e940a346bced551281953/libcloud/storage/drivers/azure_blobs.py#L732-L788)). However, since [6e0040d](https://github.com/apache/libcloud/commit/6e0040d8904cacb5dbe88309e9051be08cdc59f9) the driver doesn't have support for chunked blob upload anymore.
   
   I encountered this limitation in several other projects (e.g. https://github.com/ascoderu/opwen-cloudserver/issues/219) so I will try to find some time and work on a fix.
   
   **Work-around**
   If you require access to Azure Storage via libcloud for uploading large files right now before the fix mentioned above is implemented, I would suggest to try the following: The [libcloud S3 driver currently implements chunked upload](https://github.com/apache/libcloud/blob/6dca82e649456b42d23f439854d3dc807c806abf/libcloud/storage/drivers/s3.py#L688-L694), so you could try deploying [MinIO](https://github.com/minio/minio) as a [gateway for Azure Storage](https://docs.min.io/docs/minio-gateway-for-azure.html) and using the libcloud S3 driver to talk to the MinIO frontend which in turn communicates efficiently with the Azure Storage backend. For MinIO [947bc8c](https://github.com/minio/minio/commit/947bc8c7d3b8ad98cdbb6ce0f8dea155df16aadf) and later, this approach should work for all types of cloud-based Azure Storage accounts (e.g. Storage, StorageV2, BlobStorage) as well as Azurite and Azure IoT Edge Storage. Once chunked blob upload is fixed in libcloud, you should be able to remove the MinIO indirection and switch to libcloud's Azure Storage driver with no additional code changes required.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services