You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@libcloud.apache.org by pquentin <gi...@git.apache.org> on 2017/10/18 07:33:54 UTC
[GitHub] libcloud pull request #1135: storage: Fix hash computation performance on up...
GitHub user pquentin opened a pull request:
https://github.com/apache/libcloud/pull/1135
storage: Fix hash computation performance on upload
## storage: Fix hash computation performance on upload
### Description
The storage base driver computes the hash of uploaded files: individual drivers use it to ensure the reported hash is correct. Before libcloud 2.x, this was done efficiently: the file was only read once, and we were using `hash.update()` to avoid keeping the whole file in memory. With the switch to the requests module, both of these optimizations were removed inadvertently. It turns out the important one is using `hash.update()`: computing the hash on the whole file in memory is orders of magnitude slower.
### Status
- done, ready for review
### Checklist (tick everything that applies)
- [x] [Code linting](http://libcloud.readthedocs.org/en/latest/development.html#code-style-guide) (required, can be done after the PR checks)
- [ ] Documentation
- [x] [Tests](http://libcloud.readthedocs.org/en/latest/testing.html)
- [x] [ICLA](http://libcloud.readthedocs.org/en/latest/development.html#contributing-bigger-changes) (required for bigger changes)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/pquentin/libcloud upload-stream-hash
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/libcloud/pull/1135.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1135
----
commit 89ecd6e0346ce7cf4f6f2ec47814168f843cfa27
Author: Quentin Pradet <qu...@clustree.com>
Date: 2017-10-18T07:15:22Z
storage: Fix hash computation performance on upload
The storage base driver computes the hash of uploaded files: individual
drivers use it to ensure the reported hash is correct. Before libcloud
2.x, this was done efficiently: the file was only read once, and we were
using `hash.update()` to avoid keeping the whole file in memory. With
the switch to the requests module, both of these optimizations were
removed inadvertently. It turns out the important one is using
`hash.update()`: computing the hash on the whole file in memory is
orders of magnitude slower.
----
---
[GitHub] libcloud pull request #1135: storage: Fix hash computation performance on up...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/libcloud/pull/1135
---