You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@jclouds.apache.org by Timur Alperovich <no...@github.com> on 2018/10/24 00:42:24 UTC

[jclouds/jclouds] JCLOUDS-1450: Use S3-style ETags for MPUs. (#1251)

S3 uses a different ETag for multipart uploads (MPUs) than regular
objects. The ETag consists of the md5 hash of the concatenated ETags of
individual parts followed by the number of parts (separated by "-").

The patch changes the LocalBlobStore's implementation of
CompleteMultipartUpload to set the S3-style ETag before calling
putBlob() and return that ETag to the caller.

To simplify testing, a new protected method with a default NOOP
implementation is added to the BaseBlobIntegrationTest. It allows
providers to further verify MPUs (i.e. ensuring the correct ETag has
been stored alongside the object).
You can view, comment on, or merge this pull request online at:

  https://github.com/jclouds/jclouds/pull/1251

-- Commit Summary --

  * JCLOUDS-1450: Use S3-style ETags for MPUs.

-- File Changes --

    M apis/filesystem/src/main/java/org/jclouds/filesystem/strategy/internal/FilesystemStorageStrategyImpl.java (50)
    M apis/filesystem/src/test/java/org/jclouds/filesystem/integration/FilesystemBlobIntegrationTest.java (19)
    M blobstore/src/main/java/org/jclouds/blobstore/config/LocalBlobStore.java (16)
    M blobstore/src/main/java/org/jclouds/blobstore/domain/BlobBuilder.java (8)
    M blobstore/src/main/java/org/jclouds/blobstore/domain/internal/BlobBuilderImpl.java (14)
    M blobstore/src/test/java/org/jclouds/blobstore/integration/internal/BaseBlobIntegrationTest.java (5)

-- Patch Links --

https://github.com/jclouds/jclouds/pull/1251.patch
https://github.com/jclouds/jclouds/pull/1251.diff

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1251

Re: [jclouds/jclouds] JCLOUDS-1450: Use S3-style ETags for MPUs. (#1251)

Posted by Timur Alperovich <no...@github.com>.
timuralp commented on this pull request.



>        try {
          Files.createParentDirs(tmpFile);
-         his = new HashingInputStream(Hashing.md5(), payload.openStream());
-         long actualSize = Files.asByteSink(tmpFile).writeFrom(his);
+         if (isMpu) {
+            inputStream = payload.openStream();
+            eTag = blob.getMetadata().getETag().getBytes();
+         }
+         else {

Done

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1251#discussion_r227910267

Re: [jclouds/jclouds] JCLOUDS-1450: Use S3-style ETags for MPUs. (#1251)

Posted by Timur Alperovich <no...@github.com>.
timuralp commented on this pull request.



> @@ -488,23 +498,37 @@ public String putBlob(final String containerName, final Blob blob) throws IOExce
       String tmpBlobName = blobKey + "-" + UUID.randomUUID();
       File tmpFile = getFileForBlobKey(containerName, tmpBlobName);
       Path tmpPath = tmpFile.toPath();
-      HashingInputStream his = null;
+      boolean isMpu = false;
+      if (blob.getMetadata() != null && blob.getMetadata().getETag() != null)
+         isMpu = blob.getMetadata().getETag().matches(MPU_ETAG_FORMAT);
+      InputStream inputStream = null;
+      byte [] eTag = null;

Done

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1251#discussion_r227910332

Re: [jclouds/jclouds] JCLOUDS-1450: Use S3-style ETags for MPUs. (#1251)

Posted by Andrew Gaul <no...@github.com>.
gaul commented on this pull request.



> @@ -878,7 +888,7 @@ public String completeMultipartUpload(MultipartUpload mpu, List<MultipartPart> p
 
       setBlobAccess(mpu.containerName(), mpu.blobName(), mpu.putOptions().getBlobAccess());
 
-      return eTag;
+      return mpuETag;

Sorry I missed that this was party of `completeMultipartUpload`.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1251#discussion_r227944512

Re: [jclouds/jclouds] JCLOUDS-1450: Use S3-style ETags for MPUs. (#1251)

Posted by Andrew Gaul <no...@github.com>.
gaul requested changes on this pull request.

This seems reasonable.  CI may fail for unrelated reasons so please run these locally which I will also do before merging:

```
mvn verify -pl :filesystem
mvn integration-test -pl :filesystem -Plive
```

> @@ -115,6 +116,7 @@
    private static final String XATTR_USER_METADATA_PREFIX = "user.user-metadata.";
    private static final byte[] DIRECTORY_MD5 =
            Hashing.md5().hashBytes(new byte[0]).asBytes();
+   private static final String MPU_ETAG_FORMAT = "^\"[a-f0-9]{32}-\\d+\"$";

Prefer `format = Pattern.compile()` and `format.matcher().matches`.  Also are ^$ necessary since you use `match` instead of `find`?

> @@ -488,23 +498,37 @@ public String putBlob(final String containerName, final Blob blob) throws IOExce
       String tmpBlobName = blobKey + "-" + UUID.randomUUID();
       File tmpFile = getFileForBlobKey(containerName, tmpBlobName);
       Path tmpPath = tmpFile.toPath();
-      HashingInputStream his = null;
+      boolean isMpu = false;
+      if (blob.getMetadata() != null && blob.getMetadata().getETag() != null)
+         isMpu = blob.getMetadata().getETag().matches(MPU_ETAG_FORMAT);
+      InputStream inputStream = null;
+      byte [] eTag = null;

Remove space before `[]`.

>        try {
          Files.createParentDirs(tmpFile);
-         his = new HashingInputStream(Hashing.md5(), payload.openStream());
-         long actualSize = Files.asByteSink(tmpFile).writeFrom(his);
+         if (isMpu) {
+            inputStream = payload.openStream();
+            eTag = blob.getMetadata().getETag().getBytes();
+         }
+         else {

Put `} else {` on a single line.

>        }
+      StringBuilder mpuETagBuilder = new StringBuilder("\"");
+      mpuETagBuilder.append(Hashing.md5().hashString(partHashes.toString(), US_ASCII).toString());
+      mpuETagBuilder.append(String.format("-%d\"", parts.size()));
+      String mpuETag = mpuETagBuilder.toString();

Use the return values from `StringBuilder` and avoid the `String.format` call:

```java
String mpuETag = new StringBuilder("\"")
    .append(Hashing.md5().hashString(partHashes.toString(), US_ASCII).toString())
    .append("-")
    .append(parts.size())
    .toString();
```

Repeated above.

> @@ -373,7 +376,13 @@ public Blob getBlob(final String container, final String key) {
                if (attributes.contains(XATTR_CONTENT_MD5)) {
                   ByteBuffer buf = ByteBuffer.allocate(view.size(XATTR_CONTENT_MD5));
                   view.read(XATTR_CONTENT_MD5, buf);
-                  hashCode = HashCode.fromBytes(buf.array());
+                  byte [] etagBytes = buf.array();
+                  if (etagBytes.length == 16) {
+                     hashCode = HashCode.fromBytes(buf.array());
+                     eTag = String.format("\"%s\"", hashCode.toString());

`String.format` will actually call `toString` for you, although this may read better:

```java
eTag = "\"" + hashCode + "\"";
```

> @@ -373,7 +376,13 @@ public Blob getBlob(final String container, final String key) {
                if (attributes.contains(XATTR_CONTENT_MD5)) {
                   ByteBuffer buf = ByteBuffer.allocate(view.size(XATTR_CONTENT_MD5));
                   view.read(XATTR_CONTENT_MD5, buf);
-                  hashCode = HashCode.fromBytes(buf.array());
+                  byte [] etagBytes = buf.array();
+                  if (etagBytes.length == 16) {
+                     hashCode = HashCode.fromBytes(buf.array());
+                     eTag = String.format("\"%s\"", hashCode.toString());
+                  } else {
+                     eTag = new String(etagBytes);

Should this have a second `Charset` parameter?

> @@ -878,7 +888,7 @@ public String completeMultipartUpload(MultipartUpload mpu, List<MultipartPart> p
 
       setBlobAccess(mpu.containerName(), mpu.blobName(), mpu.putOptions().getBlobAccess());
 
-      return eTag;
+      return mpuETag;

Does this return an MPU ETag for single-part uploads?

> @@ -373,7 +376,13 @@ public Blob getBlob(final String container, final String key) {
                if (attributes.contains(XATTR_CONTENT_MD5)) {
                   ByteBuffer buf = ByteBuffer.allocate(view.size(XATTR_CONTENT_MD5));
                   view.read(XATTR_CONTENT_MD5, buf);
-                  hashCode = HashCode.fromBytes(buf.array());
+                  byte [] etagBytes = buf.array();
+                  if (etagBytes.length == 16) {

Add comments:

```
// single-part upload
// multi-part upload
```

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1251#pullrequestreview-167697206

Re: [jclouds/jclouds] JCLOUDS-1450: Use S3-style ETags for MPUs. (#1251)

Posted by Timur Alperovich <no...@github.com>.
timuralp commented on this pull request.



> @@ -373,7 +376,13 @@ public Blob getBlob(final String container, final String key) {
                if (attributes.contains(XATTR_CONTENT_MD5)) {
                   ByteBuffer buf = ByteBuffer.allocate(view.size(XATTR_CONTENT_MD5));
                   view.read(XATTR_CONTENT_MD5, buf);
-                  hashCode = HashCode.fromBytes(buf.array());
+                  byte [] etagBytes = buf.array();
+                  if (etagBytes.length == 16) {
+                     hashCode = HashCode.fromBytes(buf.array());
+                     eTag = String.format("\"%s\"", hashCode.toString());
+                  } else {
+                     eTag = new String(etagBytes);

Done

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1251#discussion_r227910171

Re: [jclouds/jclouds] JCLOUDS-1450: Use S3-style ETags for MPUs. (#1251)

Posted by Timur Alperovich <no...@github.com>.
timuralp commented on this pull request.



> @@ -373,7 +376,13 @@ public Blob getBlob(final String container, final String key) {
                if (attributes.contains(XATTR_CONTENT_MD5)) {
                   ByteBuffer buf = ByteBuffer.allocate(view.size(XATTR_CONTENT_MD5));
                   view.read(XATTR_CONTENT_MD5, buf);
-                  hashCode = HashCode.fromBytes(buf.array());
+                  byte [] etagBytes = buf.array();
+                  if (etagBytes.length == 16) {
+                     hashCode = HashCode.fromBytes(buf.array());
+                     eTag = String.format("\"%s\"", hashCode.toString());

Done

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1251#discussion_r227910213

Re: [jclouds/jclouds] JCLOUDS-1450: Use S3-style ETags for MPUs. (#1251)

Posted by Andrew Gaul <no...@github.com>.
gaul approved this pull request.





-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1251#pullrequestreview-168100745

Re: [jclouds/jclouds] JCLOUDS-1450: Use S3-style ETags for MPUs. (#1251)

Posted by Timur Alperovich <no...@github.com>.
timuralp commented on this pull request.



>        }
+      StringBuilder mpuETagBuilder = new StringBuilder("\"");
+      mpuETagBuilder.append(Hashing.md5().hashString(partHashes.toString(), US_ASCII).toString());
+      mpuETagBuilder.append(String.format("-%d\"", parts.size()));
+      String mpuETag = mpuETagBuilder.toString();

Done

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1251#discussion_r227910241

Re: [jclouds/jclouds] JCLOUDS-1450: Use S3-style ETags for MPUs. (#1251)

Posted by Timur Alperovich <no...@github.com>.
timuralp commented on this pull request.



> @@ -373,7 +376,13 @@ public Blob getBlob(final String container, final String key) {
                if (attributes.contains(XATTR_CONTENT_MD5)) {
                   ByteBuffer buf = ByteBuffer.allocate(view.size(XATTR_CONTENT_MD5));
                   view.read(XATTR_CONTENT_MD5, buf);
-                  hashCode = HashCode.fromBytes(buf.array());
+                  byte [] etagBytes = buf.array();
+                  if (etagBytes.length == 16) {

Done

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1251#discussion_r227923384

Re: [jclouds/jclouds] JCLOUDS-1450: Use S3-style ETags for MPUs. (#1251)

Posted by Timur Alperovich <no...@github.com>.
timuralp commented on this pull request.



> @@ -878,7 +888,7 @@ public String completeMultipartUpload(MultipartUpload mpu, List<MultipartPart> p
 
       setBlobAccess(mpu.containerName(), mpu.blobName(), mpu.putOptions().getBlobAccess());
 
-      return eTag;
+      return mpuETag;

This will return an MPU ETag for any multi-part uploads, even if they are composed of one part. This matches the AWS S3 behavior.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1251#discussion_r227910056

Re: [jclouds/jclouds] JCLOUDS-1450: Use S3-style ETags for MPUs. (#1251)

Posted by Andrew Gaul <no...@github.com>.
Closed #1251.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1251#event-1924564258

Re: [jclouds/jclouds] JCLOUDS-1450: Use S3-style ETags for MPUs. (#1251)

Posted by Andrew Gaul <no...@github.com>.
Pushed to master as 539a9854c12514959264987d8a18e1cab40f2240.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1251#issuecomment-432817751

Re: [jclouds/jclouds] JCLOUDS-1450: Use S3-style ETags for MPUs. (#1251)

Posted by Timur Alperovich <no...@github.com>.
timuralp commented on this pull request.



> @@ -115,6 +116,7 @@
    private static final String XATTR_USER_METADATA_PREFIX = "user.user-metadata.";
    private static final byte[] DIRECTORY_MD5 =
            Hashing.md5().hashBytes(new byte[0]).asBytes();
+   private static final String MPU_ETAG_FORMAT = "^\"[a-f0-9]{32}-\\d+\"$";

Done

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1251#discussion_r227910358