You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@jclouds.apache.org by jo...@gmail.com, jo...@gmail.com on 2018/04/11 20:20:02 UTC

putBlob fails to send proper content when using payload(InputStream)

I've got a sample program here that shows uploading a payload via an InputStream (ByteBufferBackedInputStream below - I've also tried it with a FileInputStream). I can't use a ByteSource because it doesn't support sourcing from direct access memory buffers.

```
public class App 
{
    public static void main( String[] args ) throws Exception
    {
        Properties overrides = new Properties();
        overrides.setProperty(PROPERTY_S3_VIRTUAL_HOST_BUCKETS, "false");
        overrides.setProperty(PROPERTY_STRIP_EXPECT_HEADER, "true");

        String endpoint = "http://localhost:9444/s3";
        if (endpoint != null) {
            URI uri = URI.create(endpoint);
            String path = uri.getPath();
            if (path != null && !path.isEmpty()) {
                overrides.setProperty(PROPERTY_S3_SERVICE_PATH, path);          // <-- "/s3" required
            }
        }
        BlobStoreContext context = ContextBuilder.newBuilder("aws-s3")
                .endpoint(endpoint)                                             // <-- "/s3" required
                .credentials("key", "secret")
                .overrides(overrides)
                .buildView(BlobStoreContext.class);

        BlobStore blobStore = context.getBlobStore();

        ByteBuffer content = ByteBuffer.allocate(1024 * 1024);
        new Random(Instant.now().getEpochSecond()).nextBytes(content.array());

        ByteBuffer tmp = content.duplicate();
        HashCode md5 = Hashing.md5().hashBytes(tmp);

        try (ByteBufferBackedInputStream is = new ByteBufferBackedInputStream(content)) {
            Blob blob = blobStore.blobBuilder("myblob")
                        .payload(is)
//                        .payload(content.array())
                        .contentLength(content.remaining())
                        .contentMD5(md5)
                        .build();
            blobStore.putBlob("ninja-autod71326c6-b718-46d2-bbd9-5c259ddd3bba", blob);
        }
        context.close();
    }

    private static class ByteBufferBackedInputStream extends InputStream {

        private final ByteBuffer buf;

        public ByteBufferBackedInputStream(ByteBuffer buf) {
            this.buf = buf;
        }

        @Override
        public int read() {
            if (!buf.hasRemaining()) {
                return -1;
            }
            return buf.get() & 0xFF;
        }

        @Override
        public int read(@Nonnull byte[] bytes, int off, int len) {
            if (!buf.hasRemaining()) {
                return -1;
            }
            int lenToUse = Math.min(len, buf.remaining());
            buf.get(bytes, off, lenToUse);
            return lenToUse;
        }
    }
}
```

The error returned by an s3 ninja server is 400 bad request, with the reason being the md5 checksums do not match (the one I calculated with Hashing.md5() and the one the server calculated). 

I've tried the same request with the same content using payload(byte[]) and it works. Uncomment the other payload line and comment out the payload(is) line and re-execute. It works.

```
Exception in thread "main" org.jclouds.http.HttpResponseException: command: PUT http://172.29.83.131:9444/s3/ninja-autod71326c6-b718-46d2-bbd9-5c259ddd3bba/byHashV1/0001000000000040000020DB64B417264C5CE81AC0F85D855F6088D717463C5C7CC7574AA19CAD7B45C1BC HTTP/1.1 failed with response: HTTP/1.1 400 Bad Request; content: [
<html>
<head>
<title>Error - 400</title>
</head>
<body>
<h1>Bad Request</h1>
<hr />
<pre>Invalid MD5 checksum (Input: PaLqlfshFBMNUqyw+NT54A==, Expected: 4HWv1+2bHosUPzq8OUeMmQ==)</pre>
<hr />
<p>
..
</body>
</html>]
	at org.jclouds.aws.handlers.ParseAWSErrorFromXmlContent.handleError(ParseAWSErrorFromXmlContent.java:82)
	at org.jclouds.http.handlers.DelegatingErrorHandler.handleError(DelegatingErrorHandler.java:65)
	at org.jclouds.http.internal.BaseHttpCommandExecutorService.shouldContinue(BaseHttpCommandExecutorService.java:138)
	at org.jclouds.http.internal.BaseHttpCommandExecutorService.invoke(BaseHttpCommandExecutorService.java:107)
	at org.jclouds.rest.internal.InvokeHttpMethod.invoke(InvokeHttpMethod.java:91)
	at org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:74)
	at org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:45)
	at org.jclouds.rest.internal.DelegatesToInvocationFunction.handle(DelegatesToInvocationFunction.java:156)
	at org.jclouds.rest.internal.DelegatesToInvocationFunction.invoke(DelegatesToInvocationFunction.java:123)
	at com.sun.proxy.$Proxy48.putObject(Unknown Source)
	at org.jclouds.s3.blobstore.S3BlobStore.putBlob(S3BlobStore.java:271)
	at org.jclouds.aws.s3.blobstore.AWSS3BlobStore.putBlob(AWSS3BlobStore.java:85)
	at org.jclouds.s3.blobstore.S3BlobStore.putBlob(S3BlobStore.java:248)
	at test.App.main(App.java:98)

Process finished with exit code 1
```

Thanks in advance,
John Calcote

Re: putBlob fails to send proper content when using payload(InputStream)

Posted by Andrew Gaul <ga...@apache.org>.
On Fri, Apr 13, 2018 at 09:29:19PM -0000, john.calcote@gmail.com wrote:
> Just tried out s3proxy - it's exactly what we're looking for. I tried setting it up to use aws-v4 authorization against a filesystem backend like this:
>
> $ cat s3proxy.conf 
> s3proxy.authorization=aws-v4
> s3proxy.endpoint=http://0.0.0.0:8080
> s3proxy.identity=identity
> s3proxy.credential=secret
> jclouds.provider=filesystem
> jclouds.filesystem.basedir=/tmp/s3proxy
> 
> It seemed to work, but it can't find the container (test-bucket) when I do a PUT that works against amazon. This came back from jclouds client when configured to use the aws-s3 provider (I added an /etc/hosts entry for the vhost-buckets issue - worked like a charm):

S3Proxy interpreted your PUT object operation as a PUT bucket operation;
I suspect that you need to set s3proxy.virtual-host as documented here:

https://github.com/gaul/s3proxy/blob/master/src/main/java/org/gaul/s3proxy/S3ProxyConstants.java#L50

If this does not help, could you follow up with a GitHub issue at
https://github.com/gaul/s3proxy since this does not relate to the
jclouds backend?

-- 
Andrew Gaul
http://gaul.org/

Re: putBlob fails to send proper content when using payload(InputStream)

Posted by jo...@gmail.com, jo...@gmail.com.
> > I notice you filed a related bug against s3ninja recently -- you may
> > want to try S3Proxy[1] instead which has a more complete implementation
> > and actually uses jclouds as its backend.

Hi Andrew,

Just tried out s3proxy - it's exactly what we're looking for. I tried setting it up to use aws-v4 authorization against a filesystem backend like this:

$ cat s3proxy.conf 
s3proxy.authorization=aws-v4
s3proxy.endpoint=http://0.0.0.0:8080
s3proxy.identity=identity
s3proxy.credential=secret
jclouds.provider=filesystem
jclouds.filesystem.basedir=/tmp/s3proxy

It seemed to work, but it can't find the container (test-bucket) when I do a PUT that works against amazon. This came back from jclouds client when configured to use the aws-s3 provider (I added an /etc/hosts entry for the vhost-buckets issue - worked like a charm):

Exception in thread "main" org.jclouds.http.HttpResponseException: command: PUT http://test-bucket.jmc-dev:8080/myblob HTTP/1.1 failed with response: HTTP/1.1 500 Unexpected character 'N' (code 78) in prolog; expected '<'? at [row,col {unknown-source}]: [1,1]; content: [<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 500 </title>
</head>
<body>
<h2>HTTP ERROR: 500</h2>
<p>Problem accessing /myblob. Reason:
<pre>    Unexpected character &apos;N&apos; (code 78) in prolog; expected &apos;&lt;&apos;
 at [row,col {unknown-source}]: [1,1]</pre></p>
<hr /><i><small>Powered by Jetty://</small></i>
</body>
</html>

This came out on the s3proxy console:

com.fasterxml.jackson.core.JsonParseException: Unexpected character 'N' (code 78) in prolog; expected '<'
 at [row,col {unknown-source}]: [1,1]
	at com.fasterxml.jackson.dataformat.xml.util.StaxUtil.throwAsParseException(StaxUtil.java:37)
	at com.fasterxml.jackson.dataformat.xml.XmlFactory._initializeXmlReader(XmlFactory.java:657)
	at com.fasterxml.jackson.dataformat.xml.XmlFactory._createParser(XmlFactory.java:536)
	at com.fasterxml.jackson.dataformat.xml.XmlFactory._createParser(XmlFactory.java:29)
	at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:820)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3058)
	at org.gaul.s3proxy.S3ProxyHandler.handleContainerCreate(S3ProxyHandler.java:1207)
	at org.gaul.s3proxy.S3ProxyHandler.doHandle(S3ProxyHandler.java:707)
	at org.gaul.s3proxy.S3ProxyHandlerJetty.handle(S3ProxyHandlerJetty.java:70)
	at org.gaul.shaded.org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
	at org.gaul.shaded.org.eclipse.jetty.server.Server.handle(Server.java:499)
	at org.gaul.shaded.org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311)
	at org.gaul.shaded.org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:258)
	at org.gaul.shaded.org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
	at org.gaul.shaded.org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
	at org.gaul.shaded.org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
	at java.lang.Thread.run(Thread.java:745)
Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character 'N' (code 78) in prolog; expected '<'
 at [row,col {unknown-source}]: [1,1]
	at com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:653)
	at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2133)
	at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1181)
	at com.fasterxml.jackson.dataformat.xml.XmlFactory._initializeXmlReader(XmlFactory.java:653)
	... 15 common frames omitted

The 'N' that it says it's seeing could be coming from my PUT data: "Now is the time for all good men..."

Any thoughts?

John

Re: putBlob fails to send proper content when using payload(InputStream)

Posted by John Calcote <jo...@gmail.com>.
Thanks Andrew. Very helpful information. I'll definitely look at s3proxy.

John

On Fri, Apr 13, 2018, 12:06 AM Andrew Gaul <ga...@apache.org> wrote:

> On Thu, Apr 12, 2018 at 01:41:38PM -0000, john.calcote@gmail.com wrote:
> > Additional information on this issue: I've discovered by virtue of a
> wireshark session that jclouds client is NOT sending chunked
> transfer-encoding, but rather aws-chunked content-encoding. Can anyone tell
> me why this is necessary, since A) it accomplishes the same thing that
> chunked transfer-encoding does (except that it's not compatible with most
> web servers' built-in ability to handle chunked encoding) and B) we're
> sending the content-length header?
>
> aws-s3 uses V4 signing while s3 uses V2 signing.  V4 uses a chunked
> encoding to sign the payload as well as the headers while V2 signs only
> the headers.  V4 uses the AWS encoding because of the signatures it
> attaches.  I believe you can Guice override the signer type in s3 to get
> the same behavior as aws-s3.  If you are using a local S3 clone and not
> AWS itself you really should use the s3 provider since aws-s3 just
> overrides endpoints and regions.
>
> I notice you filed a related bug against s3ninja recently -- you may
> want to try S3Proxy[1] instead which has a more complete implementation
> and actually uses jclouds as its backend.
>
> [1] https://github.com/gaul/s3proxy
>
> --
> Andrew Gaul
> http://gaul.org/
>

Re: putBlob fails to send proper content when using payload(InputStream)

Posted by Andrew Gaul <ga...@apache.org>.
On Thu, Apr 12, 2018 at 01:41:38PM -0000, john.calcote@gmail.com wrote:
> Additional information on this issue: I've discovered by virtue of a wireshark session that jclouds client is NOT sending chunked transfer-encoding, but rather aws-chunked content-encoding. Can anyone tell me why this is necessary, since A) it accomplishes the same thing that chunked transfer-encoding does (except that it's not compatible with most web servers' built-in ability to handle chunked encoding) and B) we're sending the content-length header?

aws-s3 uses V4 signing while s3 uses V2 signing.  V4 uses a chunked
encoding to sign the payload as well as the headers while V2 signs only
the headers.  V4 uses the AWS encoding because of the signatures it
attaches.  I believe you can Guice override the signer type in s3 to get
the same behavior as aws-s3.  If you are using a local S3 clone and not
AWS itself you really should use the s3 provider since aws-s3 just
overrides endpoints and regions.

I notice you filed a related bug against s3ninja recently -- you may
want to try S3Proxy[1] instead which has a more complete implementation
and actually uses jclouds as its backend.

[1] https://github.com/gaul/s3proxy

-- 
Andrew Gaul
http://gaul.org/

Re: putBlob fails to send proper content when using payload(InputStream)

Posted by jo...@gmail.com, jo...@gmail.com.
Additional information on this issue: I've discovered by virtue of a wireshark session that jclouds client is NOT sending chunked transfer-encoding, but rather aws-chunked content-encoding. Can anyone tell me why this is necessary, since A) it accomplishes the same thing that chunked transfer-encoding does (except that it's not compatible with most web servers' built-in ability to handle chunked encoding) and B) we're sending the content-length header?

Re: putBlob fails to send proper content when using payload(InputStream)

Posted by jo...@gmail.com, jo...@gmail.com.

On 2018/04/11 20:28:55, john.calcote@gmail.com <jo...@gmail.com> wrote: 
> I just found out that if I use "s3" provider type rather than "aws-s3" provider it works. I've set a breakpoint in the read(byte[], ...) method of my ByteBufferBackedInputStream and I can see that the difference appears to be that the "s3" provider is using a straight upload, while the "aws-s3" provider is using a chunked encoding form of upload.
> 

We always know our content-length in advance. Is there a way to disabled chunked encoding when using aws-s3?

Re: putBlob fails to send proper content when using payload(InputStream)

Posted by jo...@gmail.com, jo...@gmail.com.
I just found out that if I use "s3" provider type rather than "aws-s3" provider it works. I've set a breakpoint in the read(byte[], ...) method of my ByteBufferBackedInputStream and I can see that the difference appears to be that the "s3" provider is using a straight upload, while the "aws-s3" provider is using a chunked encoding form of upload.