You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Denis Haskin <de...@haskinferguson.net> on 2010/06/10 20:15:11 UTC

Using S3 native filesystem? (s3n://)

I'm having trouble getting the S3 native filesystem working, trying to use
it from my local desktop.  Following http://wiki.apache.org/hadoop/AmazonS3,
I have my hdfs-site.xml as:

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
<property>
  <name>fs.default.name</name>
  <value>s3n://dwh-hdfs-test</value>
</property>
<property>
  <name>fs.s3n.awsAccessKeyId</name>
  <value>..access key...</value>
</property>
<property>
  <name>fs.s3n.awsSecretAccessKey</name>
  <value>...secret access key, with / encoded as %2F..</value>
</property>

</configuration>

"bin/hadoop namenode -format" works; I end up with this structure (as seen
by s3cmd):

2010-06-10 13:11         4
s3://dwh-hdfs-test/tmp/hadoop-dhaskin/mapred/system/jobtracker.info
2010-06-10 13:11         0
s3://dwh-hdfs-test/tmp/hadoop-dhaskin/mapred/system_$folder$
2010-06-10 13:10         0
s3://dwh-hdfs-test/tmp/hadoop-dhaskin/mapred_$folder$
2010-06-10 13:10         0   s3://dwh-hdfs-test/tmp/hadoop-dhaskin_$folder$
2010-06-10 13:10         0   s3://dwh-hdfs-test/tmp_$folder$

but when I try something like "bin/hadoop fs -ls" I get various errors.
 Part of the problem is it's sort of unclear what the format of the
parameter should be.  See what I tried and the various errors down below, at
[1]

Also, when I started up the jobtracker, it fails with:
2010-06-10 13:34:08,521 FATAL org.apache.hadoop.mapred.JobTracker:
org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException:
S3 HEAD request failed for '/tmp%2Fhadoop-dhaskin%2Fmapred%2Fsystem' -
ResponseCode=403, ResponseMessage=Forbidden

Suggestions?  Thanks...

dwh


[1] bin/hadoop fs -ls -- various attempts:

dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls /
ls: org.jets3t.service.S3ServiceException: S3 GET failed for '/' XML Error
Message: <?xml version="1.0"
encoding="UTF-8"?><Error><Code>SignatureDoesNotMatch</Code><Message>The
request signature we calculated does not match the signature you provided.
Check your key and signing method.</Message><StringToSignBytes>47 45 54 0a
0a 0a 54 68 75 2c 20 31 30 20 4a 75 6e 20 32 30 31 30 20 31 37 3a 33 30 3a
34 30 20 47 4d 54 0a 2f 64 77 68 2d 68 64 66 73 2d 74 65 73 74
2f</StringToSignBytes><RequestId>098E7E853DB93FD9</RequestId><HostId>Y8w3ccqhj4VvWY6Ma17o5HF+8cWK3r1kiIywxAsrwSsuR2DqFpxx4+2+9Xgnn+9i</HostId><SignatureProvided>w8vS4vJjm5MpkLpDbnuVcByruxw=</SignatureProvided><StringToSign>GETThu,
10 Jun 2010 17:30:40
GMT/dwh-hdfs-test/</StringToSign><AWSAccessKeyId>AKIAI5UTNXFARAYFZAXQ</AWSAccessKeyId></Error>

dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls s3n://dwh-hdfs-test
ls: Path must be absolute: s3n://dwh-hdfs-test
Usage: java FsShell [-ls <path>]

dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls s3n://dwh-hdfs-test/
ls: org.jets3t.service.S3ServiceException: S3 GET failed for '/' XML Error
Message: <?xml version="1.0"
encoding="UTF-8"?><Error><Code>SignatureDoesNotMatch</Code><Message>The
request signature we calculated does not match the signature you provided.
Check your key and signing method.</Message><StringToSignBytes>47 45 54 0a
0a 0a 54 68 75 2c 20 31 30 20 4a 75 6e 20 32 30 31 30 20 31 37 3a 33 30 3a
35 36 20 47 4d 54 0a 2f 64 77 68 2d 68 64 66 73 2d 74 65 73 74
2f</StringToSignBytes><RequestId>2F7004189009A56A</RequestId><HostId>IQzsA5849ZnBaqGuAsFvTIt78u9oDRaBvrY5Xwg5exf85H+7/aAejxK33QPLXCue</HostId><SignatureProvided>ryv+zgWT3YXPbqPcMNr0F4dcWKM=</SignatureProvided><StringToSign>GETThu,
10 Jun 2010 17:30:56
GMT/dwh-hdfs-test/</StringToSign><AWSAccessKeyId>AKIAI5UTNXFARAYFZAXQ</AWSAccessKeyId></Error>

dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls s3n://dwh-hdfs-test/tmp
ls: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/tmp'
- ResponseCode=403, ResponseMessage=Forbidden

dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls /tmp
ls: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/tmp'
- ResponseCode=403, ResponseMessage=Forbidden

Re: Using S3 native filesystem? (s3n://)

Posted by Denis Haskin <de...@haskinferguson.net>.
Is the doc at http://wiki.apache.org/hadoop/AmazonS3 wrong, then?
"Put the following in conf/hadoop-site.xml to set the default
filesystem to be the S3 block filesystem"

dwh



On Tue, Jun 15, 2010 at 12:17 PM, Owen O'Malley <om...@apache.org> wrote:
> On Jun 15, 2010, at 8:51 AM, Denis Haskin wrote:
>
>> Actually, it still seems that "hadoop fs" is incorrectly using
>> core-site.xml instead of hdfs-site.xml.  If I put my fs.default.name &
>> etc only in hdfs-site.xml, "hadoop fs" doesn't find it.  But does if
>> it's in core-site.xml.
>>
>> I assume this is a bug?
>
> Actually, no. The default file system (fs.default.name) can be any
> filesystem, including s3n. Therefore the configuration for it is in common,
> rather than hdfs.
>
> -- Owen
>

Re: Using S3 native filesystem? (s3n://)

Posted by Owen O'Malley <om...@apache.org>.
On Jun 15, 2010, at 8:51 AM, Denis Haskin wrote:

> Actually, it still seems that "hadoop fs" is incorrectly using
> core-site.xml instead of hdfs-site.xml.  If I put my fs.default.name &
> etc only in hdfs-site.xml, "hadoop fs" doesn't find it.  But does if
> it's in core-site.xml.
>
> I assume this is a bug?

Actually, no. The default file system (fs.default.name) can be any  
filesystem, including s3n. Therefore the configuration for it is in  
common, rather than hdfs.

-- Owen

Re: Using S3 native filesystem? (s3n://)

Posted by Denis Haskin <de...@haskinferguson.net>.
Actually, it still seems that "hadoop fs" is incorrectly using
core-site.xml instead of hdfs-site.xml.  If I put my fs.default.name &
etc only in hdfs-site.xml, "hadoop fs" doesn't find it.  But does if
it's in core-site.xml.

I assume this is a bug?

dwh



On Fri, Jun 11, 2010 at 6:30 AM, Denis Haskin <de...@haskinferguson.net> wrote:
> Resolved this; http://wiki.apache.org/hadoop/AmazonS3 is a little unclear.
>
> The page says "Note that since the secret access key can contain
> slashes, you must remember to escape them by replacing each slash /
> with the string %2F."
>
> This appears to apply ONLY when the secret access key is embedded in
> the fs.default.name, not when it's specified separately via
> fs.s3n.awsSecretAccessKey.  I haven't fully tested that, but once I
> un-escaped it (in the awsSecretAccessKey), it works.
>
> Also, I don't know why I was trying to do a "namenode -format" in this
> case; it's not necesssary.
>
> dwh
>
>
>
> On Thu, Jun 10, 2010 at 2:15 PM, Denis Haskin <de...@haskinferguson.net> wrote:
>> I'm having trouble getting the S3 native filesystem working, trying to use
>> it from my local desktop.  Following http://wiki.apache.org/hadoop/AmazonS3,
>> I have my hdfs-site.xml as:
>> <configuration>
>>   <property>
>>     <name>dfs.replication</name>
>>     <value>1</value>
>>   </property>
>> <property>
>>   <name>fs.default.name</name>
>>   <value>s3n://dwh-hdfs-test</value>
>> </property>
>> <property>
>>   <name>fs.s3n.awsAccessKeyId</name>
>>   <value>..access key...</value>
>> </property>
>> <property>
>>   <name>fs.s3n.awsSecretAccessKey</name>
>>   <value>...secret access key, with / encoded as %2F..</value>
>> </property>
>> </configuration>
>> "bin/hadoop namenode -format" works; I end up with this structure (as seen
>> by s3cmd):
>> 2010-06-10 13:11         4
>> s3://dwh-hdfs-test/tmp/hadoop-dhaskin/mapred/system/jobtracker.info
>> 2010-06-10 13:11         0
>> s3://dwh-hdfs-test/tmp/hadoop-dhaskin/mapred/system_$folder$
>> 2010-06-10 13:10         0
>> s3://dwh-hdfs-test/tmp/hadoop-dhaskin/mapred_$folder$
>> 2010-06-10 13:10         0   s3://dwh-hdfs-test/tmp/hadoop-dhaskin_$folder$
>> 2010-06-10 13:10         0   s3://dwh-hdfs-test/tmp_$folder$
>> but when I try something like "bin/hadoop fs -ls" I get various errors.
>>  Part of the problem is it's sort of unclear what the format of the
>> parameter should be.  See what I tried and the various errors down below, at
>> [1]
>> Also, when I started up the jobtracker, it fails with:
>> 2010-06-10 13:34:08,521 FATAL org.apache.hadoop.mapred.JobTracker:
>> org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException:
>> S3 HEAD request failed for '/tmp%2Fhadoop-dhaskin%2Fmapred%2Fsystem' -
>> ResponseCode=403, ResponseMessage=Forbidden
>> Suggestions?  Thanks...
>> dwh
>>
>> [1] bin/hadoop fs -ls -- various attempts:
>> dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls /
>> ls: org.jets3t.service.S3ServiceException: S3 GET failed for '/' XML Error
>> Message: <?xml version="1.0"
>> encoding="UTF-8"?><Error><Code>SignatureDoesNotMatch</Code><Message>The
>> request signature we calculated does not match the signature you provided.
>> Check your key and signing method.</Message><StringToSignBytes>47 45 54 0a
>> 0a 0a 54 68 75 2c 20 31 30 20 4a 75 6e 20 32 30 31 30 20 31 37 3a 33 30 3a
>> 34 30 20 47 4d 54 0a 2f 64 77 68 2d 68 64 66 73 2d 74 65 73 74
>> 2f</StringToSignBytes><RequestId>098E7E853DB93FD9</RequestId><HostId>Y8w3ccqhj4VvWY6Ma17o5HF+8cWK3r1kiIywxAsrwSsuR2DqFpxx4+2+9Xgnn+9i</HostId><SignatureProvided>w8vS4vJjm5MpkLpDbnuVcByruxw=</SignatureProvided><StringToSign>GETThu,
>> 10 Jun 2010 17:30:40
>> GMT/dwh-hdfs-test/</StringToSign><AWSAccessKeyId>AKIAI5UTNXFARAYFZAXQ</AWSAccessKeyId></Error>
>> dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls s3n://dwh-hdfs-test
>> ls: Path must be absolute: s3n://dwh-hdfs-test
>> Usage: java FsShell [-ls <path>]
>> dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls s3n://dwh-hdfs-test/
>> ls: org.jets3t.service.S3ServiceException: S3 GET failed for '/' XML Error
>> Message: <?xml version="1.0"
>> encoding="UTF-8"?><Error><Code>SignatureDoesNotMatch</Code><Message>The
>> request signature we calculated does not match the signature you provided.
>> Check your key and signing method.</Message><StringToSignBytes>47 45 54 0a
>> 0a 0a 54 68 75 2c 20 31 30 20 4a 75 6e 20 32 30 31 30 20 31 37 3a 33 30 3a
>> 35 36 20 47 4d 54 0a 2f 64 77 68 2d 68 64 66 73 2d 74 65 73 74
>> 2f</StringToSignBytes><RequestId>2F7004189009A56A</RequestId><HostId>IQzsA5849ZnBaqGuAsFvTIt78u9oDRaBvrY5Xwg5exf85H+7/aAejxK33QPLXCue</HostId><SignatureProvided>ryv+zgWT3YXPbqPcMNr0F4dcWKM=</SignatureProvided><StringToSign>GETThu,
>> 10 Jun 2010 17:30:56
>> GMT/dwh-hdfs-test/</StringToSign><AWSAccessKeyId>AKIAI5UTNXFARAYFZAXQ</AWSAccessKeyId></Error>
>> dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls s3n://dwh-hdfs-test/tmp
>> ls: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/tmp'
>> - ResponseCode=403, ResponseMessage=Forbidden
>> dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls /tmp
>> ls: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/tmp'
>> - ResponseCode=403, ResponseMessage=Forbidden
>>
>

Re: Using S3 native filesystem? (s3n://)

Posted by Denis Haskin <de...@haskinferguson.net>.
Resolved this; http://wiki.apache.org/hadoop/AmazonS3 is a little unclear.

The page says "Note that since the secret access key can contain
slashes, you must remember to escape them by replacing each slash /
with the string %2F."

This appears to apply ONLY when the secret access key is embedded in
the fs.default.name, not when it's specified separately via
fs.s3n.awsSecretAccessKey.  I haven't fully tested that, but once I
un-escaped it (in the awsSecretAccessKey), it works.

Also, I don't know why I was trying to do a "namenode -format" in this
case; it's not necesssary.

dwh



On Thu, Jun 10, 2010 at 2:15 PM, Denis Haskin <de...@haskinferguson.net> wrote:
> I'm having trouble getting the S3 native filesystem working, trying to use
> it from my local desktop.  Following http://wiki.apache.org/hadoop/AmazonS3,
> I have my hdfs-site.xml as:
> <configuration>
>   <property>
>     <name>dfs.replication</name>
>     <value>1</value>
>   </property>
> <property>
>   <name>fs.default.name</name>
>   <value>s3n://dwh-hdfs-test</value>
> </property>
> <property>
>   <name>fs.s3n.awsAccessKeyId</name>
>   <value>..access key...</value>
> </property>
> <property>
>   <name>fs.s3n.awsSecretAccessKey</name>
>   <value>...secret access key, with / encoded as %2F..</value>
> </property>
> </configuration>
> "bin/hadoop namenode -format" works; I end up with this structure (as seen
> by s3cmd):
> 2010-06-10 13:11         4
> s3://dwh-hdfs-test/tmp/hadoop-dhaskin/mapred/system/jobtracker.info
> 2010-06-10 13:11         0
> s3://dwh-hdfs-test/tmp/hadoop-dhaskin/mapred/system_$folder$
> 2010-06-10 13:10         0
> s3://dwh-hdfs-test/tmp/hadoop-dhaskin/mapred_$folder$
> 2010-06-10 13:10         0   s3://dwh-hdfs-test/tmp/hadoop-dhaskin_$folder$
> 2010-06-10 13:10         0   s3://dwh-hdfs-test/tmp_$folder$
> but when I try something like "bin/hadoop fs -ls" I get various errors.
>  Part of the problem is it's sort of unclear what the format of the
> parameter should be.  See what I tried and the various errors down below, at
> [1]
> Also, when I started up the jobtracker, it fails with:
> 2010-06-10 13:34:08,521 FATAL org.apache.hadoop.mapred.JobTracker:
> org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException:
> S3 HEAD request failed for '/tmp%2Fhadoop-dhaskin%2Fmapred%2Fsystem' -
> ResponseCode=403, ResponseMessage=Forbidden
> Suggestions?  Thanks...
> dwh
>
> [1] bin/hadoop fs -ls -- various attempts:
> dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls /
> ls: org.jets3t.service.S3ServiceException: S3 GET failed for '/' XML Error
> Message: <?xml version="1.0"
> encoding="UTF-8"?><Error><Code>SignatureDoesNotMatch</Code><Message>The
> request signature we calculated does not match the signature you provided.
> Check your key and signing method.</Message><StringToSignBytes>47 45 54 0a
> 0a 0a 54 68 75 2c 20 31 30 20 4a 75 6e 20 32 30 31 30 20 31 37 3a 33 30 3a
> 34 30 20 47 4d 54 0a 2f 64 77 68 2d 68 64 66 73 2d 74 65 73 74
> 2f</StringToSignBytes><RequestId>098E7E853DB93FD9</RequestId><HostId>Y8w3ccqhj4VvWY6Ma17o5HF+8cWK3r1kiIywxAsrwSsuR2DqFpxx4+2+9Xgnn+9i</HostId><SignatureProvided>w8vS4vJjm5MpkLpDbnuVcByruxw=</SignatureProvided><StringToSign>GETThu,
> 10 Jun 2010 17:30:40
> GMT/dwh-hdfs-test/</StringToSign><AWSAccessKeyId>AKIAI5UTNXFARAYFZAXQ</AWSAccessKeyId></Error>
> dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls s3n://dwh-hdfs-test
> ls: Path must be absolute: s3n://dwh-hdfs-test
> Usage: java FsShell [-ls <path>]
> dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls s3n://dwh-hdfs-test/
> ls: org.jets3t.service.S3ServiceException: S3 GET failed for '/' XML Error
> Message: <?xml version="1.0"
> encoding="UTF-8"?><Error><Code>SignatureDoesNotMatch</Code><Message>The
> request signature we calculated does not match the signature you provided.
> Check your key and signing method.</Message><StringToSignBytes>47 45 54 0a
> 0a 0a 54 68 75 2c 20 31 30 20 4a 75 6e 20 32 30 31 30 20 31 37 3a 33 30 3a
> 35 36 20 47 4d 54 0a 2f 64 77 68 2d 68 64 66 73 2d 74 65 73 74
> 2f</StringToSignBytes><RequestId>2F7004189009A56A</RequestId><HostId>IQzsA5849ZnBaqGuAsFvTIt78u9oDRaBvrY5Xwg5exf85H+7/aAejxK33QPLXCue</HostId><SignatureProvided>ryv+zgWT3YXPbqPcMNr0F4dcWKM=</SignatureProvided><StringToSign>GETThu,
> 10 Jun 2010 17:30:56
> GMT/dwh-hdfs-test/</StringToSign><AWSAccessKeyId>AKIAI5UTNXFARAYFZAXQ</AWSAccessKeyId></Error>
> dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls s3n://dwh-hdfs-test/tmp
> ls: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/tmp'
> - ResponseCode=403, ResponseMessage=Forbidden
> dwhsix:hadoop-0.20.2 dhaskin$ bin/hadoop fs -ls /tmp
> ls: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/tmp'
> - ResponseCode=403, ResponseMessage=Forbidden
>