You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by John Roberts <jo...@yahoo.com> on 2009/12/15 03:49:50 UTC

HBase: using S3 for storage

I'm running HBase version 0.20.2 and am trying to get my HBase server
to use S3 for storage instead of the local file system.  I tried
following the instructions here but could not get it to work:

http://developer.amazonwebservices.com/connect/thread.jspa?messageID=139683

My HBase version does not have a hadoop-site.xml file so I created one in the conf directory with the following parameters:

<configuration> 
<property> 
  <name>fs.default.name</name> 
  <value>s3://hbase</value> 
</property> 

<property> 
  <name>fs.s3.awsAccessKeyId</name> 
  <value>id</value> 
</property> 

<property> 
  <name>fs.s3.awsSecretAccessKey</name> 
  <value>secret</value> 
</property> 

</configuration> 

I
also updated the hbase.rootdir property with the S3 url as per the reference above.  When I ran
the hbase shell and tried to put a value into a table I got a deep stack trace with
no mention of S3.

Has anyone gotten HBase to use S3?  If so - could you send me the config changes you made to get it to work?  Thanks!

John



      

Re: HBase: using S3 for storage

Posted by Jean-Daniel Cryans <jd...@apache.org>.
I wrote an email 1-2 months ago to the hbase-user mailing list that
summarizes the pros and cons of using S3 instead of HDFS. Title is "On
storing HBase data in AWS S3".

J-D

On Tue, Dec 15, 2009 at 9:58 AM, Andrew Purtell <ap...@apache.org> wrote:
> Hi John,
>
> Thanks for writing us with the updates. I'm glad you noticed the errors about missing classes etc. because the pastebin only showed the master did not start so we would have needed to dig deeper in logs (time consuming).
>
> Please let me know about your experiences using S3 to back the HBase rootdir. We haven't looked into this because we were not aware of any users actively trying this out.
>
>    - Andy
>
>
>
>
> ________________________________
> From: John Roberts <jo...@yahoo.com>
> To: hbase-dev@hadoop.apache.org
> Sent: Tue, December 15, 2009 8:32:52 AM
> Subject: Re: HBase: using S3 for storage
>
> Never mind - I set my hbase.rootdir to s3://net.montrix.test.s3.amazonaws.com:80/ and it worked and I can see files being written to my net.montrix.test.s3.amazonaws.com bucket in S3.
>
> - John
>
>
>
>
> ________________________________
> From: John Roberts <jo...@yahoo.com>
> To: hbase-dev@hadoop.apache.org
> Sent: Tue, December 15, 2009 6:39:53 AM
> Subject: Re: HBase: using S3 for storage
>
>
> There were additional error messages in my master log file which indicated that I was missing some jar's.  I downloaded jets3t-0.7.1.jar and commons-codec-1.4.jar and set the JETS3T_HOME variable in my hbase-env.sh file.   This got me to the point where it is now trying to use S3.  Now I get the errors below in my master log file.  At this point the only question seems to be exactly what to set my hbase.rootdir property to.  My S3 account has buckets "net.montrix.test" as well as "net.montrix.test.s3.amazonaws.com".   I tried setting my hbase.rootdir value to this:
>
> s3://net.montrix.test.s3.amazonaws.com:80/tmp/hbase-jroberts/hbase
>
> The location of my hbase root dir on my local file system is /tmp/hbase-jroberts/hbase.  That resulted in the error below.   So either my hbase.rootdir value is wrong or perhaps the fs.default.name property in my hadoop-site.xml is wrong?  I have it set to s3://hbase.
>
> John
>
> 2009-12-15 06:18:52,917 INFO org.apache.hadoop.hbase.master.HMaster: My address is localhost.localdomain:60000
> 2009-12-15 06:18:54,696 ERROR org.apache.hadoop.hbase.master.HMaster: Can not start master
> org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: S3 GET failed for '/%2Ftmp%2Fhbase-jroberts%2Fhbase' XML Error Message: <?xml version="1.0" encoding="UTF-8"?><Error><Code>NoSuchBucket</Code><Message>The specified bucket does not exist</Message><BucketName>net.montrix.test.s3.amazonaws.com</BucketName><RequestId>E7E72017C69AB6DF</RequestId><HostId>LHSezOrfx3LrWI+IWQ1Icbz0/FRndFDsyQWIn3Oaru1ui6JXfq9Zfz1tgfUET7TG</HostId></Error>
>        at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.get(Jets3tFileSystemStore.java:156)
>
>
>
>
>
> ________________________________
> From: John Roberts <jo...@yahoo.com>
> To: hbase-dev@hadoop.apache.org
> Sent: Tue, December 15, 2009 1:33:45 AM
> Subject: Re: HBase: using S3 for storage
>
>
> The stack trace is here: http://pastebin.ca/1715521
>
> I set my hbase.rootdir value to the following:
>
> s3://net.montrix.test.s3.amazonaws.com:80/
>
> Note that the net.montrix.test bucket exists in my S3 account.  Thanks for looking at this.
>
> John
>
>
>
>
> ________________________________
> From: Andrew Purtell <ap...@apache.org>
> To: hbase-dev@hadoop.apache.org
> Sent: Mon, December 14, 2009 11:27:41 PM
> Subject: Re: HBase: using S3 for storage
>
> Hi John,
>
> Can you pastebin that stack trace?
>
>   - Andy
>
>
>
>
> ________________________________
> From: John Roberts <jo...@yahoo.com>
> To: hbase-dev@hadoop.apache.org
> Sent: Mon, December 14, 2009 6:49:50 PM
> Subject: HBase: using S3 for storage
>
> I'm running HBase version 0.20.2 and am trying to get my HBase server
> to use S3 for storage instead of the local file system.  I tried
> following the instructions here but could not get it to work:
>
> http://developer.amazonwebservices.com/connect/thread.jspa?messageID=139683
>
> My HBase version does not have a hadoop-site.xml file so I created one in the conf directory with the following parameters:
>
> <configuration>
> <property>
>  <name>fs.default.name</name>
>  <value>s3://hbase</value>
> </property>
>
> <property>
>  <name>fs.s3.awsAccessKeyId</name>
>  <value>id</value>
> </property>
>
> <property>
>  <name>fs.s3.awsSecretAccessKey</name>
>  <value>secret</value>
> </property>
>
> </configuration>
>
> I
> also updated the hbase.rootdir property with the S3 url as per the reference above.  When I ran
> the hbase shell and tried to put a value into a table I got a deep stack trace with
> no mention of S3.
>
> Has anyone gotten HBase to use S3?  If so - could you send me the config changes you made to get it to work?  Thanks!
>
> John
>
>
>

Re: HBase: using S3 for storage

Posted by Andrew Purtell <ap...@apache.org>.
Hi John,

Thanks for writing us with the updates. I'm glad you noticed the errors about missing classes etc. because the pastebin only showed the master did not start so we would have needed to dig deeper in logs (time consuming).

Please let me know about your experiences using S3 to back the HBase rootdir. We haven't looked into this because we were not aware of any users actively trying this out. 

    - Andy




________________________________
From: John Roberts <jo...@yahoo.com>
To: hbase-dev@hadoop.apache.org
Sent: Tue, December 15, 2009 8:32:52 AM
Subject: Re: HBase: using S3 for storage

Never mind - I set my hbase.rootdir to s3://net.montrix.test.s3.amazonaws.com:80/ and it worked and I can see files being written to my net.montrix.test.s3.amazonaws.com bucket in S3.  

- John




________________________________
From: John Roberts <jo...@yahoo.com>
To: hbase-dev@hadoop.apache.org
Sent: Tue, December 15, 2009 6:39:53 AM
Subject: Re: HBase: using S3 for storage


There were additional error messages in my master log file which indicated that I was missing some jar's.  I downloaded jets3t-0.7.1.jar and commons-codec-1.4.jar and set the JETS3T_HOME variable in my hbase-env.sh file.   This got me to the point where it is now trying to use S3.  Now I get the errors below in my master log file.  At this point the only question seems to be exactly what to set my hbase.rootdir property to.  My S3 account has buckets "net.montrix.test" as well as "net.montrix.test.s3.amazonaws.com".   I tried setting my hbase.rootdir value to this:

s3://net.montrix.test.s3.amazonaws.com:80/tmp/hbase-jroberts/hbase

The location of my hbase root dir on my local file system is /tmp/hbase-jroberts/hbase.  That resulted in the error below.   So either my hbase.rootdir value is wrong or perhaps the fs.default.name property in my hadoop-site.xml is wrong?  I have it set to s3://hbase.

John

2009-12-15 06:18:52,917 INFO org.apache.hadoop.hbase.master.HMaster: My address is localhost.localdomain:60000
2009-12-15 06:18:54,696 ERROR org.apache.hadoop.hbase.master.HMaster: Can not start master
org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: S3 GET failed for '/%2Ftmp%2Fhbase-jroberts%2Fhbase' XML Error Message: <?xml version="1.0" encoding="UTF-8"?><Error><Code>NoSuchBucket</Code><Message>The specified bucket does not exist</Message><BucketName>net.montrix.test.s3.amazonaws.com</BucketName><RequestId>E7E72017C69AB6DF</RequestId><HostId>LHSezOrfx3LrWI+IWQ1Icbz0/FRndFDsyQWIn3Oaru1ui6JXfq9Zfz1tgfUET7TG</HostId></Error>
        at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.get(Jets3tFileSystemStore.java:156)





________________________________
From: John Roberts <jo...@yahoo.com>
To: hbase-dev@hadoop.apache.org
Sent: Tue, December 15, 2009 1:33:45 AM
Subject: Re: HBase: using S3 for storage


The stack trace is here: http://pastebin.ca/1715521

I set my hbase.rootdir value to the following:

s3://net.montrix.test.s3.amazonaws.com:80/

Note that the net.montrix.test bucket exists in my S3 account.  Thanks for looking at this.

John




________________________________
From: Andrew Purtell <ap...@apache.org>
To: hbase-dev@hadoop.apache.org
Sent: Mon, December 14, 2009 11:27:41 PM
Subject: Re: HBase: using S3 for storage

Hi John,

Can you pastebin that stack trace?

   - Andy




________________________________
From: John Roberts <jo...@yahoo.com>
To: hbase-dev@hadoop.apache.org
Sent: Mon, December 14, 2009 6:49:50 PM
Subject: HBase: using S3 for storage

I'm running HBase version 0.20.2 and am trying to get my HBase server
to use S3 for storage instead of the local file system.  I tried
following the instructions here but could not get it to work:

http://developer.amazonwebservices.com/connect/thread.jspa?messageID=139683

My HBase version does not have a hadoop-site.xml file so I created one in the conf directory with the following parameters:

<configuration> 
<property> 
  <name>fs.default.name</name> 
  <value>s3://hbase</value> 
</property> 

<property> 
  <name>fs.s3.awsAccessKeyId</name> 
  <value>id</value> 
</property> 

<property> 
  <name>fs.s3.awsSecretAccessKey</name> 
  <value>secret</value> 
</property> 

</configuration> 

I
also updated the hbase.rootdir property with the S3 url as per the reference above.  When I ran
the hbase shell and tried to put a value into a table I got a deep stack trace with
no mention of S3.

Has anyone gotten HBase to use S3?  If so - could you send me the config changes you made to get it to work?  Thanks!

John


      

Re: HBase: using S3 for storage

Posted by John Roberts <jo...@yahoo.com>.
Never mind - I set my hbase.rootdir to s3://net.montrix.test.s3.amazonaws.com:80/ and it worked and I can see files being written to my net.montrix.test.s3.amazonaws.com bucket in S3.   

- John




________________________________
From: John Roberts <jo...@yahoo.com>
To: hbase-dev@hadoop.apache.org
Sent: Tue, December 15, 2009 6:39:53 AM
Subject: Re: HBase: using S3 for storage


There were additional error messages in my master log file which indicated that I was missing some jar's.  I downloaded jets3t-0.7.1.jar and commons-codec-1.4.jar and set the JETS3T_HOME variable in my hbase-env.sh file.   This got me to the point where it is now trying to use S3.  Now I get the errors below in my master log file.  At this point the only question seems to be exactly what to set my hbase.rootdir property to.  My S3 account has buckets "net.montrix.test" as well as "net.montrix.test.s3.amazonaws.com".   I tried setting my hbase.rootdir value to this:

s3://net.montrix.test.s3.amazonaws.com:80/tmp/hbase-jroberts/hbase

The location of my hbase root dir on my local file system is /tmp/hbase-jroberts/hbase.  That resulted in the error below.   So either my hbase.rootdir value is wrong or perhaps the fs.default.name property in my hadoop-site.xml is wrong?  I have it set to s3://hbase.

John

2009-12-15 06:18:52,917 INFO org.apache.hadoop.hbase.master.HMaster: My address is localhost.localdomain:60000
2009-12-15 06:18:54,696 ERROR org.apache.hadoop.hbase.master.HMaster: Can not start master
org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: S3 GET failed for '/%2Ftmp%2Fhbase-jroberts%2Fhbase' XML Error Message: <?xml version="1.0" encoding="UTF-8"?><Error><Code>NoSuchBucket</Code><Message>The specified bucket does not exist</Message><BucketName>net.montrix.test.s3.amazonaws.com</BucketName><RequestId>E7E72017C69AB6DF</RequestId><HostId>LHSezOrfx3LrWI+IWQ1Icbz0/FRndFDsyQWIn3Oaru1ui6JXfq9Zfz1tgfUET7TG</HostId></Error>
        at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.get(Jets3tFileSystemStore.java:156)





________________________________
From: John Roberts <jo...@yahoo.com>
To: hbase-dev@hadoop.apache.org
Sent: Tue, December 15, 2009 1:33:45 AM
Subject: Re: HBase: using S3 for storage


The stack trace is here: http://pastebin.ca/1715521

I set my hbase.rootdir value to the following:

s3://net.montrix.test.s3.amazonaws.com:80/

Note that the net.montrix.test bucket exists in my S3 account.  Thanks for looking at this.

John




________________________________
From: Andrew Purtell <ap...@apache.org>
To: hbase-dev@hadoop.apache.org
Sent: Mon, December 14, 2009 11:27:41 PM
Subject: Re: HBase: using S3 for storage

Hi John,

Can you pastebin that stack trace?

   - Andy




________________________________
From: John Roberts <jo...@yahoo.com>
To: hbase-dev@hadoop.apache.org
Sent: Mon, December 14, 2009 6:49:50 PM
Subject: HBase: using S3 for storage

I'm running HBase version 0.20.2 and am trying to get my HBase server
to use S3 for storage instead of the local file system.  I tried
following the instructions here but could not get it to work:

http://developer.amazonwebservices.com/connect/thread.jspa?messageID=139683

My HBase version does not have a hadoop-site.xml file so I created one in the conf directory with the following parameters:

<configuration> 
<property> 
  <name>fs.default.name</name> 
  <value>s3://hbase</value> 
</property> 

<property> 
  <name>fs.s3.awsAccessKeyId</name> 
  <value>id</value> 
</property> 

<property> 
  <name>fs.s3.awsSecretAccessKey</name> 
  <value>secret</value> 
</property> 

</configuration> 

I
also updated the hbase.rootdir property with the S3 url as per the reference above.  When I ran
the hbase shell and tried to put a value into a table I got a deep stack trace with
no mention of S3.

Has anyone gotten HBase to use S3?  If so - could you send me the config changes you made to get it to work?  Thanks!

John


      

Re: HBase: using S3 for storage

Posted by John Roberts <jo...@yahoo.com>.
There were additional error messages in my master log file which indicated that I was missing some jar's.  I downloaded jets3t-0.7.1.jar and commons-codec-1.4.jar and set the JETS3T_HOME variable in my hbase-env.sh file.   This got me to the point where it is now trying to use S3.  Now I get the errors below in my master log file.  At this point the only question seems to be exactly what to set my hbase.rootdir property to.  My S3 account has buckets "net.montrix.test" as well as "net.montrix.test.s3.amazonaws.com".   I tried setting my hbase.rootdir value to this:

s3://net.montrix.test.s3.amazonaws.com:80/tmp/hbase-jroberts/hbase

The location of my hbase root dir on my local file system is /tmp/hbase-jroberts/hbase.  That resulted in the error below.   So either my hbase.rootdir value is wrong or perhaps the fs.default.name property in my hadoop-site.xml is wrong?  I have it set to s3://hbase.

John

2009-12-15 06:18:52,917 INFO org.apache.hadoop.hbase.master.HMaster: My address is localhost.localdomain:60000
2009-12-15 06:18:54,696 ERROR org.apache.hadoop.hbase.master.HMaster: Can not start master
org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: S3 GET failed for '/%2Ftmp%2Fhbase-jroberts%2Fhbase' XML Error Message: <?xml version="1.0" encoding="UTF-8"?><Error><Code>NoSuchBucket</Code><Message>The specified bucket does not exist</Message><BucketName>net.montrix.test.s3.amazonaws.com</BucketName><RequestId>E7E72017C69AB6DF</RequestId><HostId>LHSezOrfx3LrWI+IWQ1Icbz0/FRndFDsyQWIn3Oaru1ui6JXfq9Zfz1tgfUET7TG</HostId></Error>
        at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.get(Jets3tFileSystemStore.java:156)





________________________________
From: John Roberts <jo...@yahoo.com>
To: hbase-dev@hadoop.apache.org
Sent: Tue, December 15, 2009 1:33:45 AM
Subject: Re: HBase: using S3 for storage


The stack trace is here: http://pastebin.ca/1715521

I set my hbase.rootdir value to the following:

s3://net.montrix.test.s3.amazonaws.com:80/

Note that the net.montrix.test bucket exists in my S3 account.  Thanks for looking at this.

John




________________________________
From: Andrew Purtell <ap...@apache.org>
To: hbase-dev@hadoop.apache.org
Sent: Mon, December 14, 2009 11:27:41 PM
Subject: Re: HBase: using S3 for storage

Hi John,

Can you pastebin that stack trace?

   - Andy




________________________________
From: John Roberts <jo...@yahoo.com>
To: hbase-dev@hadoop.apache.org
Sent: Mon, December 14, 2009 6:49:50 PM
Subject: HBase: using S3 for storage

I'm running HBase version 0.20.2 and am trying to get my HBase server
to use S3 for storage instead of the local file system.  I tried
following the instructions here but could not get it to work:

http://developer.amazonwebservices.com/connect/thread.jspa?messageID=139683

My HBase version does not have a hadoop-site.xml file so I created one in the conf directory with the following parameters:

<configuration> 
<property> 
  <name>fs.default.name</name> 
  <value>s3://hbase</value> 
</property> 

<property> 
  <name>fs.s3.awsAccessKeyId</name> 
  <value>id</value> 
</property> 

<property> 
  <name>fs.s3.awsSecretAccessKey</name> 
  <value>secret</value> 
</property> 

</configuration> 

I
also updated the hbase.rootdir property with the S3 url as per the reference above.  When I ran
the hbase shell and tried to put a value into a table I got a deep stack trace with
no mention of S3.

Has anyone gotten HBase to use S3?  If so - could you send me the config changes you made to get it to work?  Thanks!

John


      

Re: HBase: using S3 for storage

Posted by John Roberts <jo...@yahoo.com>.
The stack trace is here: http://pastebin.ca/1715521

I set my hbase.rootdir value to the following:

s3://net.montrix.test.s3.amazonaws.com:80/

Note that the net.montrix.test bucket exists in my S3 account.  Thanks for looking at this.

John




________________________________
From: Andrew Purtell <ap...@apache.org>
To: hbase-dev@hadoop.apache.org
Sent: Mon, December 14, 2009 11:27:41 PM
Subject: Re: HBase: using S3 for storage

Hi John,

Can you pastebin that stack trace?

   - Andy




________________________________
From: John Roberts <jo...@yahoo.com>
To: hbase-dev@hadoop.apache.org
Sent: Mon, December 14, 2009 6:49:50 PM
Subject: HBase: using S3 for storage

I'm running HBase version 0.20.2 and am trying to get my HBase server
to use S3 for storage instead of the local file system.  I tried
following the instructions here but could not get it to work:

http://developer.amazonwebservices.com/connect/thread.jspa?messageID=139683

My HBase version does not have a hadoop-site.xml file so I created one in the conf directory with the following parameters:

<configuration> 
<property> 
  <name>fs.default.name</name> 
  <value>s3://hbase</value> 
</property> 

<property> 
  <name>fs.s3.awsAccessKeyId</name> 
  <value>id</value> 
</property> 

<property> 
  <name>fs.s3.awsSecretAccessKey</name> 
  <value>secret</value> 
</property> 

</configuration> 

I
also updated the hbase.rootdir property with the S3 url as per the reference above.  When I ran
the hbase shell and tried to put a value into a table I got a deep stack trace with
no mention of S3.

Has anyone gotten HBase to use S3?  If so - could you send me the config changes you made to get it to work?  Thanks!

John


      

Re: HBase: using S3 for storage

Posted by Andrew Purtell <ap...@apache.org>.
Hi John,

Can you pastebin that stack trace?

   - Andy




________________________________
From: John Roberts <jo...@yahoo.com>
To: hbase-dev@hadoop.apache.org
Sent: Mon, December 14, 2009 6:49:50 PM
Subject: HBase: using S3 for storage

I'm running HBase version 0.20.2 and am trying to get my HBase server
to use S3 for storage instead of the local file system.  I tried
following the instructions here but could not get it to work:

http://developer.amazonwebservices.com/connect/thread.jspa?messageID=139683

My HBase version does not have a hadoop-site.xml file so I created one in the conf directory with the following parameters:

<configuration> 
<property> 
  <name>fs.default.name</name> 
  <value>s3://hbase</value> 
</property> 

<property> 
  <name>fs.s3.awsAccessKeyId</name> 
  <value>id</value> 
</property> 

<property> 
  <name>fs.s3.awsSecretAccessKey</name> 
  <value>secret</value> 
</property> 

</configuration> 

I
also updated the hbase.rootdir property with the S3 url as per the reference above.  When I ran
the hbase shell and tried to put a value into a table I got a deep stack trace with
no mention of S3.

Has anyone gotten HBase to use S3?  If so - could you send me the config changes you made to get it to work?  Thanks!

John