You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Michael Dagaev <mi...@gmail.com> on 2009/02/04 17:21:08 UTC

Backup Again

Hi, all

    I read HBASE-974 and HBASE-643 mentioned on the list
but what do you think about copying tables from the production
to a backup Hbase cluster ? I guess we do need a big iron
for such a backup cluster.

I understand that the copy can be implemented with MR
but for now we can implement it just as a simple sequential script,
which scans the tables of the production Hbase and writes the data
to the backup Hbase.

Does it make sense?

Thank you for your cooperation,
M.

Re: Backup Again

Posted by Michael Dagaev <mi...@gmail.com>.
Looks like we can work around the problem
just by changing the "hbase.rootdir" value used as a key in the map.

M.

On Wed, Feb 4, 2009 at 7:00 PM, Chris K Wensel <ch...@wensel.net> wrote:
> I think that would be slightly more troublesome than just passing an url
> around. most of the heavy lifting is done inside TableInput/OutputFormat
> (which is part of HBase). You need to pass reasonable properties all the way
> down.
>
> I think the Streamy guys might be touching on some of this, this week.
>
> ckw

Re: Backup Again

Posted by Chris K Wensel <ch...@wensel.net>.
I think that would be slightly more troublesome than just passing an  
url around. most of the heavy lifting is done inside TableInput/ 
OutputFormat (which is part of HBase). You need to pass reasonable  
properties all the way down.

I think the Streamy guys might be touching on some of this, this week.

ckw

On Feb 4, 2009, at 8:53 AM, Michael Dagaev wrote:

>> Currently no. but we would love you to patch that in. If you clone  
>> the repo
>> and get it working, I'll merge it back from your repo.
>
> Currently, I would like just to work around that problem :)
> What about loading the Hbase client classes with different class  
> loaders?
>
> M.

--
Chris K Wensel
chris@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/


Re: Backup Again

Posted by Michael Dagaev <mi...@gmail.com>.
> Currently no. but we would love you to patch that in. If you clone the repo
> and get it working, I'll merge it back from your repo.

Currently, I would like just to work around that problem :)
What about loading the Hbase client classes with different class loaders?

M.

Re: Backup Again

Posted by Chris K Wensel <ch...@wensel.net>.
Currently no. but we would love you to patch that in. If you clone the  
repo and get it working, I'll merge it back from your repo.

I'm thinking it is as simple as handing the URL to HBaseTap. thoughts?

ckw

On Feb 4, 2009, at 8:38 AM, Michael Dagaev wrote:

> Thanks, Chris
>
> BTW, is it possible to run a few HBase clients in a single JVM?
>
> On Wed, Feb 4, 2009 at 6:28 PM, Chris K Wensel <ch...@wensel.net>  
> wrote:
>> Hey Michael
>>
>> You could probably use Cascading to migrate data between HBase  
>> clusters.
>> http://wiki.apache.org/hadoop/Hbase/Cascading
>>
>> But the code currently doesn't support multiple HBase cluster  
>> clients in a
>> single JVM, but I'm sure it can be coded in quickly. (the code is  
>> hosted at
>> github, so is easily cloned and patched).
>>
>> A benefit of using Cascading would be the ability to put in quality  
>> checks,
>> or filter data very easily.
>>
>> I've already heard of users starting to migrate from HBase to a  
>> RDBMS using
>> Cascading, and also between Hypertable and Aster Data. Hopefully  
>> those
>> adapters will leak out for the rest of us to use.
>>
>> ckw
>>
>> On Feb 4, 2009, at 8:21 AM, Michael Dagaev wrote:
>>
>>> Hi, all
>>>
>>>  I read HBASE-974 and HBASE-643 mentioned on the list
>>> but what do you think about copying tables from the production
>>> to a backup Hbase cluster ? I guess we do need a big iron
>>> for such a backup cluster.
>>>
>>> I understand that the copy can be implemented with MR
>>> but for now we can implement it just as a simple sequential script,
>>> which scans the tables of the production Hbase and writes the data
>>> to the backup Hbase.
>>>
>>> Does it make sense?
>>>
>>> Thank you for your cooperation,
>>> M.
>>
>> --
>> Chris K Wensel
>> chris@wensel.net
>> http://www.cascading.org/
>> http://www.scaleunlimited.com/
>>
>>

--
Chris K Wensel
chris@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/


Re: Backup Again

Posted by Michael Dagaev <mi...@gmail.com>.
Thanks, Chris

BTW, is it possible to run a few HBase clients in a single JVM?

On Wed, Feb 4, 2009 at 6:28 PM, Chris K Wensel <ch...@wensel.net> wrote:
> Hey Michael
>
> You could probably use Cascading to migrate data between HBase clusters.
> http://wiki.apache.org/hadoop/Hbase/Cascading
>
> But the code currently doesn't support multiple HBase cluster clients in a
> single JVM, but I'm sure it can be coded in quickly. (the code is hosted at
> github, so is easily cloned and patched).
>
> A benefit of using Cascading would be the ability to put in quality checks,
> or filter data very easily.
>
> I've already heard of users starting to migrate from HBase to a RDBMS using
> Cascading, and also between Hypertable and Aster Data. Hopefully those
> adapters will leak out for the rest of us to use.
>
> ckw
>
> On Feb 4, 2009, at 8:21 AM, Michael Dagaev wrote:
>
>> Hi, all
>>
>>   I read HBASE-974 and HBASE-643 mentioned on the list
>> but what do you think about copying tables from the production
>> to a backup Hbase cluster ? I guess we do need a big iron
>> for such a backup cluster.
>>
>> I understand that the copy can be implemented with MR
>> but for now we can implement it just as a simple sequential script,
>> which scans the tables of the production Hbase and writes the data
>> to the backup Hbase.
>>
>> Does it make sense?
>>
>> Thank you for your cooperation,
>> M.
>
> --
> Chris K Wensel
> chris@wensel.net
> http://www.cascading.org/
> http://www.scaleunlimited.com/
>
>

Re: Backup Again

Posted by Chris K Wensel <ch...@wensel.net>.
Hey Michael

You could probably use Cascading to migrate data between HBase clusters.
http://wiki.apache.org/hadoop/Hbase/Cascading

But the code currently doesn't support multiple HBase cluster clients  
in a single JVM, but I'm sure it can be coded in quickly. (the code is  
hosted at github, so is easily cloned and patched).

A benefit of using Cascading would be the ability to put in quality  
checks, or filter data very easily.

I've already heard of users starting to migrate from HBase to a RDBMS  
using Cascading, and also between Hypertable and Aster Data. Hopefully  
those adapters will leak out for the rest of us to use.

ckw

On Feb 4, 2009, at 8:21 AM, Michael Dagaev wrote:

> Hi, all
>
>    I read HBASE-974 and HBASE-643 mentioned on the list
> but what do you think about copying tables from the production
> to a backup Hbase cluster ? I guess we do need a big iron
> for such a backup cluster.
>
> I understand that the copy can be implemented with MR
> but for now we can implement it just as a simple sequential script,
> which scans the tables of the production Hbase and writes the data
> to the backup Hbase.
>
> Does it make sense?
>
> Thank you for your cooperation,
> M.

--
Chris K Wensel
chris@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/