You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by N Dm <ni...@gmail.com> on 2013/06/05 00:01:09 UTC

Replication is on columnfamily level or table level?

hi, folks,

<hbase 0.94.3>

By reading several documents, I always have the impression that *
"Replication* works at the table-*column*-*family level*". However, when I
am setting up a table with two columnfamilies and replicate them to two
different slavers, the whole table replicated. Is this a bug? Thanks

Here is the simple steps to receate.

*Environment: *
Replication Master: hdtest014
Replication Slave 1: hdtest017
Replication Slave 2: hdtest009

*Create Table*: on Master, and the two slaves:  create 't2_dn','cf1','cf2'

*setup replication on Master*(hdtest014), so that
Master> list_peers
 PEER_ID CLUSTER_KEY STATE
 1 hdtest017.svl.ibm.com:2181:/hbase ENABLED
 2 hdtest009.svl.ibm.com:2181:/hbase ENABLED
Master> > describe 't2_dn'
DESCRIPTION
ENABLED
 {NAME => 't2_dn', FAMILIES => [{*NAME => 'cf1', REPLICATION_SCOPE => '1'*,
KEEP_DELETED_CELLS => 'fals
true
 e', COMPRESSION => 'NONE', ENCODE_ON_DISK => 'true', BLOCKCACHE => 'true',
MIN_VERSIONS => '0',
DATA
 _BLOCK_ENCODING => 'NONE', IN_MEMORY => 'false', BLOOMFILTER => 'NONE',
TTL => '2147483647',
VERSION
 S => '3', BLOCKSIZE => '65536'}, {*NAME => 'cf2', REPLICATION_SCOPE => '2'*,
KEEP_DELETED_CELLS =>
'fa
 lse', COMPRESSION => 'NONE', ENCODE_ON_DISK => 'true', BLOCKCACHE =>
'true', MIN_VERSIONS => '0',
DA
 TA_BLOCK_ENCODING => 'NONE', IN_MEMORY => 'false', BLOOMFILTER => 'NONE',
TTL => '2147483647',
VERSI
 ONS => '3', BLOCKSIZE =>
'65536'}]}

1 row(s) in 0.0250 seconds

*Put rows into t2_dn on Master*
put 't2_dn','row1','cf1:q1','val1cf1fromMaster'
put 't2_dn','row1','cf2:q1','val1cf2fromMaster'
put 't2_dn','row2','cf1:q1','val2cf1fromMaster'
put 't2_dn','row3','cf2:q1','val3cf2fromMaster'

*Expecting cf1 replicated to slave1, and cf2 replicatedto slave2. Where all
the three clusters got: *
scan 't2_dn'
ROW
COLUMN+CELL

 row1                          column=cf1:q1, timestamp=1370382328358,
value=val1cf1fromMaster
 row1                          column=cf2:q1, timestamp=1370382334303,
value=val1cf2fromMaster
 row2                          column=cf1:q1, timestamp=1370382351716,
value=val2cf1fromMaster
 row3                          column=cf2:q1, timestamp=1370382367724,
value=val3cf2fromMaster
3 row(s) in 0.0160 seconds

Many thanks

Demai

Re: Replication is on columnfamily level or table level?

Posted by Anoop John <an...@gmail.com>.
There is no support like one CF can be replicated to one cluster while
another to another cluster..  In fact you can not specify peers where some
CF to be replicated. If the scope is given as 1 it gets replicated to all
peers..

See the issue HBASE-5002

-Anoop-

On Wed, Jun 5, 2013 at 6:04 PM, Shahab Yunus <sh...@gmail.com> wrote:

> Anoop, Can you please clarify a bit.
>
> -So we can specify replication at CF level but the scope 2 is not
> supported, right? And we can replicate one CF to one slave cluster and
> other CF to other slave cluster? Right? Thanks.
>
> Regards,
> Shahab
>
>
> On Wed, Jun 5, 2013 at 12:37 AM, Anoop John <an...@gmail.com> wrote:
>
> > Yes the replication can be specified at the CF level..  You have used
> > HCD#setScope() right?
> >
> > > S => '3', BLOCKSIZE => '65536'}, {*NAME => 'cf2', REPLICATION_SCOPE =>
> > '2'*,
> > You set scope as 2?? You have to set one CF to be replicated to one
> cluster
> > and another to to another cluster. I dont think it is supported even now.
> > You can see in the HCD code that there are 2 constants for scope 0 and 1
> > where 1 means replicate and 0 means not to be  replicated.
> >
> > -Anoop-
> >
> > On Wed, Jun 5, 2013 at 3:31 AM, N Dm <ni...@gmail.com> wrote:
> >
> > > hi, folks,
> > >
> > > <hbase 0.94.3>
> > >
> > > By reading several documents, I always have the impression that *
> > > "Replication* works at the table-*column*-*family level*". However,
> when
> > I
> > > am setting up a table with two columnfamilies and replicate them to two
> > > different slavers, the whole table replicated. Is this a bug? Thanks
> > >
> > > Here is the simple steps to receate.
> > >
> > > *Environment: *
> > > Replication Master: hdtest014
> > > Replication Slave 1: hdtest017
> > > Replication Slave 2: hdtest009
> > >
> > > *Create Table*: on Master, and the two slaves:  create
> > 't2_dn','cf1','cf2'
> > >
> > > *setup replication on Master*(hdtest014), so that
> > > Master> list_peers
> > >  PEER_ID CLUSTER_KEY STATE
> > >  1 hdtest017.svl.ibm.com:2181:/hbase ENABLED
> > >  2 hdtest009.svl.ibm.com:2181:/hbase ENABLED
> > > Master> > describe 't2_dn'
> > > DESCRIPTION
> > > ENABLED
> > >  {NAME => 't2_dn', FAMILIES => [{*NAME => 'cf1', REPLICATION_SCOPE =>
> > '1'*,
> > > KEEP_DELETED_CELLS => 'fals
> > > true
> > >  e', COMPRESSION => 'NONE', ENCODE_ON_DISK => 'true', BLOCKCACHE =>
> > 'true',
> > > MIN_VERSIONS => '0',
> > > DATA
> > >  _BLOCK_ENCODING => 'NONE', IN_MEMORY => 'false', BLOOMFILTER =>
> 'NONE',
> > > TTL => '2147483647',
> > > VERSION
> > >  S => '3', BLOCKSIZE => '65536'}, {*NAME => 'cf2', REPLICATION_SCOPE =>
> > > '2'*,
> > > KEEP_DELETED_CELLS =>
> > > 'fa
> > >  lse', COMPRESSION => 'NONE', ENCODE_ON_DISK => 'true', BLOCKCACHE =>
> > > 'true', MIN_VERSIONS => '0',
> > > DA
> > >  TA_BLOCK_ENCODING => 'NONE', IN_MEMORY => 'false', BLOOMFILTER =>
> > 'NONE',
> > > TTL => '2147483647',
> > > VERSI
> > >  ONS => '3', BLOCKSIZE =>
> > > '65536'}]}
> > >
> > > 1 row(s) in 0.0250 seconds
> > >
> > > *Put rows into t2_dn on Master*
> > > put 't2_dn','row1','cf1:q1','val1cf1fromMaster'
> > > put 't2_dn','row1','cf2:q1','val1cf2fromMaster'
> > > put 't2_dn','row2','cf1:q1','val2cf1fromMaster'
> > > put 't2_dn','row3','cf2:q1','val3cf2fromMaster'
> > >
> > > *Expecting cf1 replicated to slave1, and cf2 replicatedto slave2. Where
> > all
> > > the three clusters got: *
> > > scan 't2_dn'
> > > ROW
> > > COLUMN+CELL
> > >
> > >  row1                          column=cf1:q1, timestamp=1370382328358,
> > > value=val1cf1fromMaster
> > >  row1                          column=cf2:q1, timestamp=1370382334303,
> > > value=val1cf2fromMaster
> > >  row2                          column=cf1:q1, timestamp=1370382351716,
> > > value=val2cf1fromMaster
> > >  row3                          column=cf2:q1, timestamp=1370382367724,
> > > value=val3cf2fromMaster
> > > 3 row(s) in 0.0160 seconds
> > >
> > > Many thanks
> > >
> > > Demai
> > >
> >
>

Re: Replication is on columnfamily level or table level?

Posted by Shahab Yunus <sh...@gmail.com>.
Anoop, Can you please clarify a bit.

-So we can specify replication at CF level but the scope 2 is not
supported, right? And we can replicate one CF to one slave cluster and
other CF to other slave cluster? Right? Thanks.

Regards,
Shahab


On Wed, Jun 5, 2013 at 12:37 AM, Anoop John <an...@gmail.com> wrote:

> Yes the replication can be specified at the CF level..  You have used
> HCD#setScope() right?
>
> > S => '3', BLOCKSIZE => '65536'}, {*NAME => 'cf2', REPLICATION_SCOPE =>
> '2'*,
> You set scope as 2?? You have to set one CF to be replicated to one cluster
> and another to to another cluster. I dont think it is supported even now.
> You can see in the HCD code that there are 2 constants for scope 0 and 1
> where 1 means replicate and 0 means not to be  replicated.
>
> -Anoop-
>
> On Wed, Jun 5, 2013 at 3:31 AM, N Dm <ni...@gmail.com> wrote:
>
> > hi, folks,
> >
> > <hbase 0.94.3>
> >
> > By reading several documents, I always have the impression that *
> > "Replication* works at the table-*column*-*family level*". However, when
> I
> > am setting up a table with two columnfamilies and replicate them to two
> > different slavers, the whole table replicated. Is this a bug? Thanks
> >
> > Here is the simple steps to receate.
> >
> > *Environment: *
> > Replication Master: hdtest014
> > Replication Slave 1: hdtest017
> > Replication Slave 2: hdtest009
> >
> > *Create Table*: on Master, and the two slaves:  create
> 't2_dn','cf1','cf2'
> >
> > *setup replication on Master*(hdtest014), so that
> > Master> list_peers
> >  PEER_ID CLUSTER_KEY STATE
> >  1 hdtest017.svl.ibm.com:2181:/hbase ENABLED
> >  2 hdtest009.svl.ibm.com:2181:/hbase ENABLED
> > Master> > describe 't2_dn'
> > DESCRIPTION
> > ENABLED
> >  {NAME => 't2_dn', FAMILIES => [{*NAME => 'cf1', REPLICATION_SCOPE =>
> '1'*,
> > KEEP_DELETED_CELLS => 'fals
> > true
> >  e', COMPRESSION => 'NONE', ENCODE_ON_DISK => 'true', BLOCKCACHE =>
> 'true',
> > MIN_VERSIONS => '0',
> > DATA
> >  _BLOCK_ENCODING => 'NONE', IN_MEMORY => 'false', BLOOMFILTER => 'NONE',
> > TTL => '2147483647',
> > VERSION
> >  S => '3', BLOCKSIZE => '65536'}, {*NAME => 'cf2', REPLICATION_SCOPE =>
> > '2'*,
> > KEEP_DELETED_CELLS =>
> > 'fa
> >  lse', COMPRESSION => 'NONE', ENCODE_ON_DISK => 'true', BLOCKCACHE =>
> > 'true', MIN_VERSIONS => '0',
> > DA
> >  TA_BLOCK_ENCODING => 'NONE', IN_MEMORY => 'false', BLOOMFILTER =>
> 'NONE',
> > TTL => '2147483647',
> > VERSI
> >  ONS => '3', BLOCKSIZE =>
> > '65536'}]}
> >
> > 1 row(s) in 0.0250 seconds
> >
> > *Put rows into t2_dn on Master*
> > put 't2_dn','row1','cf1:q1','val1cf1fromMaster'
> > put 't2_dn','row1','cf2:q1','val1cf2fromMaster'
> > put 't2_dn','row2','cf1:q1','val2cf1fromMaster'
> > put 't2_dn','row3','cf2:q1','val3cf2fromMaster'
> >
> > *Expecting cf1 replicated to slave1, and cf2 replicatedto slave2. Where
> all
> > the three clusters got: *
> > scan 't2_dn'
> > ROW
> > COLUMN+CELL
> >
> >  row1                          column=cf1:q1, timestamp=1370382328358,
> > value=val1cf1fromMaster
> >  row1                          column=cf2:q1, timestamp=1370382334303,
> > value=val1cf2fromMaster
> >  row2                          column=cf1:q1, timestamp=1370382351716,
> > value=val2cf1fromMaster
> >  row3                          column=cf2:q1, timestamp=1370382367724,
> > value=val3cf2fromMaster
> > 3 row(s) in 0.0160 seconds
> >
> > Many thanks
> >
> > Demai
> >
>

Re: Replication is on columnfamily level or table level?

Posted by Anoop John <an...@gmail.com>.
Yes the replication can be specified at the CF level..  You have used
HCD#setScope() right?

> S => '3', BLOCKSIZE => '65536'}, {*NAME => 'cf2', REPLICATION_SCOPE =>
'2'*,
You set scope as 2?? You have to set one CF to be replicated to one cluster
and another to to another cluster. I dont think it is supported even now.
You can see in the HCD code that there are 2 constants for scope 0 and 1
where 1 means replicate and 0 means not to be  replicated.

-Anoop-

On Wed, Jun 5, 2013 at 3:31 AM, N Dm <ni...@gmail.com> wrote:

> hi, folks,
>
> <hbase 0.94.3>
>
> By reading several documents, I always have the impression that *
> "Replication* works at the table-*column*-*family level*". However, when I
> am setting up a table with two columnfamilies and replicate them to two
> different slavers, the whole table replicated. Is this a bug? Thanks
>
> Here is the simple steps to receate.
>
> *Environment: *
> Replication Master: hdtest014
> Replication Slave 1: hdtest017
> Replication Slave 2: hdtest009
>
> *Create Table*: on Master, and the two slaves:  create 't2_dn','cf1','cf2'
>
> *setup replication on Master*(hdtest014), so that
> Master> list_peers
>  PEER_ID CLUSTER_KEY STATE
>  1 hdtest017.svl.ibm.com:2181:/hbase ENABLED
>  2 hdtest009.svl.ibm.com:2181:/hbase ENABLED
> Master> > describe 't2_dn'
> DESCRIPTION
> ENABLED
>  {NAME => 't2_dn', FAMILIES => [{*NAME => 'cf1', REPLICATION_SCOPE => '1'*,
> KEEP_DELETED_CELLS => 'fals
> true
>  e', COMPRESSION => 'NONE', ENCODE_ON_DISK => 'true', BLOCKCACHE => 'true',
> MIN_VERSIONS => '0',
> DATA
>  _BLOCK_ENCODING => 'NONE', IN_MEMORY => 'false', BLOOMFILTER => 'NONE',
> TTL => '2147483647',
> VERSION
>  S => '3', BLOCKSIZE => '65536'}, {*NAME => 'cf2', REPLICATION_SCOPE =>
> '2'*,
> KEEP_DELETED_CELLS =>
> 'fa
>  lse', COMPRESSION => 'NONE', ENCODE_ON_DISK => 'true', BLOCKCACHE =>
> 'true', MIN_VERSIONS => '0',
> DA
>  TA_BLOCK_ENCODING => 'NONE', IN_MEMORY => 'false', BLOOMFILTER => 'NONE',
> TTL => '2147483647',
> VERSI
>  ONS => '3', BLOCKSIZE =>
> '65536'}]}
>
> 1 row(s) in 0.0250 seconds
>
> *Put rows into t2_dn on Master*
> put 't2_dn','row1','cf1:q1','val1cf1fromMaster'
> put 't2_dn','row1','cf2:q1','val1cf2fromMaster'
> put 't2_dn','row2','cf1:q1','val2cf1fromMaster'
> put 't2_dn','row3','cf2:q1','val3cf2fromMaster'
>
> *Expecting cf1 replicated to slave1, and cf2 replicatedto slave2. Where all
> the three clusters got: *
> scan 't2_dn'
> ROW
> COLUMN+CELL
>
>  row1                          column=cf1:q1, timestamp=1370382328358,
> value=val1cf1fromMaster
>  row1                          column=cf2:q1, timestamp=1370382334303,
> value=val1cf2fromMaster
>  row2                          column=cf1:q1, timestamp=1370382351716,
> value=val2cf1fromMaster
>  row3                          column=cf2:q1, timestamp=1370382367724,
> value=val3cf2fromMaster
> 3 row(s) in 0.0160 seconds
>
> Many thanks
>
> Demai
>