You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jean-Daniel Cryans <jd...@apache.org> on 2011/03/11 19:17:00 UTC

Re: HBase replication

A scan can scan as many families as you have.

J-D

On Fri, Mar 11, 2011 at 7:56 AM, Mark Kerzner <ma...@gmail.com> wrote:
> J-D,
> when I read from two family, I cannot combine it in one scan, so I would
> have to do two scans, correct?
> Thank you,
> Mark
>
> On Thu, Feb 24, 2011 at 2:13 PM, Mark Kerzner <ma...@gmail.com> wrote:
>>
>> Thanks, J-D, that's the best answer.
>> Merci!
>> Mark
>>
>> On Thu, Feb 24, 2011 at 1:34 PM, Jean-Daniel Cryans <jd...@apache.org>
>> wrote:
>>>
>>> Ah ok so really just master-master... well it's possible to do it in
>>> 0.90 as long as a family that's replicated from one cluster isn't
>>> replicated when inserted in the other. That means you would have to
>>> use 2 columns families and merge the results.
>>>
>>> Let's say you have table "test", on cluster 1 you would create it like
>>> this:
>>>
>>> create "test", {NAME => 'f1', REPLICATION_SCOPE => '1'}, {NAME => 'f2'}
>>>
>>> Then on cluster 2:
>>>
>>> create "test", {NAME => 'f1'}, {NAME => 'f2', REPLICATION_SCOPE => '1'}
>>>
>>> When you write, always write to 1 family (and that family is different
>>> depending on the cluster you're on). When you read, always get the
>>> data from both families.
>>>
>>> Another option is to write yourself to both clusters (async or not).
>>>
>>> J-D
>>>
>>> On Thu, Feb 24, 2011 at 10:28 AM, Mark Kerzner <ke...@shmsoft.com>
>>> wrote:
>>> > Yes, J-D, your understanding of my understanding is correct.
>>> > The two would actually be the same if all new records in one HBase
>>> > could be
>>> > copied to the other HBase through log shipping. That is assuming that
>>> > the
>>> > two databases never get records with the same row key. I did not mean
>>> > to
>>> > synchronize after the fact, but only as the records are being written,
>>> > from
>>> > the very beginning.
>>> > If we agree on the question, what would be the solution?
>>> > Thank you,
>>> > Mark
>>> >
>>> > On Thu, Feb 24, 2011 at 12:14 PM, Jean-Daniel Cryans
>>> > <jd...@apache.org>
>>> > wrote:
>>> >>
>>> >> What you describe is more like the rsync tool, which isn't what HBase
>>> >> replication is doing at all. Replication works with log shipping, and
>>> >> only copies data when it reads it from a log, there's no proactive
>>> >> thread that checks for differences between two clusters and that
>>> >> copies the missing pieces.
>>> >>
>>> >> Is my understanding of your understanding of replication correct?
>>> >>
>>> >> J-D
>>> >>
>>> >> On Thu, Feb 24, 2011 at 8:00 AM, Mark Kerzner <ke...@shmsoft.com>
>>> >> wrote:
>>> >> >
>>> >> > Hi,
>>> >> >
>>> >> > I have two HBases running in separate clusters, and ideally I would
>>> >> > like
>>> >> > to
>>> >> > synchronize them: records not found in one should be copied over to
>>> >> > the
>>> >> > other, and vice versa.
>>> >> >
>>> >> > Now, I do know that there is master-slave replication in 0.89
>>> >> > already,
>>> >> > but
>>> >> > that master-master is experimental in 0.90 on, with some of it left
>>> >> > for
>>> >> > 0.92, and it will only copy tables from one HBase that are not
>>> >> > present
>>> >> > in
>>> >> > another.
>>> >> >
>>> >> > Are there any other approaches that can get me as close to this kind
>>> >> > of
>>> >> > synchronization as possible?
>>> >> >
>>> >> > Thank you,
>>> >> > Mark
>>> >> > --
>>> >> > View this message in context:
>>> >> > http://old.nabble.com/HBase-replication-tp31005404p31005404.html
>>> >> > Sent from the HBase User mailing list archive at Nabble.com.
>>> >> >
>>> >> >
>>> >
>>> >
>>
>
>