You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2009/03/30 03:57:50 UTC

[jira] Created: (HBASE-1295) Federated HBase

Federated HBase
---------------

Key: HBASE-1295
URL: https://issues.apache.org/jira/browse/HBASE-1295
Project: Hadoop HBase
Issue Type: New Feature
Reporter: Andrew Purtell

HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box.

Consider if rows, columns, or even cells could be scoped: local, or global.

Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.

This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user wants the transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [jira] Commented: (HBASE-1295) Federated HBase

Posted by Chad Walters <Ch...@microsoft.com>.

Ryan,

It might be better to respond to JIRA tickets in the ticket. That way the conversation about the issues in the ticket is kept in one place in case the issue stalls out and is picked up months later.

Chad


On 4/24/09 8:22 PM, "Ryan Rawson" <ry...@gmail.com> wrote:

We could use ZooKeeper paths as a way for replication endpoints to know
about each other.

On Fri, Apr 24, 2009 at 8:16 PM, Andrew Purtell (JIRA) <ji...@apache.org>wrote:

>
>    [
> https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702647#action_12702647]
>
> Andrew Purtell commented on HBASE-1295:
> ---------------------------------------
>
> bq. In the case a cluster is down, I guess that would mean the other
> clusters would have to keep all the WALs until the it is up again. At that
> moment, it may receive tons of WALs right?
>
> Yes the effect of a partition and extended outage is a buildup of WALs on
> the peer clusters, and then a lot of backlog. Let me think about this case
> and post a revised slide deck.
>
> bq. Also I was wondering, if you want to add a new cluster, would the way
> to go be replicating by hand (MR or else) all the data to the other cluster
> then telling somehow that the clusters have a new peer?
>
> I was anticipating that the cluster would be advertised as a peer, somehow,
> and then replication would then start. The replicators should add tables and
> column families to their local schema on demand as the cells are received,
> maybe additionally also ask the peer about column family details as
> necessary. Whether or not to bring over existing data would be a
> deployment/application concern I think and could be handed by a MR
> export-import job.
>
> > Federated HBase
> > ---------------
> >
> >                 Key: HBASE-1295
> >                 URL: https://issues.apache.org/jira/browse/HBASE-1295
> >             Project: Hadoop HBase
> >          Issue Type: New Feature
> >            Reporter: Andrew Purtell
> >         Attachments: hbase_repl.1.pdf
> >
> >
> > HBase should consider supporting a federated deployment where someone
> might have terascale (or beyond) clusters in more than one geography and
> would want the system to handle replication between the clusters/regions. It
> would be sweet if HBase had something on the roadmap to sync between
> replicas out of the box.
> > Consider if rows, columns, or even cells could be scoped: local, or
> global.
> > Then, consider a background task on each cluster that replicates new
> globally scoped edits to peer clusters. The HBase/Bigtable data model has
> convenient features (timestamps, multiversioning) such that simple exchange
> of globally scoped cells would be conflict free and would "just work".
> Implementation effort here would be in producing an efficient mechanism for
> collecting up edits from all the HRS and transmitting the edits over the
> network to peers where they would then be split out to the HRS there.
> Holding on to the edit trace and tracking it until the remote commits
> succeed would also be necessary. So, HLog is probably the right place to set
> up the tee. This would be filtered log shipping, basically.
> > This proposal does not consider transactional tables. For transactional
> tables, enforcement of global mutation commit ordering would come into the
> picture if the user  wants the  transaction to span the federation. This
> should be an optional feature even with transactional tables themselves
> being optional because of how slow it would be.
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>

Re: [jira] Commented: (HBASE-1295) Federated HBase

Posted by Ryan Rawson <ry...@gmail.com>.

We could use ZooKeeper paths as a way for replication endpoints to know
about each other.

On Fri, Apr 24, 2009 at 8:16 PM, Andrew Purtell (JIRA) <ji...@apache.org>wrote:

>
>    [
> https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702647#action_12702647]
>
> Andrew Purtell commented on HBASE-1295:
> ---------------------------------------
>
> bq. In the case a cluster is down, I guess that would mean the other
> clusters would have to keep all the WALs until the it is up again. At that
> moment, it may receive tons of WALs right?
>
> Yes the effect of a partition and extended outage is a buildup of WALs on
> the peer clusters, and then a lot of backlog. Let me think about this case
> and post a revised slide deck.
>
> bq. Also I was wondering, if you want to add a new cluster, would the way
> to go be replicating by hand (MR or else) all the data to the other cluster
> then telling somehow that the clusters have a new peer?
>
> I was anticipating that the cluster would be advertised as a peer, somehow,
> and then replication would then start. The replicators should add tables and
> column families to their local schema on demand as the cells are received,
> maybe additionally also ask the peer about column family details as
> necessary. Whether or not to bring over existing data would be a
> deployment/application concern I think and could be handed by a MR
> export-import job.
>
> > Federated HBase
> > ---------------
> >
> >                 Key: HBASE-1295
> >                 URL: https://issues.apache.org/jira/browse/HBASE-1295
> >             Project: Hadoop HBase
> >          Issue Type: New Feature
> >            Reporter: Andrew Purtell
> >         Attachments: hbase_repl.1.pdf
> >
> >
> > HBase should consider supporting a federated deployment where someone
> might have terascale (or beyond) clusters in more than one geography and
> would want the system to handle replication between the clusters/regions. It
> would be sweet if HBase had something on the roadmap to sync between
> replicas out of the box.
> > Consider if rows, columns, or even cells could be scoped: local, or
> global.
> > Then, consider a background task on each cluster that replicates new
> globally scoped edits to peer clusters. The HBase/Bigtable data model has
> convenient features (timestamps, multiversioning) such that simple exchange
> of globally scoped cells would be conflict free and would "just work".
> Implementation effort here would be in producing an efficient mechanism for
> collecting up edits from all the HRS and transmitting the edits over the
> network to peers where they would then be split out to the HRS there.
> Holding on to the edit trace and tracking it until the remote commits
> succeed would also be necessary. So, HLog is probably the right place to set
> up the tee. This would be filtered log shipping, basically.
> > This proposal does not consider transactional tables. For transactional
> tables, enforcement of global mutation commit ordering would come into the
> picture if the user  wants the  transaction to span the federation. This
> should be an optional feature even with transactional tables themselves
> being optional because of how slow it would be.
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>

[jira] Updated: (HBASE-1295) Federated HBase

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1295:
----------------------------------

    Attachment:     (was: hbase_repl.1.pdf)

> Federated HBase
> ---------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1295) Federated HBase

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708090#action_12708090 ] 

Billy Pearson commented on HBASE-1295:
--------------------------------------

ok I agree now I reread the pdf the only down side is the lag will be the log roll period for the most part but I can live with that if others can.

> Federated HBase
> ---------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>         Attachments: hbase_repl.2.odp, hbase_repl.2.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1295) Federated HBase

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1295:
----------------------------------

    Attachment: hbase_repl.2.odp
                hbase_repl.2.pdf

> Federated HBase
> ---------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>         Attachments: hbase_repl.2.odp, hbase_repl.2.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1295) Multi data center replication

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724315#action_12724315 ] 

Andrew Purtell commented on HBASE-1295:
---------------------------------------

bq. I assume to get a new peer online we would have to have some kind of a read-lock with a flush or some kind of catchup mode that will export all data before x timestamp 

In my opinion, no. Edits are propagated from HLogs. Existing data would not be replicated, therefore. It would be an application specific consideration, and could be accomplished perhaps of forwarding of existing data at the application level via mapreduce transfer job or a background value fetch-and-refresh strategy.

bq. Also will this be a write one place read anywhere replication or write anywhere read anywhere replication

Write anywhere read anywhere

bq. Will edits be able to write to any site/cluster and get replication to all the peers?

Yes.
But I know that jgray at least would like for some administrative control over the propagation details. This could be accomplished via local settings stored in each cluster's peer table, supplied as parameters to ADD PEER commands.


> Multi data center replication
> -----------------------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>             Fix For: 0.21.0
>
>         Attachments: hbase_repl.3.odp, hbase_repl.3.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1295) Federated HBase

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702411#action_12702411 ] 

Jean-Daniel Cryans commented on HBASE-1295:
-------------------------------------------

Glad to see that you already did some work on this issue Andrew, once 0.20 is out I'm sure people will look for a feature like that. Regards the slides:

- I like the idea of having nominated gateways.

- Using the WALs seems the way to go, especially the way you describe how it should be done.

- In the case a cluster is down, I guess that would mean the other clusters would have to keep all the WALs until the it is up again. At that moment, it may receive tons of WALs right?

Also I was wondering, if you want to add a new cluster, would the way to go be replicating by hand (MR or else) all the data to the other cluster then telling somehow that the clusters have a new peer?

> Federated HBase
> ---------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>         Attachments: hbase_repl.1.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HBASE-1295) Federated HBase

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702647#action_12702647 ] 

Andrew Purtell edited comment on HBASE-1295 at 4/24/09 8:21 PM:
----------------------------------------------------------------

bq. In the case a cluster is down, I guess that would mean the other clusters would have to keep all the WALs until the it is up again. At that moment, it may receive tons of WALs right? 

Yes the effect of a partition and extended outage is a buildup of WALs on the peer clusters, and then a lot of backlog. Let me think about this case and post a revised slide deck.

bq. Also I was wondering, if you want to add a new cluster, would the way to go be replicating by hand (MR or else) all the data to the other cluster then telling somehow that the clusters have a new peer?

I was anticipating that the cluster would be advertised as a peer, somehow, and then replication would then start. The replicators should add tables and column families to their local schema on demand as the cells are received, maybe additionally also ask the peer about schema details as necessary. Updates to HTD and HCD attributes can be considered another type of edit. Whether or not to bring over existing data would be a deployment/application concern I think and could be handed by a MR export-import job. 

      was (Author: apurtell):
    bq. In the case a cluster is down, I guess that would mean the other clusters would have to keep all the WALs until the it is up again. At that moment, it may receive tons of WALs right? 

Yes the effect of a partition and extended outage is a buildup of WALs on the peer clusters, and then a lot of backlog. Let me think about this case and post a revised slide deck.

bq. Also I was wondering, if you want to add a new cluster, would the way to go be replicating by hand (MR or else) all the data to the other cluster then telling somehow that the clusters have a new peer?

I was anticipating that the cluster would be advertised as a peer, somehow, and then replication would then start. The replicators should add tables and column families to their local schema on demand as the cells are received, maybe additionally also ask the peer about column family details as necessary. Whether or not to bring over existing data would be a deployment/application concern I think and could be handed by a MR export-import job. 
  
> Federated HBase
> ---------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>         Attachments: hbase_repl.1.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1295) Multi data center replication

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727041#action_12727041 ] 

stack commented on HBASE-1295:
------------------------------

TODO: Move of in-memory parameter from table to column family (from HTD to HCD).

> Multi data center replication
> -----------------------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>             Fix For: 0.21.0
>
>         Attachments: hbase_repl.3.odp, hbase_repl.3.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1295) Multi data center replication

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1295:
----------------------------------

    Attachment: hbase_repl.3.pdf
                hbase_repl.3.odp

> Multi data center replication
> -----------------------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>             Fix For: 0.21.0
>
>         Attachments: hbase_repl.3.odp, hbase_repl.3.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1295) Multi data center replication

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1295:
----------------------------------

    Attachment:     (was: hbase_repl.2.pdf)

> Multi data center replication
> -----------------------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>             Fix For: 0.21.0
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1295) Multi data center replication

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719293#action_12719293 ] 

Andrew Purtell commented on HBASE-1295:
---------------------------------------

I'm going to iterate on this again. Will put up new materials in a day or two.

> Multi data center replication
> -----------------------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>             Fix For: 0.21.0
>
>         Attachments: hbase_repl.2.odp, hbase_repl.2.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1295) Multi data center replication

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1295:
----------------------------------

    Attachment:     (was: hbase_repl.2.odp)

> Multi data center replication
> -----------------------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>             Fix For: 0.21.0
>
>         Attachments: hbase_repl.2.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1295) Multi data center replication

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1295:
----------------------------------

    Fix Version/s: 0.21.0
          Summary: Multi data center replication  (was: Federated HBase)

> Multi data center replication
> -----------------------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>             Fix For: 0.21.0
>
>         Attachments: hbase_repl.2.odp, hbase_repl.2.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1295) Multi data center replication

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774544#action_12774544 ] 

Andrew Purtell commented on HBASE-1295:
---------------------------------------

Break out replication into contrib/. 

Put hooks into core for holding onto logs after roll, notify watchers via some mechanism (e.g. ZK), and only allow log delete after all watchers acknowledge it. Everything else, move out.

> Multi data center replication
> -----------------------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>             Fix For: 0.21.0
>
>         Attachments: hbase_repl.3.odp, hbase_repl.3.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1295) Multi data center replication

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724302#action_12724302 ] 

Billy Pearson commented on HBASE-1295:
--------------------------------------

I assume to get a new peer online we would have to have some kind of a read-lock with a flush or 
some kind of catchup mode that will export all data before x timestamp

Might include something in the pdf on idea you are thanking of on deploying a new peer

Also will this be a write one place read anywhere replication or write anywhere read anywhere replication
Will edits be able to write to any site/cluster and get replication to all the peers?

> Multi data center replication
> -----------------------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>             Fix For: 0.21.0
>
>         Attachments: hbase_repl.3.odp, hbase_repl.3.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1295) Multi data center replication

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1295:
-------------------------

    Comment: was deleted

(was: TODO: Move of in-memory parameter from table to column family (from HTD to HCD).)

> Multi data center replication
> -----------------------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>             Fix For: 0.21.0
>
>         Attachments: hbase_repl.3.odp, hbase_repl.3.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1295) Federated HBase

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1295:
----------------------------------

    Attachment: hbase_repl.1.pdf

> Federated HBase
> ---------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>         Attachments: hbase_repl.1.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1295) Federated HBase

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707924#action_12707924 ] 

Billy Pearson commented on HBASE-1295:
--------------------------------------

I was thanking on this there is some other thing to consider like table splits will the regions be the same on both because there is no guarantee the compactions will happen at the same time or the split will find the same mid key.

I would thank the master would be the idea process to pull logs a pass to peer master then it can split the logs in to regions and pass the edits on to the servers hosting the regions.
I would like to see Sequential process of the edits to the peer so everything is in the same order and that's the way we store the wal's now.

I am not sure what the current status of appends on hdfs right now but if we had that 100% working the master could just remember where in the wal it read up to and pull every x secs to see if there are any updates then we would not have to worry about waiting for a log to roll which could be a while in some cases. Waiting for a log to roll for the updates to get pushed to the peers seams like the wrong way to go with this but might be the only way we have now if append is not working right in hdfs.

As for a first sync for the peers would be hugh saving if we could do a rolling read only mode on the regions and flush the memcache and copy the needed files unlock the region and start the transfer to the peer this would allow one by one copy of the regions to the remote and  it would only be depending on the site-site bandwidth as the bottleneck in the mean time the peer could be holding edits and waiting for all regions to get copied and then start the replay of the logs skipping any edit that is older the the time stamp of the copy. I thank that could be written in the hfile now I thank as meta data.

Just some suggestions and/or other thoughts



> Federated HBase
> ---------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>         Attachments: hbase_repl.2.odp, hbase_repl.2.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1295) Federated HBase

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708085#action_12708085 ] 

Andrew Purtell commented on HBASE-1295:
---------------------------------------

@Billy:

I don't follow what you are saying about splits. It won't matter where a table is split. The replication process does not care about such details. It would send edits from the WALs to peers to be applied as if some local HRS is receiving local batchupdates. 

Also, according to the proposal, the master would not be involved in replication. The proposal considers more than one HRS -- self-elected via ZK -- working in a fault tolerant way to forward edits sent by all the other HRS on to the peer cluster. There is no SPOF in the replication process.

Also, I disagree that waiting for HLog roll is the wrong way to go. There is no reason a log roll cannot happen once per minute or every five minutes or whatever the configured replication period is. Then, we do not care if append or sync is properly implemented in the underlying FS. Given the state of how those issues are progressing in HDFS, we may have a working replication process before HDFS has a working append. 

> Federated HBase
> ---------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>         Attachments: hbase_repl.2.odp, hbase_repl.2.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1295) Federated HBase

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703389#action_12703389 ] 

ryan rawson commented on HBASE-1295:
------------------------------------

We could use ZooKeeper paths as a way for replication endpoints to know about each other.

> Federated HBase
> ---------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>         Attachments: hbase_repl.1.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1295) Federated HBase

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708627#action_12708627 ] 

Andrew Purtell commented on HBASE-1295:
---------------------------------------

See https://issues.apache.org/jira/browse/HBASE-1411?focusedCommentId=12708624&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12708624

> Federated HBase
> ---------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>         Attachments: hbase_repl.2.odp, hbase_repl.2.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1295) Multi data center replication

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1295:
----------------------------------

    Comment: was deleted

(was: I'm going to iterate on this again. Will put up new materials in a day or two.)

> Multi data center replication
> -----------------------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>             Fix For: 0.21.0
>
>         Attachments: hbase_repl.3.odp, hbase_repl.3.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1295) Federated HBase

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704213#action_12704213 ] 

Jonathan Gray commented on HBASE-1295:
--------------------------------------

This looks great Andrew!  Some comments...

- What do we do when there are two identical keys in a KeyValue (row, family, column, timestamp) but different values?  That's actually going to be possible in 0.20 since you can manually set the stamp, will certainly be possible with multi-master replication.  I'm not sure how it's handled now.  Would depend on logic in both memcache insertion and more importantly compaction, and then how it's handled when reading.
- Everything is now just a KeyValue, so that would be what we send to replicas.
- Thoughts on network partitioning?  I'm assuming you're referring to partitioning of replica clusters from one another, not within a cluster right?  If so, I guess you'd hang on to WALs as long as you could, eventually a replicated cluster would go into some secondary mode of needing a full sync (when other cluster(s) could no longer hold all WALs, or should we assume hdfs will not fill and just flush, so can always resync with WALs?).  (note: handling of intra-cluster partitions is virtually impossible because of our strong consistency)
- Regarding SCOPE and setting things as local or replicated.  What do you suspect the precision/resolution of this would be?  Could i have some tables being replicated to some clusters, other tables to others, some to both?
- Would replicas of tables _always_ require identical family settings?  For example, I have a cluster of 5 nodes with lots of memory, I want to just replicate a single high-volume, high-read table from my primary large cluster.  But in the small cluster I want to set a TTL of 1 day and also set as in-memory.  This is kind of advanced and special but the ability to do things like that would be very cool, could definitely see us doing something like it were it possible.

I've got a good bit of experience with database replication, did some work in the postgres world on WAL shipping.  Let me know how I can help your effort.

I agree on your assessment regarding consistency, etc.  It is clear we should be doing an eventual consistency model for replication.  This is one of my favorite topics!

One thing that's a bit special is this would make an HBase cluster of clusters a "read-your-writes"-style eventual consistency distribution model (with our strong consistency, partitioned distribution within each individual cluster a la read-your-writes).  That makes a huge difference for us, internally, on many of our data systems.  This may be obvious as we're just talking about replication here, but something to keep in mind.

> Federated HBase
> ---------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>         Attachments: hbase_repl.2.odp, hbase_repl.2.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1295) Federated HBase

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702647#action_12702647 ] 

Andrew Purtell commented on HBASE-1295:
---------------------------------------

bq. In the case a cluster is down, I guess that would mean the other clusters would have to keep all the WALs until the it is up again. At that moment, it may receive tons of WALs right? 

Yes the effect of a partition and extended outage is a buildup of WALs on the peer clusters, and then a lot of backlog. Let me think about this case and post a revised slide deck.

bq. Also I was wondering, if you want to add a new cluster, would the way to go be replicating by hand (MR or else) all the data to the other cluster then telling somehow that the clusters have a new peer?

I was anticipating that the cluster would be advertised as a peer, somehow, and then replication would then start. The replicators should add tables and column families to their local schema on demand as the cells are received, maybe additionally also ask the peer about column family details as necessary. Whether or not to bring over existing data would be a deployment/application concern I think and could be handed by a MR export-import job. 

> Federated HBase
> ---------------
>
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>         Attachments: hbase_repl.1.pdf
>
>
> HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user  wants the  transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.