You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2010/02/09 03:17:28 UTC

[jira] Created: (HBASE-2197) Start replication from a point in time

Start replication from a point in time
--------------------------------------

                 Key: HBASE-2197
                 URL: https://issues.apache.org/jira/browse/HBASE-2197
             Project: Hadoop HBase
          Issue Type: Sub-task
            Reporter: Jean-Daniel Cryans




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-2197) Start replication from a point in time

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-2197:
--------------------------------------

      Description: One way to set up a cluster for replication is to distcp all files then start the replication. We need a way to make sure we don't miss any edits by being able to start a process that reads old log files from a defined point in time and send them to a specific slave cluster and then catch up with normal replication.
    Fix Version/s: 0.21.0

> Start replication from a point in time
> --------------------------------------
>
>                 Key: HBASE-2197
>                 URL: https://issues.apache.org/jira/browse/HBASE-2197
>             Project: Hadoop HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>
> One way to set up a cluster for replication is to distcp all files then start the replication. We need a way to make sure we don't miss any edits by being able to start a process that reads old log files from a defined point in time and send them to a specific slave cluster and then catch up with normal replication.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2197) Start replication from a point in time

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832105#action_12832105 ] 

Jean-Daniel Cryans commented on HBASE-2197:
-------------------------------------------

bq. My expectation as a user is that I would turn on replication and then do a MR job to transfer already stored data from the source cluster to the new peer.

Good idea, I feel dumb not thinking about it. So we could ship a MR that takes in input a table and that outputs to a peer cluster, that's how you see it?

bq. That's a bug. Let's file a jira for it and fix it.

I think it's HBASE-1485

> Start replication from a point in time
> --------------------------------------
>
>                 Key: HBASE-2197
>                 URL: https://issues.apache.org/jira/browse/HBASE-2197
>             Project: Hadoop HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>
> One way to set up a cluster for replication is to distcp all files then start the replication. We need a way to make sure we don't miss any edits by being able to start a process that reads old log files from a defined point in time and send them to a specific slave cluster and then catch up with normal replication.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2197) Start replication from a point in time

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832081#action_12832081 ] 

Andrew Purtell commented on HBASE-2197:
---------------------------------------

My expectation as a user is that I would turn on replication and then do a MR job to transfer already stored data from the source cluster to the new peer. So this is iterating over and streaming values, not shipping files. In this way I would not miss any new edits because they are already replicating even as I am bringing over the values persisted before replication started, with timestamps preserved. But, I would also have my clocks NTP synchronized and I would not be manually adjusting timestamps. 

> Start replication from a point in time
> --------------------------------------
>
>                 Key: HBASE-2197
>                 URL: https://issues.apache.org/jira/browse/HBASE-2197
>             Project: Hadoop HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>
> One way to set up a cluster for replication is to distcp all files then start the replication. We need a way to make sure we don't miss any edits by being able to start a process that reads old log files from a defined point in time and send them to a specific slave cluster and then catch up with normal replication.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HBASE-2197) Start replication from a point in time

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans reassigned HBASE-2197:
-----------------------------------------

    Assignee: Jean-Daniel Cryans

> Start replication from a point in time
> --------------------------------------
>
>                 Key: HBASE-2197
>                 URL: https://issues.apache.org/jira/browse/HBASE-2197
>             Project: Hadoop HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>
> One way to set up a cluster for replication is to distcp all files then start the replication. We need a way to make sure we don't miss any edits by being able to start a process that reads old log files from a defined point in time and send them to a specific slave cluster and then catch up with normal replication.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2197) Start replication from a point in time

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833119#action_12833119 ] 

Jean-Daniel Cryans commented on HBASE-2197:
-------------------------------------------

I opened HBASE-2221 to develop the MR Andrew and I discussed.

> Start replication from a point in time
> --------------------------------------
>
>                 Key: HBASE-2197
>                 URL: https://issues.apache.org/jira/browse/HBASE-2197
>             Project: Hadoop HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>
> One way to set up a cluster for replication is to distcp all files then start the replication. We need a way to make sure we don't miss any edits by being able to start a process that reads old log files from a defined point in time and send them to a specific slave cluster and then catch up with normal replication.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2197) Start replication from a point in time

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831764#action_12831764 ] 

Jean-Daniel Cryans commented on HBASE-2197:
-------------------------------------------

Basically I see two ways of solving this problem:

- The master cluster itself reads the old edits and streams them to the slave cluster using a distributed queue for all files to process.
- The master cluster does a distcp-like operation to ship all files to the slave which will be responsible to replay them.

The second solution can be implemented very easily by making some operations sequentially. First the user runs a jruby script on the master cluster to move all files to the slave cluster sequentially, then run another script on the slave cluster to apply all edits sequentially from all files.

An upgrade would be to copy (or move) the desired files on the master cluster to a different folder and start a real distcp to the slave cluster. Then on the slave cluster start a MapReduce job to process all those files. It's still manual but much faster.

The next problem to tackle is how to switch from sending files to streaming the edits. An easy solution would be to start the replication as soon as the initial distcp of all the data is done and then start a second copying of all the files to reapply. There's 2 issues with that:

- How do we make sure that when we start the second distcp that all the files we need are in the HLog archive folder? There's a possibility that some are still "active" in some region servers' .logs folder and that we may miss them. A workaround would be to wait long enough to make sure everything is archived.
- Currently random reads have that problem where the can return older versions of the data if a timestamp in the memstore is older than data in the store files. That can easily be the case here because we stream newer edit before applying the old ones. This problem applies only for those who wants to serve ASAP out of the slave cluster although they will also have to deal with missing data (which is in transfer). So this could even be a non-issue.

> Start replication from a point in time
> --------------------------------------
>
>                 Key: HBASE-2197
>                 URL: https://issues.apache.org/jira/browse/HBASE-2197
>             Project: Hadoop HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>
> One way to set up a cluster for replication is to distcp all files then start the replication. We need a way to make sure we don't miss any edits by being able to start a process that reads old log files from a defined point in time and send them to a specific slave cluster and then catch up with normal replication.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2197) Start replication from a point in time

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832154#action_12832154 ] 

Jean-Daniel Cryans commented on HBASE-2197:
-------------------------------------------

@Andrew

That's my thought. It would be easily configurable with peer address and time from which to replicate, and could use the replication logic (which will be refactored out) to only replicate the right column families. That sounds badass.

> Start replication from a point in time
> --------------------------------------
>
>                 Key: HBASE-2197
>                 URL: https://issues.apache.org/jira/browse/HBASE-2197
>             Project: Hadoop HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>
> One way to set up a cluster for replication is to distcp all files then start the replication. We need a way to make sure we don't miss any edits by being able to start a process that reads old log files from a defined point in time and send them to a specific slave cluster and then catch up with normal replication.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2197) Start replication from a point in time

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831823#action_12831823 ] 

ryan rawson commented on HBASE-2197:
------------------------------------

I am not generally a fan of depending on files and region layout.  We need to have a system where we can bring up a cluster immediately and get new edits in then pull in the rest of them as fast as possible.  I think the latter process should be out of bounds, ie: lets keep the replication core in HRS really small, then add value on and around it, possibly with some HRS hooks if necessary.

As for the random read issue, we should highly consider removing that code.

> Start replication from a point in time
> --------------------------------------
>
>                 Key: HBASE-2197
>                 URL: https://issues.apache.org/jira/browse/HBASE-2197
>             Project: Hadoop HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>
> One way to set up a cluster for replication is to distcp all files then start the replication. We need a way to make sure we don't miss any edits by being able to start a process that reads old log files from a defined point in time and send them to a specific slave cluster and then catch up with normal replication.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2197) Start replication from a point in time

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832087#action_12832087 ] 

Andrew Purtell commented on HBASE-2197:
---------------------------------------

bq. Currently random reads have that problem where they can return older versions of the data if a timestamp in the memstore is older than data in the store files.

That's a bug. Let's file a jira for it and fix it. 

> Start replication from a point in time
> --------------------------------------
>
>                 Key: HBASE-2197
>                 URL: https://issues.apache.org/jira/browse/HBASE-2197
>             Project: Hadoop HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>
> One way to set up a cluster for replication is to distcp all files then start the replication. We need a way to make sure we don't miss any edits by being able to start a process that reads old log files from a defined point in time and send them to a specific slave cluster and then catch up with normal replication.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2197) Start replication from a point in time

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832151#action_12832151 ] 

Andrew Purtell commented on HBASE-2197:
---------------------------------------

{quote}
bq. My expectation as a user is that I would turn on replication and then do a MR job to transfer already stored data from the source cluster to the new peer.
So we could ship a MR that takes in input a table and that outputs to a peer cluster, that's how you see it?
{quote}

Something generic we can provide would be helpful. A TOF that supports some notion of remote table? 


> Start replication from a point in time
> --------------------------------------
>
>                 Key: HBASE-2197
>                 URL: https://issues.apache.org/jira/browse/HBASE-2197
>             Project: Hadoop HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>
> One way to set up a cluster for replication is to distcp all files then start the replication. We need a way to make sure we don't miss any edits by being able to start a process that reads old log files from a defined point in time and send them to a specific slave cluster and then catch up with normal replication.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.