You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Tim Sell (JIRA)" <ji...@apache.org> on 2008/07/30 18:09:31 UTC

[jira] Commented: (HBASE-787) Postgresql to HBase table replication.

    [ https://issues.apache.org/jira/browse/HBASE-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618408#action_12618408 ] 

Tim Sell commented on HBASE-787:
--------------------------------

Ack.. I forgot to mention...
When it does a bootstrap, it has to options for doing so, a copy to file, or a select of the whole table. If you comment out the file for the tables to dumped to, it will do a select.

The drawback to the copy is it is tab separated, so parsing data that contains tabs will mess up, but it's a lot faster if you know your data is safe.

> Postgresql to HBase table replication.
> --------------------------------------
>
>                 Key: HBASE-787
>                 URL: https://issues.apache.org/jira/browse/HBASE-787
>             Project: Hadoop HBase
>          Issue Type: New Feature
>    Affects Versions: 0.2.0
>            Reporter: Tim Sell
>            Priority: Minor
>             Fix For: 0.2.0
>
>         Attachments: hbrep-2008.07.30.tar.gz
>
>
> It is useful to have an easy way to replicate data from Postgresql tables to a HBase tables.
> I made a simple python tool which does this, called hbrep.
> hbrep is a tool for replicating data from postgresql tables to hbase tables.
> Dependancies:
>  - python 2.4
>  - hbase 0.2.0
>  - skytools 2.1.7
>  - postgresql
>  
> It has two main functions.
>  - bootstrap, which bootstraps all the data from specified columns of a table
>  - play, which processes incoming insert, update and delete events and applies them to hbase.
> Example usage:
> install triggers:
>   ./hbrep.py hbrep.ini install schema1.table1 schema2.table2
> now that future updates are queuing, bootstrap the tables.
>   ./hbrep.py hbrep.ini bootstrap schema1.table1 schema2.table2
> start pgq ticker (this is part of skytools, it manages event queues and sends the events to registered consumers).
>   pgqadm.py pgq.ini ticker
> play our queue consumer to replicate events
>   ./hbrep.py hbrep.ini play schema1.table1 schema2.table2
> more details in the readme.
> feedback and improvements appreciated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.