You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Tim Sell (JIRA)" <ji...@apache.org> on 2008/07/30 18:09:31 UTC
[jira] Commented: (HBASE-787) Postgresql to HBase table
replication.
[ https://issues.apache.org/jira/browse/HBASE-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618408#action_12618408 ]
Tim Sell commented on HBASE-787:
--------------------------------
Ack.. I forgot to mention...
When it does a bootstrap, it has to options for doing so, a copy to file, or a select of the whole table. If you comment out the file for the tables to dumped to, it will do a select.
The drawback to the copy is it is tab separated, so parsing data that contains tabs will mess up, but it's a lot faster if you know your data is safe.
> Postgresql to HBase table replication.
> --------------------------------------
>
> Key: HBASE-787
> URL: https://issues.apache.org/jira/browse/HBASE-787
> Project: Hadoop HBase
> Issue Type: New Feature
> Affects Versions: 0.2.0
> Reporter: Tim Sell
> Priority: Minor
> Fix For: 0.2.0
>
> Attachments: hbrep-2008.07.30.tar.gz
>
>
> It is useful to have an easy way to replicate data from Postgresql tables to a HBase tables.
> I made a simple python tool which does this, called hbrep.
> hbrep is a tool for replicating data from postgresql tables to hbase tables.
> Dependancies:
> - python 2.4
> - hbase 0.2.0
> - skytools 2.1.7
> - postgresql
>
> It has two main functions.
> - bootstrap, which bootstraps all the data from specified columns of a table
> - play, which processes incoming insert, update and delete events and applies them to hbase.
> Example usage:
> install triggers:
> ./hbrep.py hbrep.ini install schema1.table1 schema2.table2
> now that future updates are queuing, bootstrap the tables.
> ./hbrep.py hbrep.ini bootstrap schema1.table1 schema2.table2
> start pgq ticker (this is part of skytools, it manages event queues and sends the events to registered consumers).
> pgqadm.py pgq.ini ticker
> play our queue consumer to replicate events
> ./hbrep.py hbrep.ini play schema1.table1 schema2.table2
> more details in the readme.
> feedback and improvements appreciated.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.