You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by Von Gosling <vo...@apache.org> on 2018/06/04 06:15:29 UTC

[GSoC]Apache Beam and HBase Integration with RocketMQ

Hi,

I would like to split this topic from AMQP project, we could discuss in the current thread :-)

Best Regards,
Von Gosling



> 在 2018年6月4日,11:30,Xin Wang <da...@gmail.com> 写道:
> 
> You can just follow your plan, starting the work for integrating RocketMQ


Re: [GSoC]Apache Beam and HBase Integration with RocketMQ

Posted by Sergio Esteves <sr...@gmail.com>.
Hi,

Do you have any suggestions on how to track HBase updates that are less
intrusive that accessing the WAL log on the HDFS directly?

Thanks.

On Thu, Jun 7, 2018 at 3:33 PM, Sergio Esteves <sr...@gmail.com> wrote:

> OK, I am starting with the connector that will rely on an abstract event
> listener. I can also work in parallel on the Beam integration, just let me
> know how you want to proceed.
>
> Best,
> Sergio.
>
> On Thu, Jun 7, 2018 at 10:19 AM, Von Gosling <vo...@apache.org>
> wrote:
>
>> IMO, Co-processor is seldom used in the production consider its heavy
>> cost and security problem. for simplicity, you could design and implement
>> the connector for HBase.
>>
>> BTW, do we plan to raise the priority of Beam Integration?
>>
>>
>> Best Regards,
>> Von Gosling
>>
>> 在 2018年6月5日,21:41,Sergio Esteves <sr...@gmail.com> 写道:
>>
>> n that keeps a database in synch with a rocketmq topic.
>> The first way that comes to my mind to listen to HBase updates on a
>> table, is by using co-processors (akin to BigTable Observers):
>> https://blogs.apache.org/hbase/entry/coprocessor_introduction
>> With co-processors it is possible to insert hooks for data manipulation
>> events (e.g., put, delete). It is also possible to intercept WAL
>> (Write-Ahead Log that keeps track of updates on a table) writing of events.
>> The co-processors reside on the server side and are distributed
>> automatically alongside HBase region servers. Co-processor custom
>> implementations require modifications in the HBase configuration. By not
>> using co-processors, it would be possible to access directly the WAL on
>> HDFS (in case WAL writing is not disabled).
>>
>> These are the two main ways (co-processors or accessing WAL directly) I
>> can see to listen to HBase table updates. Using co-processors seems the
>> most natural way to go. For example, implementing a WALObserver and keep
>> track of the last position that was replicated to the rocketmq topic. Do
>> you agree?
>>
>>
>>
>

Re: [GSoC]Apache Beam and HBase Integration with RocketMQ

Posted by Sergio Esteves <sr...@gmail.com>.
OK, I am starting with the connector that will rely on an abstract event
listener. I can also work in parallel on the Beam integration, just let me
know how you want to proceed.

Best,
Sergio.

On Thu, Jun 7, 2018 at 10:19 AM, Von Gosling <vo...@apache.org> wrote:

> IMO, Co-processor is seldom used in the production consider its heavy cost
> and security problem. for simplicity, you could design and implement the
> connector for HBase.
>
> BTW, do we plan to raise the priority of Beam Integration?
>
>
> Best Regards,
> Von Gosling
>
> 在 2018年6月5日,21:41,Sergio Esteves <sr...@gmail.com> 写道:
>
> n that keeps a database in synch with a rocketmq topic.
> The first way that comes to my mind to listen to HBase updates on a table,
> is by using co-processors (akin to BigTable Observers): https://blogs.
> apache.org/hbase/entry/coprocessor_introduction
> With co-processors it is possible to insert hooks for data manipulation
> events (e.g., put, delete). It is also possible to intercept WAL
> (Write-Ahead Log that keeps track of updates on a table) writing of events.
> The co-processors reside on the server side and are distributed
> automatically alongside HBase region servers. Co-processor custom
> implementations require modifications in the HBase configuration. By not
> using co-processors, it would be possible to access directly the WAL on
> HDFS (in case WAL writing is not disabled).
>
> These are the two main ways (co-processors or accessing WAL directly) I
> can see to listen to HBase table updates. Using co-processors seems the
> most natural way to go. For example, implementing a WALObserver and keep
> track of the last position that was replicated to the rocketmq topic. Do
> you agree?
>
>
>

Re: [GSoC]Apache Beam and HBase Integration with RocketMQ

Posted by Von Gosling <vo...@apache.org>.
IMO, Co-processor is seldom used in the production consider its heavy cost and security problem. for simplicity, you could design and implement the connector for HBase.

BTW, do we plan to raise the priority of Beam Integration?


Best Regards,
Von Gosling

> 在 2018年6月5日,21:41,Sergio Esteves <sr...@gmail.com> 写道:
> 
> n that keeps a database in synch with a rocketmq topic.
> The first way that comes to my mind to listen to HBase updates on a table, is by using co-processors (akin to BigTable Observers): https://blogs.apache.org/hbase/entry/coprocessor_introduction <https://blogs.apache.org/hbase/entry/coprocessor_introduction>
> With co-processors it is possible to insert hooks for data manipulation events (e.g., put, delete). It is also possible to intercept WAL (Write-Ahead Log that keeps track of updates on a table) writing of events. The co-processors reside on the server side and are distributed automatically alongside HBase region servers. Co-processor custom implementations require modifications in the HBase configuration. By not using co-processors, it would be possible to access directly the WAL on HDFS (in case WAL writing is not disabled).
> 
> These are the two main ways (co-processors or accessing WAL directly) I can see to listen to HBase table updates. Using co-processors seems the most natural way to go. For example, implementing a WALObserver and keep track of the last position that was replicated to the rocketmq topic. Do you agree?


Re: [GSoC]Apache Beam and HBase Integration with RocketMQ

Posted by Sergio Esteves <sr...@gmail.com>.
Hi,

So the architecture overview of the replicator would be something like
this: https://goo.gl/FfnL4S

Shall we discuss the design of this plugin on hangouts (or similar)?

Thanks!



On Tue, Jun 5, 2018 at 2:41 PM, Sergio Esteves <sr...@gmail.com> wrote:

> Hi,
>
> I have been studying the mysql plugin that keeps a database in synch with
> a rocketmq topic.
> The first way that comes to my mind to listen to HBase updates on a table,
> is by using co-processors (akin to BigTable Observers):
> https://blogs.apache.org/hbase/entry/coprocessor_introduction
> With co-processors it is possible to insert hooks for data manipulation
> events (e.g., put, delete). It is also possible to intercept WAL
> (Write-Ahead Log that keeps track of updates on a table) writing of events.
> The co-processors reside on the server side and are distributed
> automatically alongside HBase region servers. Co-processor custom
> implementations require modifications in the HBase configuration. By not
> using co-processors, it would be possible to access directly the WAL on
> HDFS (in case WAL writing is not disabled).
>
> These are the two main ways (co-processors or accessing WAL directly) I
> can see to listen to HBase table updates. Using co-processors seems the
> most natural way to go. For example, implementing a WALObserver and keep
> track of the last position that was replicated to the rocketmq topic. Do
> you agree?
>
> Thanks!
>
>
>
> On Mon, Jun 4, 2018 at 7:15 AM, Von Gosling <vo...@apache.org> wrote:
>
>> Hi,
>>
>> I would like to split this topic from AMQP project, we could discuss in
>> the current thread :-)
>>
>> Best Regards,
>> Von Gosling
>>
>>
>>
>> 在 2018年6月4日,11:30,Xin Wang <da...@gmail.com> 写道:
>>
>> You can just follow your plan, starting the work for integrating RocketMQ
>>
>>
>>
>

Re: [GSoC]Apache Beam and HBase Integration with RocketMQ

Posted by Sergio Esteves <sr...@gmail.com>.
Hi,

I have been studying the mysql plugin that keeps a database in synch with a
rocketmq topic.
The first way that comes to my mind to listen to HBase updates on a table,
is by using co-processors (akin to BigTable Observers):
https://blogs.apache.org/hbase/entry/coprocessor_introduction
With co-processors it is possible to insert hooks for data manipulation
events (e.g., put, delete). It is also possible to intercept WAL
(Write-Ahead Log that keeps track of updates on a table) writing of events.
The co-processors reside on the server side and are distributed
automatically alongside HBase region servers. Co-processor custom
implementations require modifications in the HBase configuration. By not
using co-processors, it would be possible to access directly the WAL on
HDFS (in case WAL writing is not disabled).

These are the two main ways (co-processors or accessing WAL directly) I can
see to listen to HBase table updates. Using co-processors seems the most
natural way to go. For example, implementing a WALObserver and keep track
of the last position that was replicated to the rocketmq topic. Do you
agree?

Thanks!



On Mon, Jun 4, 2018 at 7:15 AM, Von Gosling <vo...@apache.org> wrote:

> Hi,
>
> I would like to split this topic from AMQP project, we could discuss in
> the current thread :-)
>
> Best Regards,
> Von Gosling
>
>
>
> 在 2018年6月4日,11:30,Xin Wang <da...@gmail.com> 写道:
>
> You can just follow your plan, starting the work for integrating RocketMQ
>
>
>