You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ismaël Mejía <ie...@gmail.com> on 2017/02/22 11:27:39 UTC

Mini announcement HBase connector for Apache Beam

​Hello,

I have been working in the last weeks on a connector to use HBase in Apache
Beam (the evolution of Google’s Dataflow model), and it was finally merged
today.

https://github.com/apache/beam/tree/master/sdks/java/io/hbase

This message is a call for those interested on using HBase with Apache Beam
to try it and report any issue or additional need not covered by the
current version of the IO. I will be more than happy to address any issue.

Of course, for the more hardcore HBase contributors, I wouldn’t mind if
some of you guys could take a look and give me feedback on stuff I can
improve (or even better if you are  interested to contribute your
ideas/improvements in our community).

Finally I want to thank you guys for your work on HBase, it was really nice
to use your APIs, and HBaseTestingUtility was a life-saver to test my
implementation.

Thanks,
Ismaël Mejía

ps. I don’t know if Solomon Duskis still follows this mailing list, but
just in case, thanks a lot Solomon, your ideas for Google’s Cloud Dataflow
really had a profound influence on my implementation.​

Re: Mini announcement HBase connector for Apache Beam

Posted by Ismaël Mejía <ie...@gmail.com>.
Oh yes, We had to disable it because it was getting stuck (like in infinite
execution), I still have not found the cause, but at least now is happening
in both Jenkins and local (before I had this bug only 'sometimes' in local)
so it is at least consistent to reproduce.

For info and to see the execution log, there are more details on
https://issues.apache.org/jira/browse/BEAM-1550

I will keep you updated if I have some progress, and of course ideas or
patches are welcome if you have a clearer idea to solve it.

Thanks for the interest btw,
Ismaël


On Sat, Feb 25, 2017 at 6:36 PM, Ted Yu <yu...@gmail.com> wrote:

> I saw that HBaseIOTest was disabled.
>
> Just curious, was the test flaky ?
>
> On Sat, Feb 25, 2017 at 8:12 AM, Ismaël Mejía <ie...@gmail.com> wrote:
>
> > I will for sure, Thanks Ted.
> >
> >
> > On Thu, Feb 23, 2017 at 4:00 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > Ismaël:
> > > Can you post your future questions on the mailing list ?
> > >
> > > Thanks
> > >
> > > On Thu, Feb 23, 2017 at 5:49 AM, Ismaël Mejía <ie...@gmail.com>
> wrote:
> > >
> > > > Solomon, it is so great you answered, thanks a lot, you saved me with
> > the
> > > > idea of using the Protobuf serialization for the Coders, it was quite
> > > > tricky to achieve a similar API to what you guys have in BigTable,
> but
> > > that
> > > > helped me a lot and I think after looking what you did I could design
> > an
> > > > API that is at the same time natural for the HBase users and for the
> > > > Beam/Bigtable ones.
> > > >
> > > > Thanks again,
> > > > Ismaël
> > > >
> > > > ps. I would eventually jump into dev to ask some pending questions
> for
> > > > future work (subsplits and dynamic work rebalancing) so may be I will
> > be
> > > > bothering you or other guys in the mailing list again.
> > > >
> > > >
> > > > On Wed, Feb 22, 2017 at 8:07 PM, Solomon Duskis <sd...@gmail.com>
> > > wrote:
> > > >
> > > > > That's an amazing accomplishment, and a great collaboration!  I'm
> > happy
> > > > and
> > > > > humbled that you were able to use the work we did over the last
> > couple
> > > of
> > > > > years.
> > > > >
> > > > > On Wed, Feb 22, 2017 at 6:27 AM, Ismaël Mejía <ie...@gmail.com>
> > > wrote:
> > > > >
> > > > > > ​Hello,
> > > > > >
> > > > > > I have been working in the last weeks on a connector to use HBase
> > in
> > > > > Apache
> > > > > > Beam (the evolution of Google’s Dataflow model), and it was
> finally
> > > > > merged
> > > > > > today.
> > > > > >
> > > > > > https://github.com/apache/beam/tree/master/sdks/java/io/hbase
> > > > > >
> > > > > > This message is a call for those interested on using HBase with
> > > Apache
> > > > > Beam
> > > > > > to try it and report any issue or additional need not covered by
> > the
> > > > > > current version of the IO. I will be more than happy to address
> any
> > > > > issue.
> > > > > >
> > > > > > Of course, for the more hardcore HBase contributors, I wouldn’t
> > mind
> > > if
> > > > > > some of you guys could take a look and give me feedback on stuff
> I
> > > can
> > > > > > improve (or even better if you are  interested to contribute your
> > > > > > ideas/improvements in our community).
> > > > > >
> > > > > > Finally I want to thank you guys for your work on HBase, it was
> > > really
> > > > > nice
> > > > > > to use your APIs, and HBaseTestingUtility was a life-saver to
> test
> > my
> > > > > > implementation.
> > > > > >
> > > > > > Thanks,
> > > > > > Ismaël Mejía
> > > > > >
> > > > > > ps. I don’t know if Solomon Duskis still follows this mailing
> list,
> > > but
> > > > > > just in case, thanks a lot Solomon, your ideas for Google’s Cloud
> > > > > Dataflow
> > > > > > really had a profound influence on my implementation.​
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Mini announcement HBase connector for Apache Beam

Posted by Ted Yu <yu...@gmail.com>.
I saw that HBaseIOTest was disabled.

Just curious, was the test flaky ?

On Sat, Feb 25, 2017 at 8:12 AM, Ismaël Mejía <ie...@gmail.com> wrote:

> I will for sure, Thanks Ted.
>
>
> On Thu, Feb 23, 2017 at 4:00 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > Ismaël:
> > Can you post your future questions on the mailing list ?
> >
> > Thanks
> >
> > On Thu, Feb 23, 2017 at 5:49 AM, Ismaël Mejía <ie...@gmail.com> wrote:
> >
> > > Solomon, it is so great you answered, thanks a lot, you saved me with
> the
> > > idea of using the Protobuf serialization for the Coders, it was quite
> > > tricky to achieve a similar API to what you guys have in BigTable, but
> > that
> > > helped me a lot and I think after looking what you did I could design
> an
> > > API that is at the same time natural for the HBase users and for the
> > > Beam/Bigtable ones.
> > >
> > > Thanks again,
> > > Ismaël
> > >
> > > ps. I would eventually jump into dev to ask some pending questions for
> > > future work (subsplits and dynamic work rebalancing) so may be I will
> be
> > > bothering you or other guys in the mailing list again.
> > >
> > >
> > > On Wed, Feb 22, 2017 at 8:07 PM, Solomon Duskis <sd...@gmail.com>
> > wrote:
> > >
> > > > That's an amazing accomplishment, and a great collaboration!  I'm
> happy
> > > and
> > > > humbled that you were able to use the work we did over the last
> couple
> > of
> > > > years.
> > > >
> > > > On Wed, Feb 22, 2017 at 6:27 AM, Ismaël Mejía <ie...@gmail.com>
> > wrote:
> > > >
> > > > > ​Hello,
> > > > >
> > > > > I have been working in the last weeks on a connector to use HBase
> in
> > > > Apache
> > > > > Beam (the evolution of Google’s Dataflow model), and it was finally
> > > > merged
> > > > > today.
> > > > >
> > > > > https://github.com/apache/beam/tree/master/sdks/java/io/hbase
> > > > >
> > > > > This message is a call for those interested on using HBase with
> > Apache
> > > > Beam
> > > > > to try it and report any issue or additional need not covered by
> the
> > > > > current version of the IO. I will be more than happy to address any
> > > > issue.
> > > > >
> > > > > Of course, for the more hardcore HBase contributors, I wouldn’t
> mind
> > if
> > > > > some of you guys could take a look and give me feedback on stuff I
> > can
> > > > > improve (or even better if you are  interested to contribute your
> > > > > ideas/improvements in our community).
> > > > >
> > > > > Finally I want to thank you guys for your work on HBase, it was
> > really
> > > > nice
> > > > > to use your APIs, and HBaseTestingUtility was a life-saver to test
> my
> > > > > implementation.
> > > > >
> > > > > Thanks,
> > > > > Ismaël Mejía
> > > > >
> > > > > ps. I don’t know if Solomon Duskis still follows this mailing list,
> > but
> > > > > just in case, thanks a lot Solomon, your ideas for Google’s Cloud
> > > > Dataflow
> > > > > really had a profound influence on my implementation.​
> > > > >
> > > >
> > >
> >
>

Re: Mini announcement HBase connector for Apache Beam

Posted by Ismaël Mejía <ie...@gmail.com>.
I will for sure, Thanks Ted.


On Thu, Feb 23, 2017 at 4:00 PM, Ted Yu <yu...@gmail.com> wrote:

> Ismaël:
> Can you post your future questions on the mailing list ?
>
> Thanks
>
> On Thu, Feb 23, 2017 at 5:49 AM, Ismaël Mejía <ie...@gmail.com> wrote:
>
> > Solomon, it is so great you answered, thanks a lot, you saved me with the
> > idea of using the Protobuf serialization for the Coders, it was quite
> > tricky to achieve a similar API to what you guys have in BigTable, but
> that
> > helped me a lot and I think after looking what you did I could design an
> > API that is at the same time natural for the HBase users and for the
> > Beam/Bigtable ones.
> >
> > Thanks again,
> > Ismaël
> >
> > ps. I would eventually jump into dev to ask some pending questions for
> > future work (subsplits and dynamic work rebalancing) so may be I will be
> > bothering you or other guys in the mailing list again.
> >
> >
> > On Wed, Feb 22, 2017 at 8:07 PM, Solomon Duskis <sd...@gmail.com>
> wrote:
> >
> > > That's an amazing accomplishment, and a great collaboration!  I'm happy
> > and
> > > humbled that you were able to use the work we did over the last couple
> of
> > > years.
> > >
> > > On Wed, Feb 22, 2017 at 6:27 AM, Ismaël Mejía <ie...@gmail.com>
> wrote:
> > >
> > > > ​Hello,
> > > >
> > > > I have been working in the last weeks on a connector to use HBase in
> > > Apache
> > > > Beam (the evolution of Google’s Dataflow model), and it was finally
> > > merged
> > > > today.
> > > >
> > > > https://github.com/apache/beam/tree/master/sdks/java/io/hbase
> > > >
> > > > This message is a call for those interested on using HBase with
> Apache
> > > Beam
> > > > to try it and report any issue or additional need not covered by the
> > > > current version of the IO. I will be more than happy to address any
> > > issue.
> > > >
> > > > Of course, for the more hardcore HBase contributors, I wouldn’t mind
> if
> > > > some of you guys could take a look and give me feedback on stuff I
> can
> > > > improve (or even better if you are  interested to contribute your
> > > > ideas/improvements in our community).
> > > >
> > > > Finally I want to thank you guys for your work on HBase, it was
> really
> > > nice
> > > > to use your APIs, and HBaseTestingUtility was a life-saver to test my
> > > > implementation.
> > > >
> > > > Thanks,
> > > > Ismaël Mejía
> > > >
> > > > ps. I don’t know if Solomon Duskis still follows this mailing list,
> but
> > > > just in case, thanks a lot Solomon, your ideas for Google’s Cloud
> > > Dataflow
> > > > really had a profound influence on my implementation.​
> > > >
> > >
> >
>

Re: Mini announcement HBase connector for Apache Beam

Posted by Ted Yu <yu...@gmail.com>.
Ismaël:
Can you post your future questions on the mailing list ?

Thanks

On Thu, Feb 23, 2017 at 5:49 AM, Ismaël Mejía <ie...@gmail.com> wrote:

> Solomon, it is so great you answered, thanks a lot, you saved me with the
> idea of using the Protobuf serialization for the Coders, it was quite
> tricky to achieve a similar API to what you guys have in BigTable, but that
> helped me a lot and I think after looking what you did I could design an
> API that is at the same time natural for the HBase users and for the
> Beam/Bigtable ones.
>
> Thanks again,
> Ismaël
>
> ps. I would eventually jump into dev to ask some pending questions for
> future work (subsplits and dynamic work rebalancing) so may be I will be
> bothering you or other guys in the mailing list again.
>
>
> On Wed, Feb 22, 2017 at 8:07 PM, Solomon Duskis <sd...@gmail.com> wrote:
>
> > That's an amazing accomplishment, and a great collaboration!  I'm happy
> and
> > humbled that you were able to use the work we did over the last couple of
> > years.
> >
> > On Wed, Feb 22, 2017 at 6:27 AM, Ismaël Mejía <ie...@gmail.com> wrote:
> >
> > > ​Hello,
> > >
> > > I have been working in the last weeks on a connector to use HBase in
> > Apache
> > > Beam (the evolution of Google’s Dataflow model), and it was finally
> > merged
> > > today.
> > >
> > > https://github.com/apache/beam/tree/master/sdks/java/io/hbase
> > >
> > > This message is a call for those interested on using HBase with Apache
> > Beam
> > > to try it and report any issue or additional need not covered by the
> > > current version of the IO. I will be more than happy to address any
> > issue.
> > >
> > > Of course, for the more hardcore HBase contributors, I wouldn’t mind if
> > > some of you guys could take a look and give me feedback on stuff I can
> > > improve (or even better if you are  interested to contribute your
> > > ideas/improvements in our community).
> > >
> > > Finally I want to thank you guys for your work on HBase, it was really
> > nice
> > > to use your APIs, and HBaseTestingUtility was a life-saver to test my
> > > implementation.
> > >
> > > Thanks,
> > > Ismaël Mejía
> > >
> > > ps. I don’t know if Solomon Duskis still follows this mailing list, but
> > > just in case, thanks a lot Solomon, your ideas for Google’s Cloud
> > Dataflow
> > > really had a profound influence on my implementation.​
> > >
> >
>

Re: Mini announcement HBase connector for Apache Beam

Posted by Ismaël Mejía <ie...@gmail.com>.
Solomon, it is so great you answered, thanks a lot, you saved me with the
idea of using the Protobuf serialization for the Coders, it was quite
tricky to achieve a similar API to what you guys have in BigTable, but that
helped me a lot and I think after looking what you did I could design an
API that is at the same time natural for the HBase users and for the
Beam/Bigtable ones.

Thanks again,
Ismaël

ps. I would eventually jump into dev to ask some pending questions for
future work (subsplits and dynamic work rebalancing) so may be I will be
bothering you or other guys in the mailing list again.


On Wed, Feb 22, 2017 at 8:07 PM, Solomon Duskis <sd...@gmail.com> wrote:

> That's an amazing accomplishment, and a great collaboration!  I'm happy and
> humbled that you were able to use the work we did over the last couple of
> years.
>
> On Wed, Feb 22, 2017 at 6:27 AM, Ismaël Mejía <ie...@gmail.com> wrote:
>
> > ​Hello,
> >
> > I have been working in the last weeks on a connector to use HBase in
> Apache
> > Beam (the evolution of Google’s Dataflow model), and it was finally
> merged
> > today.
> >
> > https://github.com/apache/beam/tree/master/sdks/java/io/hbase
> >
> > This message is a call for those interested on using HBase with Apache
> Beam
> > to try it and report any issue or additional need not covered by the
> > current version of the IO. I will be more than happy to address any
> issue.
> >
> > Of course, for the more hardcore HBase contributors, I wouldn’t mind if
> > some of you guys could take a look and give me feedback on stuff I can
> > improve (or even better if you are  interested to contribute your
> > ideas/improvements in our community).
> >
> > Finally I want to thank you guys for your work on HBase, it was really
> nice
> > to use your APIs, and HBaseTestingUtility was a life-saver to test my
> > implementation.
> >
> > Thanks,
> > Ismaël Mejía
> >
> > ps. I don’t know if Solomon Duskis still follows this mailing list, but
> > just in case, thanks a lot Solomon, your ideas for Google’s Cloud
> Dataflow
> > really had a profound influence on my implementation.​
> >
>

Re: Mini announcement HBase connector for Apache Beam

Posted by Solomon Duskis <sd...@gmail.com>.
That's an amazing accomplishment, and a great collaboration!  I'm happy and
humbled that you were able to use the work we did over the last couple of
years.

On Wed, Feb 22, 2017 at 6:27 AM, Ismaël Mejía <ie...@gmail.com> wrote:

> ​Hello,
>
> I have been working in the last weeks on a connector to use HBase in Apache
> Beam (the evolution of Google’s Dataflow model), and it was finally merged
> today.
>
> https://github.com/apache/beam/tree/master/sdks/java/io/hbase
>
> This message is a call for those interested on using HBase with Apache Beam
> to try it and report any issue or additional need not covered by the
> current version of the IO. I will be more than happy to address any issue.
>
> Of course, for the more hardcore HBase contributors, I wouldn’t mind if
> some of you guys could take a look and give me feedback on stuff I can
> improve (or even better if you are  interested to contribute your
> ideas/improvements in our community).
>
> Finally I want to thank you guys for your work on HBase, it was really nice
> to use your APIs, and HBaseTestingUtility was a life-saver to test my
> implementation.
>
> Thanks,
> Ismaël Mejía
>
> ps. I don’t know if Solomon Duskis still follows this mailing list, but
> just in case, thanks a lot Solomon, your ideas for Google’s Cloud Dataflow
> really had a profound influence on my implementation.​
>