You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by David Koch <og...@googlemail.com> on 2012/07/17 22:39:21 UTC

Hbase joins using MultiTableInputCollection [HBASE-3996]

Hello,

I came across this ticket for multiple table scans via and their use in
Map/Reduce jobs:

https://issues.apache.org/jira/browse/HBASE-3996
https://reviews.apache.org/r/4411/diff/7/

There is a patch for this now and it is mentioned in the comments that the
functionality could be useful for doing joins as part of a map reduce.
Could someone briefly explain how this works? I am interested in doing
joins between 2 tables on rowkeys.

If I append both tables to the newly added MultiTableInputCollection
instance and use that in a Map/Reduce - would map(<rowkey>, <value>) only
be called once per unique <rowkey> with <value> containing 2 value sets if
the key was found in both tables?

If there exist any practical examples for doing joins on HBase tables I'd
appreciate a link. Also, I am using Hbase client 0.90.6-cdh3u4, is the
patch applicable to this version of HBase at all?

Thank you,

/David

Re: Hbase joins using MultiTableInputCollection [HBASE-3996]

Posted by David Koch <og...@googlemail.com>.
Hi Ted,

Thank you for your reply. You are right, the ticket has not been closed
yet. At this point, I am mainly trying to understand how
MultiTableInputCollection
can be used to do Joins between HBase tables, if possible with an example.

Thanks,

/David

On Tue, Jul 17, 2012 at 11:07 PM, Ted Yu <yu...@gmail.com> wrote:

> If my memory is correct, there're a few items that Stack pointed out which
> are still outstanding for
> HBASE-3996<https://issues.apache.org/jira/browse/HBASE-3996>
> .
>
> Cheers
>
> On Tue, Jul 17, 2012 at 1:39 PM, David Koch <og...@googlemail.com> wrote:
>
> > Hello,
> >
> > I came across this ticket for multiple table scans via and their use in
> > Map/Reduce jobs:
> >
> > https://issues.apache.org/jira/browse/HBASE-3996
> > https://reviews.apache.org/r/4411/diff/7/
> >
> > There is a patch for this now and it is mentioned in the comments that
> the
> > functionality could be useful for doing joins as part of a map reduce.
> > Could someone briefly explain how this works? I am interested in doing
> > joins between 2 tables on rowkeys.
> >
> > If I append both tables to the newly added MultiTableInputCollection
> > instance and use that in a Map/Reduce - would map(<rowkey>, <value>) only
> > be called once per unique <rowkey> with <value> containing 2 value sets
> if
> > the key was found in both tables?
> >
> > If there exist any practical examples for doing joins on HBase tables I'd
> > appreciate a link. Also, I am using Hbase client 0.90.6-cdh3u4, is the
> > patch applicable to this version of HBase at all?
> >
> > Thank you,
> >
> > /David
> >
>

Re: Hbase joins using MultiTableInputCollection [HBASE-3996]

Posted by Ted Yu <yu...@gmail.com>.
If my memory is correct, there're a few items that Stack pointed out which
are still outstanding for
HBASE-3996<https://issues.apache.org/jira/browse/HBASE-3996>
.

Cheers

On Tue, Jul 17, 2012 at 1:39 PM, David Koch <og...@googlemail.com> wrote:

> Hello,
>
> I came across this ticket for multiple table scans via and their use in
> Map/Reduce jobs:
>
> https://issues.apache.org/jira/browse/HBASE-3996
> https://reviews.apache.org/r/4411/diff/7/
>
> There is a patch for this now and it is mentioned in the comments that the
> functionality could be useful for doing joins as part of a map reduce.
> Could someone briefly explain how this works? I am interested in doing
> joins between 2 tables on rowkeys.
>
> If I append both tables to the newly added MultiTableInputCollection
> instance and use that in a Map/Reduce - would map(<rowkey>, <value>) only
> be called once per unique <rowkey> with <value> containing 2 value sets if
> the key was found in both tables?
>
> If there exist any practical examples for doing joins on HBase tables I'd
> appreciate a link. Also, I am using Hbase client 0.90.6-cdh3u4, is the
> patch applicable to this version of HBase at all?
>
> Thank you,
>
> /David
>