You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Marc Limotte <ms...@gmail.com> on 2009/06/16 18:49:20 UTC

Data Access Patterns on top of HBase

I'm wondering what sorts of data access patterns (if any) people are using
on top of HBase?

For example, is a DAO pattern applicable?  Seemed reasonable at first, but
based on limited experience, I see a couple of mismatches:

1) Generally, if the DAO was backed by an RDBMS, all the columns are stored
together, so it's no problem to read them all in and create a light Data
Transfer Object.  In Hbase, however, it probably only makes sense to read in
complete column families rather than the complete "row".  On the other hand,
there are some lazy variations that might be appropriate?

2) Also, again based on my limited hbase experience, it seems more common to
want to operate on ranges of data (either a scan or an update), rather than
individual entities.


I suppose I could make a DAO pattern work, but I'm really looking for some
feedback on people's real world experiences... has this pattern proved
useful for people?  Are there other patterns that are more applicable?

Marc

Re: Data Access Patterns on top of HBase

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Marc,

At openplaces.org we use an ActiveRecord-like component to access HBase
rows, so I guess that fits the DAO pattern. One thing that changes from
typical RDBMS is that you can see column families as lists, maps, or you can
have accessors on individual qualifiers inside a family. For example, we
always have an "attribute:" family that holds your typical class properties.
Then the other families are used to hold other kind of "special" data.

When we operate on ranges of data in MapReduce jobs, we still "wrap"
RowResults into domain objects.

I hope this helps,

J-D

On Tue, Jun 16, 2009 at 12:49 PM, Marc Limotte <ms...@gmail.com> wrote:

> I'm wondering what sorts of data access patterns (if any) people are using
> on top of HBase?
>
> For example, is a DAO pattern applicable?  Seemed reasonable at first, but
> based on limited experience, I see a couple of mismatches:
>
> 1) Generally, if the DAO was backed by an RDBMS, all the columns are stored
> together, so it's no problem to read them all in and create a light Data
> Transfer Object.  In Hbase, however, it probably only makes sense to read
> in
> complete column families rather than the complete "row".  On the other
> hand,
> there are some lazy variations that might be appropriate?
>
> 2) Also, again based on my limited hbase experience, it seems more common
> to
> want to operate on ranges of data (either a scan or an update), rather than
> individual entities.
>
>
> I suppose I could make a DAO pattern work, but I'm really looking for some
> feedback on people's real world experiences... has this pattern proved
> useful for people?  Are there other patterns that are more applicable?
>
> Marc
>