You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by John Lilley <jo...@redpoint.net> on 2014/07/08 20:16:47 UTC

Metadata conventions/tools

Greetings!  We are an ISV of ETL/DI/DQ software and desire to support connections to HBase for "classic tabular" data like one would store in an RDBMS.  To that end, I am trying to better understand how people typically use HBase to store this type of data.  Hive appears to wrap HBase and provide a meta-data layer, and users on @hadoop have commented that Phoenix and Lingual are also used, as well as various home-grown solutions.  But we're really interested in the most common uses and conventions.  Can you comment on what "most people" actually use in production.  We are not really bleeding-edge (other than running in Hadoop, which I suppose makes us bleeding edge), in the sense that our customers tend to come from a server/RDBMS world and are very comfortable with that paradigm.
Thanks,
John

Re: Metadata conventions/tools

Posted by anil gupta <an...@gmail.com>.

Hi John,

If you want to access your HBase data by doing SQL-like(RDBMS) queries then
IMO Phoenix is a good option.
I say so because:
1. Phoenix is a top level Apache project
2. Phoenix is JDBC compliant.
3. Phoenix is built specifically for HBase
4. You will get good community support.

Also, as compared to Hive, Phoenix will provide you much more seamless
experience with HBase. IMO, Hive HBase integration only does basic very
basic stuff.

We have been using Phoenix for around a year and the SQL interface turns
out to be pretty handy as it reduces the learning curve of people migrating
from RDBMS world to NoSql world.

Obviously, i didn't provide you whole lot of options. Maybe, other people
can share their experience with other software's. Hope this helps.

~Anil

On Tue, Jul 8, 2014 at 3:04 PM, Esteban Gutierrez <es...@cloudera.com>
wrote:

> Hello John,
>
> I don't think there is something like one size fits all in HBase for data
> serialization. From my experience I've seen that users choose a serializers
> that fits their needs (Avro, Kiji, Gora, etc.) or they use use
> Bytes.toBytes.  So what you need to do is to make sure that in your
> application the user can specify an arbitrary serializer/deserializer.
>
> cheers,
> esteban.
>
> --
> Cloudera, Inc.
>
>
>
> On Tue, Jul 8, 2014 at 11:16 AM, John Lilley <jo...@redpoint.net>
> wrote:
>
> > Greetings!  We are an ISV of ETL/DI/DQ software and desire to support
> > connections to HBase for "classic tabular" data like one would store in
> an
> > RDBMS.  To that end, I am trying to better understand how people
> typically
> > use HBase to store this type of data.  Hive appears to wrap HBase and
> > provide a meta-data layer, and users on @hadoop have commented that
> Phoenix
> > and Lingual are also used, as well as various home-grown solutions.  But
> > we're really interested in the most common uses and conventions.  Can you
> > comment on what "most people" actually use in production.  We are not
> > really bleeding-edge (other than running in Hadoop, which I suppose makes
> > us bleeding edge), in the sense that our customers tend to come from a
> > server/RDBMS world and are very comfortable with that paradigm.
> > Thanks,
> > John
> >
> >
>

-- 
Thanks & Regards,
Anil Gupta

Re: Metadata conventions/tools

Posted by Esteban Gutierrez <es...@cloudera.com>.

Hello John,

I don't think there is something like one size fits all in HBase for data
serialization. From my experience I've seen that users choose a serializers
that fits their needs (Avro, Kiji, Gora, etc.) or they use use
Bytes.toBytes.  So what you need to do is to make sure that in your
application the user can specify an arbitrary serializer/deserializer.

cheers,
esteban.

--
Cloudera, Inc.

On Tue, Jul 8, 2014 at 11:16 AM, John Lilley <jo...@redpoint.net>
wrote:

> Greetings!  We are an ISV of ETL/DI/DQ software and desire to support
> connections to HBase for "classic tabular" data like one would store in an
> RDBMS.  To that end, I am trying to better understand how people typically
> use HBase to store this type of data.  Hive appears to wrap HBase and
> provide a meta-data layer, and users on @hadoop have commented that Phoenix
> and Lingual are also used, as well as various home-grown solutions.  But
> we're really interested in the most common uses and conventions.  Can you
> comment on what "most people" actually use in production.  We are not
> really bleeding-edge (other than running in Hadoop, which I suppose makes
> us bleeding edge), in the sense that our customers tend to come from a
> server/RDBMS world and are very comfortable with that paradigm.
> Thanks,
> John
>
>