You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Dalia Sobhy <da...@hotmail.com> on 2012/01/25 16:01:08 UTC

Important Question

Dear all,
I am developing an API for medical use i.e Hospital admissions and all about patients, thus transactions and queries and realtime data is important here...
Therefore both real-time and analytical processing is a must..
Therefore which best suits my application Hbase or Hive or another method ??
Please reply quickly bec this is critical thxxx a million ;) 		 	   		  

Re: Important Question

Posted by Ioan Eugen Stan <st...@gmail.com>.
Pe 25.01.2012 18:30, Dalia Sobhy a scris:
> So what about HBQL??
> And if i had complex queries would i get stuck with HBase?

Hbql seems to be unmaintained. Last update seems to be in jan 2011, one 
year ago.

>
> Also can anyone provide me with examples of a table in RDBMS transformed into hbase, realtime query and analytical processing..
>

-- 
Ioan Eugen Stan
http://ieugen.blogspot.com

RE: Important Question

Posted by Dalia Sobhy <da...@hotmail.com>.
So may be you are all right I found Hbase really complex.. 
So what are other alternatives I am already using Hadoop as my backend system?

Kindly check apixio which is a similar medical system which adopts Hadoop so plz check and reply..

Bescause this is concerning my thesis part..

Thxx all for your sincere help :)

> From: doug.meil@explorysmedical.com
> To: user@hbase.apache.org
> Subject: Re: Important Question
> Date: Wed, 25 Jan 2012 21:23:13 +0000
> 
> 
> Hi there-
> 
> As someone who works with medical data I take such analysis very
> seriously, but according to the World Health Organization there were 608
> cases of measles reported in Egypt in 2011 (page 82).  Granted, these are
> probably incidence and not prevalence statistics, but the order of
> magnitude of data in your use-case is relatively small.
> 
> www.who.int/whosis/whostat/EN_WHS2011_Full.pdf
> 
> 
> Should you be considering something like MySQL?  Or Microsoft Access?  Or
> a spreadsheet?
> 
> One of the things that the overview points out...
> 
> http://hbase.apache.org/book.html#arch.overview
> 
> 
> ... is that HBase is really useful when you have a *lot* of data, but is
> also serious overkill and over-complexity if you don't.  I'm saying this
> because I'd like to support your epidemiological research, and also
> because I'd like to prevent you from having a bad HBase experience
> especially when the use-case doesn't seem to warrant it.
> 
> Doug
> 
> On 1/25/12 3:56 PM, "Dalia Sobhy" <da...@hotmail.com> wrote:
> 
> >
> >I will explain to u more Mike.
> >I am building a Software Oriented Architecture, I want my API to provide
> >some services such as Add/Delete Patients, Search for a patient by
> >name/ID, count the number of people who are suffering from measles in
> >Alexandria Egypt.
> >Something like that so I am wondering which best suits my API ??
> >
> >> To: dalia.mohsobhy@hotmail.com
> >> CC: user@hbase.apache.org; user@hive.apache.org
> >> Subject: Re: Important Question
> >> From: mspreitz@us.ibm.com
> >> Date: Wed, 25 Jan 2012 12:05:39 -0500
> >> 
> >> BTW, what do you mean by "realtime"?  Do you mean you want to run some
> >> non-trivial query quickly enough for some sort of interactive use?  Can
> >> you give us a feel for the sort of queries that interest you?
> >> 
> >> Thanks,
> >> Mike
> >> 
> >> 
> >> 
> >> From:   Dalia Sobhy <da...@hotmail.com>
> >> To:     "user@hbase.apache.org" <us...@hbase.apache.org>
> >> Cc:     "user@hive.apache.org" <us...@hive.apache.org>,
> >> "user@hbase.apache.org" <us...@hbase.apache.org>
> >> Date:   01/25/2012 11:34 AM
> >> Subject:        Re: Important Question
> >> 
> >> 
> >> 
> >> So what about HBQL??
> >> And if i had complex queries would i get stuck with HBase?
> >> 
> >> Also can anyone provide me with examples of a table in RDBMS
> >>transformed 
> >> into hbase, realtime query and analytical processing..
> >> 
> >> Sent from my iPhone
> >> 
> >> On 2012-01-25, at 6:15 PM, bejoy_ks@yahoo.com wrote:
> >> 
> >> > Real Time.. Definitely not hive. Go in for HBase, but don't expect
> >>Hbase 
> >> to be as flexible as RDBMS. You need to choose your Row Key and Column
> >> Families wisely as per your requirements.
> >> > For data mining and analytics you can mount Hive table  over
> >> corresponding Hbase table and play on with SQL like queries.
> >> > 
> >> > 
> >> > 
> >> > Regards
> >> > Bejoy K S
> >> > 
> >> > -----Original Message-----
> >> > From: Dalia Sobhy <da...@hotmail.com>
> >> > Date: Wed, 25 Jan 2012 17:01:08
> >> > To: <us...@hbase.apache.org>; <us...@hive.apache.org>
> >> > Reply-To: user@hive.apache.org
> >> > Subject: Important Question
> >> > 
> >> > 
> >> > Dear all,
> >> > I am developing an API for medical use i.e Hospital admissions and
> >>all 
> >> about patients, thus transactions and queries and realtime data is
> >> important here...
> >> > Therefore both real-time and analytical processing is a must..
> >> > Therefore which best suits my application Hbase or Hive or another
> >> method ??
> >> > Please reply quickly bec this is critical thxxx a million ;)
> >> 
> >> 
> > 		 	   		  
> 
> 
 		 	   		  

Re: Important Question

Posted by Stephen Boesch <ja...@gmail.com>.
Dalia
 your requirements appear to be transaction oriented and thus OLTP systems
- i.e. regular relational databases - are more likely to be suitable than a
hive (/hadoop) based solution.  Hive is more for business intelligence and
certainly includes latencies - which by saying 'realtime'  - would likely
not be acceptable for your application.

stephenb

2012/1/25 Dalia Sobhy <da...@hotmail.com>

>  I will explain to u more Mike.
>
> I am building a Software Oriented Architecture, I want my API to provide
> some services such as Add/Delete Patients, Search for a patient by name/ID,
> count the number of people who are suffering from measles in Alexandria
> Egypt.
>
> Something like that so I am wondering which best suits my API ??
>
> > To: dalia.mohsobhy@hotmail.com
> > CC: user@hbase.apache.org; user@hive.apache.org
> > Subject: Re: Important Question
> > From: mspreitz@us.ibm.com
> > Date: Wed, 25 Jan 2012 12:05:39 -0500
>
> >
> > BTW, what do you mean by "realtime"? Do you mean you want to run some
> > non-trivial query quickly enough for some sort of interactive use? Can
> > you give us a feel for the sort of queries that interest you?
> >
> > Thanks,
> > Mike
> >
> >
> >
> > From: Dalia Sobhy <dalia.mohsobhy@hotm ail.com>
> > To: "user@hbase.apache.org" <us...@hbase.apache.org>
> > Cc: "user@hive.apache.org" <us...@hive.apache.org>,
> > "user@hbase.apache.org" <us...@hbase.apache.org>
> > Date: 01/25/2012 11:34 AM
> > Subject: Re: Important Question
> >
> >
> >
> > So what about HBQL??
> > And if i had complex queries would i get stuck with HBase?
> >
> > Also can anyone provide me with examples of a table in RDBMS transformed
> > into hbase, realtime query and analytical processing..
> >
> > Sent from my iPhone
> >
> > On 2012-01-25, at 6:15 PM, bejoy_ks@yahoo.com wrote:
> >
> > > Real Time.. Definitely not hive. Go in for HBase, but don't expect
> Hbase
> > to be as flexible as RDBMS. You need to choose your Row Key and Column
> > Families wisely as per your requirements.
> > > For data mining and analytics you can mount Hive table over
> > corresponding Hbase table and play on with SQL like queries.
> > >
> > >
> > >
> > > Regards
> > > Bejoy K S
> > >
> > > -----Original Message-----
> > > From: Dalia Sobhy <da...@hotmail.com>
> > > Date: Wed, 25 Jan 2012 17:01:08
> > > To: <us...@hbase.apache.org>; <us...@hive.apache.org>
> > > Reply-To: user@hive.apache.org
> > > Subject: Important Question
> > >
> > >
> > > Dear all,
> > > I am developing an API for medical use i.e Hospital admissions and all
> > about patients, thus transactions and queries and realtime data is
> > important here...
> > > Therefore both real-time and analytical processing is a must..
> > > Therefore which best suits my application Hbase or Hive or another
> > method ??
> > > Please reply quickly bec this is critical thxxx a million ;)
> >
> >
> **
>

RE: Important Question

Posted by Dalia Sobhy <da...@hotmail.com>.
what about Pig??
Please check this and tell me ur opinions..
http://hstreaming.com/docs/developer-guide/pig/

> To: dalia.mohsobhy@hotmail.com
> CC: user@hbase.apache.org
> Subject: RE: Important Question
> From: mspreitz@us.ibm.com
> Date: Wed, 25 Jan 2012 16:28:42 -0500
> 
> A bit more grist for our mill: what transaction rate do you need to 
> support?  Are you concerned with a lookup or aggregation query "correctly" 
> including a record that is being concurrently updated?
> 
> Thanks,
> Mike
 		 	   		  

RE: Important Question

Posted by Mike Spreitzer <ms...@us.ibm.com>.
A bit more grist for our mill: what transaction rate do you need to 
support?  Are you concerned with a lookup or aggregation query "correctly" 
including a record that is being concurrently updated?

Thanks,
Mike

RE: Important Question

Posted by Dalia Sobhy <da...@hotmail.com>.
Yes that's right!!

> To: dalia.mohsobhy@hotmail.com
> CC: user@hbase.apache.org
> Subject: RE: Important Question
> From: mspreitz@us.ibm.com
> Date: Wed, 25 Jan 2012 16:04:57 -0500
> 
> Just a couple more questions.  Your data will all be in one place, this is 
> not a federated architecture, right?  How much data are we talking about? 
> It sounds like you want to find/create/update/delete individual records 
> and do simple aggregations over records identified by a conjunction of 
> predicates on fields; is that right?
> 
> Thanks,
> Mike (not on the hive mailing list)
 		 	   		  

Re: Important Question

Posted by Ulrich Staudinger <us...@activequant.com>.
Hey everybody,

with the risk of being flamed and bbqued...
to be absolutely honest, I think the NoSQL approach and with it HBase and
all other alternatives don't fit your use case at all. You have a complex
domain model, where it is very likely that you will want to search through
your domain space by all possible attributes of your domain model.

For example, patient has had diseases, prescriptions, etc. So, to make
access into your data space fast, you want to have indices. You want to
have sorting ascending and descending by all attributes.

And preferably you don't want to have to think about building indexing
logic yourself.

Above all, you want to have  referential integrity in your data space -
patient data is not like wall messages where it really doesn't matter that
much if one in a million is lost because something went awry. So
transactions should be supported.

On top of that, your patient data (not counting MRI scans or CT scans), is
probably not going to be more then 10 mb per patient (if at all) - with 1
million users, you would have something like 10 terrabyte of data. With
proper partitioning, you can easily manage that within an average database.

but i maybe wrong and i am looking forward to hear another opinion.


cheers,
ulrich





On Wed, Jan 25, 2012 at 10:04 PM, Mike Spreitzer <ms...@us.ibm.com>wrote:

> Just a couple more questions.  Your data will all be in one place, this is
> not a federated architecture, right?  How much data are we talking about?
> It sounds like you want to find/create/update/delete individual records
> and do simple aggregations over records identified by a conjunction of
> predicates on fields; is that right?
>
> Thanks,
> Mike (not on the hive mailing list)




-- 
Ulrich Staudinger

<http://goog_958005736>http://www.activequant.com
Connect online: https://www.xing.com/profile/Ulrich_Staudinger

RE: Important Question

Posted by Mike Spreitzer <ms...@us.ibm.com>.
Just a couple more questions.  Your data will all be in one place, this is 
not a federated architecture, right?  How much data are we talking about? 
It sounds like you want to find/create/update/delete individual records 
and do simple aggregations over records identified by a conjunction of 
predicates on fields; is that right?

Thanks,
Mike (not on the hive mailing list)

Re: Important Question

Posted by Doug Meil <do...@explorysmedical.com>.
Hi there-

As someone who works with medical data I take such analysis very
seriously, but according to the World Health Organization there were 608
cases of measles reported in Egypt in 2011 (page 82).  Granted, these are
probably incidence and not prevalence statistics, but the order of
magnitude of data in your use-case is relatively small.

www.who.int/whosis/whostat/EN_WHS2011_Full.pdf


Should you be considering something like MySQL?  Or Microsoft Access?  Or
a spreadsheet?

One of the things that the overview points out...

http://hbase.apache.org/book.html#arch.overview


... is that HBase is really useful when you have a *lot* of data, but is
also serious overkill and over-complexity if you don't.  I'm saying this
because I'd like to support your epidemiological research, and also
because I'd like to prevent you from having a bad HBase experience
especially when the use-case doesn't seem to warrant it.

Doug

On 1/25/12 3:56 PM, "Dalia Sobhy" <da...@hotmail.com> wrote:

>
>I will explain to u more Mike.
>I am building a Software Oriented Architecture, I want my API to provide
>some services such as Add/Delete Patients, Search for a patient by
>name/ID, count the number of people who are suffering from measles in
>Alexandria Egypt.
>Something like that so I am wondering which best suits my API ??
>
>> To: dalia.mohsobhy@hotmail.com
>> CC: user@hbase.apache.org; user@hive.apache.org
>> Subject: Re: Important Question
>> From: mspreitz@us.ibm.com
>> Date: Wed, 25 Jan 2012 12:05:39 -0500
>> 
>> BTW, what do you mean by "realtime"?  Do you mean you want to run some
>> non-trivial query quickly enough for some sort of interactive use?  Can
>> you give us a feel for the sort of queries that interest you?
>> 
>> Thanks,
>> Mike
>> 
>> 
>> 
>> From:   Dalia Sobhy <da...@hotmail.com>
>> To:     "user@hbase.apache.org" <us...@hbase.apache.org>
>> Cc:     "user@hive.apache.org" <us...@hive.apache.org>,
>> "user@hbase.apache.org" <us...@hbase.apache.org>
>> Date:   01/25/2012 11:34 AM
>> Subject:        Re: Important Question
>> 
>> 
>> 
>> So what about HBQL??
>> And if i had complex queries would i get stuck with HBase?
>> 
>> Also can anyone provide me with examples of a table in RDBMS
>>transformed 
>> into hbase, realtime query and analytical processing..
>> 
>> Sent from my iPhone
>> 
>> On 2012-01-25, at 6:15 PM, bejoy_ks@yahoo.com wrote:
>> 
>> > Real Time.. Definitely not hive. Go in for HBase, but don't expect
>>Hbase 
>> to be as flexible as RDBMS. You need to choose your Row Key and Column
>> Families wisely as per your requirements.
>> > For data mining and analytics you can mount Hive table  over
>> corresponding Hbase table and play on with SQL like queries.
>> > 
>> > 
>> > 
>> > Regards
>> > Bejoy K S
>> > 
>> > -----Original Message-----
>> > From: Dalia Sobhy <da...@hotmail.com>
>> > Date: Wed, 25 Jan 2012 17:01:08
>> > To: <us...@hbase.apache.org>; <us...@hive.apache.org>
>> > Reply-To: user@hive.apache.org
>> > Subject: Important Question
>> > 
>> > 
>> > Dear all,
>> > I am developing an API for medical use i.e Hospital admissions and
>>all 
>> about patients, thus transactions and queries and realtime data is
>> important here...
>> > Therefore both real-time and analytical processing is a must..
>> > Therefore which best suits my application Hbase or Hive or another
>> method ??
>> > Please reply quickly bec this is critical thxxx a million ;)
>> 
>> 
> 		 	   		  



RE: Important Question

Posted by Dalia Sobhy <da...@hotmail.com>.
I will explain to u more Mike.
I am building a Software Oriented Architecture, I want my API to provide some services such as Add/Delete Patients, Search for a patient by name/ID, count the number of people who are suffering from measles in Alexandria Egypt.
Something like that so I am wondering which best suits my API ?? 

> To: dalia.mohsobhy@hotmail.com
> CC: user@hbase.apache.org; user@hive.apache.org
> Subject: Re: Important Question
> From: mspreitz@us.ibm.com
> Date: Wed, 25 Jan 2012 12:05:39 -0500
> 
> BTW, what do you mean by "realtime"?  Do you mean you want to run some 
> non-trivial query quickly enough for some sort of interactive use?  Can 
> you give us a feel for the sort of queries that interest you?
> 
> Thanks,
> Mike
> 
> 
> 
> From:   Dalia Sobhy <da...@hotmail.com>
> To:     "user@hbase.apache.org" <us...@hbase.apache.org>
> Cc:     "user@hive.apache.org" <us...@hive.apache.org>, 
> "user@hbase.apache.org" <us...@hbase.apache.org>
> Date:   01/25/2012 11:34 AM
> Subject:        Re: Important Question
> 
> 
> 
> So what about HBQL??
> And if i had complex queries would i get stuck with HBase?
> 
> Also can anyone provide me with examples of a table in RDBMS transformed 
> into hbase, realtime query and analytical processing..
> 
> Sent from my iPhone
> 
> On 2012-01-25, at 6:15 PM, bejoy_ks@yahoo.com wrote:
> 
> > Real Time.. Definitely not hive. Go in for HBase, but don't expect Hbase 
> to be as flexible as RDBMS. You need to choose your Row Key and Column 
> Families wisely as per your requirements.
> > For data mining and analytics you can mount Hive table  over 
> corresponding Hbase table and play on with SQL like queries.
> > 
> > 
> > 
> > Regards
> > Bejoy K S
> > 
> > -----Original Message-----
> > From: Dalia Sobhy <da...@hotmail.com>
> > Date: Wed, 25 Jan 2012 17:01:08 
> > To: <us...@hbase.apache.org>; <us...@hive.apache.org>
> > Reply-To: user@hive.apache.org
> > Subject: Important Question
> > 
> > 
> > Dear all,
> > I am developing an API for medical use i.e Hospital admissions and all 
> about patients, thus transactions and queries and realtime data is 
> important here...
> > Therefore both real-time and analytical processing is a must..
> > Therefore which best suits my application Hbase or Hive or another 
> method ??
> > Please reply quickly bec this is critical thxxx a million ;)  
> 
> 
 		 	   		  

RE: Important Question

Posted by Dalia Sobhy <da...@hotmail.com>.
I will explain to u more Mike.
I am building a Software Oriented Architecture, I want my API to provide some services such as Add/Delete Patients, Search for a patient by name/ID, count the number of people who are suffering from measles in Alexandria Egypt.
Something like that so I am wondering which best suits my API ?? 

> To: dalia.mohsobhy@hotmail.com
> CC: user@hbase.apache.org; user@hive.apache.org
> Subject: Re: Important Question
> From: mspreitz@us.ibm.com
> Date: Wed, 25 Jan 2012 12:05:39 -0500
> 
> BTW, what do you mean by "realtime"?  Do you mean you want to run some 
> non-trivial query quickly enough for some sort of interactive use?  Can 
> you give us a feel for the sort of queries that interest you?
> 
> Thanks,
> Mike
> 
> 
> 
> From:   Dalia Sobhy <da...@hotmail.com>
> To:     "user@hbase.apache.org" <us...@hbase.apache.org>
> Cc:     "user@hive.apache.org" <us...@hive.apache.org>, 
> "user@hbase.apache.org" <us...@hbase.apache.org>
> Date:   01/25/2012 11:34 AM
> Subject:        Re: Important Question
> 
> 
> 
> So what about HBQL??
> And if i had complex queries would i get stuck with HBase?
> 
> Also can anyone provide me with examples of a table in RDBMS transformed 
> into hbase, realtime query and analytical processing..
> 
> Sent from my iPhone
> 
> On 2012-01-25, at 6:15 PM, bejoy_ks@yahoo.com wrote:
> 
> > Real Time.. Definitely not hive. Go in for HBase, but don't expect Hbase 
> to be as flexible as RDBMS. You need to choose your Row Key and Column 
> Families wisely as per your requirements.
> > For data mining and analytics you can mount Hive table  over 
> corresponding Hbase table and play on with SQL like queries.
> > 
> > 
> > 
> > Regards
> > Bejoy K S
> > 
> > -----Original Message-----
> > From: Dalia Sobhy <da...@hotmail.com>
> > Date: Wed, 25 Jan 2012 17:01:08 
> > To: <us...@hbase.apache.org>; <us...@hive.apache.org>
> > Reply-To: user@hive.apache.org
> > Subject: Important Question
> > 
> > 
> > Dear all,
> > I am developing an API for medical use i.e Hospital admissions and all 
> about patients, thus transactions and queries and realtime data is 
> important here...
> > Therefore both real-time and analytical processing is a must..
> > Therefore which best suits my application Hbase or Hive or another 
> method ??
> > Please reply quickly bec this is critical thxxx a million ;)  
> 
> 
 		 	   		  

Re: Important Question

Posted by Mike Spreitzer <ms...@us.ibm.com>.
BTW, what do you mean by "realtime"?  Do you mean you want to run some 
non-trivial query quickly enough for some sort of interactive use?  Can 
you give us a feel for the sort of queries that interest you?

Thanks,
Mike



From:   Dalia Sobhy <da...@hotmail.com>
To:     "user@hbase.apache.org" <us...@hbase.apache.org>
Cc:     "user@hive.apache.org" <us...@hive.apache.org>, 
"user@hbase.apache.org" <us...@hbase.apache.org>
Date:   01/25/2012 11:34 AM
Subject:        Re: Important Question



So what about HBQL??
And if i had complex queries would i get stuck with HBase?

Also can anyone provide me with examples of a table in RDBMS transformed 
into hbase, realtime query and analytical processing..

Sent from my iPhone

On 2012-01-25, at 6:15 PM, bejoy_ks@yahoo.com wrote:

> Real Time.. Definitely not hive. Go in for HBase, but don't expect Hbase 
to be as flexible as RDBMS. You need to choose your Row Key and Column 
Families wisely as per your requirements.
> For data mining and analytics you can mount Hive table  over 
corresponding Hbase table and play on with SQL like queries.
> 
> 
> 
> Regards
> Bejoy K S
> 
> -----Original Message-----
> From: Dalia Sobhy <da...@hotmail.com>
> Date: Wed, 25 Jan 2012 17:01:08 
> To: <us...@hbase.apache.org>; <us...@hive.apache.org>
> Reply-To: user@hive.apache.org
> Subject: Important Question
> 
> 
> Dear all,
> I am developing an API for medical use i.e Hospital admissions and all 
about patients, thus transactions and queries and realtime data is 
important here...
> Therefore both real-time and analytical processing is a must..
> Therefore which best suits my application Hbase or Hive or another 
method ??
> Please reply quickly bec this is critical thxxx a million ;)  



Re: Important Question

Posted by Bejoy Ks <be...@yahoo.com>.
Hi Dalia
    A sample DDL would be something like this.

CREATE EXTERNAL TABLE employee(key string,no string,name string,address map<string,string>)   
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,no:NUM,name:FIRST,address:") 
TBLPROPERTIES("hbase.table.name"= "employee");

points to be noted
-if you are mapping an entire Hbase Column Family to a hive column, then hive column data type should be a map
-if the mapper is done with hbase CF:Qualifier then you can have other non collection data types like STRING
-the order of values given in hbase.columns.mapping should be in correspondence with the hive column order

For complete reference 
https://cwiki.apache.org/Hive/hbaseintegration.html

Hope it helps!...

Regards
Bejoy.K.S



________________________________
 From: Dalia Sobhy <da...@hotmail.com>
To: user@hive.apache.org 
Sent: Thursday, February 2, 2012 3:17 PM
Subject: RE: Important Question
 

 
Hiii Bejoy,

Can you provide me with a simple example of how to mount Hbase table into a Hive Table ??

Thanks,



________________________________
Date: Wed, 25 Jan 2012 08:58:26 -0800
From: bejoy_ks@yahoo.com
Subject: Re: Important Question
To: user@hive.apache.org; user@hbase.apache.org


Hi Dalia
    By complex queries if you are looking at joins with multiple tables and so on, Hbase doesn't support joins. In the absence of joins if you want to achieve a join that involved multiple tables in RDBMS, based on your requirement you should find suitable Column Families and Qualifiers in single Hbase table to accommodate those multiple tables in RDBMS. I haven't played much with HBQL, but if you are developing some API you c an depend on the HBase Java API internally for storage and retrieval of records. Hbase the Querying (Retrieval time) largely depends on how you design the Row key and Column family (Hbase stores CF together and Row Keys sorted and distributed across regions). If you want to have a SQL like querying functionality for a Hbase table you have to correspondingly mount that to a hive table. 

    In my personal experience I have used hbase tables for real time data storage and retrieval for a hadoop enterprise application. There were scheduled Map Reduce jobs that run on off peak hours that dumps the required data (formatted and filtered) from this Hbase table into hdfs and from there hive consumes the data for analytical purposes. We had a good number of analytical jobs and didn't wanted to choke hbase servers in peak hours so the mining and analytics part were moved completely to hive.

Regards
Bejoy.K.S



________________________________
 From: Dalia Sobhy <da...@hotmail.com>
To: "user@hbase.apache.org" <us...@hbase.apache.org> 
Cc: "user@hive.apache.org" <us...@hive.apache.org>; "user@hbase.apache.org" <us...@hbase.apache.org> 
Sent: Wednesday, January 25, 2012 10:00 PM
Subject: Re: Important Question
 
So what about HBQL??
And if i had complex queries would i get stuck with HBase?

Also can anyone provide me with examples of a table in RDBMS transformed into hbase, realtime query and analytical processing..

Sent from my iPhone

On 2012-01-25, at 6:15 PM, bejoy_ks@yahoo.com wrote:

> Real Time.. Definitely not hive. Go in for HBase, but don't expect Hbase to be as flexible as RDBMS. You need to choose your Row Key and Column Families wisely as per your requirements.
> For data mining and analytics you can mount Hive table  over corresponding Hbase table and play on with SQL like queries.
> 
> 
> 
> Regards
> Bejoy K S
> 
> -----Original Message-----
> From: Dalia Sobhy <da...@hotmail.com>
> Date: Wed, 25 Jan 2012 17:01:08 
> To: <us er@hbase.apache.org>; <us...@hive.apache.org>
> Reply-To: user@hive.apache.org
> Subject: Important Question
> 
> 
> Dear all,
> I am developing an API for medical use i.e Hospital admissions and all about patients, thus transactions and queries and realtime data is important here...
> Therefore both real-time and analytical processing is a must..
> Therefore which best suits my application Hbase or Hive or another method ??
> Please reply quickly bec this is critical thxxx a million ;)         

RE: Important Question

Posted by Dalia Sobhy <da...@hotmail.com>.
Hiii Bejoy,
Can you provide me with a simple example of how to mount Hbase table into a Hive Table ??
Thanks,

Date: Wed, 25 Jan 2012 08:58:26 -0800
From: bejoy_ks@yahoo.com
Subject: Re: Important Question
To: user@hive.apache.org; user@hbase.apache.org

Hi Dalia    By complex queries if you are looking at joins with multiple tables and so on, Hbase doesn't support joins. In the absence of joins if you want to achieve a join that involved multiple tables in RDBMS, based on your requirement you should find suitable Column Families and Qualifiers in single Hbase table to accommodate those multiple tables in RDBMS. I haven't played much with HBQL, but if you are developing some API you can depend on the HBase Java API internally for storage and retrieval of records. Hbase the Querying (Retrieval time) largely depends on how you design the Row key and Column family (Hbase stores CF together and Row Keys sorted and distributed across regions). If you want to have a SQL like querying functionality for a Hbase
 table you have to correspondingly mount that to a hive table. 
    In my personal experience I have used hbase tables for real time data storage and retrieval for a hadoop enterprise application. There were scheduled Map Reduce jobs that run on off peak hours that dumps the required data (formatted and filtered) from this Hbase table into hdfs and from there hive consumes the data for analytical purposes. We had a good number of analytical jobs and didn't wanted to choke hbase servers in peak hours so the mining and analytics part were moved completely to hive.
RegardsBejoy.K.S

        From: Dalia Sobhy <da...@hotmail.com>
 To: "user@hbase.apache.org" <us...@hbase.apache.org> 
Cc: "user@hive.apache.org" <us...@hive.apache.org>; "user@hbase.apache.org" <us...@hbase.apache.org> 
 Sent: Wednesday, January 25, 2012 10:00 PM
 Subject: Re: Important Question
   

So what about HBQL??
And if i had complex queries would i get stuck with HBase?

Also can anyone provide me with examples of a table in RDBMS transformed into hbase, realtime query and analytical processing..

Sent from my iPhone

On 2012-01-25, at 6:15 PM, bejoy_ks@yahoo.com wrote:

> Real Time.. Definitely not hive. Go in for HBase, but don't expect Hbase to be as flexible as RDBMS. You need to choose your Row Key and Column Families wisely as per your requirements.
> For data mining and analytics you can mount Hive table  over corresponding Hbase table and play on with SQL like queries.
> 
> 
> 
> Regards
> Bejoy K S
> 
> -----Original Message-----
> From: Dalia Sobhy <da...@hotmail.com>
> Date: Wed, 25 Jan 2012 17:01:08 
> To: <us...@hbase.apache.org>; <us...@hive.apache.org>
> Reply-To: user@hive.apache.org
> Subject: Important Question
> 
> 
> Dear all,
> I am developing an API for medical use i.e Hospital admissions and all about patients, thus transactions and queries and realtime data is important here...
> Therefore both real-time and analytical processing is a must..
> Therefore which best suits my application Hbase or Hive or another method ??
> Please reply quickly bec this is critical thxxx a million ;)         
                


     		 	   		  

Re: Important Question

Posted by Bejoy Ks <be...@yahoo.com>.
Hi Dalia
    By complex queries if you are looking at joins with multiple tables and so on, Hbase doesn't support joins. In the absence of joins if you want to achieve a join that involved multiple tables in RDBMS, based on your requirement you should find suitable Column Families and Qualifiers in single Hbase table to accommodate those multiple tables in RDBMS. I haven't played much with HBQL, but if you are developing some API you can depend on the HBase Java API internally for storage and retrieval of records. Hbase the Querying (Retrieval time) largely depends on how you design the Row key and Column family (Hbase stores CF together and Row Keys sorted and distributed across regions). If you want to have a SQL like querying functionality for a Hbase table you have to correspondingly mount that to a hive table.

    In my personal experience I have used hbase tables for real time data storage and retrieval for a hadoop enterprise application. There were scheduled Map Reduce jobs that run on off peak hours that dumps the required data (formatted and filtered) from this Hbase table into hdfs and from there hive consumes the data for analytical purposes. We had a good number of analytical jobs and didn't wanted to choke hbase servers in peak hours so the mining and analytics part were moved completely to hive.

Regards
Bejoy.K.S



________________________________
 From: Dalia Sobhy <da...@hotmail.com>
To: "user@hbase.apache.org" <us...@hbase.apache.org> 
Cc: "user@hive.apache.org" <us...@hive.apache.org>; "user@hbase.apache.org" <us...@hbase.apache.org> 
Sent: Wednesday, January 25, 2012 10:00 PM
Subject: Re: Important Question
 
So what about HBQL??
And if i had complex queries would i get stuck with HBase?

Also can anyone provide me with examples of a table in RDBMS transformed into hbase, realtime query and analytical processing..

Sent from my iPhone

On 2012-01-25, at 6:15 PM, bejoy_ks@yahoo.com wrote:

> Real Time.. Definitely not hive. Go in for HBase, but don't expect Hbase to be as flexible as RDBMS. You need to choose your Row Key and Column Families wisely as per your requirements.
> For data mining and analytics you can mount Hive table  over corresponding Hbase table and play on with SQL like queries.
> 
> 
> 
> Regards
> Bejoy K S
> 
> -----Original Message-----
> From: Dalia Sobhy <da...@hotmail.com>
> Date: Wed, 25 Jan 2012 17:01:08 
> To: <us...@hbase.apache.org>; <us...@hive.apache.org>
> Reply-To: user@hive.apache.org
> Subject: Important Question
> 
> 
> Dear all,
> I am developing an API for medical use i.e Hospital admissions and all about patients, thus transactions and queries and realtime data is important here...
> Therefore both real-time and analytical processing is a must..
> Therefore which best suits my application Hbase or Hive or another method ??
> Please reply quickly bec this is critical thxxx a million ;)                        

Re: Important Question

Posted by Rohit Kelkar <ro...@gmail.com>.
Dalia,
You mentioned realtime, which of your use cases are realtime and whats
an acceptable response time for it?
You may want to try a combination of sql and nosql. Nosql to store
your data for analytics purposes and the sql for realtime. I am
assuming that your analytics needs would be based on huge amount of
historical data which is not dependent on the data that is required in
realtime. It would be very helpful if you could elaborate a typical
analytics use case and a typical realtime use case that you want to be
handled.

- Rohit Kelkar

On Thu, Jan 26, 2012 at 3:43 PM, Dalia Sobhy <da...@hotmail.com> wrote:
> Hii Doug,
>
> How can i talk to you for the Explorsys may be it suits my application ??
>
> Contact me asap..
>
> Sent from my iPhone
>
> On 2012-01-25, at 6:45 PM, "Doug Meil" <do...@explorysmedical.com> wrote:
>
>>
>> Because you specifically cited the medical domain in your question, I
>> think you might want talk to Explorys (disclaimer:  I work there).
>>
>>
>> Otherwise, you probably want to look at the HBase book.
>>
>>
>> On 1/25/12 11:30 AM, "Dalia Sobhy" <da...@hotmail.com> wrote:
>>
>>> So what about HBQL??
>>> And if i had complex queries would i get stuck with HBase?
>>>
>>> Also can anyone provide me with examples of a table in RDBMS transformed
>>> into hbase, realtime query and analytical processing..
>>>
>>> Sent from my iPhone
>>>
>>> On 2012-01-25, at 6:15 PM, bejoy_ks@yahoo.com wrote:
>>>
>>>> Real Time.. Definitely not hive. Go in for HBase, but don't expect
>>>> Hbase to be as flexible as RDBMS. You need to choose your Row Key and
>>>> Column Families wisely as per your requirements.
>>>> For data mining and analytics you can mount Hive table  over
>>>> corresponding Hbase table and play on with SQL like queries.
>>>>
>>>>
>>>>
>>>> Regards
>>>> Bejoy K S
>>>>
>>>> -----Original Message-----
>>>> From: Dalia Sobhy <da...@hotmail.com>
>>>> Date: Wed, 25 Jan 2012 17:01:08
>>>> To: <us...@hbase.apache.org>; <us...@hive.apache.org>
>>>> Reply-To: user@hive.apache.org
>>>> Subject: Important Question
>>>>
>>>>
>>>> Dear all,
>>>> I am developing an API for medical use i.e Hospital admissions and all
>>>> about patients, thus transactions and queries and realtime data is
>>>> important here...
>>>> Therefore both real-time and analytical processing is a must..
>>>> Therefore which best suits my application Hbase or Hive or another
>>>> method ??
>>>> Please reply quickly bec this is critical thxxx a million ;)
>>>>
>>>
>>
>>

Re: Important Question

Posted by Dalia Sobhy <da...@hotmail.com>.
Hii Doug,

How can i talk to you for the Explorsys may be it suits my application ??

Contact me asap..

Sent from my iPhone

On 2012-01-25, at 6:45 PM, "Doug Meil" <do...@explorysmedical.com> wrote:

> 
> Because you specifically cited the medical domain in your question, I
> think you might want talk to Explorys (disclaimer:  I work there).
> 
> 
> Otherwise, you probably want to look at the HBase book.
> 
> 
> On 1/25/12 11:30 AM, "Dalia Sobhy" <da...@hotmail.com> wrote:
> 
>> So what about HBQL??
>> And if i had complex queries would i get stuck with HBase?
>> 
>> Also can anyone provide me with examples of a table in RDBMS transformed
>> into hbase, realtime query and analytical processing..
>> 
>> Sent from my iPhone
>> 
>> On 2012-01-25, at 6:15 PM, bejoy_ks@yahoo.com wrote:
>> 
>>> Real Time.. Definitely not hive. Go in for HBase, but don't expect
>>> Hbase to be as flexible as RDBMS. You need to choose your Row Key and
>>> Column Families wisely as per your requirements.
>>> For data mining and analytics you can mount Hive table  over
>>> corresponding Hbase table and play on with SQL like queries.
>>> 
>>> 
>>> 
>>> Regards
>>> Bejoy K S
>>> 
>>> -----Original Message-----
>>> From: Dalia Sobhy <da...@hotmail.com>
>>> Date: Wed, 25 Jan 2012 17:01:08
>>> To: <us...@hbase.apache.org>; <us...@hive.apache.org>
>>> Reply-To: user@hive.apache.org
>>> Subject: Important Question
>>> 
>>> 
>>> Dear all,
>>> I am developing an API for medical use i.e Hospital admissions and all
>>> about patients, thus transactions and queries and realtime data is
>>> important here...
>>> Therefore both real-time and analytical processing is a must..
>>> Therefore which best suits my application Hbase or Hive or another
>>> method ??
>>> Please reply quickly bec this is critical thxxx a million ;)
>>> 
>> 
> 
> 

Re: Important Question

Posted by Doug Meil <do...@explorysmedical.com>.
Because you specifically cited the medical domain in your question, I
think you might want talk to Explorys (disclaimer:  I work there).


Otherwise, you probably want to look at the HBase book.


On 1/25/12 11:30 AM, "Dalia Sobhy" <da...@hotmail.com> wrote:

>So what about HBQL??
>And if i had complex queries would i get stuck with HBase?
>
>Also can anyone provide me with examples of a table in RDBMS transformed
>into hbase, realtime query and analytical processing..
>
>Sent from my iPhone
>
>On 2012-01-25, at 6:15 PM, bejoy_ks@yahoo.com wrote:
>
>> Real Time.. Definitely not hive. Go in for HBase, but don't expect
>>Hbase to be as flexible as RDBMS. You need to choose your Row Key and
>>Column Families wisely as per your requirements.
>> For data mining and analytics you can mount Hive table  over
>>corresponding Hbase table and play on with SQL like queries.
>> 
>> 
>> 
>> Regards
>> Bejoy K S
>> 
>> -----Original Message-----
>> From: Dalia Sobhy <da...@hotmail.com>
>> Date: Wed, 25 Jan 2012 17:01:08
>> To: <us...@hbase.apache.org>; <us...@hive.apache.org>
>> Reply-To: user@hive.apache.org
>> Subject: Important Question
>> 
>> 
>> Dear all,
>> I am developing an API for medical use i.e Hospital admissions and all
>>about patients, thus transactions and queries and realtime data is
>>important here...
>> Therefore both real-time and analytical processing is a must..
>> Therefore which best suits my application Hbase or Hive or another
>>method ??
>> Please reply quickly bec this is critical thxxx a million ;)
>>             
>



Re: Important Question

Posted by Dalia Sobhy <da...@hotmail.com>.
So what about HBQL??
And if i had complex queries would i get stuck with HBase?

Also can anyone provide me with examples of a table in RDBMS transformed into hbase, realtime query and analytical processing..

Sent from my iPhone

On 2012-01-25, at 6:15 PM, bejoy_ks@yahoo.com wrote:

> Real Time.. Definitely not hive. Go in for HBase, but don't expect Hbase to be as flexible as RDBMS. You need to choose your Row Key and Column Families wisely as per your requirements.
> For data mining and analytics you can mount Hive table  over corresponding Hbase table and play on with SQL like queries.
> 
> 
> 
> Regards
> Bejoy K S
> 
> -----Original Message-----
> From: Dalia Sobhy <da...@hotmail.com>
> Date: Wed, 25 Jan 2012 17:01:08 
> To: <us...@hbase.apache.org>; <us...@hive.apache.org>
> Reply-To: user@hive.apache.org
> Subject: Important Question
> 
> 
> Dear all,
> I am developing an API for medical use i.e Hospital admissions and all about patients, thus transactions and queries and realtime data is important here...
> Therefore both real-time and analytical processing is a must..
> Therefore which best suits my application Hbase or Hive or another method ??
> Please reply quickly bec this is critical thxxx a million ;)                         

Re: Important Question

Posted by Dalia Sobhy <da...@hotmail.com>.
So what about HBQL??
And if i had complex queries would i get stuck with HBase?

Also can anyone provide me with examples of a table in RDBMS transformed into hbase, realtime query and analytical processing..

Sent from my iPhone

On 2012-01-25, at 6:15 PM, bejoy_ks@yahoo.com wrote:

> Real Time.. Definitely not hive. Go in for HBase, but don't expect Hbase to be as flexible as RDBMS. You need to choose your Row Key and Column Families wisely as per your requirements.
> For data mining and analytics you can mount Hive table  over corresponding Hbase table and play on with SQL like queries.
> 
> 
> 
> Regards
> Bejoy K S
> 
> -----Original Message-----
> From: Dalia Sobhy <da...@hotmail.com>
> Date: Wed, 25 Jan 2012 17:01:08 
> To: <us...@hbase.apache.org>; <us...@hive.apache.org>
> Reply-To: user@hive.apache.org
> Subject: Important Question
> 
> 
> Dear all,
> I am developing an API for medical use i.e Hospital admissions and all about patients, thus transactions and queries and realtime data is important here...
> Therefore both real-time and analytical processing is a must..
> Therefore which best suits my application Hbase or Hive or another method ??
> Please reply quickly bec this is critical thxxx a million ;)                         

Re: Important Question

Posted by be...@yahoo.com.
Real Time.. Definitely not hive. Go in for HBase, but don't expect Hbase to be as flexible as RDBMS. You need to choose your Row Key and Column Families wisely as per your requirements.
For data mining and analytics you can mount Hive table  over corresponding Hbase table and play on with SQL like queries.



Regards
Bejoy K S

-----Original Message-----
From: Dalia Sobhy <da...@hotmail.com>
Date: Wed, 25 Jan 2012 17:01:08 
To: <us...@hbase.apache.org>; <us...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Important Question


Dear all,
I am developing an API for medical use i.e Hospital admissions and all about patients, thus transactions and queries and realtime data is important here...
Therefore both real-time and analytical processing is a must..
Therefore which best suits my application Hbase or Hive or another method ??
Please reply quickly bec this is critical thxxx a million ;) 		 	   		  

Re: Important Question

Posted by be...@yahoo.com.
Real Time.. Definitely not hive. Go in for HBase, but don't expect Hbase to be as flexible as RDBMS. You need to choose your Row Key and Column Families wisely as per your requirements.
For data mining and analytics you can mount Hive table  over corresponding Hbase table and play on with SQL like queries.



Regards
Bejoy K S

-----Original Message-----
From: Dalia Sobhy <da...@hotmail.com>
Date: Wed, 25 Jan 2012 17:01:08 
To: <us...@hbase.apache.org>; <us...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Important Question


Dear all,
I am developing an API for medical use i.e Hospital admissions and all about patients, thus transactions and queries and realtime data is important here...
Therefore both real-time and analytical processing is a must..
Therefore which best suits my application Hbase or Hive or another method ??
Please reply quickly bec this is critical thxxx a million ;) 		 	   		  

Re: Important Question

Posted by Ioan Eugen Stan <st...@gmail.com>.
Pe 25.01.2012 17:01, Dalia Sobhy a scris:
>
> Dear all,
> I am developing an API for medical use i.e Hospital admissions and all about patients, thus transactions and queries and realtime data is important here...
> Therefore both real-time and analytical processing is a must..
> Therefore which best suits my application Hbase or Hive or another method ??
> Please reply quickly bec this is critical thxxx a million ;) 		 	   		

HBase does Real time. Hive is more batch oriented.
Please read each project's description.

http://hive.apache.org/
http://hbase.apache.org/

-- 
Ioan Eugen Stan
http://ieugen.blogspot.com