You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by grashmi13 <ra...@rsystems.com> on 2012/07/11 21:16:09 UTC

Hbase Schema

Hi,

In RDBMS we have multiple DB schemas\oracle user instances.

Similarly, can we have multiple db schemas in hbase? If yes, can we have
multiple schemas one one hadoop-hbase cluster? If multiple schemas possible,
how can we define them? Using configuration or programatically?

Q2: can we have same column family name in multiple tables? if yes, does it
impacts performance if we have same name column family in multiple tables?

Q3: Sequential keys improves read performance and random keys improves write
performance. which way one must go?

Q4: What are best practices to improve hadoop+hbase performance?

Q5: when one program is deleting tables, another program is accessing a row
of that table. what would be impact of it? can we have some sort of lock
while reading or while deleting a table?

Q6: as everything in application is byte form, what would happen if hbase db
and application are using different character set? can we synch both for
some particular character set by configuration or programatically?
-- 
View this message in context: http://old.nabble.com/Hbase-Schema-tp34147582p34147582.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: Hbase Schema

Posted by Doug Meil <do...@explorysmedical.com>.

re:  Q2

Yes you can have the same CF name in different tables.  Column Family
names are embedded in each KeyValue.

See:  http://hbase.apache.org/book.html#regions.arch  for more detail

re:  Q3

It depends on what you you need.  A common pattern is using composite keys
where the lead portion represents some natural grouping of data (e.g., a
userid) but that is also hashed to provide distribution across the cluster.

re:  Q4

Read the RefGuide! 

http://hbase.apache.org/book.html




On 7/11/12 3:16 PM, "grashmi13" <ra...@rsystems.com> wrote:

>
>Hi,
>
>In RDBMS we have multiple DB schemas\oracle user instances.
>
>Similarly, can we have multiple db schemas in hbase? If yes, can we have
>multiple schemas one one hadoop-hbase cluster? If multiple schemas
>possible,
>how can we define them? Using configuration or programatically?
>
>Q2: can we have same column family name in multiple tables? if yes, does
>it
>impacts performance if we have same name column family in multiple tables?
>
>Q3: Sequential keys improves read performance and random keys improves
>write
>performance. which way one must go?
>
>Q4: What are best practices to improve hadoop+hbase performance?
>
>Q5: when one program is deleting tables, another program is accessing a
>row
>of that table. what would be impact of it? can we have some sort of lock
>while reading or while deleting a table?
>
>Q6: as everything in application is byte form, what would happen if hbase
>db
>and application are using different character set? can we synch both for
>some particular character set by configuration or programatically?
>-- 
>View this message in context:
>http://old.nabble.com/Hbase-Schema-tp34147582p34147582.html
>Sent from the HBase User mailing list archive at Nabble.com.