You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Aryeh Berkowitz <ar...@iswcorp.com> on 2009/12/17 14:48:34 UTC

HBase Schema

I have a question about schema creation. If my understanding is correct, HBase is a NoSQL database which according to Wikipedia is "Data stores that may not require fixed table schemas, and usually avoid join operations." Coming from a relational background, it's challenging to wrap my mind around it, but I think I understand it. So for example, if I would have a person table and I want to store information about all the cars that he owns, in a relational database I would make two tables, a person table and a cars table something like this:
Person Table
ID

Name

Address

1

Frank

1234 Main St.


Cars Table
Person_ID

Year

Make

Model

1

2001

Toyota

Camry

1

2004

Mazda

Protégé

1

2003

Nissan

Sentra


In HBase I'm thinking to put everything on one row in one table since it's possible to create columns on the fly:

Person & Cars
Person Family

Cars Family

Name

Address

Car1:Year

Car1:Make

Car1:Model

Car2:Year

Car2:Make

Car2:Model

Car3:Year

Car3:Make

Car3:Model

Frank

1234 Main St.

2001

Toyota

Camry

2004

Mazda

Protégé

2003

Nissan

Sentra


It seems a little harder for the programmers though to work with this type of model. Am I on the right track? I would be very interested to hear your thoughts on this matter.

Thanks!

RE: HBase Schema

Posted by Aryeh Berkowitz <ar...@iswcorp.com>.
Sorry about the format, it was supposed to come out in a table.

-----Original Message-----
From: Aryeh Berkowitz [mailto:aryeh@iswcorp.com] 
Sent: Thursday, December 17, 2009 8:49 AM
To: hbase-user@hadoop.apache.org
Subject: HBase Schema

I have a question about schema creation. If my understanding is correct, HBase is a NoSQL database which according to Wikipedia is "Data stores that may not require fixed table schemas, and usually avoid join operations." Coming from a relational background, it's challenging to wrap my mind around it, but I think I understand it. So for example, if I would have a person table and I want to store information about all the cars that he owns, in a relational database I would make two tables, a person table and a cars table something like this:
Person Table
ID

Name

Address

1

Frank

1234 Main St.


Cars Table
Person_ID

Year

Make

Model

1

2001

Toyota

Camry

1

2004

Mazda

Protégé

1

2003

Nissan

Sentra


In HBase I'm thinking to put everything on one row in one table since it's possible to create columns on the fly:

Person & Cars
Person Family

Cars Family

Name

Address

Car1:Year

Car1:Make

Car1:Model

Car2:Year

Car2:Make

Car2:Model

Car3:Year

Car3:Make

Car3:Model

Frank

1234 Main St.

2001

Toyota

Camry

2004

Mazda

Protégé

2003

Nissan

Sentra


It seems a little harder for the programmers though to work with this type of model. Am I on the right track? I would be very interested to hear your thoughts on this matter.

Thanks!