You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Gaurav Vashishth <va...@gmail.com> on 2010/01/20 10:43:57 UTC

HBase: Designing Table structure

In order to store the high volume live data, I was thinking of designing the
table structure and two options came into my mind

1. Can we forcefully set the range in which region server will operate?
Like, If row with this partuclar ID come than this most goes to x region
server. If that is possible, then possibly I can have only one table in
cluster and can have dedicated region servers for them.

2. Another approach is to have multiple tables, I will manage this at my
code level and will insert the records in appropriate tables.

Table will have only one column family with less than 20 qualifiers.

Plase help where Im missing or I'm totally way off?

Thanks,

Gaurav
-- 
View this message in context: http://old.nabble.com/HBase%3A-Designing-Table-structure-tp27239038p27239038.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: HBase: Designing Table structure

Posted by Gaurav Vashishth <va...@gmail.com>.

Yes, the incoming stream will have good spread of symbols.

- Gaurav


stack-3 wrote:
> 
> On Wed, Jan 20, 2010 at 7:54 AM, Gaurav Vashishth
> <va...@gmail.com>wrote:
> 
>>
>> > Ok, I can go with this approach but now little about the generating
>> hask
>> > key values. Basically, it would be like for one particular symbol we
>> got
>> > the record from servers and we will keep getting the stream updates for
>> > the same symbol, there would be many symbols. Right now, I have made
>> the
>> > symbol name as row key and user's will also use this symbol to get the
>> > records from databse
>> >
>>
> 
> This should be good for spreading the write load across the cluster
> presuming there is a good spread of symbols in the incoming write stream.
> St.Ack
> P.S. Thanks for adding the pointer to context -- previous emails
> 
> 

-- 
View this message in context: http://old.nabble.com/HBase%3A-Designing-Table-structure-tp27239038p27244666.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: HBase: Designing Table structure

Posted by stack <st...@duboce.net>.

On Wed, Jan 20, 2010 at 7:54 AM, Gaurav Vashishth <va...@gmail.com>wrote:

>
> > Ok, I can go with this approach but now little about the generating hask
> > key values. Basically, it would be like for one particular symbol we got
> > the record from servers and we will keep getting the stream updates for
> > the same symbol, there would be many symbols. Right now, I have made the
> > symbol name as row key and user's will also use this symbol to get the
> > records from databse
> >
>

This should be good for spreading the write load across the cluster
presuming there is a good spread of symbols in the incoming write stream.
St.Ack
P.S. Thanks for adding the pointer to context -- previous emails

Re: HBase: Designing Table structure

Posted by Gaurav Vashishth <va...@gmail.com>.


stack-3 wrote:
> 
> On Wed, Jan 20, 2010 at 1:43 AM, Gaurav Vashishth
> <va...@gmail.com>wrote:
> 
>>
>> In order to store the high volume live data, I was thinking of designing
>> the
>> table structure and two options came into my mind
>>
> 
> Because the incoming stream rate is so high, you want it spread across all
> servers?
> 
> Yes, rate is very high, around 50K records/sec. I want to spread it across
> servers so that we can minimize the lock time as user will be reading the
> data simulatenouly also, I might be mixing RDBMS concept here..
> 
> 
> 
>>
>> 1. Can we forcefully set the range in which region server will operate?
>> Like, If row with this partuclar ID come than this most goes to x region
>> server. If that is possible, then possibly I can have only one table in
>> cluster and can have dedicated region servers for them.
>>
>> Why do you want to do the above?  What happens if the server dies?
> 
> I thought this so that we minimize the lock time on tables. We will have
> replicas of this also for exceptions
> 
>> 2. Another approach is to have multiple tables, I will manage this at my
>> code level and will insert the records in appropriate tables.
>>
>> Table will have only one column family with less than 20 qualifiers.
>>
> 
> 
> You could do this.
> 
> Or have one table and design the key so the incoming stream of writes are
> spread across the cluster; e.g. hash key values?
> 
> How are you going to access the data?  Thats also an important input
> designing keys.
> 
> Ok, I can go with this approach but now little about the generating hask
> key values. Basically, it would be like for one particular symbol we got
> the record from servers and we will keep getting the stream updates for
> the same symbol, there would be many symbols. Right now, I have made the
> symbol name as row key and user's will also use this symbol to get the
> records from databse
> 
> 
> St.Ack
> 
>>
>> Plase help where Im missing or I'm totally way off?
>>
>> Thanks,
>>
>> Gaurav
>> --
>> View this message in context:
>> http://old.nabble.com/HBase%3A-Designing-Table-structure-tp27239038p27239038.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://old.nabble.com/HBase%3A-Designing-Table-structure-tp27239038p27243983.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: HBase: Designing Table structure

Posted by stack <st...@duboce.net>.

On Wed, Jan 20, 2010 at 1:43 AM, Gaurav Vashishth <va...@gmail.com>wrote:

>
> In order to store the high volume live data, I was thinking of designing
> the
> table structure and two options came into my mind
>

Because the incoming stream rate is so high, you want it spread across all
servers?


>
> 1. Can we forcefully set the range in which region server will operate?
> Like, If row with this partuclar ID come than this most goes to x region
> server. If that is possible, then possibly I can have only one table in
> cluster and can have dedicated region servers for them.
>
> Why do you want to do the above?  What happens if the server dies?



> 2. Another approach is to have multiple tables, I will manage this at my
> code level and will insert the records in appropriate tables.
>
> Table will have only one column family with less than 20 qualifiers.
>


You could do this.

Or have one table and design the key so the incoming stream of writes are
spread across the cluster; e.g. hash key values?

How are you going to access the data?  Thats also an important input
designing keys.

St.Ack

>
> Plase help where Im missing or I'm totally way off?
>
> Thanks,
>
> Gaurav
> --
> View this message in context:
> http://old.nabble.com/HBase%3A-Designing-Table-structure-tp27239038p27239038.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
>