You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Florin P <fl...@yahoo.com> on 2011/07/01 16:09:30 UTC

Some queestions about HBase Architecture

Hello!
  I've read the HBase architecture from the book
http://hbase.apache.org/book.html#architecture (HBA)
and confronted with HBase definitive guide (HBDG)
http://ofps.oreilly.com/titles/9781449396107/architecture.html
Some questions raised:
1. How many MemStores can have Region?
  HBDG: "A HRegion also has a MemStore" 
  HBA: "A Store hosts a MemStore". A Store corresponds to a column family for a table for a given region. 
2. How many HLog instances are created per Region Servers?
   HBDG:"A HRegion also has [...] a HLog instance"
   HBA: "[...]and there is one HLog instance per RegionServer. "
3. After reading the HBA, I've concluded (please correct me if I'm wrong), that the are these reletionships
   a) A RegionServer has one HLog instance
   b) A RegionServer has one .META. table that holds meta information about many HTable
   c) A HTable can be split in many Regions. 
   d) A HTable can have many column family
   e) A Column family has one Store
   f) A Store can have zero ore more HFile instances
   g) A Store can one MemStore
   h) A column family can have zero or many columns
I'll look forward for your opinions and answers...and please complete with your knowledge.
Thank you.
 Regards,
  Florin


Re: Some queestions about HBase Architecture

Posted by Lars George <la...@gmail.com>.
BTW, here the current version: http://ofps.oreilly.com/titles/9781449396107/architecture.html#archstorage

Please add feedback online too :)

On Jul 6, 2011, at 10:37 PM, Lars George wrote:

> Hi Florin,
> 
> Note that this was way old stuff. I updated that chapter the last 3-4 days. Inline...
> 
> 1. How many MemStores can have Region?
>  HBDG: "A HRegion also has a MemStore"
>  HBA: "A Store hosts a MemStore". A Store corresponds to a column family for a table for a given region.
> 
> Each Store has a MemStore.
>  
> 2. How many HLog instances are created per Region Servers?
>   HBDG:"A HRegion also has [...] a HLog instance"
>   HBA: "[...]and there is one HLog instance per RegionServer. "
> 
> The HRegion has a reference of the shared HLog, I had already updated that in the newer version.
>  
> 3. After reading the HBA, I've concluded (please correct me if I'm wrong), that the are these reletionships
>   a) A RegionServer has one HLog instance
> 
> Yes, and it is shared with the HRegion instances.
>  
>   b) A RegionServer has one .META. table that holds meta information about many HTable
> 
> Well, one region server "hosts" the .META. region. Basically, the .META. is just another table, but treated specially. It only has one region (for now) and that is on one server. Which one is random.
>  
>   c) A HTable can be split in many Regions.
> 
> Yes.
>  
>   d) A HTable can have many column family
> 
> Yes.
>  
>   e) A Column family has one Store
> 
> Yes.
>  
>   f) A Store can have zero ore more HFile instances
> 
> Yes, wrapped into StoreFile instances.
>  
>   g) A Store can one MemStore
> 
> It has one, yes.
>  
>   h) A column family can have zero or many columns
> 
> Well, the CF is just a schema. The actual columns define a row. So if you define a CF, it will create a directory for it on disk. But if you do not store any data, then no data resides on disk.
>  
> I'll look forward for your opinions and answers...and please complete with your knowledge.
> 
> Hope that helps. Sorry for putting the old info into that section, I had added a disclaimer a few days ago that I am still overhauling this chapter. It was written for 0.20.x! But as I said, I just finished the overhaul, and added much more to it.
> 
> Cheers,
> Lars
> 
>  
> Thank you.
>  Regards,
>  Florin
> 
> 


Re: Some queestions about HBase Architecture

Posted by Lars George <la...@gmail.com>.
Hi Florin,

Note that this was way old stuff. I updated that chapter the last 3-4 days.
Inline...

1. How many MemStores can have Region?
>  HBDG: "A HRegion also has a MemStore"
>  HBA: "A Store hosts a MemStore". A Store corresponds to a column family
> for a table for a given region.
>

Each Store has a MemStore.


> 2. How many HLog instances are created per Region Servers?
>   HBDG:"A HRegion also has [...] a HLog instance"
>   HBA: "[...]and there is one HLog instance per RegionServer. "
>

The HRegion has a reference of the shared HLog, I had already updated that
in the newer version.


> 3. After reading the HBA, I've concluded (please correct me if I'm wrong),
> that the are these reletionships
>   a) A RegionServer has one HLog instance
>

Yes, and it is shared with the HRegion instances.


>   b) A RegionServer has one .META. table that holds meta information about
> many HTable
>

Well, one region server "hosts" the .META. region. Basically, the .META. is
just another table, but treated specially. It only has one region (for now)
and that is on one server. Which one is random.


>   c) A HTable can be split in many Regions.
>

Yes.


>   d) A HTable can have many column family
>

Yes.


>   e) A Column family has one Store
>

Yes.


>   f) A Store can have zero ore more HFile instances
>

Yes, wrapped into StoreFile instances.


>   g) A Store can one MemStore
>

It has one, yes.


>   h) A column family can have zero or many columns
>

Well, the CF is just a schema. The actual columns define a row. So if you
define a CF, it will create a directory for it on disk. But if you do not
store any data, then no data resides on disk.


> I'll look forward for your opinions and answers...and please complete with
> your knowledge.
>

Hope that helps. Sorry for putting the old info into that section, I had
added a disclaimer a few days ago that I am still overhauling this chapter.
It was written for 0.20.x! But as I said, I just finished the overhaul, and
added much more to it.

Cheers,
Lars



> Thank you.
>  Regards,
>   Florin
>
>

Re: Some queestions about HBase Architecture

Posted by Stack <st...@duboce.net>.
@Lars:

Looks like Florin has some feedback for you!

St.Ack

On Fri, Jul 1, 2011 at 7:09 AM, Florin P <fl...@yahoo.com> wrote:
> Hello!
>  I've read the HBase architecture from the book
> http://hbase.apache.org/book.html#architecture (HBA)
> and confronted with HBase definitive guide (HBDG)
> http://ofps.oreilly.com/titles/9781449396107/architecture.html
> Some questions raised:
> 1. How many MemStores can have Region?
>  HBDG: "A HRegion also has a MemStore"
>  HBA: "A Store hosts a MemStore". A Store corresponds to a column family for a table for a given region.
> 2. How many HLog instances are created per Region Servers?
>   HBDG:"A HRegion also has [...] a HLog instance"
>   HBA: "[...]and there is one HLog instance per RegionServer. "
> 3. After reading the HBA, I've concluded (please correct me if I'm wrong), that the are these reletionships
>   a) A RegionServer has one HLog instance
>   b) A RegionServer has one .META. table that holds meta information about many HTable
>   c) A HTable can be split in many Regions.
>   d) A HTable can have many column family
>   e) A Column family has one Store
>   f) A Store can have zero ore more HFile instances
>   g) A Store can one MemStore
>   h) A column family can have zero or many columns
> I'll look forward for your opinions and answers...and please complete with your knowledge.
> Thank you.
>  Regards,
>  Florin
>
>

Re: Some queestions about HBase Architecture

Posted by Ted Yu <yu...@gmail.com>.
Since Doug maintains HBA actively, you can trust what it says.

For item 3, most of your observations are correct except for b):
There is only one RegionServer hosting .META. table at a given time

On Fri, Jul 1, 2011 at 7:09 AM, Florin P <fl...@yahoo.com> wrote:

> Hello!
>  I've read the HBase architecture from the book
> http://hbase.apache.org/book.html#architecture (HBA)
> and confronted with HBase definitive guide (HBDG)
> http://ofps.oreilly.com/titles/9781449396107/architecture.html
> Some questions raised:
> 1. How many MemStores can have Region?
>  HBDG: "A HRegion also has a MemStore"
>  HBA: "A Store hosts a MemStore". A Store corresponds to a column family
> for a table for a given region.
> 2. How many HLog instances are created per Region Servers?
>   HBDG:"A HRegion also has [...] a HLog instance"
>   HBA: "[...]and there is one HLog instance per RegionServer. "
> 3. After reading the HBA, I've concluded (please correct me if I'm wrong),
> that the are these reletionships
>   a) A RegionServer has one HLog instance
>   b) A RegionServer has one .META. table that holds meta information about
> many HTable
>   c) A HTable can be split in many Regions.
>   d) A HTable can have many column family
>   e) A Column family has one Store
>   f) A Store can have zero ore more HFile instances
>   g) A Store can one MemStore
>   h) A column family can have zero or many columns
> I'll look forward for your opinions and answers...and please complete with
> your knowledge.
> Thank you.
>  Regards,
>   Florin
>
>