You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Yossi Ittach <yo...@gmail.com> on 2008/10/27 14:33:56 UTC

HBase RegionServer and Hadoop : Basic Question

Hi All

I've looked at http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture but I
couldn't get a sure answer. This is a basic question , but it will clearify
a lot of things for , so your answer is appreciated :)

When I'm entering 256Mb of data to a table , the data itself is stored on
Hadoop DFS  (256Mb) and the RegionServer *MapFile* (in the memcahe) contains
the reference to the location of the row "war and peace" ?

A RegionSplit occurs when the MapFile is bigger then 256Mb (default) or when
the *content* stored on Hadoop DFS specific location is greater the 256Mb?

Thanks!










Vale et me ama
Yossi

RE: HBase RegionServer and Hadoop : Basic Question

Posted by "Jim Kellerman (POWERSET)" <Ji...@microsoft.com>.
answers in-line below:
> -----Original Message-----
> From: Yossi Ittach [mailto:yossale@gmail.com]
> Sent: Monday, October 27, 2008 6:34 AM
> To: hbase-user@hadoop.apache.org
> Subject: HBase RegionServer and Hadoop : Basic Question
>
> Hi All
>
> I've looked at http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture but I
> couldn't get a sure answer. This is a basic question , but it will
> clearify
> a lot of things for , so your answer is appreciated :)
>
> When I'm entering 256Mb of data to a table , the data itself is stored on
> Hadoop DFS  (256Mb) and the RegionServer *MapFile* (in the memcahe)
> contains the reference to the location of the row "war and peace" ?

No. When an update is received, it is first written to a write ahead log
(HLog) and then is stored in memcache. When the memcache reaches a
configurable size (default is 64 MB), it is written to a MapFile.
When the number of MapFiles exceeds a threshold they are compacted into
one or two MapFiles.

> A RegionSplit occurs when the MapFile is bigger then 256Mb (default) or
> when the *content* stored on Hadoop DFS specific location is greater the
> 256Mb?

A region split occurs when the aggregate MapFile size for a single
column family exceeds 256 MB (also configurable).

>
> Thanks!
>
>
>
>
>
>
>
>
>
>
> Vale et me ama
> Yossi