You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Mridul Muralidharan <mr...@yahoo-inc.com> on 2010/01/08 22:40:52 UTC

FW: Read op question

A collegue is unable to send this mail to the list, so proxying it.
Thanks in advance for the responses !


Regards,
Mridul


---


Hi,
I'm trying to better understand the flow of the client read operation in 
HBase.  I've been looking at a combination of the HBase documents, Lars 
George's summary (very nice), the javadocs, and the BigTable paper.

My understanding from Lars and the documentation is that a given record 
maps to a single HRegion and a single Store on that HRegion.  Writes to 
that record are buffered in the MemStore.  When a MemStore is full, it 
is flushed to HDFS as an HFile.

My understanding is that if the record is updated multiple times, these 
updates may be stored in different HFiles.  The BigTable paper mentions 
this specifically, and I infer this from the HBase documentation too.

So my question is, what indexing is present on an HRegion to support a 
read of a single record?  Aside from looking in the MemStore, how do you 
know what HFiles to read?  On opening an HFile, do you scan the whole thing?

Thanks for any details, including pointers to class names.

-Adam

Re: FW: Read op question

Posted by stack <st...@duboce.net>.
On Fri, Jan 8, 2010 at 1:40 PM, Mridul Muralidharan
<mr...@yahoo-inc.com>wrote:

> So my question is, what indexing is present on an HRegion to support a read
> of a single record?  Aside from looking in the MemStore, how do you know
> what HFiles to read?  On opening an HFile, do you scan the whole thing?
>
> HFiles have an index.  They are like the sstable files in bigtable paper.
 See
http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/io/hfile/HFile.htmlfor
a bit of doc. on hfile format, etc.  They are made of blocks that are
by
default 64k in size.  The index, stored on the tail of the hfile, has the
offset of each block and the key that starts that block.  All hfiles are
open and kept open.  On open, their metadata including index is read into
memory.  A lookup for a particular cell will look in memstore, and then each
hfile.  If the wanted-cell is outside of the start/end key of the hfile
(start and end keys are part of metadata), we'll skip the file and move on
to the next.   Otherwise we'll find where to start seeking by doing a lookup
in the index.  We'll find the exact key (not usual) or the key just before
and then seek and read in the 64k block.  We'll move through the block until
we find (or not) the wanted key.  TODO, add bloomfilters on hfiles so we can
avoid the seek if wanted key is not present in the file.

St.Ack