You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Cosmin Lehene <cl...@adobe.com> on 2008/09/17 14:08:29 UTC

Hbase reads an entire 64MB HDFS block over network when reading a single value?

Hi,

Does Hbase read the entire file from network (HDFS) when doing a get
operation, or it's able to read just a smaller data segment?

I got down to HStoreFile HBaseReader that does a MapFile.open, but I can't
really figure what happens next...



Thanks,
Cosmin


Re: Hbase reads an entire 64MB HDFS block over network when reading a single value?

Posted by stack <st...@duboce.net>.
No.  To get some random entry from a store file, the Mapfile index is
employed seeking the location of the asked-for key over in the remote file
in the fileystem.  See the MapFile#get function up in hadoop.  DFSClient
manages the work.  There are for sure inefficencies involved when the
fetched value is bytes but the block-size in HDFS is 64MB but its not the
case that the full block is pulled client-side to extract the wanted values
(nor the pulling of the complete file).

Its a bit of a hairy ride trying to hold on once you break below the surface
of (H)DFS Client. If you are about to deep-dive and want some company, feel
free keeping the discussion going up here on this list.

Yours,
St.Ack

On Wed, Sep 17, 2008 at 5:08 AM, Cosmin Lehene <cl...@adobe.com> wrote:

> Hi,
>
> Does Hbase read the entire file from network (HDFS) when doing a get
> operation, or it's able to read just a smaller data segment?
>
> I got down to HStoreFile HBaseReader that does a MapFile.open, but I can't
> really figure what happens next...
>
>
>
> Thanks,
> Cosmin
>
>