You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by Robert Dyer <ps...@gmail.com> on 2012/08/29 06:20:28 UTC

HBase and MapReduce data locality

I have been reading up on HBase and my understanding is that the
physical files on the HDFS are split first by region and then by
column families.

Thus each column family has its own physical file (on a per-region basis).

If I run a MapReduce task that uses the HBase as input, wouldn't this
imply that if the task reads from more than 1 column family the data
for that row might not be (entirely) local to the task?

Is there a way to tell the HDFS to keep blocks of each region's column
families together?

Re: HBase and MapReduce data locality

Posted by Robert Dyer <rd...@iastate.edu>.

Ah thanks for that link.  I missed it while browsing the docs.  The
link from there to this blog post

  http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html

really answers my questions! :-)

On Wed, Aug 29, 2012 at 2:38 AM, N Keywal <nk...@gmail.com> wrote:
> Inline. Just a set of "you're right" :-).
> It's documented here:
> http://hbase.apache.org/book.html#regions.arch.locality
>
> On Wed, Aug 29, 2012 at 8:06 AM, Robert Dyer <rd...@iastate.edu> wrote:
>>
>> Ok but does that imply that only 1 of your compute nodes is promised
>> to have all of the data for any given row?  The blocks will replicate,
>> but they don't necessarily all replicate to the same nodes right?
>
>
> Right.
>
>>
>> So if I have say 2 column families (cf1, cf2) and there is 2 physical
>> files on the HDFS for those (per region) then those files are created
>> on one datanode (dn1) which will have all blocks local to that node.
>
>
> Yes. Nit: datanodes don't "see" files, only blocks. But the logic remains
> the same.
>
>>
>> Once it replicates those blocks 2 more times by default, isn't it
>> possible the blocks for cf1 will go to dn2, dn3 while the blocks for
>> cf2 goes to dn4, dn5?
>
>
> Yes, it's possible (and even likely).

Re: HBase and MapReduce data locality

Posted by Robert Dyer <rd...@iastate.edu>.

Ah thanks for that link.  I missed it while browsing the docs.  The
link from there to this blog post

  http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html

really answers my questions! :-)

On Wed, Aug 29, 2012 at 2:38 AM, N Keywal <nk...@gmail.com> wrote:
> Inline. Just a set of "you're right" :-).
> It's documented here:
> http://hbase.apache.org/book.html#regions.arch.locality
>
> On Wed, Aug 29, 2012 at 8:06 AM, Robert Dyer <rd...@iastate.edu> wrote:
>>
>> Ok but does that imply that only 1 of your compute nodes is promised
>> to have all of the data for any given row?  The blocks will replicate,
>> but they don't necessarily all replicate to the same nodes right?
>
>
> Right.
>
>>
>> So if I have say 2 column families (cf1, cf2) and there is 2 physical
>> files on the HDFS for those (per region) then those files are created
>> on one datanode (dn1) which will have all blocks local to that node.
>
>
> Yes. Nit: datanodes don't "see" files, only blocks. But the logic remains
> the same.
>
>>
>> Once it replicates those blocks 2 more times by default, isn't it
>> possible the blocks for cf1 will go to dn2, dn3 while the blocks for
>> cf2 goes to dn4, dn5?
>
>
> Yes, it's possible (and even likely).

Re: HBase and MapReduce data locality

Posted by Robert Dyer <rd...@iastate.edu>.

Ah thanks for that link.  I missed it while browsing the docs.  The
link from there to this blog post

  http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html

really answers my questions! :-)

On Wed, Aug 29, 2012 at 2:38 AM, N Keywal <nk...@gmail.com> wrote:
> Inline. Just a set of "you're right" :-).
> It's documented here:
> http://hbase.apache.org/book.html#regions.arch.locality
>
> On Wed, Aug 29, 2012 at 8:06 AM, Robert Dyer <rd...@iastate.edu> wrote:
>>
>> Ok but does that imply that only 1 of your compute nodes is promised
>> to have all of the data for any given row?  The blocks will replicate,
>> but they don't necessarily all replicate to the same nodes right?
>
>
> Right.
>
>>
>> So if I have say 2 column families (cf1, cf2) and there is 2 physical
>> files on the HDFS for those (per region) then those files are created
>> on one datanode (dn1) which will have all blocks local to that node.
>
>
> Yes. Nit: datanodes don't "see" files, only blocks. But the logic remains
> the same.
>
>>
>> Once it replicates those blocks 2 more times by default, isn't it
>> possible the blocks for cf1 will go to dn2, dn3 while the blocks for
>> cf2 goes to dn4, dn5?
>
>
> Yes, it's possible (and even likely).

Re: HBase and MapReduce data locality

Posted by Robert Dyer <rd...@iastate.edu>.

Ah thanks for that link.  I missed it while browsing the docs.  The
link from there to this blog post

  http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html

really answers my questions! :-)

On Wed, Aug 29, 2012 at 2:38 AM, N Keywal <nk...@gmail.com> wrote:
> Inline. Just a set of "you're right" :-).
> It's documented here:
> http://hbase.apache.org/book.html#regions.arch.locality
>
> On Wed, Aug 29, 2012 at 8:06 AM, Robert Dyer <rd...@iastate.edu> wrote:
>>
>> Ok but does that imply that only 1 of your compute nodes is promised
>> to have all of the data for any given row?  The blocks will replicate,
>> but they don't necessarily all replicate to the same nodes right?
>
>
> Right.
>
>>
>> So if I have say 2 column families (cf1, cf2) and there is 2 physical
>> files on the HDFS for those (per region) then those files are created
>> on one datanode (dn1) which will have all blocks local to that node.
>
>
> Yes. Nit: datanodes don't "see" files, only blocks. But the logic remains
> the same.
>
>>
>> Once it replicates those blocks 2 more times by default, isn't it
>> possible the blocks for cf1 will go to dn2, dn3 while the blocks for
>> cf2 goes to dn4, dn5?
>
>
> Yes, it's possible (and even likely).

Re: HBase and MapReduce data locality

Posted by N Keywal <nk...@gmail.com>.

Inline. Just a set of "you're right" :-).
It's documented here:
http://hbase.apache.org/book.html#regions.arch.locality

On Wed, Aug 29, 2012 at 8:06 AM, Robert Dyer <rd...@iastate.edu> wrote:

> Ok but does that imply that only 1 of your compute nodes is promised
> to have all of the data for any given row?  The blocks will replicate,
> but they don't necessarily all replicate to the same nodes right?
>

Right.

> So if I have say 2 column families (cf1, cf2) and there is 2 physical
> files on the HDFS for those (per region) then those files are created
> on one datanode (dn1) which will have all blocks local to that node.
>

Yes. Nit: datanodes don't "see" files, only blocks. But the logic remains
the same.

> Once it replicates those blocks 2 more times by default, isn't it
> possible the blocks for cf1 will go to dn2, dn3 while the blocks for
> cf2 goes to dn4, dn5?
>

Yes, it's possible (and even likely).

Re: HBase and MapReduce data locality

Posted by N Keywal <nk...@gmail.com>.

Inline. Just a set of "you're right" :-).
It's documented here:
http://hbase.apache.org/book.html#regions.arch.locality

On Wed, Aug 29, 2012 at 8:06 AM, Robert Dyer <rd...@iastate.edu> wrote:

> Ok but does that imply that only 1 of your compute nodes is promised
> to have all of the data for any given row?  The blocks will replicate,
> but they don't necessarily all replicate to the same nodes right?
>

Right.

> So if I have say 2 column families (cf1, cf2) and there is 2 physical
> files on the HDFS for those (per region) then those files are created
> on one datanode (dn1) which will have all blocks local to that node.
>

Yes. Nit: datanodes don't "see" files, only blocks. But the logic remains
the same.

> Once it replicates those blocks 2 more times by default, isn't it
> possible the blocks for cf1 will go to dn2, dn3 while the blocks for
> cf2 goes to dn4, dn5?
>

Yes, it's possible (and even likely).

Re: HBase and MapReduce data locality

Posted by N Keywal <nk...@gmail.com>.

Inline. Just a set of "you're right" :-).
It's documented here:
http://hbase.apache.org/book.html#regions.arch.locality

On Wed, Aug 29, 2012 at 8:06 AM, Robert Dyer <rd...@iastate.edu> wrote:

> Ok but does that imply that only 1 of your compute nodes is promised
> to have all of the data for any given row?  The blocks will replicate,
> but they don't necessarily all replicate to the same nodes right?
>

Right.

> So if I have say 2 column families (cf1, cf2) and there is 2 physical
> files on the HDFS for those (per region) then those files are created
> on one datanode (dn1) which will have all blocks local to that node.
>

Yes. Nit: datanodes don't "see" files, only blocks. But the logic remains
the same.

> Once it replicates those blocks 2 more times by default, isn't it
> possible the blocks for cf1 will go to dn2, dn3 while the blocks for
> cf2 goes to dn4, dn5?
>

Yes, it's possible (and even likely).

Re: HBase and MapReduce data locality

Posted by N Keywal <nk...@gmail.com>.

Inline. Just a set of "you're right" :-).
It's documented here:
http://hbase.apache.org/book.html#regions.arch.locality

On Wed, Aug 29, 2012 at 8:06 AM, Robert Dyer <rd...@iastate.edu> wrote:

> Ok but does that imply that only 1 of your compute nodes is promised
> to have all of the data for any given row?  The blocks will replicate,
> but they don't necessarily all replicate to the same nodes right?
>

Right.

> So if I have say 2 column families (cf1, cf2) and there is 2 physical
> files on the HDFS for those (per region) then those files are created
> on one datanode (dn1) which will have all blocks local to that node.
>

Yes. Nit: datanodes don't "see" files, only blocks. But the logic remains
the same.

> Once it replicates those blocks 2 more times by default, isn't it
> possible the blocks for cf1 will go to dn2, dn3 while the blocks for
> cf2 goes to dn4, dn5?
>

Yes, it's possible (and even likely).

Re: HBase and MapReduce data locality

Posted by Robert Dyer <rd...@iastate.edu>.

Ok but does that imply that only 1 of your compute nodes is promised
to have all of the data for any given row?  The blocks will replicate,
but they don't necessarily all replicate to the same nodes right?

So if I have say 2 column families (cf1, cf2) and there is 2 physical
files on the HDFS for those (per region) then those files are created
on one datanode (dn1) which will have all blocks local to that node.

Once it replicates those blocks 2 more times by default, isn't it
possible the blocks for cf1 will go to dn2, dn3 while the blocks for
cf2 goes to dn4, dn5?

On Wed, Aug 29, 2012 at 12:47 AM, N Keywal <nk...@gmail.com> wrote:
> Hi,
>
> Locations are per block (a file is a set of blocks, a block is replicated on
> multiple hdfs datanodes).
> We have locality in HBase because hdfs datanodes are deployed on the same
> box as the hbase regionserver and hdfs writes one replica of the blocks on
> the datanode the same machine as the client (i.e. the regionserver from hdfs
> point of view).
>
> N.
>
>
>
> On Wed, Aug 29, 2012 at 6:20 AM, Robert Dyer <ps...@gmail.com> wrote:
>>
>> I have been reading up on HBase and my understanding is that the
>> physical files on the HDFS are split first by region and then by
>> column families.
>>
>> Thus each column family has its own physical file (on a per-region basis).
>>
>> If I run a MapReduce task that uses the HBase as input, wouldn't this
>> imply that if the task reads from more than 1 column family the data
>> for that row might not be (entirely) local to the task?
>>
>> Is there a way to tell the HDFS to keep blocks of each region's column
>> families together?
>
>

-- 

Robert Dyer
rdyer@iastate.edu

Re: HBase and MapReduce data locality

Posted by Robert Dyer <rd...@iastate.edu>.

Ok but does that imply that only 1 of your compute nodes is promised
to have all of the data for any given row?  The blocks will replicate,
but they don't necessarily all replicate to the same nodes right?

So if I have say 2 column families (cf1, cf2) and there is 2 physical
files on the HDFS for those (per region) then those files are created
on one datanode (dn1) which will have all blocks local to that node.

Once it replicates those blocks 2 more times by default, isn't it
possible the blocks for cf1 will go to dn2, dn3 while the blocks for
cf2 goes to dn4, dn5?

On Wed, Aug 29, 2012 at 12:47 AM, N Keywal <nk...@gmail.com> wrote:
> Hi,
>
> Locations are per block (a file is a set of blocks, a block is replicated on
> multiple hdfs datanodes).
> We have locality in HBase because hdfs datanodes are deployed on the same
> box as the hbase regionserver and hdfs writes one replica of the blocks on
> the datanode the same machine as the client (i.e. the regionserver from hdfs
> point of view).
>
> N.
>
>
>
> On Wed, Aug 29, 2012 at 6:20 AM, Robert Dyer <ps...@gmail.com> wrote:
>>
>> I have been reading up on HBase and my understanding is that the
>> physical files on the HDFS are split first by region and then by
>> column families.
>>
>> Thus each column family has its own physical file (on a per-region basis).
>>
>> If I run a MapReduce task that uses the HBase as input, wouldn't this
>> imply that if the task reads from more than 1 column family the data
>> for that row might not be (entirely) local to the task?
>>
>> Is there a way to tell the HDFS to keep blocks of each region's column
>> families together?
>
>

-- 

Robert Dyer
rdyer@iastate.edu

Re: HBase and MapReduce data locality

Posted by Robert Dyer <rd...@iastate.edu>.

Ok but does that imply that only 1 of your compute nodes is promised
to have all of the data for any given row?  The blocks will replicate,
but they don't necessarily all replicate to the same nodes right?

So if I have say 2 column families (cf1, cf2) and there is 2 physical
files on the HDFS for those (per region) then those files are created
on one datanode (dn1) which will have all blocks local to that node.

Once it replicates those blocks 2 more times by default, isn't it
possible the blocks for cf1 will go to dn2, dn3 while the blocks for
cf2 goes to dn4, dn5?

On Wed, Aug 29, 2012 at 12:47 AM, N Keywal <nk...@gmail.com> wrote:
> Hi,
>
> Locations are per block (a file is a set of blocks, a block is replicated on
> multiple hdfs datanodes).
> We have locality in HBase because hdfs datanodes are deployed on the same
> box as the hbase regionserver and hdfs writes one replica of the blocks on
> the datanode the same machine as the client (i.e. the regionserver from hdfs
> point of view).
>
> N.
>
>
>
> On Wed, Aug 29, 2012 at 6:20 AM, Robert Dyer <ps...@gmail.com> wrote:
>>
>> I have been reading up on HBase and my understanding is that the
>> physical files on the HDFS are split first by region and then by
>> column families.
>>
>> Thus each column family has its own physical file (on a per-region basis).
>>
>> If I run a MapReduce task that uses the HBase as input, wouldn't this
>> imply that if the task reads from more than 1 column family the data
>> for that row might not be (entirely) local to the task?
>>
>> Is there a way to tell the HDFS to keep blocks of each region's column
>> families together?
>
>

-- 

Robert Dyer
rdyer@iastate.edu

Re: HBase and MapReduce data locality

Posted by Robert Dyer <rd...@iastate.edu>.

Ok but does that imply that only 1 of your compute nodes is promised
to have all of the data for any given row?  The blocks will replicate,
but they don't necessarily all replicate to the same nodes right?

So if I have say 2 column families (cf1, cf2) and there is 2 physical
files on the HDFS for those (per region) then those files are created
on one datanode (dn1) which will have all blocks local to that node.

Once it replicates those blocks 2 more times by default, isn't it
possible the blocks for cf1 will go to dn2, dn3 while the blocks for
cf2 goes to dn4, dn5?

On Wed, Aug 29, 2012 at 12:47 AM, N Keywal <nk...@gmail.com> wrote:
> Hi,
>
> Locations are per block (a file is a set of blocks, a block is replicated on
> multiple hdfs datanodes).
> We have locality in HBase because hdfs datanodes are deployed on the same
> box as the hbase regionserver and hdfs writes one replica of the blocks on
> the datanode the same machine as the client (i.e. the regionserver from hdfs
> point of view).
>
> N.
>
>
>
> On Wed, Aug 29, 2012 at 6:20 AM, Robert Dyer <ps...@gmail.com> wrote:
>>
>> I have been reading up on HBase and my understanding is that the
>> physical files on the HDFS are split first by region and then by
>> column families.
>>
>> Thus each column family has its own physical file (on a per-region basis).
>>
>> If I run a MapReduce task that uses the HBase as input, wouldn't this
>> imply that if the task reads from more than 1 column family the data
>> for that row might not be (entirely) local to the task?
>>
>> Is there a way to tell the HDFS to keep blocks of each region's column
>> families together?
>
>

-- 

Robert Dyer
rdyer@iastate.edu

Re: HBase and MapReduce data locality

Posted by N Keywal <nk...@gmail.com>.

Hi,

Locations are per block (a file is a set of blocks, a block is replicated
on multiple hdfs datanodes).
We have locality in HBase because hdfs datanodes are deployed on the same
box as the hbase regionserver and hdfs writes one replica of the blocks on
the datanode the same machine as the client (i.e. the regionserver from
hdfs point of view).

N.

On Wed, Aug 29, 2012 at 6:20 AM, Robert Dyer <ps...@gmail.com> wrote:

> I have been reading up on HBase and my understanding is that the
> physical files on the HDFS are split first by region and then by
> column families.
>
> Thus each column family has its own physical file (on a per-region basis).
>
> If I run a MapReduce task that uses the HBase as input, wouldn't this
> imply that if the task reads from more than 1 column family the data
> for that row might not be (entirely) local to the task?
>
> Is there a way to tell the HDFS to keep blocks of each region's column
> families together?
>

Re: HBase and MapReduce data locality

Posted by N Keywal <nk...@gmail.com>.

Hi,

Locations are per block (a file is a set of blocks, a block is replicated
on multiple hdfs datanodes).
We have locality in HBase because hdfs datanodes are deployed on the same
box as the hbase regionserver and hdfs writes one replica of the blocks on
the datanode the same machine as the client (i.e. the regionserver from
hdfs point of view).

N.

On Wed, Aug 29, 2012 at 6:20 AM, Robert Dyer <ps...@gmail.com> wrote:

> I have been reading up on HBase and my understanding is that the
> physical files on the HDFS are split first by region and then by
> column families.
>
> Thus each column family has its own physical file (on a per-region basis).
>
> If I run a MapReduce task that uses the HBase as input, wouldn't this
> imply that if the task reads from more than 1 column family the data
> for that row might not be (entirely) local to the task?
>
> Is there a way to tell the HDFS to keep blocks of each region's column
> families together?
>

Re: HBase and MapReduce data locality

Posted by N Keywal <nk...@gmail.com>.

Hi,

Locations are per block (a file is a set of blocks, a block is replicated
on multiple hdfs datanodes).
We have locality in HBase because hdfs datanodes are deployed on the same
box as the hbase regionserver and hdfs writes one replica of the blocks on
the datanode the same machine as the client (i.e. the regionserver from
hdfs point of view).

N.

On Wed, Aug 29, 2012 at 6:20 AM, Robert Dyer <ps...@gmail.com> wrote:

> I have been reading up on HBase and my understanding is that the
> physical files on the HDFS are split first by region and then by
> column families.
>
> Thus each column family has its own physical file (on a per-region basis).
>
> If I run a MapReduce task that uses the HBase as input, wouldn't this
> imply that if the task reads from more than 1 column family the data
> for that row might not be (entirely) local to the task?
>
> Is there a way to tell the HDFS to keep blocks of each region's column
> families together?
>

Re: HBase and MapReduce data locality

Posted by N Keywal <nk...@gmail.com>.

Hi,

Locations are per block (a file is a set of blocks, a block is replicated
on multiple hdfs datanodes).
We have locality in HBase because hdfs datanodes are deployed on the same
box as the hbase regionserver and hdfs writes one replica of the blocks on
the datanode the same machine as the client (i.e. the regionserver from
hdfs point of view).

N.

On Wed, Aug 29, 2012 at 6:20 AM, Robert Dyer <ps...@gmail.com> wrote:

> I have been reading up on HBase and my understanding is that the
> physical files on the HDFS are split first by region and then by
> column families.
>
> Thus each column family has its own physical file (on a per-region basis).
>
> If I run a MapReduce task that uses the HBase as input, wouldn't this
> imply that if the task reads from more than 1 column family the data
> for that row might not be (entirely) local to the task?
>
> Is there a way to tell the HDFS to keep blocks of each region's column
> families together?
>