You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Pamecha, Abhishek" <ap...@ebay.com> on 2013/02/15 21:37:10 UTC

queries and MR jobs

Hi

Is there a way to partition HDFS [replication factor, say 3]] or route requests to specific RS nodes so that

One set of nodes serve operations like put and get etc.
Other set of nodes do MR on the same replicated  data set
And those two sets don't share the same nodes?

I mean, If we are replicating and not worried about consistency equally across all replicas, can we allocate different jobs to different replicas based on that replica's consistency tuning.

I understand that HDFS interleaves replicated data across nodes so we don't have cookie-cut isolated replicas. And thus this question becomes more interesting? :)

An underlying question is how a node of its 2 other replicas, gets chosen for a specific request[ put/get] or a MR job.

Thanks,
Abhishek


Re: queries and MR jobs

Posted by Anoop John <an...@gmail.com>.
HBase data is ultimately persisted in HDFS and there it will be replicated
in different nodes. But HBase table's each region will be associated with
exactly one RS. So doing any operation on that region, ant client need to
contact this HRS only.

-Anoop-

On Sat, Feb 16, 2013 at 2:07 AM, Pamecha, Abhishek <ap...@ebay.com>wrote:

> Hi
>
> Is there a way to partition HDFS [replication factor, say 3]] or route
> requests to specific RS nodes so that
>
> One set of nodes serve operations like put and get etc.
> Other set of nodes do MR on the same replicated  data set
> And those two sets don't share the same nodes?
>
> I mean, If we are replicating and not worried about consistency equally
> across all replicas, can we allocate different jobs to different replicas
> based on that replica's consistency tuning.
>
> I understand that HDFS interleaves replicated data across nodes so we
> don't have cookie-cut isolated replicas. And thus this question becomes
> more interesting? :)
>
> An underlying question is how a node of its 2 other replicas, gets chosen
> for a specific request[ put/get] or a MR job.
>
> Thanks,
> Abhishek
>
>

Re: queries and MR jobs

Posted by Ted Yu <yu...@gmail.com>.
Currently there is no way of doing what you requested.

If you're concerned with locality, HBASE-4755 'HBase based block placement
in DFS' may be of interest to you.

Cheers

On Fri, Feb 15, 2013 at 12:37 PM, Pamecha, Abhishek <ap...@ebay.com>wrote:

> Hi
>
> Is there a way to partition HDFS [replication factor, say 3]] or route
> requests to specific RS nodes so that
>
> One set of nodes serve operations like put and get etc.
> Other set of nodes do MR on the same replicated  data set
> And those two sets don't share the same nodes?
>
> I mean, If we are replicating and not worried about consistency equally
> across all replicas, can we allocate different jobs to different replicas
> based on that replica's consistency tuning.
>
> I understand that HDFS interleaves replicated data across nodes so we
> don't have cookie-cut isolated replicas. And thus this question becomes
> more interesting? :)
>
> An underlying question is how a node of its 2 other replicas, gets chosen
> for a specific request[ put/get] or a MR job.
>
> Thanks,
> Abhishek
>
>