You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Peter Haidinyak <ph...@local.com> on 2011/01/12 19:04:38 UTC

Region Server on Data Node

Hi,
  This might be a really dumb question but do you need to run a region server on a machine that is being used as a Hadoop data node. If not, what are the performance penalties?

Thanks

-Pete

Re: Region Server on Data Node

Posted by Jean-Daniel Cryans <jd...@apache.org>.
If the file is already major compacted (flag in HFile) and there's
only one, then it won't be major compacted. If it always stays like
that and the region never moves (and the balancer never moves any of
those blocks) then it could be that the region never becomes local
yeah. But that's a lot of ifs.

J-D

On Wed, Jan 12, 2011 at 5:04 PM, M. C. Srivas <mc...@gmail.com> wrote:
> Is a region that is never modified still compacted every 24 hrs? If not,
> given that data is typically read more often than written, is there a
> possibility that the region may never become "local"?
>
>
> On Wed, Jan 12, 2011 at 10:22 AM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> The region server knows nothing about the file locality. The magic
>> happens between the DFSClient and the Namenode; in HDFS, new files
>> will have one block on the local datanode when it's possible, but
>> existing ones won't be moved. One thing though is that the region
>> server compacts files as new ones get flushed, so those rewritten
>> files will be local. Also there's one major compaction per day (if
>> needed), so after roughly 24h the files served by the region server
>> should have one block each on the local datanode.
>>
>> J-D
>>
>> On Wed, Jan 12, 2011 at 10:17 AM, Peter Haidinyak <ph...@local.com>
>> wrote:
>> > Thanks, I had thought I had a data node running on that machine but I
>> didn't. If I setup a data node on the machine will HBase automagically move
>> its files into Hadoop? (hope, hope).
>> >
>> > Thanks again.
>> >
>> > -Pete
>> >
>> > -----Original Message-----
>> > From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of
>> Jean-Daniel Cryans
>> > Sent: Wednesday, January 12, 2011 10:12 AM
>> > To: user@hbase.apache.org
>> > Subject: Re: Region Server on Data Node
>> >
>> > You don't have to, but it's best to do it. This will help you
>> > understanding why:
>> > http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html
>> >
>> > J-D
>> >
>> > On Wed, Jan 12, 2011 at 10:04 AM, Peter Haidinyak <ph...@local.com>
>> wrote:
>> >> Hi,
>> >>  This might be a really dumb question but do you need to run a region
>> server on a machine that is being used as a Hadoop data node. If not, what
>> are the performance penalties?
>> >>
>> >> Thanks
>> >>
>> >> -Pete
>> >>
>> >
>>
>

Re: Region Server on Data Node

Posted by "M. C. Srivas" <mc...@gmail.com>.
Is a region that is never modified still compacted every 24 hrs? If not,
given that data is typically read more often than written, is there a
possibility that the region may never become "local"?


On Wed, Jan 12, 2011 at 10:22 AM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> The region server knows nothing about the file locality. The magic
> happens between the DFSClient and the Namenode; in HDFS, new files
> will have one block on the local datanode when it's possible, but
> existing ones won't be moved. One thing though is that the region
> server compacts files as new ones get flushed, so those rewritten
> files will be local. Also there's one major compaction per day (if
> needed), so after roughly 24h the files served by the region server
> should have one block each on the local datanode.
>
> J-D
>
> On Wed, Jan 12, 2011 at 10:17 AM, Peter Haidinyak <ph...@local.com>
> wrote:
> > Thanks, I had thought I had a data node running on that machine but I
> didn't. If I setup a data node on the machine will HBase automagically move
> its files into Hadoop? (hope, hope).
> >
> > Thanks again.
> >
> > -Pete
> >
> > -----Original Message-----
> > From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of
> Jean-Daniel Cryans
> > Sent: Wednesday, January 12, 2011 10:12 AM
> > To: user@hbase.apache.org
> > Subject: Re: Region Server on Data Node
> >
> > You don't have to, but it's best to do it. This will help you
> > understanding why:
> > http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html
> >
> > J-D
> >
> > On Wed, Jan 12, 2011 at 10:04 AM, Peter Haidinyak <ph...@local.com>
> wrote:
> >> Hi,
> >>  This might be a really dumb question but do you need to run a region
> server on a machine that is being used as a Hadoop data node. If not, what
> are the performance penalties?
> >>
> >> Thanks
> >>
> >> -Pete
> >>
> >
>

Re: Region Server on Data Node

Posted by Jean-Daniel Cryans <jd...@apache.org>.
The region server knows nothing about the file locality. The magic
happens between the DFSClient and the Namenode; in HDFS, new files
will have one block on the local datanode when it's possible, but
existing ones won't be moved. One thing though is that the region
server compacts files as new ones get flushed, so those rewritten
files will be local. Also there's one major compaction per day (if
needed), so after roughly 24h the files served by the region server
should have one block each on the local datanode.

J-D

On Wed, Jan 12, 2011 at 10:17 AM, Peter Haidinyak <ph...@local.com> wrote:
> Thanks, I had thought I had a data node running on that machine but I didn't. If I setup a data node on the machine will HBase automagically move its files into Hadoop? (hope, hope).
>
> Thanks again.
>
> -Pete
>
> -----Original Message-----
> From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans
> Sent: Wednesday, January 12, 2011 10:12 AM
> To: user@hbase.apache.org
> Subject: Re: Region Server on Data Node
>
> You don't have to, but it's best to do it. This will help you
> understanding why:
> http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html
>
> J-D
>
> On Wed, Jan 12, 2011 at 10:04 AM, Peter Haidinyak <ph...@local.com> wrote:
>> Hi,
>>  This might be a really dumb question but do you need to run a region server on a machine that is being used as a Hadoop data node. If not, what are the performance penalties?
>>
>> Thanks
>>
>> -Pete
>>
>

RE: Region Server on Data Node

Posted by Peter Haidinyak <ph...@local.com>.
Thanks, I had thought I had a data node running on that machine but I didn't. If I setup a data node on the machine will HBase automagically move its files into Hadoop? (hope, hope).

Thanks again.

-Pete

-----Original Message-----
From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans
Sent: Wednesday, January 12, 2011 10:12 AM
To: user@hbase.apache.org
Subject: Re: Region Server on Data Node

You don't have to, but it's best to do it. This will help you
understanding why:
http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html

J-D

On Wed, Jan 12, 2011 at 10:04 AM, Peter Haidinyak <ph...@local.com> wrote:
> Hi,
>  This might be a really dumb question but do you need to run a region server on a machine that is being used as a Hadoop data node. If not, what are the performance penalties?
>
> Thanks
>
> -Pete
>

Re: Region Server on Data Node

Posted by Jean-Daniel Cryans <jd...@apache.org>.
You don't have to, but it's best to do it. This will help you
understanding why:
http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html

J-D

On Wed, Jan 12, 2011 at 10:04 AM, Peter Haidinyak <ph...@local.com> wrote:
> Hi,
>  This might be a really dumb question but do you need to run a region server on a machine that is being used as a Hadoop data node. If not, what are the performance penalties?
>
> Thanks
>
> -Pete
>