You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Rajeshkumar J <ra...@gmail.com> on 2017/05/26 06:45:53 UTC

region files

Hi,

   we have region max file size as 10 GB. Whether the hfiles of a region
exists in same region server or will it be distributed?

Thanks

Re: region files

Posted by Josh Elser <el...@apache.org>.
The assumption is that one of those three copies of the HDFS block 
comprising your HFiles are stored on the local datanode.

That is what the major compaction process guarantee.

On 5/26/17 9:59 AM, Rajeshkumar J wrote:
> I have seen the code in that while creating input split they are also
> sending region info with that splits. Is there any reason for that as all
> the hfiles are not going to be in that server
> 
> On Fri, May 26, 2017 at 7:06 PM, Ted Yu <yu...@gmail.com> wrote:
> 
>> Consider running major compaction which restores data locality.
>>
>> Thanks
>>
>>> On May 26, 2017, at 6:08 AM, Rajeshkumar J <ra...@gmail.com>
>> wrote:
>>>
>>> Thanks Ted. If data blocks of the hfile may not be on the same node as
>> the
>>> region server then how data locality is achieved when mapreduce is run
>> over
>>> hbase tables
>>>
>>>
>>>
>>>> On Fri, May 26, 2017 at 6:15 PM, Ted Yu <yu...@gmail.com> wrote:
>>>>
>>>> The hfiles of a region are stored on hdfs. By default, hdfs has
>> replication
>>>> factor of 3.
>>>> If you're not using read replica feature, any single region is served by
>>>> one region server (however the data blocks of the hfile may not be on
>> the
>>>> same node as the region server).
>>>>
>>>> Cheers
>>>>
>>>> On Thu, May 25, 2017 at 11:45 PM, Rajeshkumar J <
>>>> rajeshkumarit8292@gmail.com
>>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>>    we have region max file size as 10 GB. Whether the hfiles of a region
>>>>> exists in same region server or will it be distributed?
>>>>>
>>>>> Thanks
>>>>
>>
> 

Re: region files

Posted by Rajeshkumar J <ra...@gmail.com>.
I have seen the code in that while creating input split they are also
sending region info with that splits. Is there any reason for that as all
the hfiles are not going to be in that server

On Fri, May 26, 2017 at 7:06 PM, Ted Yu <yu...@gmail.com> wrote:

> Consider running major compaction which restores data locality.
>
> Thanks
>
> > On May 26, 2017, at 6:08 AM, Rajeshkumar J <ra...@gmail.com>
> wrote:
> >
> > Thanks Ted. If data blocks of the hfile may not be on the same node as
> the
> > region server then how data locality is achieved when mapreduce is run
> over
> > hbase tables
> >
> >
> >
> >> On Fri, May 26, 2017 at 6:15 PM, Ted Yu <yu...@gmail.com> wrote:
> >>
> >> The hfiles of a region are stored on hdfs. By default, hdfs has
> replication
> >> factor of 3.
> >> If you're not using read replica feature, any single region is served by
> >> one region server (however the data blocks of the hfile may not be on
> the
> >> same node as the region server).
> >>
> >> Cheers
> >>
> >> On Thu, May 25, 2017 at 11:45 PM, Rajeshkumar J <
> >> rajeshkumarit8292@gmail.com
> >>> wrote:
> >>
> >>> Hi,
> >>>
> >>>   we have region max file size as 10 GB. Whether the hfiles of a region
> >>> exists in same region server or will it be distributed?
> >>>
> >>> Thanks
> >>
>

Re: region files

Posted by Ted Yu <yu...@gmail.com>.
Consider running major compaction which restores data locality. 

Thanks

> On May 26, 2017, at 6:08 AM, Rajeshkumar J <ra...@gmail.com> wrote:
> 
> Thanks Ted. If data blocks of the hfile may not be on the same node as the
> region server then how data locality is achieved when mapreduce is run over
> hbase tables
> 
> 
> 
>> On Fri, May 26, 2017 at 6:15 PM, Ted Yu <yu...@gmail.com> wrote:
>> 
>> The hfiles of a region are stored on hdfs. By default, hdfs has replication
>> factor of 3.
>> If you're not using read replica feature, any single region is served by
>> one region server (however the data blocks of the hfile may not be on the
>> same node as the region server).
>> 
>> Cheers
>> 
>> On Thu, May 25, 2017 at 11:45 PM, Rajeshkumar J <
>> rajeshkumarit8292@gmail.com
>>> wrote:
>> 
>>> Hi,
>>> 
>>>   we have region max file size as 10 GB. Whether the hfiles of a region
>>> exists in same region server or will it be distributed?
>>> 
>>> Thanks
>> 

Re: region files

Posted by Rajeshkumar J <ra...@gmail.com>.
Thanks Ted. If data blocks of the hfile may not be on the same node as the
region server then how data locality is achieved when mapreduce is run over
hbase tables



On Fri, May 26, 2017 at 6:15 PM, Ted Yu <yu...@gmail.com> wrote:

> The hfiles of a region are stored on hdfs. By default, hdfs has replication
> factor of 3.
> If you're not using read replica feature, any single region is served by
> one region server (however the data blocks of the hfile may not be on the
> same node as the region server).
>
> Cheers
>
> On Thu, May 25, 2017 at 11:45 PM, Rajeshkumar J <
> rajeshkumarit8292@gmail.com
> > wrote:
>
> > Hi,
> >
> >    we have region max file size as 10 GB. Whether the hfiles of a region
> > exists in same region server or will it be distributed?
> >
> > Thanks
> >
>

Re: region files

Posted by Ted Yu <yu...@gmail.com>.
The hfiles of a region are stored on hdfs. By default, hdfs has replication
factor of 3.
If you're not using read replica feature, any single region is served by
one region server (however the data blocks of the hfile may not be on the
same node as the region server).

Cheers

On Thu, May 25, 2017 at 11:45 PM, Rajeshkumar J <rajeshkumarit8292@gmail.com
> wrote:

> Hi,
>
>    we have region max file size as 10 GB. Whether the hfiles of a region
> exists in same region server or will it be distributed?
>
> Thanks
>