You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Jean-Marc Spaggiari <je...@spaggiari.org> on 2013/03/23 16:37:44 UTC

HRegion for HRegionInfo?

Hi,

What's the best and cleanest way to get the HRegion object from the
HRegionInfo one? Is there any utility class doing that?

Thanks,

JM

Re: HRegion for HRegionInfo?

Posted by Ted <yu...@gmail.com>.
Calculating size of all the store files seems to be an intermediate step. 

Are you going to perform some action based on the result ?

I am asking this question because access to HRegion on client side is not provided for the reason Enis cited. 

If you have a case for server side improvement, I'd love to hear about it. 

On Mar 29, 2013, at 5:11 AM, Jean-Marc Spaggiari <je...@spaggiari.org> wrote:

> Hi Sean, thanks for the suggestion. I will take a look that way too.
> 
> Hi Enis,
> 
> I agree that they are very different. So far I'm using
> getClusterStatus, getServers and getLoad but have to load ALL the
> regions for ALL the servers even if I just want to get one table. And
> also, to "rebuild" the regions order, I need to call that on all the
> servers first. Which might take a while for very big clusters with
> very big tables. For me (8 RS, 60 regions), it's efficient, but I have
> no idea how long it's going to take to call 1000 times HServerLoad
> load = status.getLoad(server). I will try to see how long one call is
> taking to see if it's efficient.
> 
> The inital idea was to scan the Meta where I can found the region's
> names for a specific table in the right order, and from that build the
> HRegion objects. That way, if on a 1000 nodes cluster, the table is
> just on 10 of them, I don't have do wait for the 1000 calls to end.
> But I'm not able to get the RegionLoad and the HRegion objects from
> that.
> 
> So I will continue with the getClusterStatus until I found a better solution.
> 
> JM
> 
> 2013/3/29 Enis Söztutar <en...@gmail.com>:
>> HRegionInfo and HRegion are very different beasts. HRegion is the main
>> datastructure for region internals. You won't have access to it from the
>> client side. HRegionInfo is just a metadata holder.
>> 
>> Enis
>> 
>> 
>> On Fri, Mar 29, 2013 at 1:00 AM, Sean Zhong <cl...@gmail.com> wrote:
>> 
>>> Recuisive iteration over HDFS table folder an option? The performance
>>> should be good!
>>> 
>>> 
>>> 
>>> 
>>> On Sun, Mar 24, 2013 at 1:24 AM, Jean-Marc Spaggiari <
>>> jean-marc@spaggiari.org> wrote:
>>> 
>>>> Hi Ted,
>>>> 
>>>> There is no JIRA opened for that since I was not sure if it was
>>>> something required/useful/missing/etc.
>>>> 
>>>> But maybe I'm going the wrong way. The idea is, for a given table, I
>>>> want to have the size of all the store files, per region.
>>>> 
>>>> So far I'm using getClusterStatus so I can get all the regions for all
>>>> the tables. And then retrieve all the store files size.
>>>> 
>>>> But in an environment where there is hundred tables with thousands
>>>> column, it might take a bit to long to get all the region for a
>>>> specific table.
>>>> 
>>>> So the idea is to scan the META table to get all the regions for the
>>>> table I'm looking for, and from there, being able to get the HRegion
>>>> object for each of those regions...
>>>> 
>>>> JM
>>>> 
>>>> 2013/3/23 Ted Yu <yu...@gmail.com>:
>>>>> HRegion is used on region server side.
>>>>> Is this tracking on server side ?
>>>>> If there is JIRA, giving us the JIRA number would help.
>>>>> 
>>>>> Cheers
>>>>> 
>>>>> On Sat, Mar 23, 2013 at 9:01 AM, Jean-Marc Spaggiari <
>>>>> jean-marc@spaggiari.org> wrote:
>>>>> 
>>>>>> Hi Ted,
>>>>>> 
>>>>>> Yes, it's for 0.94 and newer.
>>>>>> 
>>>>>> For a given table, and a given region, I want to get the
>>> storeFileSize.
>>>>>> 
>>>>>> I want to be able to track the storeFileSize per region per table. And
>>>>>> since I already have the HRegionInfo I'm wondering if there is a way
>>>>>> to use this to get the HRegion to call getStorefileSizeMB.
>>>>>> 
>>>>>> I already found few ways to get the getStorefileSizeMB for a region,
>>>>>> but none clean and easy using an HRegionInfo parameter.
>>>>>> 
>>>>>> JM
>>>>>> 
>>>>>> 2013/3/23 Ted Yu <yu...@gmail.com>:
>>>>>>> Can you clarify your use case ?
>>>>>>> 
>>>>>>> This is for 0.94 and newer releases, I assume.
>>>>>>> 
>>>>>>> On Sat, Mar 23, 2013 at 8:37 AM, Jean-Marc Spaggiari <
>>>>>>> jean-marc@spaggiari.org> wrote:
>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> What's the best and cleanest way to get the HRegion object from the
>>>>>>>> HRegionInfo one? Is there any utility class doing that?
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> 
>>>>>>>> JM
>>> 

Re: HRegion for HRegionInfo?

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Sean, thanks for the suggestion. I will take a look that way too.

Hi Enis,

I agree that they are very different. So far I'm using
getClusterStatus, getServers and getLoad but have to load ALL the
regions for ALL the servers even if I just want to get one table. And
also, to "rebuild" the regions order, I need to call that on all the
servers first. Which might take a while for very big clusters with
very big tables. For me (8 RS, 60 regions), it's efficient, but I have
no idea how long it's going to take to call 1000 times HServerLoad
load = status.getLoad(server). I will try to see how long one call is
taking to see if it's efficient.

The inital idea was to scan the Meta where I can found the region's
names for a specific table in the right order, and from that build the
HRegion objects. That way, if on a 1000 nodes cluster, the table is
just on 10 of them, I don't have do wait for the 1000 calls to end.
But I'm not able to get the RegionLoad and the HRegion objects from
that.

So I will continue with the getClusterStatus until I found a better solution.

JM

2013/3/29 Enis Söztutar <en...@gmail.com>:
> HRegionInfo and HRegion are very different beasts. HRegion is the main
> datastructure for region internals. You won't have access to it from the
> client side. HRegionInfo is just a metadata holder.
>
> Enis
>
>
> On Fri, Mar 29, 2013 at 1:00 AM, Sean Zhong <cl...@gmail.com> wrote:
>
>> Recuisive iteration over HDFS table folder an option? The performance
>> should be good!
>>
>>
>>
>>
>> On Sun, Mar 24, 2013 at 1:24 AM, Jean-Marc Spaggiari <
>> jean-marc@spaggiari.org> wrote:
>>
>> > Hi Ted,
>> >
>> > There is no JIRA opened for that since I was not sure if it was
>> > something required/useful/missing/etc.
>> >
>> > But maybe I'm going the wrong way. The idea is, for a given table, I
>> > want to have the size of all the store files, per region.
>> >
>> > So far I'm using getClusterStatus so I can get all the regions for all
>> > the tables. And then retrieve all the store files size.
>> >
>> > But in an environment where there is hundred tables with thousands
>> > column, it might take a bit to long to get all the region for a
>> > specific table.
>> >
>> > So the idea is to scan the META table to get all the regions for the
>> > table I'm looking for, and from there, being able to get the HRegion
>> > object for each of those regions...
>> >
>> > JM
>> >
>> > 2013/3/23 Ted Yu <yu...@gmail.com>:
>> > > HRegion is used on region server side.
>> > > Is this tracking on server side ?
>> > > If there is JIRA, giving us the JIRA number would help.
>> > >
>> > > Cheers
>> > >
>> > > On Sat, Mar 23, 2013 at 9:01 AM, Jean-Marc Spaggiari <
>> > > jean-marc@spaggiari.org> wrote:
>> > >
>> > >> Hi Ted,
>> > >>
>> > >> Yes, it's for 0.94 and newer.
>> > >>
>> > >> For a given table, and a given region, I want to get the
>> storeFileSize.
>> > >>
>> > >> I want to be able to track the storeFileSize per region per table. And
>> > >> since I already have the HRegionInfo I'm wondering if there is a way
>> > >> to use this to get the HRegion to call getStorefileSizeMB.
>> > >>
>> > >> I already found few ways to get the getStorefileSizeMB for a region,
>> > >> but none clean and easy using an HRegionInfo parameter.
>> > >>
>> > >> JM
>> > >>
>> > >> 2013/3/23 Ted Yu <yu...@gmail.com>:
>> > >> > Can you clarify your use case ?
>> > >> >
>> > >> > This is for 0.94 and newer releases, I assume.
>> > >> >
>> > >> > On Sat, Mar 23, 2013 at 8:37 AM, Jean-Marc Spaggiari <
>> > >> > jean-marc@spaggiari.org> wrote:
>> > >> >
>> > >> >> Hi,
>> > >> >>
>> > >> >> What's the best and cleanest way to get the HRegion object from the
>> > >> >> HRegionInfo one? Is there any utility class doing that?
>> > >> >>
>> > >> >> Thanks,
>> > >> >>
>> > >> >> JM
>> > >> >>
>> > >>
>> >
>>

Re: HRegion for HRegionInfo?

Posted by Enis Söztutar <en...@gmail.com>.
HRegionInfo and HRegion are very different beasts. HRegion is the main
datastructure for region internals. You won't have access to it from the
client side. HRegionInfo is just a metadata holder.

Enis


On Fri, Mar 29, 2013 at 1:00 AM, Sean Zhong <cl...@gmail.com> wrote:

> Recuisive iteration over HDFS table folder an option? The performance
> should be good!
>
>
>
>
> On Sun, Mar 24, 2013 at 1:24 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
> > Hi Ted,
> >
> > There is no JIRA opened for that since I was not sure if it was
> > something required/useful/missing/etc.
> >
> > But maybe I'm going the wrong way. The idea is, for a given table, I
> > want to have the size of all the store files, per region.
> >
> > So far I'm using getClusterStatus so I can get all the regions for all
> > the tables. And then retrieve all the store files size.
> >
> > But in an environment where there is hundred tables with thousands
> > column, it might take a bit to long to get all the region for a
> > specific table.
> >
> > So the idea is to scan the META table to get all the regions for the
> > table I'm looking for, and from there, being able to get the HRegion
> > object for each of those regions...
> >
> > JM
> >
> > 2013/3/23 Ted Yu <yu...@gmail.com>:
> > > HRegion is used on region server side.
> > > Is this tracking on server side ?
> > > If there is JIRA, giving us the JIRA number would help.
> > >
> > > Cheers
> > >
> > > On Sat, Mar 23, 2013 at 9:01 AM, Jean-Marc Spaggiari <
> > > jean-marc@spaggiari.org> wrote:
> > >
> > >> Hi Ted,
> > >>
> > >> Yes, it's for 0.94 and newer.
> > >>
> > >> For a given table, and a given region, I want to get the
> storeFileSize.
> > >>
> > >> I want to be able to track the storeFileSize per region per table. And
> > >> since I already have the HRegionInfo I'm wondering if there is a way
> > >> to use this to get the HRegion to call getStorefileSizeMB.
> > >>
> > >> I already found few ways to get the getStorefileSizeMB for a region,
> > >> but none clean and easy using an HRegionInfo parameter.
> > >>
> > >> JM
> > >>
> > >> 2013/3/23 Ted Yu <yu...@gmail.com>:
> > >> > Can you clarify your use case ?
> > >> >
> > >> > This is for 0.94 and newer releases, I assume.
> > >> >
> > >> > On Sat, Mar 23, 2013 at 8:37 AM, Jean-Marc Spaggiari <
> > >> > jean-marc@spaggiari.org> wrote:
> > >> >
> > >> >> Hi,
> > >> >>
> > >> >> What's the best and cleanest way to get the HRegion object from the
> > >> >> HRegionInfo one? Is there any utility class doing that?
> > >> >>
> > >> >> Thanks,
> > >> >>
> > >> >> JM
> > >> >>
> > >>
> >
>

Re: HRegion for HRegionInfo?

Posted by Sean Zhong <cl...@gmail.com>.
Recuisive iteration over HDFS table folder an option? The performance
should be good!




On Sun, Mar 24, 2013 at 1:24 AM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi Ted,
>
> There is no JIRA opened for that since I was not sure if it was
> something required/useful/missing/etc.
>
> But maybe I'm going the wrong way. The idea is, for a given table, I
> want to have the size of all the store files, per region.
>
> So far I'm using getClusterStatus so I can get all the regions for all
> the tables. And then retrieve all the store files size.
>
> But in an environment where there is hundred tables with thousands
> column, it might take a bit to long to get all the region for a
> specific table.
>
> So the idea is to scan the META table to get all the regions for the
> table I'm looking for, and from there, being able to get the HRegion
> object for each of those regions...
>
> JM
>
> 2013/3/23 Ted Yu <yu...@gmail.com>:
> > HRegion is used on region server side.
> > Is this tracking on server side ?
> > If there is JIRA, giving us the JIRA number would help.
> >
> > Cheers
> >
> > On Sat, Mar 23, 2013 at 9:01 AM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> >> Hi Ted,
> >>
> >> Yes, it's for 0.94 and newer.
> >>
> >> For a given table, and a given region, I want to get the storeFileSize.
> >>
> >> I want to be able to track the storeFileSize per region per table. And
> >> since I already have the HRegionInfo I'm wondering if there is a way
> >> to use this to get the HRegion to call getStorefileSizeMB.
> >>
> >> I already found few ways to get the getStorefileSizeMB for a region,
> >> but none clean and easy using an HRegionInfo parameter.
> >>
> >> JM
> >>
> >> 2013/3/23 Ted Yu <yu...@gmail.com>:
> >> > Can you clarify your use case ?
> >> >
> >> > This is for 0.94 and newer releases, I assume.
> >> >
> >> > On Sat, Mar 23, 2013 at 8:37 AM, Jean-Marc Spaggiari <
> >> > jean-marc@spaggiari.org> wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> What's the best and cleanest way to get the HRegion object from the
> >> >> HRegionInfo one? Is there any utility class doing that?
> >> >>
> >> >> Thanks,
> >> >>
> >> >> JM
> >> >>
> >>
>

Re: HRegion for HRegionInfo?

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Ted,

There is no JIRA opened for that since I was not sure if it was
something required/useful/missing/etc.

But maybe I'm going the wrong way. The idea is, for a given table, I
want to have the size of all the store files, per region.

So far I'm using getClusterStatus so I can get all the regions for all
the tables. And then retrieve all the store files size.

But in an environment where there is hundred tables with thousands
column, it might take a bit to long to get all the region for a
specific table.

So the idea is to scan the META table to get all the regions for the
table I'm looking for, and from there, being able to get the HRegion
object for each of those regions...

JM

2013/3/23 Ted Yu <yu...@gmail.com>:
> HRegion is used on region server side.
> Is this tracking on server side ?
> If there is JIRA, giving us the JIRA number would help.
>
> Cheers
>
> On Sat, Mar 23, 2013 at 9:01 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> Hi Ted,
>>
>> Yes, it's for 0.94 and newer.
>>
>> For a given table, and a given region, I want to get the storeFileSize.
>>
>> I want to be able to track the storeFileSize per region per table. And
>> since I already have the HRegionInfo I'm wondering if there is a way
>> to use this to get the HRegion to call getStorefileSizeMB.
>>
>> I already found few ways to get the getStorefileSizeMB for a region,
>> but none clean and easy using an HRegionInfo parameter.
>>
>> JM
>>
>> 2013/3/23 Ted Yu <yu...@gmail.com>:
>> > Can you clarify your use case ?
>> >
>> > This is for 0.94 and newer releases, I assume.
>> >
>> > On Sat, Mar 23, 2013 at 8:37 AM, Jean-Marc Spaggiari <
>> > jean-marc@spaggiari.org> wrote:
>> >
>> >> Hi,
>> >>
>> >> What's the best and cleanest way to get the HRegion object from the
>> >> HRegionInfo one? Is there any utility class doing that?
>> >>
>> >> Thanks,
>> >>
>> >> JM
>> >>
>>

Re: HRegion for HRegionInfo?

Posted by Ted Yu <yu...@gmail.com>.
HRegion is used on region server side.
Is this tracking on server side ?
If there is JIRA, giving us the JIRA number would help.

Cheers

On Sat, Mar 23, 2013 at 9:01 AM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi Ted,
>
> Yes, it's for 0.94 and newer.
>
> For a given table, and a given region, I want to get the storeFileSize.
>
> I want to be able to track the storeFileSize per region per table. And
> since I already have the HRegionInfo I'm wondering if there is a way
> to use this to get the HRegion to call getStorefileSizeMB.
>
> I already found few ways to get the getStorefileSizeMB for a region,
> but none clean and easy using an HRegionInfo parameter.
>
> JM
>
> 2013/3/23 Ted Yu <yu...@gmail.com>:
> > Can you clarify your use case ?
> >
> > This is for 0.94 and newer releases, I assume.
> >
> > On Sat, Mar 23, 2013 at 8:37 AM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> >> Hi,
> >>
> >> What's the best and cleanest way to get the HRegion object from the
> >> HRegionInfo one? Is there any utility class doing that?
> >>
> >> Thanks,
> >>
> >> JM
> >>
>

Re: HRegion for HRegionInfo?

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Ted,

Yes, it's for 0.94 and newer.

For a given table, and a given region, I want to get the storeFileSize.

I want to be able to track the storeFileSize per region per table. And
since I already have the HRegionInfo I'm wondering if there is a way
to use this to get the HRegion to call getStorefileSizeMB.

I already found few ways to get the getStorefileSizeMB for a region,
but none clean and easy using an HRegionInfo parameter.

JM

2013/3/23 Ted Yu <yu...@gmail.com>:
> Can you clarify your use case ?
>
> This is for 0.94 and newer releases, I assume.
>
> On Sat, Mar 23, 2013 at 8:37 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> Hi,
>>
>> What's the best and cleanest way to get the HRegion object from the
>> HRegionInfo one? Is there any utility class doing that?
>>
>> Thanks,
>>
>> JM
>>

Re: HRegion for HRegionInfo?

Posted by Ted Yu <yu...@gmail.com>.
Can you clarify your use case ?

This is for 0.94 and newer releases, I assume.

On Sat, Mar 23, 2013 at 8:37 AM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi,
>
> What's the best and cleanest way to get the HRegion object from the
> HRegionInfo one? Is there any utility class doing that?
>
> Thanks,
>
> JM
>