You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Bruce Bian <we...@gmail.com> on 2011/12/16 06:50:45 UTC

Is there an easy way to check HFile locality in HDFS?

Hi,
some disks of one node in my hbase cluster were broken, and after I mounted
some new ones and start regionserver/datanode on that node again, there
can't be data locality anymore unless I trigger a major_compaction on the
table manually(datanode/regionserver share the same physical node)
My question is, is there an easy way to check that all the regionservers
have a copy of its regions on the same physical node,like a script or
command,or else where to get the information so I can write one? I know
 the region info is stored in the .META. table, how about the region's
hfile blocks?

Re: Is there an easy way to check HFile locality in HDFS?

Posted by Andrew Purtell <ap...@apache.org>.
Cool. I'm working on it on the side.
 
   - Andy



----- Original Message -----
> From: Bruce Bian <we...@gmail.com>
> To: "user@hbase.apache.org" <us...@hbase.apache.org>; Andrew Purtell <ap...@apache.org>
> Cc: 
> Sent: Wednesday, December 21, 2011 3:06 AM
> Subject: Re: Is there an easy way to check HFile locality in HDFS?
> 
> Hi Andy,
> That will be really helpful.
> 
> On Sunday, December 18, 2011, Andrew Purtell <ap...@apache.org> wrote:
>>>  From: Doug Meil <do...@explorysmedical.com>
>> 
>>> 
>>>  That would be cool!  I mean, +1.
>>> 
>>>  On 12/17/11 4:02 PM, "Andrew Purtell" 
> <ap...@apache.org>
>>>  wrote:
>>> 
>>>>  Hmm. Would something like this be useful:
>>>> 
>>>>      org.apache.hadoop.hbase.HFileLocalityChecker [options]
>>>> 
>> 
>> 
>>  See HBASE-5061.
>> 
>>     - Andy
>> 
>> 
> 

Re: Is there an easy way to check HFile locality in HDFS?

Posted by Bruce Bian <we...@gmail.com>.
Hi Andy,
That will be really helpful.

On Sunday, December 18, 2011, Andrew Purtell <ap...@apache.org> wrote:
>> From: Doug Meil <do...@explorysmedical.com>
>
>>
>> That would be cool!  I mean, +1.
>>
>> On 12/17/11 4:02 PM, "Andrew Purtell" <ap...@apache.org>
>> wrote:
>>
>>> Hmm. Would something like this be useful:
>>>
>>>     org.apache.hadoop.hbase.HFileLocalityChecker [options]
>>>
>
>
> See HBASE-5061.
>
>    - Andy
>
>

Re: Is there an easy way to check HFile locality in HDFS?

Posted by Andrew Purtell <ap...@apache.org>.
> From: Doug Meil <do...@explorysmedical.com>

> 
> That would be cool!  I mean, +1.
> 
> On 12/17/11 4:02 PM, "Andrew Purtell" <ap...@apache.org> 
> wrote:
> 
>> Hmm. Would something like this be useful:
>> 
>>     org.apache.hadoop.hbase.HFileLocalityChecker [options]
>> 


See HBASE-5061.

   - Andy


Re: Is there an easy way to check HFile locality in HDFS?

Posted by Doug Meil <do...@explorysmedical.com>.
That would be cool!  I mean, +1.





On 12/17/11 4:02 PM, "Andrew Purtell" <ap...@apache.org> wrote:

>Hmm. Would something like this be useful:
>
>    org.apache.hadoop.hbase.HFileLocalityChecker [options]
>
>    Reports the number of local and nonlocal HFile blocks, and the ratio
>of
>    as a percentage.
>
>    Where options are:
>
>      -f <file>    Analyze a store file
>      -r <region>  Analyze all store files for the region
>      -t <table>   Analyze all store files for regions of the table served
>                   by the local regionserver
>      -h <host>    Consider <host> local, defaults to the local host
>      -v           Verbose operation
>
>
>? Or overkill? Happy to code it up...
>
>
>Best regards,
>
>
>       - Andy
>
>Problems worthy of attack prove their worth by hitting back. - Piet Hein
>(via Tom White)
>
>
>----- Original Message -----
>> From: Stack <st...@duboce.net>
>> To: user@hbase.apache.org
>> Cc: 
>> Sent: Friday, December 16, 2011 3:11 PM
>> Subject: Re: Is there an easy way to check HFile locality in HDFS?
>> 
>> On Thu, Dec 15, 2011 at 9:50 PM, Bruce Bian <we...@gmail.com>
>>wrote:
>>>  Hi,
>>>  some disks of one node in my hbase cluster were broken, and after I
>>>mounted
>>>  some new ones and start regionserver/datanode on that node again,
>>>there
>>>  can't be data locality anymore unless I trigger a major_compaction on
>> the
>>>  table manually(datanode/regionserver share the same physical node)
>>>  My question is, is there an easy way to check that all the
>>>regionservers
>>>  have a copy of its regions on the same physical node,like a script or
>>>  command,or else where to get the information so I can write one? I
>>>know
>>>   the region info is stored in the .META. table, how about the region's
>>>  hfile blocks?
>> 
>> 
>> In 0.92, there is a locality metric that tells you how much of the
>> regionserver load is local as a percentage that shows in the
>> regionserver UI.
>> 
>> St.Ack
>> 
>



Re: Is there an easy way to check HFile locality in HDFS?

Posted by Andrew Purtell <ap...@apache.org>.
Hmm. Would something like this be useful:

    org.apache.hadoop.hbase.HFileLocalityChecker [options]

    Reports the number of local and nonlocal HFile blocks, and the ratio of
    as a percentage.

    Where options are:

      -f <file>    Analyze a store file
      -r <region>  Analyze all store files for the region
      -t <table>   Analyze all store files for regions of the table served
                   by the local regionserver
      -h <host>    Consider <host> local, defaults to the local host
      -v           Verbose operation


? Or overkill? Happy to code it up...


Best regards,


       - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


----- Original Message -----
> From: Stack <st...@duboce.net>
> To: user@hbase.apache.org
> Cc: 
> Sent: Friday, December 16, 2011 3:11 PM
> Subject: Re: Is there an easy way to check HFile locality in HDFS?
> 
> On Thu, Dec 15, 2011 at 9:50 PM, Bruce Bian <we...@gmail.com> wrote:
>>  Hi,
>>  some disks of one node in my hbase cluster were broken, and after I mounted
>>  some new ones and start regionserver/datanode on that node again, there
>>  can't be data locality anymore unless I trigger a major_compaction on 
> the
>>  table manually(datanode/regionserver share the same physical node)
>>  My question is, is there an easy way to check that all the regionservers
>>  have a copy of its regions on the same physical node,like a script or
>>  command,or else where to get the information so I can write one? I know
>>   the region info is stored in the .META. table, how about the region's
>>  hfile blocks?
> 
> 
> In 0.92, there is a locality metric that tells you how much of the
> regionserver load is local as a percentage that shows in the
> regionserver UI.
> 
> St.Ack
> 

Re: Is there an easy way to check HFile locality in HDFS?

Posted by Stack <st...@duboce.net>.
On Thu, Dec 15, 2011 at 9:50 PM, Bruce Bian <we...@gmail.com> wrote:
> Hi,
> some disks of one node in my hbase cluster were broken, and after I mounted
> some new ones and start regionserver/datanode on that node again, there
> can't be data locality anymore unless I trigger a major_compaction on the
> table manually(datanode/regionserver share the same physical node)
> My question is, is there an easy way to check that all the regionservers
> have a copy of its regions on the same physical node,like a script or
> command,or else where to get the information so I can write one? I know
>  the region info is stored in the .META. table, how about the region's
> hfile blocks?


In 0.92, there is a locality metric that tells you how much of the
regionserver load is local as a percentage that shows in the
regionserver UI.

St.Ack

Re: Is there an easy way to check HFile locality in HDFS?

Posted by Doug Meil <do...@explorysmedical.com>.
Hi there-

There is an example of inspecting the Hbase files on HDFS in here...

http://hbase.apache.org/book.html#trouble.namenode

... that should be useful in this case.  You can check the replicas of the
StoreFiles from there.



On 12/16/11 12:50 AM, "Bruce Bian" <we...@gmail.com> wrote:

>Hi,
>some disks of one node in my hbase cluster were broken, and after I
>mounted
>some new ones and start regionserver/datanode on that node again, there
>can't be data locality anymore unless I trigger a major_compaction on the
>table manually(datanode/regionserver share the same physical node)
>My question is, is there an easy way to check that all the regionservers
>have a copy of its regions on the same physical node,like a script or
>command,or else where to get the information so I can write one? I know
> the region info is stored in the .META. table, how about the region's
>hfile blocks?