You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Jan Lukavský <ja...@firma.seznam.cz> on 2014/04/30 12:07:19 UTC

Region statistics created during major compaction

Hi all,

I have a general idea I'd like to consult. A short description of a 
problem we are facing: during mapreduce jobs run over HBase cluster, we 
very often see great disproportions in run time of different map tasks 
(some tasks tend to finish in minutes or even seconds, while others 
might take even hours). This causes the job to run inefficiently and the 
whole cluster to be underutilized - reducers have to wait until all the 
map tasks finish - at least before starting the sort phase. The number 
of long running map tasks is usually low, so the whole cluster basically 
waits until several machines finish their work. We tried to get over 
this by sampling the regions and creating some statistics (one statistic 
per mapreduce job), which we then used to tune the input format splits 
to make the distribution of running time more even. This seems to work 
(although at the time being might cause some issues with data locality, 
which we think we can solve).

Now, the questions is, would it be possible to calculate some statistics 
during major compactions and store them in the region directory on HDFS? 
What I mean by these statistics, I think it could be possible to store 
for some reasonable ranges of rows (so that for each region there would 
be like hundreds of these ranges):
  * total number of rows between specified rows
  * total number of KeyValues
  * amount of data stored on disk

These statistics could be calculated per column family and subsequently 
used in InputFormat to tune the splits to match even distribution as 
close as possible.

Is anyone else interested in this? Does anyone have any other solution 
to the problem I have described? I know we could say manually split 
regions that take long time to process, but first, these regions are 
job-specific (so different jobs have different regions that take long 
time to process), and second, ideally I'm looking for an automated solution.

Thanks for reply,
  Jan

Re: Region statistics created during major compaction

Posted by Jan Lukavský <ja...@firma.seznam.cz>.

Hi Ted,

hmm, the stripe compaction seems to solve the problem, provided there 
will be some statistics available (ideally over RPC on HMaster) for each 
stripe (subregion range). In my original idea, the ranges would have to 
be determined based on some heuristic (probably a guess could be made 
based on file size of all HFiles in the region directory on HDFS).

Jan

On 04/30/2014 12:26 PM, Ted Yu wrote:
> Interesting idea.
>
> How would the ranges of rows be determined per region ?
>
> This reminds me of stripe compaction feature which is in 0.98
> See HBASE-7667
>
> Cheers
>
> On Apr 30, 2014, at 3:07 AM, Jan Lukavský <ja...@firma.seznam.cz> wrote:
>
>> Hi all,
>>
>> I have a general idea I'd like to consult. A short description of a problem we are facing: during mapreduce jobs run over HBase cluster, we very often see great disproportions in run time of different map tasks (some tasks tend to finish in minutes or even seconds, while others might take even hours). This causes the job to run inefficiently and the whole cluster to be underutilized - reducers have to wait until all the map tasks finish - at least before starting the sort phase. The number of long running map tasks is usually low, so the whole cluster basically waits until several machines finish their work. We tried to get over this by sampling the regions and creating some statistics (one statistic per mapreduce job), which we then used to tune the input format splits to make the distribution of running time more even. This seems to work (although at the time being might cause some issues with data locality, which we think we can solve).
>>
>> Now, the questions is, would it be possible to calculate some statistics during major compactions and store them in the region directory on HDFS? What I mean by these statistics, I think it could be possible to store for some reasonable ranges of rows (so that for each region there would be like hundreds of these ranges):
>> * total number of rows between specified rows
>> * total number of KeyValues
>> * amount of data stored on disk
>>
>> These statistics could be calculated per column family and subsequently used in InputFormat to tune the splits to match even distribution as close as possible.
>>
>> Is anyone else interested in this? Does anyone have any other solution to the problem I have described? I know we could say manually split regions that take long time to process, but first, these regions are job-specific (so different jobs have different regions that take long time to process), and second, ideally I'm looking for an automated solution.
>>
>> Thanks for reply,
>> Jan
>>


-- 

Jan Lukavský
Vedoucí týmu vývoje
Seznam.cz, a.s.
Radlická 3494/10
15000, Praha 5

jan.lukavsky@firma.seznam.cz
http://www.seznam.cz

Re: Region statistics created during major compaction

Posted by Ted Yu <yu...@gmail.com>.

Interesting idea. 

How would the ranges of rows be determined per region ?

This reminds me of stripe compaction feature which is in 0.98
See HBASE-7667

Cheers

On Apr 30, 2014, at 3:07 AM, Jan Lukavský <ja...@firma.seznam.cz> wrote:

> Hi all,
> 
> I have a general idea I'd like to consult. A short description of a problem we are facing: during mapreduce jobs run over HBase cluster, we very often see great disproportions in run time of different map tasks (some tasks tend to finish in minutes or even seconds, while others might take even hours). This causes the job to run inefficiently and the whole cluster to be underutilized - reducers have to wait until all the map tasks finish - at least before starting the sort phase. The number of long running map tasks is usually low, so the whole cluster basically waits until several machines finish their work. We tried to get over this by sampling the regions and creating some statistics (one statistic per mapreduce job), which we then used to tune the input format splits to make the distribution of running time more even. This seems to work (although at the time being might cause some issues with data locality, which we think we can solve).
> 
> Now, the questions is, would it be possible to calculate some statistics during major compactions and store them in the region directory on HDFS? What I mean by these statistics, I think it could be possible to store for some reasonable ranges of rows (so that for each region there would be like hundreds of these ranges):
> * total number of rows between specified rows
> * total number of KeyValues
> * amount of data stored on disk
> 
> These statistics could be calculated per column family and subsequently used in InputFormat to tune the splits to match even distribution as close as possible.
> 
> Is anyone else interested in this? Does anyone have any other solution to the problem I have described? I know we could say manually split regions that take long time to process, but first, these regions are job-specific (so different jobs have different regions that take long time to process), and second, ideally I'm looking for an automated solution.
> 
> Thanks for reply,
> Jan
>