You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Shushant Arora <sh...@gmail.com> on 2016/05/06 01:36:06 UTC

hbase doubts

1.Why is it better to have single file per region than multiple files for
read performance. Why can't multile threads read multiple file and give
better performance?

2Does hbase regionserver has single thread for compactions and split for
all regions its holding? Why can't single thread per regions will work
better than sequential compactions/split for all regions in a regionserver.

3.Why hbase flush and compact all memstores of all the families of a table
at same time irrespective of their size when even one memstore reaches
threshold.

Thanks
Shushant

Re: hbase doubts

Posted by Ted Yu <yu...@gmail.com>.
For #2, see the following in CompactSplitThread - there is a config
parameter for merge threads as well:

  // Configuration key for the large compaction threads.

  public final static String LARGE_COMPACTION_THREADS =

      "hbase.regionserver.thread.compaction.large";

  public final static int LARGE_COMPACTION_THREADS_DEFAULT = 1;



  // Configuration key for the small compaction threads.

  public final static String SMALL_COMPACTION_THREADS =

      "hbase.regionserver.thread.compaction.small";

  public final static int SMALL_COMPACTION_THREADS_DEFAULT = 1;



  // Configuration key for split threads

  public final static String SPLIT_THREADS =
"hbase.regionserver.thread.split";

  public final static int SPLIT_THREADS_DEFAULT = 1;

On Thu, May 5, 2016 at 6:55 PM, Ted Yu <yu...@gmail.com> wrote:

> For #3, we already have the following in 1.1 release:
>
> HBASE-10201 Port 'Make flush decisions per column family' to trunk
>
> On Thu, May 5, 2016 at 6:36 PM, Shushant Arora <sh...@gmail.com>
> wrote:
>
>> 1.Why is it better to have single file per region than multiple files for
>> read performance. Why can't multile threads read multiple file and give
>> better performance?
>>
>> 2Does hbase regionserver has single thread for compactions and split for
>> all regions its holding? Why can't single thread per regions will work
>> better than sequential compactions/split for all regions in a
>> regionserver.
>>
>> 3.Why hbase flush and compact all memstores of all the families of a table
>> at same time irrespective of their size when even one memstore reaches
>> threshold.
>>
>> Thanks
>> Shushant
>>
>
>

Re: hbase doubts

Posted by Ted Yu <yu...@gmail.com>.
For #3, we already have the following in 1.1 release:

HBASE-10201 Port 'Make flush decisions per column family' to trunk

On Thu, May 5, 2016 at 6:36 PM, Shushant Arora <sh...@gmail.com>
wrote:

> 1.Why is it better to have single file per region than multiple files for
> read performance. Why can't multile threads read multiple file and give
> better performance?
>
> 2Does hbase regionserver has single thread for compactions and split for
> all regions its holding? Why can't single thread per regions will work
> better than sequential compactions/split for all regions in a regionserver.
>
> 3.Why hbase flush and compact all memstores of all the families of a table
> at same time irrespective of their size when even one memstore reaches
> threshold.
>
> Thanks
> Shushant
>