You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Saad Mufti <sa...@gmail.com> on 2016/04/27 17:27:34 UTC

HBase Write Performance Under Auto-Split

Hi,

Does anyone have experience with HBase write performance under auto-split
conditions? Out keyspace is randomized so all regions roughly start
auto-splitting around the same time, although early on when we had the 1024
regions we started with, they all decided to do so within an hour or so and
now that we're up to 6000 regions the process seems to be spread over 12
hours or more as they slowly reach their size thresholds.

During this time, our writes, for which we use a shared BufferedMutator
suffer as writes time out and the underlying AsyncProcess thread pool seems
to fill up. Which means callers to our service see their response times
shoot up as they spend time trying to drain the buffer and submit mutations
to the thread pool. So overall system time suffers and we can't keep up
with our input load.

Are there any guidelines on the size of the BufferedMutator to use? We are
even considering running performance tests without the BufferedMutator to
see if it is buying us anything. Currently we have it sized pretty large at
around 50 MB but maybe having it too big is not a good idea.

Any help/advice would be most appreciated.

Thanks.

----
Saad

Re: HBase Write Performance Under Auto-Split

Posted by Saad Mufti <sa...@gmail.com>.

Thanks for the feedback. We already disabled automatic major compaction,
looks like we have to do the same for auto-splitting.

----
Saad


On Wed, Apr 27, 2016 at 3:26 PM, Vladimir Rodionov <vl...@gmail.com>
wrote:

> Every split results in major compactions for both daughter regions.
> Concurrent major compactions across a cluster is bad.
> I recommend you to set DisabledRegionSplitPolicy on your table(s) and run
> splits manually - you will have control on what and when should be split.
> The same is true for major compactions: disable periodic major compactions
> and run them manually.
>
> -Vlad
>
> On Wed, Apr 27, 2016 at 8:27 AM, Saad Mufti <sa...@gmail.com> wrote:
>
> > Hi,
> >
> > Does anyone have experience with HBase write performance under auto-split
> > conditions? Out keyspace is randomized so all regions roughly start
> > auto-splitting around the same time, although early on when we had the
> 1024
> > regions we started with, they all decided to do so within an hour or so
> and
> > now that we're up to 6000 regions the process seems to be spread over 12
> > hours or more as they slowly reach their size thresholds.
> >
> > During this time, our writes, for which we use a shared BufferedMutator
> > suffer as writes time out and the underlying AsyncProcess thread pool
> seems
> > to fill up. Which means callers to our service see their response times
> > shoot up as they spend time trying to drain the buffer and submit
> mutations
> > to the thread pool. So overall system time suffers and we can't keep up
> > with our input load.
> >
> > Are there any guidelines on the size of the BufferedMutator to use? We
> are
> > even considering running performance tests without the BufferedMutator to
> > see if it is buying us anything. Currently we have it sized pretty large
> at
> > around 50 MB but maybe having it too big is not a good idea.
> >
> > Any help/advice would be most appreciated.
> >
> > Thanks.
> >
> > ----
> > Saad
> >
>

Re: HBase Write Performance Under Auto-Split

Posted by Vladimir Rodionov <vl...@gmail.com>.

Every split results in major compactions for both daughter regions.
Concurrent major compactions across a cluster is bad.
I recommend you to set DisabledRegionSplitPolicy on your table(s) and run
splits manually - you will have control on what and when should be split.
The same is true for major compactions: disable periodic major compactions
and run them manually.

-Vlad

On Wed, Apr 27, 2016 at 8:27 AM, Saad Mufti <sa...@gmail.com> wrote:

> Hi,
>
> Does anyone have experience with HBase write performance under auto-split
> conditions? Out keyspace is randomized so all regions roughly start
> auto-splitting around the same time, although early on when we had the 1024
> regions we started with, they all decided to do so within an hour or so and
> now that we're up to 6000 regions the process seems to be spread over 12
> hours or more as they slowly reach their size thresholds.
>
> During this time, our writes, for which we use a shared BufferedMutator
> suffer as writes time out and the underlying AsyncProcess thread pool seems
> to fill up. Which means callers to our service see their response times
> shoot up as they spend time trying to drain the buffer and submit mutations
> to the thread pool. So overall system time suffers and we can't keep up
> with our input load.
>
> Are there any guidelines on the size of the BufferedMutator to use? We are
> even considering running performance tests without the BufferedMutator to
> see if it is buying us anything. Currently we have it sized pretty large at
> around 50 MB but maybe having it too big is not a good idea.
>
> Any help/advice would be most appreciated.
>
> Thanks.
>
> ----
> Saad
>