You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by 范范欣欣 <li...@gmail.com> on 2017/04/11 05:58:20 UTC

A new Mutilple-Type Queue idea to handle multiple workloads

Now, the feature HBASE-11355 seperates the single Call Queue into
MutilQueue(get call queue, write call queue and scan call queue), and each
type queue can specify fixed number of handlers. It's helpful in some
outages , to avoid all read or all write requests ran out of handler
threads.

however, there are still several problems :

1. workloads in the same request type(get/write/scan) may influence each
other as before, consider the following scenario:

(1) both client-1 and client-2 send write requests, the client-1 write the
large objects(100KB record) , and client-2 write the small objects (1KB
record). the client -1 will ran out of all handler threads of the
write-queue, and decrease the client-2 throughput

(2) both client-3 and client-4 send get requests, the client-3 search all
data from lots of hfiles( all search key are equally popular), read latency
is high. the client-4 do not require any I/O resources(say, data is
cached). the client-3 will ran out of all handler threads of the get-queue,
and increase the read latency of client-4

2. administor can't increate/decrease the handler number for the specified
queue easily


we are trying to implement a new Mutilple-Typed Queue, administor can
create a queue with a specified number of handler for specified table and
specified request type(get/write/scan), as:

create_queue 'queue1' ,{'handler' => 100}

grant_queue 'table1','scan','queue1'

grant_queue 'tableN','scan','queue1'


create_queue 'queue2' ,{'handler' => 50}

grant_queue 'table2','write','queue2'


the idea based on the fact that the workload for a specified table and
request type will be unique.

in addition, administor can manager the queue with commands:

//easily increase/decrease handlers

alter_queue 'queue1' ,{'handler' => 50}

//list all queues

list_queues

//drop the specified queue

drop_queue 'queue1'


I am wondering if the developers could look at the idea and let me know if
anything is incorrect or inaccurate, or if I have missed anything.


Thanks a lot for your help!

Re: A new Mutilple-Type Queue idea to handle multiple workloads

Posted by libis <li...@gmail.com>.
> bq. client-3 will ran out of all handler threads of the get-queue

> Looks like you meant to say that client-3 would occupy all handlers.

the client-3 would accupy all handlers assigned to the get-queue, which
also accepts the requests from the client-4

> The pictures didn't go through.

> Since you have code for this improvement, consider opening JIRA where you
can publish your patch and pictures so that we get better idea of your
enhancement.

I will create a jira to publish some documents about the idea, including
design framework and test pictures today. As for the patch, we are still
developing, we hope to get more suggestions on functional design and avoid
doing lots of incorrect work.

Thanks a lot !


2017-04-16 18:30 GMT+08:00 Ted Yu <yu...@gmail.com>:

> bq. client-3 will ran out of all handler threads of the get-queue
>
> Looks like you meant to say that client-3 would occupy all handlers.
>
> The pictures didn't go through.
>
> Since you have code for this improvement, consider opening JIRA where you
> can publish your patch and pictures so that we get better idea of your
> enhancement.
>
> Thanks
>
> On Sun, Apr 16, 2017 at 3:14 AM, libis <li...@gmail.com> wrote:
>
> > I am runing expriment at the following scenario : both client-1 and
> > client-2 send update requests, the client-1 write the large objects(100KB
> > record) into table-1, and client-2 write the small objects (1KB record)
> > into table-2. The charts below shows the effect of the Mutilple-Type
> queue
> > feature:
> >
> > [image: 内嵌图片 1]
> >
> > as shown in future 1(the default version of hbase 1.1.2):
> >
> > 1. in the beginning, client-2 starts to sends requests and client-1 not.
> > the throughput of client-2 keep stable
> >
> > 2. when client-1 starts, the throughputs of both client-1 and client-2
> are
> > very unstable, that is to say client-1 influence client-2 seriously
> >
> > 3. the average hbase throughput is 36641 .
> >
> > [image: 内嵌图片 4]
> >
> > Afterwords, we assign 150 handlers to table-2 and 3 handlers to table-1
> > with mutilple-type queue feature. as shown in future 2, the throughputs
> of
> > both client-1 and client-2 are more smooth, the average hbase throughput
> is
> > 51653 , 41% higher.
> >
> > so we think in some cases the feature is a improvement for running
> > multiple workloads on a single hbase cluster,  let me know if anything is
> > incorrect or inaccurate .
> >
> > Thanks a lot for your help!
> >
> > 2017-04-11 13:58 GMT+08:00 范范欣欣 <li...@gmail.com>:
> >
> >> Now, the feature HBASE-11355 seperates the single Call Queue into
> >> MutilQueue(get call queue, write call queue and scan call queue), and
> each
> >> type queue can specify fixed number of handlers. It's helpful in some
> >> outages , to avoid all read or all write requests ran out of handler
> >> threads.
> >>
> >> however, there are still several problems :
> >>
> >> 1. workloads in the same request type(get/write/scan) may influence each
> >> other as before, consider the following scenario:
> >>
> >> (1) both client-1 and client-2 send write requests, the client-1 write
> >> the large objects(100KB record) , and client-2 write the small objects
> (1KB
> >> record). the client -1 will ran out of all handler threads of the
> >> write-queue, and decrease the client-2 throughput
> >>
> >> (2) both client-3 and client-4 send get requests, the client-3 search
> all
> >> data from lots of hfiles( all search key are equally popular), read
> latency
> >> is high. the client-4 do not require any I/O resources(say, data is
> >> cached). the client-3 will ran out of all handler threads of the
> get-queue,
> >> and increase the read latency of client-4
> >>
> >> 2. administor can't increate/decrease the handler number for the
> >> specified queue easily
> >>
> >>
> >> we are trying to implement a new Mutilple-Typed Queue, administor can
> >> create a queue with a specified number of handler for specified table
> and
> >> specified request type(get/write/scan), as:
> >>
> >> create_queue 'queue1' ,{'handler' => 100}
> >>
> >> grant_queue 'table1','scan','queue1'
> >>
> >> grant_queue 'tableN','scan','queue1'
> >>
> >>
> >> create_queue 'queue2' ,{'handler' => 50}
> >>
> >> grant_queue 'table2','write','queue2'
> >>
> >>
> >> the idea based on the fact that the workload for a specified table and
> >> request type will be unique.
> >>
> >> in addition, administor can manager the queue with commands:
> >>
> >> //easily increase/decrease handlers
> >>
> >> alter_queue 'queue1' ,{'handler' => 50}
> >>
> >> //list all queues
> >>
> >> list_queues
> >>
> >> //drop the specified queue
> >>
> >> drop_queue 'queue1'
> >>
> >>
> >> I am wondering if the developers could look at the idea and let me know
> >> if anything is incorrect or inaccurate, or if I have missed anything.
> >>
> >>
> >> Thanks a lot for your help!
> >>
> >
> >
>

Re: A new Mutilple-Type Queue idea to handle multiple workloads

Posted by Ted Yu <yu...@gmail.com>.
bq. client-3 will ran out of all handler threads of the get-queue

Looks like you meant to say that client-3 would occupy all handlers.

The pictures didn't go through.

Since you have code for this improvement, consider opening JIRA where you
can publish your patch and pictures so that we get better idea of your
enhancement.

Thanks

On Sun, Apr 16, 2017 at 3:14 AM, libis <li...@gmail.com> wrote:

> I am runing expriment at the following scenario : both client-1 and
> client-2 send update requests, the client-1 write the large objects(100KB
> record) into table-1, and client-2 write the small objects (1KB record)
> into table-2. The charts below shows the effect of the Mutilple-Type queue
> feature:
>
> [image: 内嵌图片 1]
>
> as shown in future 1(the default version of hbase 1.1.2):
>
> 1. in the beginning, client-2 starts to sends requests and client-1 not.
> the throughput of client-2 keep stable
>
> 2. when client-1 starts, the throughputs of both client-1 and client-2 are
> very unstable, that is to say client-1 influence client-2 seriously
>
> 3. the average hbase throughput is 36641 .
>
> [image: 内嵌图片 4]
>
> Afterwords, we assign 150 handlers to table-2 and 3 handlers to table-1
> with mutilple-type queue feature. as shown in future 2, the throughputs of
> both client-1 and client-2 are more smooth, the average hbase throughput is
> 51653 , 41% higher.
>
> so we think in some cases the feature is a improvement for running
> multiple workloads on a single hbase cluster,  let me know if anything is
> incorrect or inaccurate .
>
> Thanks a lot for your help!
>
> 2017-04-11 13:58 GMT+08:00 范范欣欣 <li...@gmail.com>:
>
>> Now, the feature HBASE-11355 seperates the single Call Queue into
>> MutilQueue(get call queue, write call queue and scan call queue), and each
>> type queue can specify fixed number of handlers. It's helpful in some
>> outages , to avoid all read or all write requests ran out of handler
>> threads.
>>
>> however, there are still several problems :
>>
>> 1. workloads in the same request type(get/write/scan) may influence each
>> other as before, consider the following scenario:
>>
>> (1) both client-1 and client-2 send write requests, the client-1 write
>> the large objects(100KB record) , and client-2 write the small objects (1KB
>> record). the client -1 will ran out of all handler threads of the
>> write-queue, and decrease the client-2 throughput
>>
>> (2) both client-3 and client-4 send get requests, the client-3 search all
>> data from lots of hfiles( all search key are equally popular), read latency
>> is high. the client-4 do not require any I/O resources(say, data is
>> cached). the client-3 will ran out of all handler threads of the get-queue,
>> and increase the read latency of client-4
>>
>> 2. administor can't increate/decrease the handler number for the
>> specified queue easily
>>
>>
>> we are trying to implement a new Mutilple-Typed Queue, administor can
>> create a queue with a specified number of handler for specified table and
>> specified request type(get/write/scan), as:
>>
>> create_queue 'queue1' ,{'handler' => 100}
>>
>> grant_queue 'table1','scan','queue1'
>>
>> grant_queue 'tableN','scan','queue1'
>>
>>
>> create_queue 'queue2' ,{'handler' => 50}
>>
>> grant_queue 'table2','write','queue2'
>>
>>
>> the idea based on the fact that the workload for a specified table and
>> request type will be unique.
>>
>> in addition, administor can manager the queue with commands:
>>
>> //easily increase/decrease handlers
>>
>> alter_queue 'queue1' ,{'handler' => 50}
>>
>> //list all queues
>>
>> list_queues
>>
>> //drop the specified queue
>>
>> drop_queue 'queue1'
>>
>>
>> I am wondering if the developers could look at the idea and let me know
>> if anything is incorrect or inaccurate, or if I have missed anything.
>>
>>
>> Thanks a lot for your help!
>>
>
>

Re: A new Mutilple-Type Queue idea to handle multiple workloads

Posted by libis <li...@gmail.com>.
I am runing expriment at the following scenario : both client-1 and
client-2 send update requests, the client-1 write the large objects(100KB
record) into table-1, and client-2 write the small objects (1KB record)
into table-2. The charts below shows the effect of the Mutilple-Type queue
feature:

[image: 内嵌图片 1]

as shown in future 1(the default version of hbase 1.1.2):

1. in the beginning, client-2 starts to sends requests and client-1 not.
the throughput of client-2 keep stable

2. when client-1 starts, the throughputs of both client-1 and client-2 are
very unstable, that is to say client-1 influence client-2 seriously

3. the average hbase throughput is 36641 .

[image: 内嵌图片 4]

Afterwords, we assign 150 handlers to table-2 and 3 handlers to table-1
with mutilple-type queue feature. as shown in future 2, the throughputs of
both client-1 and client-2 are more smooth, the average hbase throughput is
51653 , 41% higher.

so we think in some cases the feature is a improvement for running multiple
workloads on a single hbase cluster,  let me know if anything is incorrect
or inaccurate .

Thanks a lot for your help!

2017-04-11 13:58 GMT+08:00 范范欣欣 <li...@gmail.com>:

> Now, the feature HBASE-11355 seperates the single Call Queue into
> MutilQueue(get call queue, write call queue and scan call queue), and each
> type queue can specify fixed number of handlers. It's helpful in some
> outages , to avoid all read or all write requests ran out of handler
> threads.
>
> however, there are still several problems :
>
> 1. workloads in the same request type(get/write/scan) may influence each
> other as before, consider the following scenario:
>
> (1) both client-1 and client-2 send write requests, the client-1 write the
> large objects(100KB record) , and client-2 write the small objects (1KB
> record). the client -1 will ran out of all handler threads of the
> write-queue, and decrease the client-2 throughput
>
> (2) both client-3 and client-4 send get requests, the client-3 search all
> data from lots of hfiles( all search key are equally popular), read latency
> is high. the client-4 do not require any I/O resources(say, data is
> cached). the client-3 will ran out of all handler threads of the get-queue,
> and increase the read latency of client-4
>
> 2. administor can't increate/decrease the handler number for the specified
> queue easily
>
>
> we are trying to implement a new Mutilple-Typed Queue, administor can
> create a queue with a specified number of handler for specified table and
> specified request type(get/write/scan), as:
>
> create_queue 'queue1' ,{'handler' => 100}
>
> grant_queue 'table1','scan','queue1'
>
> grant_queue 'tableN','scan','queue1'
>
>
> create_queue 'queue2' ,{'handler' => 50}
>
> grant_queue 'table2','write','queue2'
>
>
> the idea based on the fact that the workload for a specified table and
> request type will be unique.
>
> in addition, administor can manager the queue with commands:
>
> //easily increase/decrease handlers
>
> alter_queue 'queue1' ,{'handler' => 50}
>
> //list all queues
>
> list_queues
>
> //drop the specified queue
>
> drop_queue 'queue1'
>
>
> I am wondering if the developers could look at the idea and let me know if
> anything is incorrect or inaccurate, or if I have missed anything.
>
>
> Thanks a lot for your help!
>