You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Mike Freyberger <mf...@appnexus.com> on 2018/08/07 15:04:26 UTC

Hbase mutate is hogging my CPU

Kafka Dev,

I’d love some help investigating a slow Hbase mutator.

The cluster is Hbase 1.2 and cluster has 22 region servers. The region servers are pretty big: 24 cores, 126 GB RAM.

The cluster has 2 tables, each only have 1 column family. Both tables have the same pre splits.

Each table is pre split into 400 regions. The split keys are all 2 bytes and evenly divide the key space.

The keys are 13 bytes. The key is formed by concatenating:
1 byte kafka partition
8 byte random int
4 byte timestamp (second level granularity)

The workload is 100% write for now. There are about 1M writes per second with a total data volume of .6GB per second.

I find that my application is spending the majority of its CPU time (71.7%) calling org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate (), which is in turn spending most of its time calling org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion ().

Attached are two images showing the performance of my application. The first is an overview showing that my application is spending a lot of in mutate. The next is a deep dive into the functions that mutate is calling internally.

I am very surprised to see this function taking so long. My intuition is that all this needs to do is:
1) Determine which region the Mutation belongs in
2) Append the Mutation to a queue for async write to HBase.


Any thoughts, comments of suggestions from the community would be much appreciated! I’m really hoping to improve the performance profile here so that my CPU can be freed up.

Thanks,

Mike Freyberger

Re: Hbase mutate is hogging my CPU

Posted by Mike Freyberger <mf...@appnexus.com>.
I'm currently doing about 10K Puts/Sec/Mutator. And there are 10 mutators running on each machine. So, each machine is doing 100K Puts/Sec.

To put these results in context, the application is also doing updating an in memory dictionary for every put that it sends to HBase. I can see that my client code accessing a simple HashMap is significantly less than the time it takes to access the region locations.

Time spent updating my local in memory cache is .07% of CPU time.
Time spent running mutate is 46% of CPU time. 

I'd expect both of these operations to take a similar amount of time because they are doing a look up based on the same key.

Please let me know if there is any more context I can provide! 

Mike


On 8/7/18, 1:17 PM, "Josh Elser" <el...@apache.org> wrote:

    Thanks for bringing this out to the mailing list, Mike.
    
    https://issues.apache.org/jira/browse/HBASE-14708 is the Jira issue that 
    came to mind -- this was an area that Elliott was trying to optimize, 
    but it seems like your profiling is showing that we're spending a 
    significant amount of time to submit your mutations just finding the 
    Region that they need to go to.
    
    What kind of "Put" throughput are you getting? The context might be 
    important -- 10K Puts/sec is a bit different than 100K Puts/sec...
    
    I think we need to understand if the performance you're getting out of 
    HBase is reasonable (e.g. is the Client working as we'd expect), and 
    then dig in to understand if you're running a fundamentally different 
    kind of experiment (or seeing different results) than Elliott did.
    
    On 8/7/18 11:39 AM, Mike Freyberger wrote:
    > The version on the Hbase servers is CDH 5.14.2 (https://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.14.2.releasenotes.html).
    > 
    > The perf images can be found here: https://drive.google.com/drive/u/0/folders/1s3RGzWhb-IN16dLal9Kzs_nLCxJBLB-U
    > 
    > I can dig into the server that is hosting hbase:meta, but I am under the impression that all of this strictly on the client side. These snapshots were taken once the application had been up for a while and all of the hbase region information should be cached client side. The screenshot of the mutators internal calls also do not have any network calls in them, so I'd expect this is not related to the current status of the region server hosting hbase:meta.
    > 
    > On 8/7/18, 11:23 AM, "Ted Yu" <yu...@gmail.com> wrote:
    > 
    >      There have been several releases for Hbase 1.2
    >      Which release are you using ?
    >      
    >      The images you sent didn't go through the mailing list. Please consider
    >      using third party site for delivery.
    >      
    >      Have you taken a look at what the server hosting hbase:meta was doing
    >      during this period of time ?
    >      
    >      Thanks
    >      
    >      On Tue, Aug 7, 2018 at 8:04 AM Mike Freyberger <mf...@appnexus.com>
    >      wrote:
    >      
    >      > Kafka Dev,
    >      >
    >      >
    >      >
    >      > I’d love some help investigating a slow Hbase mutator.
    >      >
    >      >
    >      >
    >      > The cluster is Hbase 1.2 and cluster has 22 region servers. The region
    >      > servers are pretty big: 24 cores, 126 GB RAM.
    >      >
    >      >
    >      >
    >      > The cluster has 2 tables, each only have 1 column family. Both tables have
    >      > the same pre splits.
    >      >
    >      >
    >      >
    >      > Each table is pre split into 400 regions. The split keys are all 2 bytes
    >      > and evenly divide the key space.
    >      >
    >      >
    >      >
    >      > The keys are 13 bytes. The key is formed by concatenating:
    >      >
    >      > 1 byte kafka partition
    >      >
    >      > 8 byte random int
    >      >
    >      > 4 byte timestamp (second level granularity)
    >      >
    >      > The workload is 100% write for now. There are about 1M writes per second
    >      > with a total data volume of .6GB per second.
    >      >
    >      >
    >      >
    >      > I find that my application is spending the majority of its CPU time
    >      > (71.7%) calling org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate
    >      > (), which is in turn spending most of its time calling
    >      > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion
    >      > ().
    >      >
    >      >
    >      >
    >      > Attached are two images showing the performance of my application. The
    >      > first is an overview showing that my application is spending a lot of in
    >      > mutate. The next is a deep dive into the functions that mutate is calling
    >      > internally.
    >      >
    >      > I am very surprised to see this function taking so long. My intuition is
    >      > that all this needs to do is:
    >      > 1) Determine which region the Mutation belongs in
    >      > 2) Append the Mutation to a queue for async write to HBase.
    >      >
    >      >
    >      > Any thoughts, comments of suggestions from the community would be much
    >      > appreciated! I’m really hoping to improve the performance profile here so
    >      > that my CPU can be freed up.
    >      >
    >      >
    >      >
    >      > Thanks,
    >      >
    >      >
    >      >
    >      > Mike Freyberger
    >      >
    >      
    > 
    


Re: Hbase mutate is hogging my CPU

Posted by Josh Elser <el...@apache.org>.
Thanks for bringing this out to the mailing list, Mike.

https://issues.apache.org/jira/browse/HBASE-14708 is the Jira issue that 
came to mind -- this was an area that Elliott was trying to optimize, 
but it seems like your profiling is showing that we're spending a 
significant amount of time to submit your mutations just finding the 
Region that they need to go to.

What kind of "Put" throughput are you getting? The context might be 
important -- 10K Puts/sec is a bit different than 100K Puts/sec...

I think we need to understand if the performance you're getting out of 
HBase is reasonable (e.g. is the Client working as we'd expect), and 
then dig in to understand if you're running a fundamentally different 
kind of experiment (or seeing different results) than Elliott did.

On 8/7/18 11:39 AM, Mike Freyberger wrote:
> The version on the Hbase servers is CDH 5.14.2 (https://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.14.2.releasenotes.html).
> 
> The perf images can be found here: https://drive.google.com/drive/u/0/folders/1s3RGzWhb-IN16dLal9Kzs_nLCxJBLB-U
> 
> I can dig into the server that is hosting hbase:meta, but I am under the impression that all of this strictly on the client side. These snapshots were taken once the application had been up for a while and all of the hbase region information should be cached client side. The screenshot of the mutators internal calls also do not have any network calls in them, so I'd expect this is not related to the current status of the region server hosting hbase:meta.
> 
> On 8/7/18, 11:23 AM, "Ted Yu" <yu...@gmail.com> wrote:
> 
>      There have been several releases for Hbase 1.2
>      Which release are you using ?
>      
>      The images you sent didn't go through the mailing list. Please consider
>      using third party site for delivery.
>      
>      Have you taken a look at what the server hosting hbase:meta was doing
>      during this period of time ?
>      
>      Thanks
>      
>      On Tue, Aug 7, 2018 at 8:04 AM Mike Freyberger <mf...@appnexus.com>
>      wrote:
>      
>      > Kafka Dev,
>      >
>      >
>      >
>      > I’d love some help investigating a slow Hbase mutator.
>      >
>      >
>      >
>      > The cluster is Hbase 1.2 and cluster has 22 region servers. The region
>      > servers are pretty big: 24 cores, 126 GB RAM.
>      >
>      >
>      >
>      > The cluster has 2 tables, each only have 1 column family. Both tables have
>      > the same pre splits.
>      >
>      >
>      >
>      > Each table is pre split into 400 regions. The split keys are all 2 bytes
>      > and evenly divide the key space.
>      >
>      >
>      >
>      > The keys are 13 bytes. The key is formed by concatenating:
>      >
>      > 1 byte kafka partition
>      >
>      > 8 byte random int
>      >
>      > 4 byte timestamp (second level granularity)
>      >
>      > The workload is 100% write for now. There are about 1M writes per second
>      > with a total data volume of .6GB per second.
>      >
>      >
>      >
>      > I find that my application is spending the majority of its CPU time
>      > (71.7%) calling org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate
>      > (), which is in turn spending most of its time calling
>      > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion
>      > ().
>      >
>      >
>      >
>      > Attached are two images showing the performance of my application. The
>      > first is an overview showing that my application is spending a lot of in
>      > mutate. The next is a deep dive into the functions that mutate is calling
>      > internally.
>      >
>      > I am very surprised to see this function taking so long. My intuition is
>      > that all this needs to do is:
>      > 1) Determine which region the Mutation belongs in
>      > 2) Append the Mutation to a queue for async write to HBase.
>      >
>      >
>      > Any thoughts, comments of suggestions from the community would be much
>      > appreciated! I’m really hoping to improve the performance profile here so
>      > that my CPU can be freed up.
>      >
>      >
>      >
>      > Thanks,
>      >
>      >
>      >
>      > Mike Freyberger
>      >
>      
> 

Re: Hbase mutate is hogging my CPU

Posted by Mike Freyberger <mf...@appnexus.com>.
The version on the Hbase servers is CDH 5.14.2 (https://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.14.2.releasenotes.html).

The perf images can be found here: https://drive.google.com/drive/u/0/folders/1s3RGzWhb-IN16dLal9Kzs_nLCxJBLB-U

I can dig into the server that is hosting hbase:meta, but I am under the impression that all of this strictly on the client side. These snapshots were taken once the application had been up for a while and all of the hbase region information should be cached client side. The screenshot of the mutators internal calls also do not have any network calls in them, so I'd expect this is not related to the current status of the region server hosting hbase:meta. 

On 8/7/18, 11:23 AM, "Ted Yu" <yu...@gmail.com> wrote:

    There have been several releases for Hbase 1.2
    Which release are you using ?
    
    The images you sent didn't go through the mailing list. Please consider
    using third party site for delivery.
    
    Have you taken a look at what the server hosting hbase:meta was doing
    during this period of time ?
    
    Thanks
    
    On Tue, Aug 7, 2018 at 8:04 AM Mike Freyberger <mf...@appnexus.com>
    wrote:
    
    > Kafka Dev,
    >
    >
    >
    > I’d love some help investigating a slow Hbase mutator.
    >
    >
    >
    > The cluster is Hbase 1.2 and cluster has 22 region servers. The region
    > servers are pretty big: 24 cores, 126 GB RAM.
    >
    >
    >
    > The cluster has 2 tables, each only have 1 column family. Both tables have
    > the same pre splits.
    >
    >
    >
    > Each table is pre split into 400 regions. The split keys are all 2 bytes
    > and evenly divide the key space.
    >
    >
    >
    > The keys are 13 bytes. The key is formed by concatenating:
    >
    > 1 byte kafka partition
    >
    > 8 byte random int
    >
    > 4 byte timestamp (second level granularity)
    >
    > The workload is 100% write for now. There are about 1M writes per second
    > with a total data volume of .6GB per second.
    >
    >
    >
    > I find that my application is spending the majority of its CPU time
    > (71.7%) calling org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate
    > (), which is in turn spending most of its time calling
    > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion
    > ().
    >
    >
    >
    > Attached are two images showing the performance of my application. The
    > first is an overview showing that my application is spending a lot of in
    > mutate. The next is a deep dive into the functions that mutate is calling
    > internally.
    >
    > I am very surprised to see this function taking so long. My intuition is
    > that all this needs to do is:
    > 1) Determine which region the Mutation belongs in
    > 2) Append the Mutation to a queue for async write to HBase.
    >
    >
    > Any thoughts, comments of suggestions from the community would be much
    > appreciated! I’m really hoping to improve the performance profile here so
    > that my CPU can be freed up.
    >
    >
    >
    > Thanks,
    >
    >
    >
    > Mike Freyberger
    >
    


Re: Hbase mutate is hogging my CPU

Posted by Ted Yu <yu...@gmail.com>.
There have been several releases for Hbase 1.2
Which release are you using ?

The images you sent didn't go through the mailing list. Please consider
using third party site for delivery.

Have you taken a look at what the server hosting hbase:meta was doing
during this period of time ?

Thanks

On Tue, Aug 7, 2018 at 8:04 AM Mike Freyberger <mf...@appnexus.com>
wrote:

> Kafka Dev,
>
>
>
> I’d love some help investigating a slow Hbase mutator.
>
>
>
> The cluster is Hbase 1.2 and cluster has 22 region servers. The region
> servers are pretty big: 24 cores, 126 GB RAM.
>
>
>
> The cluster has 2 tables, each only have 1 column family. Both tables have
> the same pre splits.
>
>
>
> Each table is pre split into 400 regions. The split keys are all 2 bytes
> and evenly divide the key space.
>
>
>
> The keys are 13 bytes. The key is formed by concatenating:
>
> 1 byte kafka partition
>
> 8 byte random int
>
> 4 byte timestamp (second level granularity)
>
> The workload is 100% write for now. There are about 1M writes per second
> with a total data volume of .6GB per second.
>
>
>
> I find that my application is spending the majority of its CPU time
> (71.7%) calling org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate
> (), which is in turn spending most of its time calling
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion
> ().
>
>
>
> Attached are two images showing the performance of my application. The
> first is an overview showing that my application is spending a lot of in
> mutate. The next is a deep dive into the functions that mutate is calling
> internally.
>
> I am very surprised to see this function taking so long. My intuition is
> that all this needs to do is:
> 1) Determine which region the Mutation belongs in
> 2) Append the Mutation to a queue for async write to HBase.
>
>
> Any thoughts, comments of suggestions from the community would be much
> appreciated! I’m really hoping to improve the performance profile here so
> that my CPU can be freed up.
>
>
>
> Thanks,
>
>
>
> Mike Freyberger
>

Re: Hbase mutate is hogging my CPU

Posted by Mike Freyberger <mf...@appnexus.com>.
Apologies for addressing this to Kafka Dev. That should have been HBase Dev.

From: Mike Freyberger <mf...@appnexus.com>
Date: Tuesday, August 7, 2018 at 11:04 AM
To: "dev@hbase.apache.org" <de...@hbase.apache.org>
Subject: Hbase mutate is hogging my CPU

Kafka Dev,

I’d love some help investigating a slow Hbase mutator.

The cluster is Hbase 1.2 and cluster has 22 region servers. The region servers are pretty big: 24 cores, 126 GB RAM.

The cluster has 2 tables, each only have 1 column family. Both tables have the same pre splits.

Each table is pre split into 400 regions. The split keys are all 2 bytes and evenly divide the key space.

The keys are 13 bytes. The key is formed by concatenating:
1 byte kafka partition
8 byte random int
4 byte timestamp (second level granularity)

The workload is 100% write for now. There are about 1M writes per second with a total data volume of .6GB per second.

I find that my application is spending the majority of its CPU time (71.7%) calling org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate (), which is in turn spending most of its time calling org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion ().

Attached are two images showing the performance of my application. The first is an overview showing that my application is spending a lot of in mutate. The next is a deep dive into the functions that mutate is calling internally.

I am very surprised to see this function taking so long. My intuition is that all this needs to do is:
1) Determine which region the Mutation belongs in
2) Append the Mutation to a queue for async write to HBase.


Any thoughts, comments of suggestions from the community would be much appreciated! I’m really hoping to improve the performance profile here so that my CPU can be freed up.

Thanks,

Mike Freyberger