You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Bryan Beaudreault (Jira)" <ji...@apache.org> on 2022/08/06 14:20:00 UTC

[jira] [Commented] (HBASE-27276) Reduce reflection usage in Filter deserialization

    [ https://issues.apache.org/jira/browse/HBASE-27276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17576240#comment-17576240 ] 

Bryan Beaudreault commented on HBASE-27276:
-------------------------------------------

It's not super straightforward to entirely remove reflection. I realized that caching wouldn't help much because we still need to use reflection to invoke the parseFrom method. I've been doing some research and learned about LambdaMetafactory, which allows you to use similar code used for lambdas to create more performant reflective functions. As an example, at startup time we can use reflection to find all filters, iterate them, and create a lambda function for each. I wrote up a POC of this and benchmarked it using jmh:

 
{code:java}
Benchmark                                          Mode  Cnt     Score     Error  Units
BenchmarkFilterDeserialization.testWithCreator     avgt    5  1327.474 ±  84.116  ns/op
BenchmarkFilterDeserialization.testWithoutCreator  avgt    5  1943.972 ± 193.370  ns/op {code}
 

This was continually deserializing the same filter over and over again with the two code paths. The filter in question was a simple FilterList with 2 inner filters. As you can see, the LambdaMetafactory resulted in about a 40% improvement. I think the savings would increase with larger and more complicated filters, but more testing is needed. I also didn't yet convert the Comparator parseFrom calls (which one of the filters in this test is using), so more savings can come there.

> Reduce reflection usage in Filter deserialization
> -------------------------------------------------
>
>                 Key: HBASE-27276
>                 URL: https://issues.apache.org/jira/browse/HBASE-27276
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Bryan Beaudreault
>            Priority: Major
>         Attachments: async-prof-pid-9037-cpu-1.html
>
>
> Running hbase 2.4.x, I recently profiled one of our clusters which does a very high volume of random reads. An astonishing 12% of CPU time was just spent deserializing in ProtobufUtil.toFilter.
> One immediate thought would be to cache String -> Class mappings. Currently Class.forName shows up multiple times (6 in my example) in the profile, each time taking over 1%. I think this is partially due to using FilterList in this example.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)