You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Minwoo Kang (Jira)" <ji...@apache.org> on 2022/07/18 07:08:00 UTC

[jira] [Created] (HBASE-27219) Change JONI encoding in RegexStringComparator

Minwoo Kang created HBASE-27219:
-----------------------------------

             Summary: Change JONI encoding in RegexStringComparator
                 Key: HBASE-27219
                 URL: https://issues.apache.org/jira/browse/HBASE-27219
             Project: HBase
          Issue Type: Bug
          Components: Filters
            Reporter: Minwoo Kang
            Assignee: Minwoo Kang
         Attachments: rs-heap.png

I change the engine of RegexStringComparator to JONI.
After that I sent a regex filter request, the RegionServer's heap memory usage spiked and the RegionServer did not work due to GC.
 
!rs-heap.png|width=609,height=55!
(RegionServer Heap Memory Usage)
 
{code:java}
INFO  [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1435ms
GC pool 'ParNew' had collection(s): count=1 time=1550ms
INFO  [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1073ms
GC pool 'ParNew' had collection(s): count=1 time=1534ms
INFO  [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1456ms
GC pool 'ParNew' had collection(s): count=1 time=1574ms
INFO  [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1297ms
GC pool 'ParNew' had collection(s): count=1 time=1415ms {code}
(RegionServer Log)
 
I checked the reason, it is said that when using UTF8Encoding, an infinite loop can occur if an invalid UTF8 is entered.
For trino, using NonStrictUTF8Encoding instead of UTF8Encoding.
(https://github.com/trinodb/trino/commit/ea66e8cb27b098a5cea184106fe245064351b567)

After changing the encoding of JoniRegexEngine to NonStrictUTF8Encoding in RegexStringComparator, it was confirmed that the heap memory usage spike was gone.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)