You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Bingbing Wang (JIRA)" <ji...@apache.org> on 2017/02/21 05:44:44 UTC

[jira] [Comment Edited] (HBASE-17671) HBase Thrift2 OutOfMemory

    [ https://issues.apache.org/jira/browse/HBASE-17671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875393#comment-15875393 ] 

Bingbing Wang edited comment on HBASE-17671 at 2/21/17 5:43 AM:
----------------------------------------------------------------

Yes, I have checked hprof, and the most part of them is writeBuffer in org.apache.thrift.transport.TFramedTransport. Most of writeBuffer have exceeded 128M. I am very curious why such big writeBuffer is allocated and not cycled in time. Please see the attached ClassHistogram.png.

Yes, we have close scanner on time. We can confirm this. Because we have use C++ auto-destructor when leaving life scope to ensure all scanner will be closed. We have ever fixed such bugs, so there should have no scanner leak in our application.

Previous we have ever use CMS, but many Java GC issues. Later we switched to G1GC and do some adjustments, now the issue have been less than previous.


was (Author: wbb1975):
Yes, I have checked hprof, and the most part of them is writeBuffer in org.apache.thrift.transport.TFramedTransport. Most of writeBuffer have exceeded 128M. I am very curious why such big writeBuffer is allocated and not cycled in time.

Yes, we have close scanner on time. We can confirm this. Because we have use C++ auto-destructor when leaving life scope to ensure all scanner will be closed. We have ever fixed such bugs, so there should have no scanner leak in our application.

Previous we have ever use CMS, but many Java GC issues. Later we switched to G1GC and do some adjustments, now the issue have been less than previous.

> HBase Thrift2 OutOfMemory
> -------------------------
>
>                 Key: HBASE-17671
>                 URL: https://issues.apache.org/jira/browse/HBASE-17671
>             Project: HBase
>          Issue Type: Bug
>          Components: Thrift
>    Affects Versions: 0.98.6
>         Environment: Product
>            Reporter: Bingbing Wang
>            Priority: Critical
>         Attachments: hbase-site.xml, hbase-thrift2.log, log_gc.log.0.zip
>
>
> We have a HBase Thrift2 server deployed on Windows, basically the physical view looks like:
> QueryEngine <==> HBase Thrift2 <==> HBase cluster
> Here QueryEngine is a C++ application, and HBase cluster is a about 50-nodes HBase cluster (CDH 5.3.3, namely Hbase version 0.98.6).
> Our Thrift2 Java options looks like:
> -server -Xms4096m -Xmx4096m -XX:MaxDirectMemorySize=8192m -XX:+HeapDumpOnOutOfMemoryError -XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=4M -XX:InitiatingHeapOccupancyPercent=40 -XX:+PrintAdaptiveSizePolicy -XX:+PrintPromotionFailure -Dhbase.log.dir=d:\vhayu\thrift2\log -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:PrintFLSStatistics=1 -Xloggc:log_gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=200M -Dhbase.log.file=hbase-thrift2.log  -Dhbase.home.dir=D:\vhayu\thrift2\hbase0.98 -Dhbase.id.str=root -Dlog4j.info -Dhbase.root.logger=INFO,DRFA -cp "d:\vhayu\thrift2\hbase0.98\*;d:\vhayu\thrift2\conf" org.apache.hadoop.hbase.thrift2.ThriftServer -b 127.0.0.1 -f framed start
> The phenomenon of  the issue is that after some time running, Thrift2 sometimes reports OOM and heap dump file (.hprof) file was generated. The consequence of this will always trigger high latency form HBase cluster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)