You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Sean Busbey (JIRA)" <ji...@apache.org> on 2016/02/29 16:31:18 UTC

[jira] [Comment Edited] (HBASE-9393) Hbase does not closing a closed socket resulting in many CLOSE_WAIT

    [ https://issues.apache.org/jira/browse/HBASE-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171987#comment-15171987 ] 

Sean Busbey edited comment on HBASE-9393 at 2/29/16 3:30 PM:
-------------------------------------------------------------

{quote}
The static boolean tells whether the underlying FS supports unbuffer call or not. Depending on the useHBaseChecksum and stream it wont change.. So there is no confusion as such abt this.
{quote}

But this is static across all instances in the same JVM. if it "depends on the stream" then that means we can't have a single static instance unless we know all the streams will be the same. If they'll all be the same, then we can do this in a static initializer.

{quote}
Regarding the init of the static boolean, IMHO there is no need to worry abt the multi thread. We dont want every op to do the interfaces listing and check (String op). Even if, at begin, 2 threads do it parallely, its ok. Both will give same result only.
{quote}

This is not true. The current code is not threadsafe and will fail in some cases. If we can't do this in a static initializer or otherwise assure that hte code itself will be threadsafe, then we need to mark the method as not thread safe (similar to he current non-threadsafe portion of this class) and then we need to ensure that it is only called within proper fencing _across all instances within the same JVM_.


To be clear, I am currently -1 for thread safety incorrectness.


was (Author: busbey):
{quote}
The static boolean tells whether the underlying FS supports unbuffer call or not. Depending on the useHBaseChecksum and stream it wont change.. So there is no confusion as such abt this.
{quote}

But this is static across all instances in the same JVM. if it "depends on the stream" then that means we can't have a single static instance unless we know all the streams will be the same. If they'll all be the same, then we can do this in a static initializer.

{quote}
Regarding the init of the static boolean, IMHO there is no need to worry abt the multi thread. We dont want every op to do the interfaces listing and check (String op). Even if, at begin, 2 threads do it parallely, its ok. Both will give same result only.
{quote}

This is not true. The current code is not threadsafe and will fail in some cases. If we can't do this in a static initializer or otherwise assure that hte code itself will be threadsafe, then we need to mark the method as not thread safe (similar to he current non-threadsafe portion of this class) and then we need to ensure that it is only called within proper fencing _across all instance the same JVM_.


To be clear, I am currently -1 for thread safety incorrectness.

> Hbase does not closing a closed socket resulting in many CLOSE_WAIT 
> --------------------------------------------------------------------
>
>                 Key: HBASE-9393
>                 URL: https://issues.apache.org/jira/browse/HBASE-9393
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.2, 0.98.0
>         Environment: Centos 6.4 - 7 regionservers/datanodes, 8 TB per node, 7279 regions
>            Reporter: Avi Zrachya
>            Assignee: Ashish Singhi
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: HBASE-9393.patch, HBASE-9393.v1.patch, HBASE-9393.v10.patch, HBASE-9393.v11.patch, HBASE-9393.v12.patch, HBASE-9393.v13.patch, HBASE-9393.v2.patch, HBASE-9393.v3.patch, HBASE-9393.v4.patch, HBASE-9393.v5.patch, HBASE-9393.v5.patch, HBASE-9393.v5.patch, HBASE-9393.v6.patch, HBASE-9393.v6.patch, HBASE-9393.v6.patch, HBASE-9393.v7.patch, HBASE-9393.v8.patch, HBASE-9393.v9.patch
>
>
> HBase dose not close a dead connection with the datanode.
> This resulting in over 60K CLOSE_WAIT and at some point HBase can not connect to the datanode because too many mapped sockets from one host to another on the same port.
> The example below is with low CLOSE_WAIT count because we had to restart hbase to solve the porblem, later in time it will incease to 60-100K sockets on CLOSE_WAIT
> [root@hd2-region3 ~]# netstat -nap |grep CLOSE_WAIT |grep 21592 |wc -l
> 13156
> [root@hd2-region3 ~]# ps -ef |grep 21592
> root     17255 17219  0 12:26 pts/0    00:00:00 grep 21592
> hbase    21592     1 17 Aug29 ?        03:29:06 /usr/java/jdk1.6.0_26/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx8000m -ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -Dhbase.log.dir=/var/log/hbase -Dhbase.log.file=hbase-hbase-regionserver-hd2-region3.swnet.corp.log ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)