You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ryan Rawson <ry...@gmail.com> on 2009/08/04 08:41:57 UTC

Re: HBase 0.20.0rc not close the connection to zookeeper explicitly when closing HTable (and HBaseAdmin)

We should move the clients to a non-active server API, possibly the
REST one, and avoid using active sessions just for clients.  Something
to address in 0.21 I think.

As for #2, it is recommended now to run a quorum of zookeeper instead
of a single one.  This reduces the risk of running out of connections.

Also the code snippet you listed is a little degenerate, we can never
fully protect ourselves from fork-bomb like code.  Your code snippet
suggests that:
- you are creating/closing HTable a lot.  Maybe you shouldn't do that?
 HTablePool?
- you have 1024+ tables, and need to access them in one client at one time.

In the mean time, highly consider upgrading to a cluster of 5-7 ZK
hosts.  For production, you should consider NOT running them on your
HBase/HDFS/map-reduce nodes.

Good luck!
-ryan

On Mon, Aug 3, 2009 at 11:00 PM, Angus He<an...@gmail.com> wrote:
> Hi All,
>
> In HBase 0.20rc, HTable does not explicitly close the connection to
> zookeeper in HTable::close.
> It probably could be better.  And in my opinion, it should be for:
>
> 1. It is not well-behaved, although zookeeper is able to detect the
> lost connection after issuing networking I/O operation, .
> 2. It is easy to get zookeeper server stuck with exceptions like "Too
> many connections from /0:0:0:0:0     :0:0:1 - max is 30", when user
> write codes like:
>                        for (int i = 0; i < 1024; ++i) {
>                                HTable table = new HTable("foobar");
>                                table.close();
>                        }
>
> In the current implementation, different HTable instances share the
> same connection to zookeeper if they have same HBaseConfiguration
> instance. For this, we cannot close the connection directly in HTable,
> but probably we could implement HConnection class with
> reference-counting ability.
>
> Any comments?
>
> --
> Regards
> Angus
>

Re: HBase 0.20.0rc not close the connection to zookeeper explicitly when closing HTable (and HBaseAdmin)

Posted by Angus He <an...@gmail.com>.
We have successfully run a quorum of 5 nodes for a few days.

A few hours earlies today, one of our developers reported a failure in
his HBase testing scripts.  I run a test on my own working machine
then.  After examining the scripts and digging into some of HBase
source codes, I came up with the first email in this thread.

Thanks for the great job again.

On Tue, Aug 4, 2009 at 3:49 PM, Ryan Rawson<ry...@gmail.com> wrote:
> I generally avoid reading the zookeeper log file, it's very very noisy
> and I have never gotten anything useful out of it :-)
>
> There is more work and best practices we need to do surrounding the
> zookeeper, and these will be encoded in our scripts as we figure it
> all out.
>
> The first big step is to make sure to run a quorum on a cluster, and
> the startup scripts facilitate that.  Please use it!
>
> have fun!
> -ryan
>
> On Tue, Aug 4, 2009 at 12:42 AM, Angus He<an...@gmail.com> wrote:
>> Thanks for the brilliant comments, Ryan.
>>
>> For each of this not so graceful close, zookeeper will populate its
>> log file with a WARN record that just likes
>> 2009-08-04 15:15:35,831 WARN
>> org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of
>> session 0x122e34dd69b00ad due to java.io.IOException: Read error.
>>
>> It might be confuse some users of HBase, probably we can put some
>> information about this in the documentation.
>>
>>
>> On Tue, Aug 4, 2009 at 2:41 PM, Ryan Rawson<ry...@gmail.com> wrote:
>>> We should move the clients to a non-active server API, possibly the
>>> REST one, and avoid using active sessions just for clients.  Something
>>> to address in 0.21 I think.
>>>
>>> As for #2, it is recommended now to run a quorum of zookeeper instead
>>> of a single one.  This reduces the risk of running out of connections.
>>>
>>> Also the code snippet you listed is a little degenerate, we can never
>>> fully protect ourselves from fork-bomb like code.  Your code snippet
>>> suggests that:
>>> - you are creating/closing HTable a lot.  Maybe you shouldn't do that?
>>>  HTablePool?
>>> - you have 1024+ tables, and need to access them in one client at one time.
>>>
>>> In the mean time, highly consider upgrading to a cluster of 5-7 ZK
>>> hosts.  For production, you should consider NOT running them on your
>>> HBase/HDFS/map-reduce nodes.
>>>
>>> Good luck!
>>> -ryan
>>>
>>> On Mon, Aug 3, 2009 at 11:00 PM, Angus He<an...@gmail.com> wrote:
>>>> Hi All,
>>>>
>>>> In HBase 0.20rc, HTable does not explicitly close the connection to
>>>> zookeeper in HTable::close.
>>>> It probably could be better.  And in my opinion, it should be for:
>>>>
>>>> 1. It is not well-behaved, although zookeeper is able to detect the
>>>> lost connection after issuing networking I/O operation, .
>>>> 2. It is easy to get zookeeper server stuck with exceptions like "Too
>>>> many connections from /0:0:0:0:0     :0:0:1 - max is 30", when user
>>>> write codes like:
>>>>                        for (int i = 0; i < 1024; ++i) {
>>>>                                HTable table = new HTable("foobar");
>>>>                                table.close();
>>>>                        }
>>>>
>>>> In the current implementation, different HTable instances share the
>>>> same connection to zookeeper if they have same HBaseConfiguration
>>>> instance. For this, we cannot close the connection directly in HTable,
>>>> but probably we could implement HConnection class with
>>>> reference-counting ability.
>>>>
>>>> Any comments?
>>>>
>>>> --
>>>> Regards
>>>> Angus
>>>>
>>>
>>
>>
>>
>> --
>> Regards
>> Angus
>>
>



-- 
Regards
Angus

Re: HBase 0.20.0rc not close the connection to zookeeper explicitly when closing HTable (and HBaseAdmin)

Posted by Ryan Rawson <ry...@gmail.com>.
I generally avoid reading the zookeeper log file, it's very very noisy
and I have never gotten anything useful out of it :-)

There is more work and best practices we need to do surrounding the
zookeeper, and these will be encoded in our scripts as we figure it
all out.

The first big step is to make sure to run a quorum on a cluster, and
the startup scripts facilitate that.  Please use it!

have fun!
-ryan

On Tue, Aug 4, 2009 at 12:42 AM, Angus He<an...@gmail.com> wrote:
> Thanks for the brilliant comments, Ryan.
>
> For each of this not so graceful close, zookeeper will populate its
> log file with a WARN record that just likes
> 2009-08-04 15:15:35,831 WARN
> org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of
> session 0x122e34dd69b00ad due to java.io.IOException: Read error.
>
> It might be confuse some users of HBase, probably we can put some
> information about this in the documentation.
>
>
> On Tue, Aug 4, 2009 at 2:41 PM, Ryan Rawson<ry...@gmail.com> wrote:
>> We should move the clients to a non-active server API, possibly the
>> REST one, and avoid using active sessions just for clients.  Something
>> to address in 0.21 I think.
>>
>> As for #2, it is recommended now to run a quorum of zookeeper instead
>> of a single one.  This reduces the risk of running out of connections.
>>
>> Also the code snippet you listed is a little degenerate, we can never
>> fully protect ourselves from fork-bomb like code.  Your code snippet
>> suggests that:
>> - you are creating/closing HTable a lot.  Maybe you shouldn't do that?
>>  HTablePool?
>> - you have 1024+ tables, and need to access them in one client at one time.
>>
>> In the mean time, highly consider upgrading to a cluster of 5-7 ZK
>> hosts.  For production, you should consider NOT running them on your
>> HBase/HDFS/map-reduce nodes.
>>
>> Good luck!
>> -ryan
>>
>> On Mon, Aug 3, 2009 at 11:00 PM, Angus He<an...@gmail.com> wrote:
>>> Hi All,
>>>
>>> In HBase 0.20rc, HTable does not explicitly close the connection to
>>> zookeeper in HTable::close.
>>> It probably could be better.  And in my opinion, it should be for:
>>>
>>> 1. It is not well-behaved, although zookeeper is able to detect the
>>> lost connection after issuing networking I/O operation, .
>>> 2. It is easy to get zookeeper server stuck with exceptions like "Too
>>> many connections from /0:0:0:0:0     :0:0:1 - max is 30", when user
>>> write codes like:
>>>                        for (int i = 0; i < 1024; ++i) {
>>>                                HTable table = new HTable("foobar");
>>>                                table.close();
>>>                        }
>>>
>>> In the current implementation, different HTable instances share the
>>> same connection to zookeeper if they have same HBaseConfiguration
>>> instance. For this, we cannot close the connection directly in HTable,
>>> but probably we could implement HConnection class with
>>> reference-counting ability.
>>>
>>> Any comments?
>>>
>>> --
>>> Regards
>>> Angus
>>>
>>
>
>
>
> --
> Regards
> Angus
>

Re: HBase 0.20.0rc not close the connection to zookeeper explicitly when closing HTable (and HBaseAdmin)

Posted by Angus He <an...@gmail.com>.
Thanks for the brilliant comments, Ryan.

For each of this not so graceful close, zookeeper will populate its
log file with a WARN record that just likes
2009-08-04 15:15:35,831 WARN
org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of
session 0x122e34dd69b00ad due to java.io.IOException: Read error.

It might be confuse some users of HBase, probably we can put some
information about this in the documentation.


On Tue, Aug 4, 2009 at 2:41 PM, Ryan Rawson<ry...@gmail.com> wrote:
> We should move the clients to a non-active server API, possibly the
> REST one, and avoid using active sessions just for clients.  Something
> to address in 0.21 I think.
>
> As for #2, it is recommended now to run a quorum of zookeeper instead
> of a single one.  This reduces the risk of running out of connections.
>
> Also the code snippet you listed is a little degenerate, we can never
> fully protect ourselves from fork-bomb like code.  Your code snippet
> suggests that:
> - you are creating/closing HTable a lot.  Maybe you shouldn't do that?
>  HTablePool?
> - you have 1024+ tables, and need to access them in one client at one time.
>
> In the mean time, highly consider upgrading to a cluster of 5-7 ZK
> hosts.  For production, you should consider NOT running them on your
> HBase/HDFS/map-reduce nodes.
>
> Good luck!
> -ryan
>
> On Mon, Aug 3, 2009 at 11:00 PM, Angus He<an...@gmail.com> wrote:
>> Hi All,
>>
>> In HBase 0.20rc, HTable does not explicitly close the connection to
>> zookeeper in HTable::close.
>> It probably could be better.  And in my opinion, it should be for:
>>
>> 1. It is not well-behaved, although zookeeper is able to detect the
>> lost connection after issuing networking I/O operation, .
>> 2. It is easy to get zookeeper server stuck with exceptions like "Too
>> many connections from /0:0:0:0:0     :0:0:1 - max is 30", when user
>> write codes like:
>>                        for (int i = 0; i < 1024; ++i) {
>>                                HTable table = new HTable("foobar");
>>                                table.close();
>>                        }
>>
>> In the current implementation, different HTable instances share the
>> same connection to zookeeper if they have same HBaseConfiguration
>> instance. For this, we cannot close the connection directly in HTable,
>> but probably we could implement HConnection class with
>> reference-counting ability.
>>
>> Any comments?
>>
>> --
>> Regards
>> Angus
>>
>



-- 
Regards
Angus