You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Michael Dagaev <mi...@gmail.com> on 2008/10/22 12:26:49 UTC

Question on Connection Pooling

Hi, All

    As I understand, HConnectionManager class pools table server
connections (HBASE_INSTANCES) but does not pool connections to the
master server. Is it correct? I guess it is implemented this way since
the master serves many clients and may run of connections.  On the
other hand, what if there are only few clients ? Doesn't it make sense
to make this behavior configurable ?

     I have noticed that the table server connection pool does not
have min. and max. number of connections. Don't you think it would be
nice to add it?

Thank you for your cooperation,
M.

Re: Question on Connection Pooling

Posted by Michael Dagaev <mi...@gmail.com>.
    I guess that if my cluster is small I can be pretty sure that
all servers are up and running for a long time. On the other hand,
you are probably right that the performance penalty of lazy connection
opening is insignificant.

Thank you,
M.

On Wed, Oct 22, 2008 at 9:18 PM, Jim Kellerman (POWERSET)
<Ji...@microsoft.com> wrote:
> Well, that is hard to know up front. And even harder to maintain
> as region servers may come and go (either by manual intervention
> or system restart, or system failure). If you only have one table
> I suppose we could set up connections up front, but when a connection
> object is created, so is a socket, which uses up a file handle.
> If it is done lazily, as is currently the case, you only have what
> you need, and the performance penalty you pay is (at this point)
> totally swamped with other overhead.
>
> ---
> Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
>
>
>> -----Original Message-----
>> From: Michael Dagaev [mailto:michael.dagaev@gmail.com]
>> Sent: Wednesday, October 22, 2008 12:00 PM
>> To: hbase-user@hadoop.apache.org
>> Subject: Re: Question on Connection Pooling
>>
>>     Jim, thanks. I got it.
>>
>>     What about the minimum number of cached connections?
>> Let's say I know that my client will access ALL table servers.
>> In this case, I guess, it is more efficient to open connections
>> to the table servers upfront upon the client startup.
>>
>> Thank you for your cooperation,
>> M.
>>
>> On Wed, Oct 22, 2008 at 8:24 PM, Jim Kellerman (POWERSET)
>> <Ji...@microsoft.com> wrote:
>> > Connections to the master are not pooled by the client because
>> > the only time the client needs to talk to the master is to get
>> > the root region location (which is then cached).
>> >
>> > Making the maximum number of cached connections configurable is
>> > an entirely different subject. In practice, caching a "connection"
>> > is not expensive as the connection will close its socket after
>> > a configurable time interval of inactivity. So if later, the
>> > client needs to talk to a cached "connection", it merely has to
>> > reopen the socket rather than go through all the other setup.
>> >
>> > ---
>> > Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
>> >
>> >
>> >> -----Original Message-----
>> >> From: Michael Dagaev [mailto:michael.dagaev@gmail.com]
>> >> Sent: Wednesday, October 22, 2008 3:27 AM
>> >> To: hbase-user@hadoop.apache.org
>> >> Subject: Question on Connection Pooling
>> >>
>> >> Hi, All
>> >>
>> >>     As I understand, HConnectionManager class pools table server
>> >> connections (HBASE_INSTANCES) but does not pool connections to the
>> >> master server. Is it correct? I guess it is implemented this way since
>> >> the master serves many clients and may run of connections.  On the
>> >> other hand, what if there are only few clients ? Doesn't it make sense
>> >> to make this behavior configurable ?
>> >>
>> >>      I have noticed that the table server connection pool does not
>> >> have min. and max. number of connections. Don't you think it would be
>> >> nice to add it?
>> >>
>> >> Thank you for your cooperation,
>> >> M.
>> >
>> >
>
>

RE: Question on Connection Pooling

Posted by "Jim Kellerman (POWERSET)" <Ji...@microsoft.com>.
Well, that is hard to know up front. And even harder to maintain
as region servers may come and go (either by manual intervention
or system restart, or system failure). If you only have one table
I suppose we could set up connections up front, but when a connection
object is created, so is a socket, which uses up a file handle.
If it is done lazily, as is currently the case, you only have what
you need, and the performance penalty you pay is (at this point)
totally swamped with other overhead.

---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)


> -----Original Message-----
> From: Michael Dagaev [mailto:michael.dagaev@gmail.com]
> Sent: Wednesday, October 22, 2008 12:00 PM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Question on Connection Pooling
>
>     Jim, thanks. I got it.
>
>     What about the minimum number of cached connections?
> Let's say I know that my client will access ALL table servers.
> In this case, I guess, it is more efficient to open connections
> to the table servers upfront upon the client startup.
>
> Thank you for your cooperation,
> M.
>
> On Wed, Oct 22, 2008 at 8:24 PM, Jim Kellerman (POWERSET)
> <Ji...@microsoft.com> wrote:
> > Connections to the master are not pooled by the client because
> > the only time the client needs to talk to the master is to get
> > the root region location (which is then cached).
> >
> > Making the maximum number of cached connections configurable is
> > an entirely different subject. In practice, caching a "connection"
> > is not expensive as the connection will close its socket after
> > a configurable time interval of inactivity. So if later, the
> > client needs to talk to a cached "connection", it merely has to
> > reopen the socket rather than go through all the other setup.
> >
> > ---
> > Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
> >
> >
> >> -----Original Message-----
> >> From: Michael Dagaev [mailto:michael.dagaev@gmail.com]
> >> Sent: Wednesday, October 22, 2008 3:27 AM
> >> To: hbase-user@hadoop.apache.org
> >> Subject: Question on Connection Pooling
> >>
> >> Hi, All
> >>
> >>     As I understand, HConnectionManager class pools table server
> >> connections (HBASE_INSTANCES) but does not pool connections to the
> >> master server. Is it correct? I guess it is implemented this way since
> >> the master serves many clients and may run of connections.  On the
> >> other hand, what if there are only few clients ? Doesn't it make sense
> >> to make this behavior configurable ?
> >>
> >>      I have noticed that the table server connection pool does not
> >> have min. and max. number of connections. Don't you think it would be
> >> nice to add it?
> >>
> >> Thank you for your cooperation,
> >> M.
> >
> >


Re: Question on Connection Pooling

Posted by Michael Dagaev <mi...@gmail.com>.
    Jim, thanks. I got it.

    What about the minimum number of cached connections?
Let's say I know that my client will access ALL table servers.
In this case, I guess, it is more efficient to open connections
to the table servers upfront upon the client startup.

Thank you for your cooperation,
M.

On Wed, Oct 22, 2008 at 8:24 PM, Jim Kellerman (POWERSET)
<Ji...@microsoft.com> wrote:
> Connections to the master are not pooled by the client because
> the only time the client needs to talk to the master is to get
> the root region location (which is then cached).
>
> Making the maximum number of cached connections configurable is
> an entirely different subject. In practice, caching a "connection"
> is not expensive as the connection will close its socket after
> a configurable time interval of inactivity. So if later, the
> client needs to talk to a cached "connection", it merely has to
> reopen the socket rather than go through all the other setup.
>
> ---
> Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
>
>
>> -----Original Message-----
>> From: Michael Dagaev [mailto:michael.dagaev@gmail.com]
>> Sent: Wednesday, October 22, 2008 3:27 AM
>> To: hbase-user@hadoop.apache.org
>> Subject: Question on Connection Pooling
>>
>> Hi, All
>>
>>     As I understand, HConnectionManager class pools table server
>> connections (HBASE_INSTANCES) but does not pool connections to the
>> master server. Is it correct? I guess it is implemented this way since
>> the master serves many clients and may run of connections.  On the
>> other hand, what if there are only few clients ? Doesn't it make sense
>> to make this behavior configurable ?
>>
>>      I have noticed that the table server connection pool does not
>> have min. and max. number of connections. Don't you think it would be
>> nice to add it?
>>
>> Thank you for your cooperation,
>> M.
>
>

RE: Question on Connection Pooling

Posted by "Jim Kellerman (POWERSET)" <Ji...@microsoft.com>.
Connections to the master are not pooled by the client because
the only time the client needs to talk to the master is to get
the root region location (which is then cached).

Making the maximum number of cached connections configurable is
an entirely different subject. In practice, caching a "connection"
is not expensive as the connection will close its socket after
a configurable time interval of inactivity. So if later, the
client needs to talk to a cached "connection", it merely has to
reopen the socket rather than go through all the other setup.

---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)


> -----Original Message-----
> From: Michael Dagaev [mailto:michael.dagaev@gmail.com]
> Sent: Wednesday, October 22, 2008 3:27 AM
> To: hbase-user@hadoop.apache.org
> Subject: Question on Connection Pooling
>
> Hi, All
>
>     As I understand, HConnectionManager class pools table server
> connections (HBASE_INSTANCES) but does not pool connections to the
> master server. Is it correct? I guess it is implemented this way since
> the master serves many clients and may run of connections.  On the
> other hand, what if there are only few clients ? Doesn't it make sense
> to make this behavior configurable ?
>
>      I have noticed that the table server connection pool does not
> have min. and max. number of connections. Don't you think it would be
> nice to add it?
>
> Thank you for your cooperation,
> M.


Re: How can I use PHP with HBase?

Posted by Krzysztof Szlapinski <kr...@starline.hk>.
Qian, Ling pisze:
> Dear All,
>
>   If I want to make php to use HBase as it's data source, which is the solution? thrift? how?
>   
You could also try using php java bridge

krzysiek


Re: Re: How can I use PHP with HBase?

Posted by "Qian, Ling" <la...@gmail.com>.
J-D and Michael,

  I will try them, many thanks!


2008-10-23 



Qian, Ling 



发件人: Michael Bieniosek 
发送时间: 2008-10-23  01:01:27 
收件人: hbase-user@hadoop.apache.org; Jean-DanielCryans 
抄送: 
主题: Re: How can I use PHP with HBase? 
 
You can also use the REST interface.  http://wiki.apache.org/hadoop/Hbase/HbaseRest
There's a PHP client attached to https://issues.apache.org/jira/browse/HBASE-37.
-Michael
On 10/22/08 9:25 AM, "Jean-Daniel Cryans" <jd...@apache.org> wrote:
Larry,
Thrift will be the solution for you, see the examples in
src/examples/thrift/DemoClient.php and you can look in the wiki regarding
the IDL.
J-D
On Wed, Oct 22, 2008 at 12:20 PM, Qian, Ling <la...@gmail.com> wrote:
> Dear All,
>
>  If I want to make php to use HBase as it's data source, which is the
> solution? thrift? how?
>
>  Thank you in advance!
>
>
> 2008-10-23
>
>
>
> Larry
>

Re: How can I use PHP with HBase?

Posted by Michael Bieniosek <mi...@powerset.com>.
You can also use the REST interface.  http://wiki.apache.org/hadoop/Hbase/HbaseRest

There's a PHP client attached to https://issues.apache.org/jira/browse/HBASE-37.

-Michael

On 10/22/08 9:25 AM, "Jean-Daniel Cryans" <jd...@apache.org> wrote:

Larry,

Thrift will be the solution for you, see the examples in
src/examples/thrift/DemoClient.php and you can look in the wiki regarding
the IDL.

J-D

On Wed, Oct 22, 2008 at 12:20 PM, Qian, Ling <la...@gmail.com> wrote:

> Dear All,
>
>  If I want to make php to use HBase as it's data source, which is the
> solution? thrift? how?
>
>  Thank you in advance!
>
>
> 2008-10-23
>
>
>
> Larry
>


Re: How can I use PHP with HBase?

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Larry,

Thrift will be the solution for you, see the examples in
src/examples/thrift/DemoClient.php and you can look in the wiki regarding
the IDL.

J-D

On Wed, Oct 22, 2008 at 12:20 PM, Qian, Ling <la...@gmail.com> wrote:

> Dear All,
>
>  If I want to make php to use HBase as it's data source, which is the
> solution? thrift? how?
>
>  Thank you in advance!
>
>
> 2008-10-23
>
>
>
> Larry
>

How can I use PHP with HBase?

Posted by "Qian, Ling" <la...@gmail.com>.
Dear All,

  If I want to make php to use HBase as it's data source, which is the solution? thrift? how?

  Thank you in advance!


2008-10-23 



Larry