You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Michael Dagaev <mi...@gmail.com> on 2008/10/13 11:54:16 UTC
Questions on client API
Hi All
As I understand, the HTable class uses HConnectionManager class,
which holds connections to the master and region servers.
The connections are pooled as entries in a thread-safe static table
(map). Thus, a client application should not care about connection
pooling. Is it correct?
May several threads share the same instance of HbaseConfiguration ? HTable?
Thank you for your cooperation,
M.
Re: Questions on client API
Posted by stack <st...@duboce.net>.
Michael Dagaev wrote:
> May several threads share the same instance of HbaseConfiguration ? HTable?
>
>
If you can stomach it -- the commentary wanders -- see HBASE-576
starting at about "stack - 03/Oct/08 11:45 PM". It might be of interest
if you are trying to write a multithreaded hbase client.
St.Ack
RE: Questions on client API
Posted by "Jim Kellerman (POWERSET)" <Ji...@microsoft.com>.
With respect to Configuration,
all gets go through:
private synchronized Properties getProps()
all sets go through:
private synchronized Properties getOverlay()
private synchronized Properties getProps()
and all addResource calls go through:
private synchronized void addResource(ArrayList<Object> resources, Object resource)
So I think Configuration is thread safe.
I think that sharing scanners across threads is probably not a good idea
in the first place, but you are correct that the cache is not synchronized.
---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
> -----Original Message-----
> From: Andrew Purtell [mailto:apurtell@yahoo.com]
> Sent: Monday, October 13, 2008 11:14 AM
> To: hbase-user@hadoop.apache.org
> Subject: RE: Questions on client API
>
> Configuration uses unsynchronized lists and hash set. On the other hand if
> it is used in a read-only manner after initialization that would be ok.
>
> I don't think sharing a scanner across threads would be ok because of the
> cache of RowResults as an unsynchronized linked list. Otherwise HTable
> looks ok to me.
>
> Am I being overly conservative?
>
> --- On Mon, 10/13/08, Jim Kellerman (POWERSET)
> <Ji...@microsoft.com> wrote:
>
> > From: Jim Kellerman (POWERSET) <Ji...@microsoft.com>
> > Subject: RE: Questions on client API
> > To: "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org>
> > Date: Monday, October 13, 2008, 10:48 AM
> > Andrew,
> >
> > What methods in HBaseConfiguration and HTable do you think
> > are not re-entrant?
> >
> > ---
> > Jim Kellerman, Powerset (Live Search, Microsoft
> > Corporation)
> >
> >
> > > -----Original Message-----
> > > From: Michael Dagaev [mailto:michael.dagaev@gmail.com]
> > > Sent: Monday, October 13, 2008 10:24 AM
> > > To: hbase-user@hadoop.apache.org; apurtell@apache.org
> > > Subject: Re: Questions on client API
> > >
> > > Hi Andrew
> > >
> > > Hmmm ...I would not like to instantiate
> > HBaseConfiguration per
> > > thread. I would prefer to create it once per
> > application so many
> > > threads will use it concurrently.
> > >
> > > Thank you for pointing out this issue. I will check
> > the code.
> > > M.
> > >
> > > On Mon, Oct 13, 2008 at 6:54 PM, Andrew Purtell
> > <ap...@yahoo.com>
> > > wrote:
> > > > Hello Michael,
> > > >
> > > > Your understanding regarding connection pooling
> > is correct.
> > > >
> > > > Looking at the code, I see that some methods of
> > > > HBaseConfiguration and HTable are not fully
> > reentrant, so I
> > > > would not share them across multiple threads, or
> > at least I
> > > > would explicitly synchronize access to them.
> > > >
> > > > - Andy
> > > >
> > > >
> > > >> From: Michael Dagaev
> > <mi...@gmail.com>
> > > >> Subject: Questions on client API
> > > >> To: hbase-user@hadoop.apache.org
> > > >> Date: Monday, October 13, 2008, 2:54 AM
> > > >> Hi All
> > > >>
> > > >> As I understand, the HTable class uses
> > > >> HConnectionManager class, which holds
> > connections to the
> > > >> master and region servers. The connections
> > are pooled as
> > > >> entries in a thread-safe static table (map).
> > Thus, a
> > > >> client application should not care about
> > connection
> > > >> pooling. Is it correct?
> > > >>
> > > >> May several threads share the same
> > instance of
> > > >> HbaseConfiguration ? HTable?
> > > >>
> > > >> Thank you for your cooperation,
> > > >> M.
> > > >
> > > >
> > > >
> > > >
> > > >
>
>
>
RE: Questions on client API
Posted by Andrew Purtell <ap...@yahoo.com>.
Configuration uses unsynchronized lists and hash set. On the other hand if it is used in a read-only manner after initialization that would be ok.
I don't think sharing a scanner across threads would be ok because of the cache of RowResults as an unsynchronized linked list. Otherwise HTable looks ok to me.
Am I being overly conservative?
--- On Mon, 10/13/08, Jim Kellerman (POWERSET) <Ji...@microsoft.com> wrote:
> From: Jim Kellerman (POWERSET) <Ji...@microsoft.com>
> Subject: RE: Questions on client API
> To: "hbase-user@hadoop.apache.org" <hb...@hadoop.apache.org>
> Date: Monday, October 13, 2008, 10:48 AM
> Andrew,
>
> What methods in HBaseConfiguration and HTable do you think
> are not re-entrant?
>
> ---
> Jim Kellerman, Powerset (Live Search, Microsoft
> Corporation)
>
>
> > -----Original Message-----
> > From: Michael Dagaev [mailto:michael.dagaev@gmail.com]
> > Sent: Monday, October 13, 2008 10:24 AM
> > To: hbase-user@hadoop.apache.org; apurtell@apache.org
> > Subject: Re: Questions on client API
> >
> > Hi Andrew
> >
> > Hmmm ...I would not like to instantiate
> HBaseConfiguration per
> > thread. I would prefer to create it once per
> application so many
> > threads will use it concurrently.
> >
> > Thank you for pointing out this issue. I will check
> the code.
> > M.
> >
> > On Mon, Oct 13, 2008 at 6:54 PM, Andrew Purtell
> <ap...@yahoo.com>
> > wrote:
> > > Hello Michael,
> > >
> > > Your understanding regarding connection pooling
> is correct.
> > >
> > > Looking at the code, I see that some methods of
> > > HBaseConfiguration and HTable are not fully
> reentrant, so I
> > > would not share them across multiple threads, or
> at least I
> > > would explicitly synchronize access to them.
> > >
> > > - Andy
> > >
> > >
> > >> From: Michael Dagaev
> <mi...@gmail.com>
> > >> Subject: Questions on client API
> > >> To: hbase-user@hadoop.apache.org
> > >> Date: Monday, October 13, 2008, 2:54 AM
> > >> Hi All
> > >>
> > >> As I understand, the HTable class uses
> > >> HConnectionManager class, which holds
> connections to the
> > >> master and region servers. The connections
> are pooled as
> > >> entries in a thread-safe static table (map).
> Thus, a
> > >> client application should not care about
> connection
> > >> pooling. Is it correct?
> > >>
> > >> May several threads share the same
> instance of
> > >> HbaseConfiguration ? HTable?
> > >>
> > >> Thank you for your cooperation,
> > >> M.
> > >
> > >
> > >
> > >
> > >
RE: Questions on client API
Posted by "Jim Kellerman (POWERSET)" <Ji...@microsoft.com>.
Andrew,
What methods in HBaseConfiguration and HTable do you think
are not re-entrant?
---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
> -----Original Message-----
> From: Michael Dagaev [mailto:michael.dagaev@gmail.com]
> Sent: Monday, October 13, 2008 10:24 AM
> To: hbase-user@hadoop.apache.org; apurtell@apache.org
> Subject: Re: Questions on client API
>
> Hi Andrew
>
> Hmmm ...I would not like to instantiate HBaseConfiguration per
> thread. I would prefer to create it once per application so many
> threads will use it concurrently.
>
> Thank you for pointing out this issue. I will check the code.
> M.
>
> On Mon, Oct 13, 2008 at 6:54 PM, Andrew Purtell <ap...@yahoo.com>
> wrote:
> > Hello Michael,
> >
> > Your understanding regarding connection pooling is correct.
> >
> > Looking at the code, I see that some methods of
> > HBaseConfiguration and HTable are not fully reentrant, so I
> > would not share them across multiple threads, or at least I
> > would explicitly synchronize access to them.
> >
> > - Andy
> >
> >
> >> From: Michael Dagaev <mi...@gmail.com>
> >> Subject: Questions on client API
> >> To: hbase-user@hadoop.apache.org
> >> Date: Monday, October 13, 2008, 2:54 AM
> >> Hi All
> >>
> >> As I understand, the HTable class uses
> >> HConnectionManager class, which holds connections to the
> >> master and region servers. The connections are pooled as
> >> entries in a thread-safe static table (map). Thus, a
> >> client application should not care about connection
> >> pooling. Is it correct?
> >>
> >> May several threads share the same instance of
> >> HbaseConfiguration ? HTable?
> >>
> >> Thank you for your cooperation,
> >> M.
> >
> >
> >
> >
> >
Re: Questions on client API
Posted by Michael Dagaev <mi...@gmail.com>.
Hi Andrew
Hmmm ...I would not like to instantiate HBaseConfiguration per
thread. I would prefer to create it once per application so many
threads will use it concurrently.
Thank you for pointing out this issue. I will check the code.
M.
On Mon, Oct 13, 2008 at 6:54 PM, Andrew Purtell <ap...@yahoo.com> wrote:
> Hello Michael,
>
> Your understanding regarding connection pooling is correct.
>
> Looking at the code, I see that some methods of
> HBaseConfiguration and HTable are not fully reentrant, so I
> would not share them across multiple threads, or at least I
> would explicitly synchronize access to them.
>
> - Andy
>
>
>> From: Michael Dagaev <mi...@gmail.com>
>> Subject: Questions on client API
>> To: hbase-user@hadoop.apache.org
>> Date: Monday, October 13, 2008, 2:54 AM
>> Hi All
>>
>> As I understand, the HTable class uses
>> HConnectionManager class, which holds connections to the
>> master and region servers. The connections are pooled as
>> entries in a thread-safe static table (map). Thus, a
>> client application should not care about connection
>> pooling. Is it correct?
>>
>> May several threads share the same instance of
>> HbaseConfiguration ? HTable?
>>
>> Thank you for your cooperation,
>> M.
>
>
>
>
>
Re: Questions on client API
Posted by Andrew Purtell <ap...@yahoo.com>.
Hello Michael,
Your understanding regarding connection pooling is correct.
Looking at the code, I see that some methods of
HBaseConfiguration and HTable are not fully reentrant, so I
would not share them across multiple threads, or at least I
would explicitly synchronize access to them.
- Andy
> From: Michael Dagaev <mi...@gmail.com>
> Subject: Questions on client API
> To: hbase-user@hadoop.apache.org
> Date: Monday, October 13, 2008, 2:54 AM
> Hi All
>
> As I understand, the HTable class uses
> HConnectionManager class, which holds connections to the
> master and region servers. The connections are pooled as
> entries in a thread-safe static table (map). Thus, a
> client application should not care about connection
> pooling. Is it correct?
>
> May several threads share the same instance of
> HbaseConfiguration ? HTable?
>
> Thank you for your cooperation,
> M.