You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Rose, Joseph" <Jo...@childrens.harvard.edu> on 2015/05/29 20:25:28 UTC

Connections and scalability

A colleague of mine has a question about scalability and connections to
HBase.

We’d like to use the label-based controls for our content. Those labels
are tied to users and users are specified on connections (not when getting
the HBase table, which is really too bad because if they were I wouldn’t
be writing to y’all.) Chances are we’re not going to be able to use the
built-in connection pooling because of this: every user will require a
connection of their own, because every user will be associated with
different labels.

Assuming we create unshared HConnections when each user wants to perform
some operation and destroy that connection at the end of every request,
how scalable is this? (I’m assuming “not at all”, but I’d like
confirmation.) Has anyone run tests to see how many connections per minute
are feasible?

I pointed out to this colleague that one of the first things you learn
when using a database is to pool (and reuse) connections. That connections
are tied to users, which are then tied to labels, causes us to reconsider
these lessons.

Thanks in advance.


-j


Re: Connections and scalability

Posted by Nick Dimiduk <nd...@gmail.com>.
Hi Joseph,

Sorry no one as spoken up yet.

I don't know the details in this area off the top of my head, but i'd
suggest having a look at the gateways (REST, Thrift) to see how they handle
authenticated users. My guess is they are keeping a connection per user,
but I've not looked at the code recently. I think this will give you some
clues as to the state of our present. I will say that the client has a
number of ways for managing the underlying thread pools used to communicate
wit the cluster. It may be the one of the deprecated HTable construction
APIs offered this per-user capability. It would be good to know if we
missed this in the design of the 1.0 API.

Do let us know what you find!

Thanks,
Nick

On Fri, May 29, 2015 at 11:25 AM, Rose, Joseph <
Joseph.Rose@childrens.harvard.edu> wrote:

> A colleague of mine has a question about scalability and connections to
> HBase.
>
> We’d like to use the label-based controls for our content. Those labels
> are tied to users and users are specified on connections (not when getting
> the HBase table, which is really too bad because if they were I wouldn’t
> be writing to y’all.) Chances are we’re not going to be able to use the
> built-in connection pooling because of this: every user will require a
> connection of their own, because every user will be associated with
> different labels.
>
> Assuming we create unshared HConnections when each user wants to perform
> some operation and destroy that connection at the end of every request,
> how scalable is this? (I’m assuming “not at all”, but I’d like
> confirmation.) Has anyone run tests to see how many connections per minute
> are feasible?
>
> I pointed out to this colleague that one of the first things you learn
> when using a database is to pool (and reuse) connections. That connections
> are tied to users, which are then tied to labels, causes us to reconsider
> these lessons.
>
> Thanks in advance.
>
>
> -j
>
>