You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Eric <er...@gmail.com> on 2012/07/31 13:32:53 UTC

Where to run Thrift

I'm currently running thrift on all region server nodes. The reasoning is
that you can run jobs on this cluster and these jobs, when using thrift,
can connect to localhost.
The drawback is that I'm running lots of thrift daemons of course which all
need to be monitored.

An alternative would be to create one or more dedicated Thrift / REST nodes
which have high specs (raid, etc.). Possible with a load balancer in front
of them. What would you guys recommend?

Re: Where to run Thrift

Posted by lars hofhansl <lh...@yahoo.com>.
As I said, I have not used this myself... So take this with a grain of salt :)


I imagine the advantage would be no additional servers/processes that would need to be monitored and managed, as well as a (slight) reduction in overall resource consumption.
On the downside any resource leak in the embedded thrift server (or in fact any other bug there) would now also impact the region server. And you are forced to run the thrift service on the same machine that hosts a region server.


-- Lars
________________________________
From: Shrijeet Paliwal <sh...@rocketfuel.com>
To: user@hbase.apache.org; lars hofhansl <lh...@yahoo.com> 
Sent: Wednesday, August 1, 2012 10:39 PM
Subject: Re: Where to run Thrift


Lars, 
Thanks for the pointer, its indeed interesting way. Two follow up questions  :
    1. Author states "Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer" , do you happen to have insights on what are those situations? 
    2. Could there be a situation / application_use_case where HBASE-4460 prove to be unfavorable to an extent where it impacts region server's performance? 


On Wed, Aug 1, 2012 at 9:19 PM, lars hofhansl <lh...@yahoo.com> wrote:

There is a little documented feature that Jonathan Gray added a while back: Running a thrift server as a thread as part of each region server.
>This is enabled by settting hbase.regionserver.export.thrift to true in your configuration.
>
>While I have not personally tried it, it looks like a fairly lightweight approach and does not add to the monitoring overhead.
>
>This is the jira: HBASE-4460. This is only available in 0.94+ (0.94.1 should be out soon).
>
>
>-- Lars
>
>
>
>
>----- Original Message -----
>From: Eric <er...@gmail.com>
>To: user@hbase.apache.org
>Cc:
>Sent: Tuesday, July 31, 2012 4:32 AM
>Subject: Where to run Thrift
>
>I'm currently running thrift on all region server nodes. The reasoning is
>that you can run jobs on this cluster and these jobs, when using thrift,
>can connect to localhost.
>The drawback is that I'm running lots of thrift daemons of course which all
>need to be monitored.
>
>An alternative would be to create one or more dedicated Thrift / REST nodes
>which have high specs (raid, etc.). Possible with a load balancer in front
>of them. What would you guys recommend?
>
>

Re: Where to run Thrift

Posted by Shrijeet Paliwal <sh...@rocketfuel.com>.
Lars,
Thanks for the pointer, its indeed interesting way. Two follow up questions
 :

   1. Author states "Rather than a separate process, it can be *advantageous
   * in some situations for each RegionServer to embed their own
   ThriftServer" , do you happen to have insights on what are
   those situations?
   2. Could there be a situation / application_use_case where HBASE-4460
   prove to be unfavorable to an extent where it impacts region server's
   performance?


On Wed, Aug 1, 2012 at 9:19 PM, lars hofhansl <lh...@yahoo.com> wrote:

> There is a little documented feature that Jonathan Gray added a while
> back: Running a thrift server as a thread as part of each region server.
> This is enabled by settting hbase.regionserver.export.thrift to true in
> your configuration.
>
> While I have not personally tried it, it looks like a fairly lightweight
> approach and does not add to the monitoring overhead.
>
> This is the jira: HBASE-4460. This is only available in 0.94+ (0.94.1
> should be out soon).
>
>
> -- Lars
>
>
>
> ----- Original Message -----
> From: Eric <er...@gmail.com>
> To: user@hbase.apache.org
> Cc:
> Sent: Tuesday, July 31, 2012 4:32 AM
> Subject: Where to run Thrift
>
> I'm currently running thrift on all region server nodes. The reasoning is
> that you can run jobs on this cluster and these jobs, when using thrift,
> can connect to localhost.
> The drawback is that I'm running lots of thrift daemons of course which all
> need to be monitored.
>
> An alternative would be to create one or more dedicated Thrift / REST nodes
> which have high specs (raid, etc.). Possible with a load balancer in front
> of them. What would you guys recommend?
>
>

Re: Where to run Thrift

Posted by lars hofhansl <lh...@yahoo.com>.
There is a little documented feature that Jonathan Gray added a while back: Running a thrift server as a thread as part of each region server.
This is enabled by settting hbase.regionserver.export.thrift to true in your configuration.

While I have not personally tried it, it looks like a fairly lightweight approach and does not add to the monitoring overhead.

This is the jira: HBASE-4460. This is only available in 0.94+ (0.94.1 should be out soon).


-- Lars



----- Original Message -----
From: Eric <er...@gmail.com>
To: user@hbase.apache.org
Cc: 
Sent: Tuesday, July 31, 2012 4:32 AM
Subject: Where to run Thrift

I'm currently running thrift on all region server nodes. The reasoning is
that you can run jobs on this cluster and these jobs, when using thrift,
can connect to localhost.
The drawback is that I'm running lots of thrift daemons of course which all
need to be monitored.

An alternative would be to create one or more dedicated Thrift / REST nodes
which have high specs (raid, etc.). Possible with a load balancer in front
of them. What would you guys recommend?


Re: Where to run Thrift

Posted by syed kather <in...@gmail.com>.
Eric ,
    why you are trying to run thrift on all the server.why don't you run on
only master machine . Really after seeing your post i also had this doubt
whether we need separate thrift setup or not ? Is it enough to run thrift
on single machine .

            Thanks and Regards,
        S SYED ABDUL KATHER



On Tue, Jul 31, 2012 at 5:02 PM, Eric <er...@gmail.com> wrote:

> I'm currently running thrift on all region server nodes. The reasoning is
> that you can run jobs on this cluster and these jobs, when using thrift,
> can connect to localhost.
> The drawback is that I'm running lots of thrift daemons of course which all
> need to be monitored.
>
> An alternative would be to create one or more dedicated Thrift / REST nodes
> which have high specs (raid, etc.). Possible with a load balancer in front
> of them. What would you guys recommend?
>

Re: Where to run Thrift

Posted by Trung Pham <tr...@phamcom.com>.
Running thrift server on the client is more ideal. You get to cut down 1
network hop.

On Tue, Jul 31, 2012 at 2:22 PM, Stack <st...@duboce.net> wrote:

> On Tue, Jul 31, 2012 at 12:32 PM, Eric <er...@gmail.com> wrote:
> > I'm currently running thrift on all region server nodes. The reasoning is
> > that you can run jobs on this cluster and these jobs, when using thrift,
> > can connect to localhost.
> > The drawback is that I'm running lots of thrift daemons of course which
> all
> > need to be monitored.
> >
>
> Is the drawback that bad?
>
> > An alternative would be to create one or more dedicated Thrift / REST
> nodes
> > which have high specs (raid, etc.). Possible with a load balancer in
> front
> > of them. What would you guys recommend?
>
> IIRC, where I work, we run a thrift server beside the client, the http
> server: i.e. between the two extremes you have above (Correct me if
> I'm wrong lads).  It seems to work fine.
>
> St.Ack
>

Re: Where to run Thrift

Posted by Stack <st...@duboce.net>.
On Tue, Jul 31, 2012 at 12:32 PM, Eric <er...@gmail.com> wrote:
> I'm currently running thrift on all region server nodes. The reasoning is
> that you can run jobs on this cluster and these jobs, when using thrift,
> can connect to localhost.
> The drawback is that I'm running lots of thrift daemons of course which all
> need to be monitored.
>

Is the drawback that bad?

> An alternative would be to create one or more dedicated Thrift / REST nodes
> which have high specs (raid, etc.). Possible with a load balancer in front
> of them. What would you guys recommend?

IIRC, where I work, we run a thrift server beside the client, the http
server: i.e. between the two extremes you have above (Correct me if
I'm wrong lads).  It seems to work fine.

St.Ack