You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Aaron G <aa...@gmail.com> on 2013/05/01 21:57:59 UTC

Newbie question: where do the services live?

Generic/best practices question about the 6 services:  master, gc, tserver,
logger, monitor, tracer

For this example/discussion let's say I have a cluster, with 10 nodes
(n01-n10)...3 of the nodes running zookeeper

n01:  NameNode, zooKeeper
n02:  SecondaryNameNode, zooKeeper
n03:  JobTracker, zooKeeper
n04:  empty (for now)

Let's label these storage/compute nodes:
n05-n10:  dataNode, taskTracker

So, how i thought this could be setup for Accumulo services:

n04:  master & gc
n05-n10:  each runs a tserver & logger

I think my main questions revolve around the monitor & tracer services and
where they run:

1.  Does those need to run on every "compute node"?
2.  Do you only need one running instance of monitor?  Perhaps on the n04?
 Or does it need to run on every tserver as well?
3.  Do you only need the tracer service running on compute nodes?  Or
everywhere (master & gc included)?  Do you only need the tracer for the
purpose of development of Iterators, Scanners, Writers?  They primarily
there to help with that activity?  Or are they useful to have running "all
the time?"

Thanks in advance,
Aaron

Re: Newbie question: where do the services live?

Posted by Eric Newton <er...@gmail.com>.
On all but the largest clusters, the computing needs of the
master/gc/monitor/tracer can all run on one node and can be co-located with
a zookeeper server.

Strictly speaking, you don't need a tracer.  Unless you are running a very
large cluster, one tracer will be enough.

The tracer can be used to identify components that are performing slowly,
so it needs to run all the time.

-Eric


On Wed, May 1, 2013 at 3:57 PM, Aaron G <aa...@gmail.com> wrote:

> Generic/best practices question about the 6 services:  master, gc,
> tserver, logger, monitor, tracer
>
> For this example/discussion let's say I have a cluster, with 10 nodes
> (n01-n10)...3 of the nodes running zookeeper
>
> n01:  NameNode, zooKeeper
> n02:  SecondaryNameNode, zooKeeper
> n03:  JobTracker, zooKeeper
> n04:  empty (for now)
>
> Let's label these storage/compute nodes:
> n05-n10:  dataNode, taskTracker
>
> So, how i thought this could be setup for Accumulo services:
>
> n04:  master & gc
> n05-n10:  each runs a tserver & logger
>
> I think my main questions revolve around the monitor & tracer services and
> where they run:
>
> 1.  Does those need to run on every "compute node"?
> 2.  Do you only need one running instance of monitor?  Perhaps on the n04?
>  Or does it need to run on every tserver as well?
> 3.  Do you only need the tracer service running on compute nodes?  Or
> everywhere (master & gc included)?  Do you only need the tracer for the
> purpose of development of Iterators, Scanners, Writers?  They primarily
> there to help with that activity?  Or are they useful to have running "all
> the time?"
>
> Thanks in advance,
> Aaron
>