You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Josh Elser <jo...@gmail.com> on 2012/12/20 02:10:06 UTC

Running Continuous Ingest on small cluster

In playing around with the continuous ingest collection of code (ingest, 
walkers, batchwalkers, scanners and agitators), I found myself blindly 
guessing at how many of each of these processes I should use.

Are there some generic thoughts as to what might be an ideal saturation 
point for N tservers?

I initially split my hosts 4 ways and ran (N/4) of each process (ingest, 
walkers, batchwalkers, and scanners), ratcheting down the number of 
threads ingest and batchwalkers (to avoid saturating CPU and memory). 
Should I try to balance (query threads * query clients) + (ingest 
threads * ingest clients) against the available threads per host and 
adjust the BatchWriter send buffers similarly in regard to memory available?

I appreciate anyone's insight.

- Josh

Re: Running Continuous Ingest on small cluster

Posted by Josh Elser <jo...@gmail.com>.
Thanks for the great info, Eric!

On 12/19/2012 8:59 PM, Eric Newton wrote:
> It depends on the number of drives you have, too.
>
> I run ingesters and scanners on every slave node, a single batchwalker
> on the master node.
>
> You want at least 100K for outgoing buffers for your ingester for each
> slave node you have.
>
> A large in-memory map is probably less useful than block index to get
> your query performance to be faster.
>
> Once you start getting to 5G / tablet, the number of files per tablet
> causes merging minor compactions, which cuts your performance in half.
>   Increasing the number of files will reduce query performance, so that
> gives you a basic way to control ingest vs query performance.
>
> Pre-splitting to 20 tablets/server will give you the sweet-spot for
> ingest performance; add more if you have more drives.  It allows for for
> more parallel writes during minor compactions.
>
> If you have more than 4 drives per node, try doubling the number of
> ingesters you run.
>
> I like to tweak everything until I get the system load on each machine
> to be roughly the number of real cores after 12 hours.  This is hard to
> do without a spindle for every two CPUs.
>
> It's important to watch for failed/failing hardware.  You can sample the
> outgoing write buffers of the ingesters (using netstat).  If you see a
> node constantly having data queued going to it, try taking it out of the
> cluster.  You can do the same for datanodes, too.  At dozens of nodes,
> this is not really important.  When you get to hundreds, it becomes much
> more likely that a single node will flake out after a day of abuse.
>
> -Eric
>
>
> On Wed, Dec 19, 2012 at 8:10 PM, Josh Elser <josh.elser@gmail.com
> <ma...@gmail.com>> wrote:
>
>     In playing around with the continuous ingest collection of code
>     (ingest, walkers, batchwalkers, scanners and agitators), I found
>     myself blindly guessing at how many of each of these processes I
>     should use.
>
>     Are there some generic thoughts as to what might be an ideal
>     saturation point for N tservers?
>
>     I initially split my hosts 4 ways and ran (N/4) of each process
>     (ingest, walkers, batchwalkers, and scanners), ratcheting down the
>     number of threads ingest and batchwalkers (to avoid saturating CPU
>     and memory). Should I try to balance (query threads * query clients)
>     + (ingest threads * ingest clients) against the available threads
>     per host and adjust the BatchWriter send buffers similarly in regard
>     to memory available?
>
>     I appreciate anyone's insight.
>
>     - Josh
>
>

Re: Running Continuous Ingest on small cluster

Posted by Eric Newton <er...@gmail.com>.
It depends on the number of drives you have, too.

I run ingesters and scanners on every slave node, a single batchwalker on
the master node.

You want at least 100K for outgoing buffers for your ingester for each
slave node you have.

A large in-memory map is probably less useful than block index to get your
query performance to be faster.

Once you start getting to 5G / tablet, the number of files per tablet
causes merging minor compactions, which cuts your performance in half.
 Increasing the number of files will reduce query performance, so that
gives you a basic way to control ingest vs query performance.

Pre-splitting to 20 tablets/server will give you the sweet-spot for ingest
performance; add more if you have more drives.  It allows for for more
parallel writes during minor compactions.

If you have more than 4 drives per node, try doubling the number of
ingesters you run.

I like to tweak everything until I get the system load on each machine to
be roughly the number of real cores after 12 hours.  This is hard to do
without a spindle for every two CPUs.

It's important to watch for failed/failing hardware.  You can sample the
outgoing write buffers of the ingesters (using netstat).  If you see a node
constantly having data queued going to it, try taking it out of the
cluster.  You can do the same for datanodes, too.  At dozens of nodes, this
is not really important.  When you get to hundreds, it becomes much more
likely that a single node will flake out after a day of abuse.

-Eric


On Wed, Dec 19, 2012 at 8:10 PM, Josh Elser <jo...@gmail.com> wrote:

> In playing around with the continuous ingest collection of code (ingest,
> walkers, batchwalkers, scanners and agitators), I found myself blindly
> guessing at how many of each of these processes I should use.
>
> Are there some generic thoughts as to what might be an ideal saturation
> point for N tservers?
>
> I initially split my hosts 4 ways and ran (N/4) of each process (ingest,
> walkers, batchwalkers, and scanners), ratcheting down the number of threads
> ingest and batchwalkers (to avoid saturating CPU and memory). Should I try
> to balance (query threads * query clients) + (ingest threads * ingest
> clients) against the available threads per host and adjust the BatchWriter
> send buffers similarly in regard to memory available?
>
> I appreciate anyone's insight.
>
> - Josh
>