You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Christian Hotz-Behofsits <ch...@gmail.com> on 2017/02/04 17:37:15 UTC

ideal drill node size

Hi,

I have a big server (512gb RAM, 32 cores, storage is connected by FC) and
want to use it for drill. Should I split it into several VM and build a
cluster or should I use it as a single node? I know that splitting would
introduce a overhead (guest-OS). But a cluster might provide better
parellization of the tasks.


   - Are there any best practices in terms of node size (memory, CPU)?
   - Does drill favor a lot of small nodes or few (one) big node?


cheers

Re: ideal drill node size

Posted by Parth Chandra <pa...@apache.org>.
I would second John's suggestion that you should try a single large
machine, taking care to get your memory settings right. In general, Drill
will use both CPU and memory, and in your setup, you will probably get
better (and more predictable) performance with a single node setup.
As John mentioned, please share your results.


On Mon, Feb 6, 2017 at 10:22 AM, John Omernik <jo...@omernik.com> wrote:

> I think you would be wasting quite a bit of your server if you split it up
> into multiple vms. Instead, I am thinking a larger drill bit size wise
> (ensure you are upping your ram as high as you can) would be best.  Note I
> am not an expert on this stuff, I would like an experts take as well. Here
> is a link on configuring Drill memory:
> https://drill.apache.org/docs/configuring-drill-memory/
>
> Another thing with such a heavy weight server is you will likely need to
> adjust defaults in memory to take advantage of more of the memory. (Drill
> folks correct me if I am wrong). Settings like
> planner.memory.max_query_memory_per_node
> <https://drill.apache.org/docs/configuration-options-
> introduction/#system-options>
>  Will
> need need to be setup to take advantage of more of your memory.  It will be
> very interesting to see where the bottleneck in a setup like yours is...
> please share results!
>
>
>
> On Sat, Feb 4, 2017 at 11:37 AM, Christian Hotz-Behofsits <
> chris.hotz.behofsits@gmail.com> wrote:
>
> > Hi,
> >
> > I have a big server (512gb RAM, 32 cores, storage is connected by FC) and
> > want to use it for drill. Should I split it into several VM and build a
> > cluster or should I use it as a single node? I know that splitting would
> > introduce a overhead (guest-OS). But a cluster might provide better
> > parellization of the tasks.
> >
> >
> >    - Are there any best practices in terms of node size (memory, CPU)?
> >    - Does drill favor a lot of small nodes or few (one) big node?
> >
> >
> > cheers
> >
>

Re: ideal drill node size

Posted by John Omernik <jo...@omernik.com>.
I think you would be wasting quite a bit of your server if you split it up
into multiple vms. Instead, I am thinking a larger drill bit size wise
(ensure you are upping your ram as high as you can) would be best.  Note I
am not an expert on this stuff, I would like an experts take as well. Here
is a link on configuring Drill memory:
https://drill.apache.org/docs/configuring-drill-memory/

Another thing with such a heavy weight server is you will likely need to
adjust defaults in memory to take advantage of more of the memory. (Drill
folks correct me if I am wrong). Settings like
planner.memory.max_query_memory_per_node
<https://drill.apache.org/docs/configuration-options-introduction/#system-options>
 Will
need need to be setup to take advantage of more of your memory.  It will be
very interesting to see where the bottleneck in a setup like yours is...
please share results!



On Sat, Feb 4, 2017 at 11:37 AM, Christian Hotz-Behofsits <
chris.hotz.behofsits@gmail.com> wrote:

> Hi,
>
> I have a big server (512gb RAM, 32 cores, storage is connected by FC) and
> want to use it for drill. Should I split it into several VM and build a
> cluster or should I use it as a single node? I know that splitting would
> introduce a overhead (guest-OS). But a cluster might provide better
> parellization of the tasks.
>
>
>    - Are there any best practices in terms of node size (memory, CPU)?
>    - Does drill favor a lot of small nodes or few (one) big node?
>
>
> cheers
>