You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Nicolas Paris <ni...@gmail.com> on 2016/07/30 13:38:18 UTC

Extrapolate performances standalone -> cluster

Hey,

I have run tests on drill on a standalone installation (1 computer 8
core/32GO ram).
I will get soon a 5 computer cluster (8 core/96GO ram each).
Is it possible to get an estimation of the performance gain ?
Is it linear ? Will the performance get better ? Worst ?

I just want an estimation/extrapolation.

Thanks !

Re: Extrapolate performances standalone -> cluster

Posted by Andries Engelbrecht <ae...@maprtech.com>.
A couple of quick things to check.

Look at the query plans of the queries you are interested in. In general expect the query times to improve (not quite linear though) for larger data sets, and where you may have lots of concurrency as the planning duties, execution, etc will be spread out on the cluster.

Expect query planning time to increase for a cluster
- You can offset that by using metadata caching

For Major query fragments that are broken up into many minor fragments you can expect that this will improve close to linear due to more parallel minor fragments due to bigger cluster ( 5 vs 1 node). Make sure that queries and data sets are optimized to parallelize in a cluster, see Drill Best Practices.

For exchange operators you should expect an increase in time due to network and a cluster vs single node.


These links will help a lot for some best practices to optimize your cluster.
https://community.mapr.com/docs/DOC-1497 <https://community.mapr.com/docs/DOC-1497>
https://community.mapr.com/thread/18549 <https://community.mapr.com/thread/18549>

--Andries



> On Jul 30, 2016, at 6:38 AM, Nicolas Paris <ni...@gmail.com> wrote:
> 
> Hey,
> 
> I have run tests on drill on a standalone installation (1 computer 8
> core/32GO ram).
> I will get soon a 5 computer cluster (8 core/96GO ram each).
> Is it possible to get an estimation of the performance gain ?
> Is it linear ? Will the performance get better ? Worst ?
> 
> I just want an estimation/extrapolation.
> 
> Thanks !