You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Kunal Khatua <kk...@mapr.com> on 2018/02/16 19:15:46 UTC

Looking for user feedback on DRILL-5741

Hi everyone

We're working on simplifying the Drill memory configuration process, where by, users no more have the need for getting into the specifics of Heap and Direct memory allocations.
Here is the original JIRA https://issues.apache.org/jira/browse/DRILL-5741

The idea is to simply provide the Drill process with a single dimensional of total process memory (either in absolute values, or as a percentage of the total system memory), while Drill's startup scripts automatically do an optimal allocations. This, of course, can be overridden.

What I'm looking for feed back.... for which, you're welcome to try the Commit (1ad11ee44902c11efa69cde908002f59169f61d7) specified in the following
https://github.com/apache/drill/pull/1082

You can try building Drill with this private branch (to which the pull request is linked): https://github.com/kkhatua/drill/commit/1ad11ee44902c11efa69cde908002f59169f61d7

Once you've done a clean setup, you should only need to edit

and uncomment the line having the property - "DRILLBIT_MAX_PROC_MEM" with a setting like (say) 50%.
export DRILLBIT_MAX_PROC_MEM=50%

After that, Drill should start up successfully. Log messages should appear in drillbit.out showing that Drill has auto-configured the memory.

I'm looking forward to hearing back from folks who've tried this.

TIA
~ Kunal

Re: Looking for user feedback on DRILL-5741

Posted by John Omernik <jo...@omernik.com>.
I like this this idea in general.  When running under orchestrators like
Yarn, Marathon, or Kubernetes, it's true that those things that start drill
"manage" memory, however, there exists issues in that you need to setup the
variables in drill to not exceed the amount that orchestrators have
allocated.  Once an orchestrator sees a managed process overtake what it
allocated, it often kills the process. In Drill that can mean drillbits
that get killed during queries and thus that leads to a bad user
experience.  Folks configuring Drill in the field have had to set the Heap,
Direct and other settings and hope that they did it right to ensure this
didn't happen.

This option, provides a way for people to start working with reasonable
settings, I like the by % or absolute values. This is important in
multi-tenant environments.

I think I saw in the JIRA that Drill will indicate at startup what
allocation was used, based on what variables.  I think this is important.
Log at bit start both in stdout and in the drillbit.log file.  Indicate
what method was used for allocation, what the user provided values were,
and for auto allocations the split provided. (maybe even provide it in such
a way, that if if a user read it, and wanted to tweak, they could take the
auto allocated output message, and cut and paste that into drill_env.sh.
I.e. print the variables and the values that got auto allocated. That way,
as an administrator, if I felt the need to tweak settings, I can take
exactly what the auto-allocation outputted, put it into my env script, and
then tweak to my hearts desire.


This is a pretty cool..

John



On Fri, Feb 16, 2018 at 1:15 PM, Kunal Khatua <kk...@mapr.com> wrote:

> Hi everyone
>
> We're working on simplifying the Drill memory configuration process, where
> by, users no more have the need for getting into the specifics of Heap and
> Direct memory allocations.
> Here is the original JIRA https://issues.apache.org/jira/browse/DRILL-5741
>
> The idea is to simply provide the Drill process with a single dimensional
> of total process memory (either in absolute values, or as a percentage of
> the total system memory), while Drill's startup scripts automatically do an
> optimal allocations. This, of course, can be overridden.
>
> What I'm looking for feed back.... for which, you're welcome to try the
> Commit (1ad11ee44902c11efa69cde908002f59169f61d7) specified in the
> following
> https://github.com/apache/drill/pull/1082
>
> You can try building Drill with this private branch (to which the pull
> request is linked): https://github.com/kkhatua/drill/commit/
> 1ad11ee44902c11efa69cde908002f59169f61d7
>
> Once you've done a clean setup, you should only need to edit
>
> and uncomment the line having the property - "DRILLBIT_MAX_PROC_MEM" with
> a setting like (say) 50%.
> export DRILLBIT_MAX_PROC_MEM=50%
>
> After that, Drill should start up successfully. Log messages should appear
> in drillbit.out showing that Drill has auto-configured the memory.
>
> I'm looking forward to hearing back from folks who've tried this.
>
> TIA
> ~ Kunal
>