You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by kkhatua <gi...@git.apache.org> on 2018/01/03 22:39:57 UTC

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

GitHub user kkhatua opened a pull request:

    https://github.com/apache/drill/pull/1082

    DRILL-5741: Automatically manage memory allocations during startup

    Providing an environment variable - DRILLBIT_MAX_PROC_MEM to ensure that a Drillbit's max memory parameters, cumulatively, don't exceed the value specified.
    The variable can be defined in KB, MB, or  GB; similar in syntax to how the JVM MaxHeap is specified. For e.g. 
    ```
    DRILLBIT_MAX_PROC_MEM=13G
    DRILLBIT_MAX_PROC_MEM=8192m
    DRILLBIT_MAX_PROC_MEM=4194304K
    ```
    In addition, you can specify it as a percent of the total sytem memory prior to the Drillbit starting up:
    `DRILLBIT_MAX_PROC_MEM=40%`
    
    For a system with with 48GB free memory, when set to (say) 25% (with settings defined in drill-env.sh), and heap (8GB) and direct (10GB) are defined; the Drillbit fails startup with the following message:
    ```
    2018-01-03 14:27:57  [WARN] 25% of System Memory (47 GB) translates to 12 GB
    2018-01-03 14:27:57  [ERROR]    Unable to start Drillbit due to memory constraint violations
      Total Memory Requested : 19 GB
      Check the following settings to possibly modify (or increase the Max Memory Permitted):
            DRILLBIT_MAX_PROC_MEM=25%
            DRILL_HEAP=8G
            DRILL_MAX_DIRECT_MEMORY=10G
            DRILLBIT_CODE_CACHE_SIZE=1024m 
            *NOTE: It is recommended not to specify DRILLBIT_CODE_CACHE_SIZE as this will be auto-computed based on the HeapSize and would not exceed 1GB
    ```
    
    For all other combinations, the undefined parameters are adjusted to ensure that the total memory allocated is within the value specified by `DRILLBIT_MAX_PROC_MEM`,
    
    For a system with with 48GB free memory, when set to (say) 50% (with settings defined in drill-env.sh), and heap (8GB) and direct (10GB) are defined; the Drillbit startup with the following warning:
    ```
    2018-01-03 14:31:06  [WARN] 50% of System Memory (47 GB) translates to 24 GB
    2018-01-03 14:31:06  [WARN] You have an allocation of 4 GB that is currently unused from a total of 24 GB. You can increase your existing memory configuration to use this extra memory
            DRILLBIT_MAX_PROC_MEM=50%
            DRILL_HEAP=8G
            DRILL_MAX_DIRECT_MEMORY=10G
            DRILLBIT_CODE_CACHE_SIZE=1024m 
            *NOTE: It is recommended not to specify DRILLBIT_CODE_CACHE_SIZE as this will be auto-computed based on the HeapSize and would not exceed 1GB
    ```
    
    In addition, if the available free memory is less than the allocation, an additional warning is provided under the assumption that the OS will reclaim more free memory when required:
    ```
    2018-01-03 14:31:06  [WARN] Total Memory Allocation for Drillbit (19GB) exceeds available free memory (11GB)
    2018-01-03 14:31:06  [WARN] Drillbit will start up, but can potentially crash due to oversubscribing of system memory.
    ```
    
    For more details, refer the attachments in https://issues.apache.org/jira/browse/DRILL-5741

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kkhatua/drill DRILL-5741

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1082.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1082
    
----
commit 2fea048835f729175f77b0a0dea731f741bb70e9
Author: Kunal Khatua <kk...@...>
Date:   2018-01-03T21:22:52Z

    DRILL-6068: Support user/distrib-specific config checks during startup
    
    1. Allows for distrib/user specific checks to be done
    2. Place-holder files for distribution and user specific checks
    3. Moved JVM Version Check to head of script

commit 266529f2338c607bd7845d408cddb721a41ae4ac
Author: Kunal Khatua <kk...@...>
Date:   2018-01-03T22:15:03Z

    DRILL-5741: Automatically manage memory allocations during startup
    
    Providing an environment variable - DRILLBIT_MAX_PROC_MEM to ensure that a Drillbit's max memory parameters, cumulatively, don't exceed the value specified.
    The variable can be defined in KB, MB, or  GB; similar in syntax to how the JVM MaxHeap is specified.
    e.g. 
    ```
    DRILLBIT_MAX_PROC_MEM=13G
    DRILLBIT_MAX_PROC_MEM=8192m
    DRILLBIT_MAX_PROC_MEM=4194304K
    ```
    In addition, you can specify it as a percent of the total sytem memory prior to the Drillbit starting up:
    `DRILLBIT_MAX_PROC_MEM=40%`
    
    For a system with with 48GB free memory, when set to (say) 25% (with settings defined in drill-env.sh), and heap (8GB) and direct (10GB) are defined; the Drillbit fails startup with the following message:
    ```
    2018-01-03 14:27:57  [WARN] 25% of System Memory (47 GB) translates to 12 GB
    2018-01-03 14:27:57  [ERROR]    Unable to start Drillbit due to memory constraint violations
      Total Memory Requested : 19 GB
      Check the following settings to possibly modify (or increase the Max Memory Permitted):
            DRILLBIT_MAX_PROC_MEM=25%
            DRILL_HEAP=8G
            DRILL_MAX_DIRECT_MEMORY=10G
            DRILLBIT_CODE_CACHE_SIZE=1024m 
            *NOTE: It is recommended not to specify DRILLBIT_CODE_CACHE_SIZE as this will be auto-computed based on the HeapSize and would not exceed 1GB
    ```
    
    For all other combinations, the undefined parameters are adjusted to ensure that the total memory allocated is within the value specified by `DRILLBIT_MAX_PROC_MEM`,
    
    For a system with with 48GB free memory, when set to (say) 50% (with settings defined in drill-env.sh), and heap (8GB) and direct (10GB) are defined; the Drillbit startup with the following warning:
    ```
    2018-01-03 14:31:06  [WARN] 50% of System Memory (47 GB) translates to 24 GB
    2018-01-03 14:31:06  [WARN] You have an allocation of 4 GB that is currently unused from a total of 24 GB. You can increase your existing memory configuration to use this extra memory
            DRILLBIT_MAX_PROC_MEM=50%
            DRILL_HEAP=8G
            DRILL_MAX_DIRECT_MEMORY=10G
            DRILLBIT_CODE_CACHE_SIZE=1024m 
            *NOTE: It is recommended not to specify DRILLBIT_CODE_CACHE_SIZE as this will be auto-computed based on the HeapSize and would not exceed 1GB
    ```
    
    In addition, if the available free memory is less than the allocation, an additional warning is provided under the assumption that the OS will reclaim more free memory when required:
    ```
    2018-01-03 14:31:06  [WARN] Total Memory Allocation for Drillbit (19GB) exceeds available free memory (11GB)
    2018-01-03 14:31:06  [WARN] Drillbit will start up, but can potentially crash due to oversubscribing of system memory.
    ```
    
    For more details, refer the attachments in https://issues.apache.org/jira/browse/DRILL-5741

----


---

[GitHub] drill issue #1082: DRILL-5741: Automatically manage memory allocations durin...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on the issue:

    https://github.com/apache/drill/pull/1082
  
    Done the following changes:
    1. The file names have been changed to the following:
    * `./bin/auto-setup.sh`
    * `./conf/distrib-setup.sh`
    * `./conf/drill-setup.sh`
    `auto-setup.sh` and `distrib-setup.sh` files are mutually exclusive in execution.
    	i. `distrib-setup.sh` -> if it has executable content. Else `auto-setup.sh`
    		The assumption here is that the distribution might have a different mechanism for doing things (along lines of specifying a preferred setup service).
    	ii. `drill-setup.sh` --> This is any additional setup activities beyond what the distribution offers. I don't think this should be mutually exclusive to the previous step, but should be an add on.
    
    A distribution of Drill should reimplement the setup task done by `auto-setup.sh` , if it wants to do the same separately.
    
    2. For managing memory automatically, the `auto-setup.sh` script is looking for the `DRILLBIT_MAX_PROC_MEM` variable.
    If this is not set (which is the case by default), the existing procedure for inferring memory parameters will be followed.
    
    3. If the `DRILLBIT_MAX_PROC_MEM` variable is set, the logic defined within the `auto-setup.sh` script is executed to assign and validate that the total memory allocation of Heap, Direct and CodeCache is within that value.
    
    4. The JIRA and the PR has been updated with the details of the feature.



---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r160802962
  
    --- Diff: distribution/src/assemble/bin.xml ---
    @@ -345,6 +345,16 @@
           <fileMode>0755</fileMode>
           <outputDirectory>conf</outputDirectory>
         </file>
    +    <file>
    +      <source>src/resources/drill-auto.sh</source>
    --- End diff --
    
    Modifying the base commit (#1081) to reflect the name change from `[distrib/drill]-auto.sh` to `[distrib/drill]-setup.sh`


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r160537845
  
    --- Diff: distribution/src/resources/distrib-auto.sh ---
    @@ -0,0 +1,223 @@
    +#!/usr/bin/env bash
    --- End diff --
    
    In general, "distrib" files are meant to be added by Drill distributions, not by Drill itself. If we want to provide the memory checking in Drill, then it should go into some other file other than "distrib."


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r160606654
  
    --- Diff: distribution/src/resources/drill-config.sh ---
    @@ -180,18 +251,46 @@ else
       fi
     fi
     
    -# Default memory settings if none provided by the environment or
    +# Checking if being executed in context of Drillbit and not SQLLine
    +if [ "$DRILLBIT_CONTEXT" == "1" ]; then 
    +  # *-auto.sh allows for distrib/user specific checks to be done
    +  distribAuto="$DRILL_CONF_DIR/distrib-auto.sh"
    +  if [ ! -r "$distribAuto" ]; then distribAuto="$DRILL_HOME/conf/distrib-auto.sh"; fi
    +  if [ ! -r "$distribAuto" ]; then distribAuto=""; fi
    +  drillAuto="$DRILL_CONF_DIR/drill-auto.sh"
    +  if [ ! -r "$drillAuto" ]; then drillAuto="$DRILL_HOME/conf/drill-auto.sh"; fi
    --- End diff --
    
    Agreed. The drill-config file follows this for `drill-env.sh`, but checks for both locations in case of `distrib-env.sh`.


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r166196043
  
    --- Diff: distribution/src/resources/auto-setup.sh ---
    @@ -0,0 +1,222 @@
    +#!/usr/bin/env bash
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    +# contributor license agreements.  See the NOTICE file distributed with
    +# this work for additional information regarding copyright ownership.
    +# The ASF licenses this file to You under the Apache License, Version 2.0
    +# (the "License"); you may not use this file except in compliance with
    +# the License.  You may obtain a copy of the License at
    +#
    +#     http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +
    +# This file is invoked by drill-config.sh during a Drillbit startup and provides
    +# default checks and autoconfiguration.
    +# Distributions should not put anything in this file. Checks can be
    +# specified in ${DRILL_HOME}/conf/distrib-setup.sh
    +# Users should not put anything in this file. Additional checks can be defined
    +# and put in ${DRILL_CONF_DIR}/drill-setup.sh instead.
    +# To FAIL any check, return with a non-zero return code
    +# e.g.
    +# if [ $status == "FAILED" ]; return 1; fi
    +
    +###==========================================================================
    +# FEATURES
    +# 1. Provides checks and auto-configuration for memory settings
    +###==========================================================================
    +
    +# Convert Java memory value to MB
    +function valueInMB() {
    +  if [ -z "$1" ]; then echo ""; return; fi
    +  local inputTxt=`echo $1| tr '[A-Z]' '[a-z]'`
    +  local inputValue=`echo ${inputTxt:0:${#inputTxt}-1}`;
    +  # Extracting Numeric Value
    +  if [[ "$inputTxt" == *g ]]; then
    +    let valueInMB=$inputValue*1024
    +  elif [[ "$DbitMaxProcMem" == *k ]]; then
    +    let valueInMB=$inputValue/1024
    +  elif [[ "$inputTxt" == *m ]]; then
    +    let valueInMB=$inputValue
    +  elif [[ "$inputTxt" == *% ]]; then
    +    #TotalRAM_inMB*percentage [Works on Linux]
    +    let valueInMB=$inputValue*$totalRAM_inMB/100;
    +  else
    +    echo error;
    +    return 1;
    +  fi
    +  echo "$valueInMB"
    +  return
    +}
    +
    +# Convert Java memory value to GB
    +function valueInGB() {
    +  if [ -z "$1" ]; then echo ""; return; fi
    +  local inputTxt=`echo $1| tr '[A-Z]' '[a-z]'`
    +  local inputValue=`echo ${inputTxt:0:${#inputTxt}-1}`;
    +  # Extracting Numeric Value
    +  if [[ "$inputTxt" == *g ]]; then
    +    let valueInGB=$inputValue
    +  elif [[ "$DbitMaxProcMem" == *k ]]; then
    +    let valueInGB=$inputValue/1024/1024
    +  elif [[ "$inputTxt" == *m ]]; then
    +    let valueInGB=$inputValue/1024
    +  elif [[ "$inputTxt" == *% ]]; then
    +    #TotalRAM_inMB*percentage [Works on Linux]
    +    let valueInGB=$inputValue*`cat /proc/meminfo | grep MemTotal | tr ' ' '\n'| grep '[0-9]'`/1024/1024/100;
    +  else
    +    echo error;
    +    return 1;
    +  fi
    +  echo "$valueInGB"
    +  return
    +}
    +
    +# Estimates code cache based on total heap and direct
    +function estCodeCacheInMB() {
    +  local totalHeapAndDirect=$1
    +  if [ $totalHeapAndDirect -le 4096 ]; then echo 512;
    +  elif [ $totalHeapAndDirect -le 10240 ]; then echo 768;
    +  else echo 1024;
    +  fi
    +}
    +
    +#Print Current Allocation
    +function printCurrAllocation()
    +{
    +  if [ -n "$DRILLBIT_MAX_PROC_MEM" ]; then echo -e "\tDRILLBIT_MAX_PROC_MEM=$DRILLBIT_MAX_PROC_MEM"; fi
    +  if [ -n "$DRILL_HEAP" ]; then echo -e "\tDRILL_HEAP=$DRILL_HEAP"; fi
    +  if [ -n "$DRILL_MAX_DIRECT_MEMORY" ]; then echo -e "\tDRILL_MAX_DIRECT_MEMORY=$DRILL_MAX_DIRECT_MEMORY"; fi
    +  if [ -n "$DRILLBIT_CODE_CACHE_SIZE" ]; then
    +    echo -e "\tDRILLBIT_CODE_CACHE_SIZE=$DRILLBIT_CODE_CACHE_SIZE "
    +    echo -e "\t*NOTE: It is recommended not to specify DRILLBIT_CODE_CACHE_SIZE as this will be auto-computed based on the HeapSize and would not exceed 1GB"
    +  fi
    +}
    +
    +#============================================================================
    +# Check and auto-configuration for memory settings
    +#----------------------------------------------------------------------------
    +#Default (Track status of this check: "" => Continue checking ; "PASSED" => no more check required)
    +AutoMemConfigStatus=""
    +
    +#Computing existing system information
    +# Tested on Linux (CentOS/RHEL/Ubuntu); Cygwin (Win10Pro-64bit)
    +if [[ "$OSTYPE" == *linux* ]] || [[ "$OSTYPE" == cygwin* ]]; then
    +  let totalRAM_inMB=`cat /proc/meminfo | grep MemTotal | tr ' ' '\n'| grep '[0-9]'`/1024
    +  let freeRAM_inMB=`cat /proc/meminfo | grep MemFree | tr ' ' '\n'| grep '[0-9]'`/1024
    +elif [[ "$OSTYPE" == darwin* ]]; then
    +  # Mac OSX
    +  #Refer for math: https://apple.stackexchange.com/a/196925
    +  #Page Size
    +  let macOSPageSize=`vm_stat | grep 'page size' | grep -o -E '[0-9]+'`
    +  #MemoryUsage on MacOS
    +  let freePg=`vm_stat | grep free | awk '{ print $NF }' | sed 's/\.//'`
    +  let activePg=`vm_stat | grep -w 'active:' | awk '{ print $NF }' | sed 's/\.//'`
    +  let speculativePg=`vm_stat | grep speculative | awk '{ print $NF }' | sed 's/\.//'`
    +  let fileCachePg=`vm_stat | grep File-backed | awk '{ print $NF }' | sed 's/\.//'`
    +  let wiredMemPg=`vm_stat | grep 'wired down' | awk '{ print $NF }' | sed 's/\.//'`
    +  let compressedPg=`vm_stat | grep 'occupied by compressor' | awk '{ print $NF }' | sed 's/\.//'`
    +  #Total
    +  let totalRAM_inPages=$freePg+$activePg+$speculativePg+$fileCachePg+$wiredMemPg+$compressedPg
    +  let totalRAM_inMB=$totalRAM_inPages*$macOSPageSize/1048576
    +  let freeRAM_inMB=$freePg*$macOSPageSize/1048576
    +elif [[ "$OSTYPE" == "msys" ]]; then
    +  # Msys env on MinGW (TODO: Pending verification)
    +  let totalRAM_inMB=`cat /proc/meminfo | grep MemTotal | tr ' ' '\n'| grep '[0-9]'`/1024
    +  let freeRAM_inMB=`cat /proc/meminfo | grep MemFree | tr ' ' '\n'| grep '[0-9]'`/1024
    +else
    +  # Unknown OS
    +  echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] Unknown OS ("$OSTYPE"). Will not attempt to auto-configure memory"
    +  AutoMemConfigStatus="PASSED"
    +fi
    +
    +#Read current values
    +DbitMaxProcMem=$(valueInMB $DRILLBIT_MAX_PROC_MEM)
    +DbitMaxDirectMem=$(valueInMB $DRILL_MAX_DIRECT_MEMORY)
    +DbitMaxHeapMem=$(valueInMB $DRILL_HEAP)
    +DbitMaxCodeCacheMem=$(valueInMB $DRILLBIT_CODE_CACHE_SIZE)
    +
    +# Alert for %age usage
    +if [[ "$DRILLBIT_MAX_PROC_MEM" == *% ]] && [ -z "$AutoMemConfigStatus" ]; then
    +  echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] "$DRILLBIT_MAX_PROC_MEM" of System Memory ("$(valueInGB $totalRAM_inMB'm')" GB) translates to "$(valueInGB $DbitMaxProcMem'm')" GB"
    +fi
    +
    +### Performing Auto-Configuration
    +if [ -z "$DbitMaxProcMem" ] && [ -z "$AutoMemConfigStatus" ]; then
    +  if [ -n "$DbitMaxDirectMem" ] && [ -n "$DbitMaxHeapMem" ]; then
    +    ## [SCENARIO 1]: TotalCap is NOT Defined, but Heap&Direct ARE Defined (i.e. no limit)
    +    let currTotal=$DbitMaxDirectMem+$DbitMaxHeapMem
    +    #Estimating CodeCache size of current total
    +    if [ -z "$DbitMaxCodeCacheMem" ]; then export DRILLBIT_CODE_CACHE_SIZE=$(estCodeCacheInMB $currTotal)'m'; fi
    +  fi
    +  # Default values will be loaded for unspecified memory parameters
    +  AutoMemConfigStatus="PASSED"
    +elif [ -z "$AutoMemConfigStatus" ]; then
    +  ## Scenario: Total IS Defined
    +  if [ -z "$DbitMaxCodeCacheMem" ]; then
    +    let DbitMaxCodeCacheMem=$(estCodeCacheInMB $DbitMaxProcMem)
    +    export DRILLBIT_CODE_CACHE_SIZE=$DbitMaxCodeCacheMem'm'
    +  fi
    +  if [ -n "$DbitMaxHeapMem" ] && [ -n "$DbitMaxDirectMem" ]; then
    +    ## [SCENARIO 2]: Heap &Direct ARE Defined
    +    let calcTotalInMB=$DbitMaxDirectMem+$DbitMaxHeapMem+$DbitMaxCodeCacheMem
    +    # Fail if exceeding process limit
    +    if [ $calcTotalInMB -gt $DbitMaxProcMem ]; then
    +      echo `date +%Y-%m-%d" "%H:%M:%S`"  [ERROR]    Unable to start Drillbit due to memory constraint violations"
    +      echo "  Total Memory Requested : "$(valueInGB $calcTotalInMB'm')" GB"
    +      echo "  Check the following settings to possibly modify (or increase the Max Memory Permitted):"
    +      printCurrAllocation
    +      exit 127
    +    else
    +      #All numbers align
    +      let deltaInGB=($DbitMaxProcMem-$calcTotalInMB)/1024
    +      if [ $deltaInGB -gt 1 ]; then
    +        echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] You have an allocation of "$deltaInGB" GB that is currently unused from a total of "$(valueInGB $DbitMaxProcMem'm')" GB. You can increase your existing memory configuration to use this extra memory";
    +        printCurrAllocation
    +      fi
    +    fi
    +  elif [ -n "$DbitMaxHeapMem" ] && [ -z "$DbitMaxDirectMem" ]; then
    +    ## [SCENARIO 3]: Total and only Heap is defined
    +    let DbitMaxDirectMem=$DbitMaxProcMem-$DbitMaxHeapMem-$DbitMaxCodeCacheMem
    +  elif [ -z "$DbitMaxHeapMem" ] && [ -n "$DbitMaxDirectMem" ]; then
    +    ## [SCENARIO 4]: Total and only Direct is defined
    +    let DbitMaxHeapMem=$DbitMaxProcMem-$DbitMaxDirectMem-$DbitMaxCodeCacheMem
    +  elif [ -z "$DbitMaxDirectMem" ] && [ -z "$DbitMaxHeapMem" ]; then
    +    ## [SCENARIO 5]: Only Total is defined
    +    ## Compute Direct & Heap
    +    let DbitMaxProcMemInGB=$(valueInGB $DbitMaxProcMem'm')
    +    let DbitMaxHeapMemInGB=`echo $DbitMaxProcMemInGB | awk '{heap=-13.2+6.12*log($1); if (heap<1) {heap=1}; printf "%0.0f\n", heap }'`
    +    let DbitMaxHeapMem=$(valueInMB $DbitMaxHeapMemInGB'g')
    +    let DbitMaxDirectMem=$DbitMaxProcMem-$DbitMaxHeapMem-$DbitMaxCodeCacheMem
    +  fi
    +  ## Export computed values
    +  export DRILL_HEAP=$(valueInGB $DbitMaxHeapMem'm')"G"
    +  export DRILL_MAX_DIRECT_MEMORY=$(valueInGB $DbitMaxDirectMem'm')"G"
    +  export DRILLBIT_CODE_CACHE_SIZE=$DbitMaxCodeCacheMem'm'
    +fi
    +
    +### Broad check for System Level capacity
    +if [ -z "$AutoMemConfigStatus" ]; then
    +  # Rereading for recently exported env var
    +  DbitMaxDirectMem=$(valueInMB $DRILL_MAX_DIRECT_MEMORY)
    +  DbitMaxHeapMem=$(valueInMB $DRILL_HEAP)
    +  DbitMaxCodeCacheMem=$(valueInMB $DRILLBIT_CODE_CACHE_SIZE)
    +  let totalDBitMem_inMB=$DbitMaxDirectMem+$DbitMaxHeapMem+$DbitMaxCodeCacheMem
    +  if [ $totalDBitMem_inMB -gt $totalRAM_inMB ]; then
    +    echo `date +%Y-%m-%d" "%H:%M:%S`"  [ERROR] Total Memory Allocation for Drillbit ("$(valueInGB $totalDBitMem_inMB'm')"GB) exceeds total system memory ("$(valueInGB $totalRAM_inMB'm')"GB)"
    +    echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] Drillbit not will start up"
    +    exit 127
    +  elif [ $totalDBitMem_inMB -gt $freeRAM_inMB ]; then
    +    echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] Total Memory Allocation for Drillbit ("$(valueInGB $totalDBitMem_inMB'm')"GB) exceeds available free memory ("$(valueInGB $freeRAM_inMB'm')"GB)"
    +    echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] Drillbit will start up, but can potentially crash due to oversubscribing of system memory."
    +  fi
    +fi
    +
    +#Implicit that checks have passed
    +AutoMemConfigStatus="PASSED"
    --- End diff --
    
    Added the additional messaging of when auto-configuration is being done. This should ensure no surprises emerge when a partial set of parameters is provided by the user.


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/1082


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r160606013
  
    --- Diff: distribution/src/assemble/bin.xml ---
    @@ -345,6 +345,16 @@
           <fileMode>0755</fileMode>
           <outputDirectory>conf</outputDirectory>
         </file>
    +    <file>
    +      <source>src/resources/drill-auto.sh</source>
    --- End diff --
    
    AutoConfig was the intent, but looked a bit verbose. How about `drill-autocheck.sh` ? :)


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r164324919
  
    --- Diff: distribution/src/resources/auto-setup.sh ---
    @@ -0,0 +1,222 @@
    +#!/usr/bin/env bash
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    +# contributor license agreements.  See the NOTICE file distributed with
    +# this work for additional information regarding copyright ownership.
    +# The ASF licenses this file to You under the Apache License, Version 2.0
    +# (the "License"); you may not use this file except in compliance with
    +# the License.  You may obtain a copy of the License at
    +#
    +#     http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +
    +# This file is invoked by drill-config.sh during a Drillbit startup and provides
    +# default checks and autoconfiguration.
    +# Distributions should not put anything in this file. Checks can be
    +# specified in ${DRILL_HOME}/conf/distrib-setup.sh
    +# Users should not put anything in this file. Additional checks can be defined
    +# and put in ${DRILL_CONF_DIR}/drill-setup.sh instead.
    +# To FAIL any check, return with a non-zero return code
    +# e.g.
    +# if [ $status == "FAILED" ]; return 1; fi
    +
    +###==========================================================================
    +# FEATURES
    +# 1. Provides checks and auto-configuration for memory settings
    +###==========================================================================
    +
    +# Convert Java memory value to MB
    +function valueInMB() {
    +  if [ -z "$1" ]; then echo ""; return; fi
    +  local inputTxt=`echo $1| tr '[A-Z]' '[a-z]'`
    +  local inputValue=`echo ${inputTxt:0:${#inputTxt}-1}`;
    +  # Extracting Numeric Value
    +  if [[ "$inputTxt" == *g ]]; then
    +    let valueInMB=$inputValue*1024
    +  elif [[ "$DbitMaxProcMem" == *k ]]; then
    +    let valueInMB=$inputValue/1024
    +  elif [[ "$inputTxt" == *m ]]; then
    +    let valueInMB=$inputValue
    +  elif [[ "$inputTxt" == *% ]]; then
    +    #TotalRAM_inMB*percentage [Works on Linux]
    +    let valueInMB=$inputValue*$totalRAM_inMB/100;
    +  else
    +    echo error;
    +    return 1;
    +  fi
    +  echo "$valueInMB"
    +  return
    +}
    +
    +# Convert Java memory value to GB
    +function valueInGB() {
    +  if [ -z "$1" ]; then echo ""; return; fi
    +  local inputTxt=`echo $1| tr '[A-Z]' '[a-z]'`
    +  local inputValue=`echo ${inputTxt:0:${#inputTxt}-1}`;
    +  # Extracting Numeric Value
    +  if [[ "$inputTxt" == *g ]]; then
    +    let valueInGB=$inputValue
    +  elif [[ "$DbitMaxProcMem" == *k ]]; then
    +    let valueInGB=$inputValue/1024/1024
    +  elif [[ "$inputTxt" == *m ]]; then
    +    let valueInGB=$inputValue/1024
    +  elif [[ "$inputTxt" == *% ]]; then
    +    #TotalRAM_inMB*percentage [Works on Linux]
    +    let valueInGB=$inputValue*`cat /proc/meminfo | grep MemTotal | tr ' ' '\n'| grep '[0-9]'`/1024/1024/100;
    +  else
    +    echo error;
    +    return 1;
    +  fi
    +  echo "$valueInGB"
    +  return
    +}
    +
    +# Estimates code cache based on total heap and direct
    +function estCodeCacheInMB() {
    +  local totalHeapAndDirect=$1
    +  if [ $totalHeapAndDirect -le 4096 ]; then echo 512;
    +  elif [ $totalHeapAndDirect -le 10240 ]; then echo 768;
    +  else echo 1024;
    +  fi
    +}
    +
    +#Print Current Allocation
    +function printCurrAllocation()
    +{
    +  if [ -n "$DRILLBIT_MAX_PROC_MEM" ]; then echo -e "\tDRILLBIT_MAX_PROC_MEM=$DRILLBIT_MAX_PROC_MEM"; fi
    +  if [ -n "$DRILL_HEAP" ]; then echo -e "\tDRILL_HEAP=$DRILL_HEAP"; fi
    +  if [ -n "$DRILL_MAX_DIRECT_MEMORY" ]; then echo -e "\tDRILL_MAX_DIRECT_MEMORY=$DRILL_MAX_DIRECT_MEMORY"; fi
    +  if [ -n "$DRILLBIT_CODE_CACHE_SIZE" ]; then
    +    echo -e "\tDRILLBIT_CODE_CACHE_SIZE=$DRILLBIT_CODE_CACHE_SIZE "
    +    echo -e "\t*NOTE: It is recommended not to specify DRILLBIT_CODE_CACHE_SIZE as this will be auto-computed based on the HeapSize and would not exceed 1GB"
    +  fi
    +}
    +
    +#============================================================================
    +# Check and auto-configuration for memory settings
    +#----------------------------------------------------------------------------
    +#Default (Track status of this check: "" => Continue checking ; "PASSED" => no more check required)
    +AutoMemConfigStatus=""
    +
    +#Computing existing system information
    +# Tested on Linux (CentOS/RHEL/Ubuntu); Cygwin (Win10Pro-64bit)
    +if [[ "$OSTYPE" == *linux* ]] || [[ "$OSTYPE" == cygwin* ]]; then
    +  let totalRAM_inMB=`cat /proc/meminfo | grep MemTotal | tr ' ' '\n'| grep '[0-9]'`/1024
    +  let freeRAM_inMB=`cat /proc/meminfo | grep MemFree | tr ' ' '\n'| grep '[0-9]'`/1024
    +elif [[ "$OSTYPE" == darwin* ]]; then
    +  # Mac OSX
    +  #Refer for math: https://apple.stackexchange.com/a/196925
    +  #Page Size
    +  let macOSPageSize=`vm_stat | grep 'page size' | grep -o -E '[0-9]+'`
    +  #MemoryUsage on MacOS
    +  let freePg=`vm_stat | grep free | awk '{ print $NF }' | sed 's/\.//'`
    +  let activePg=`vm_stat | grep -w 'active:' | awk '{ print $NF }' | sed 's/\.//'`
    +  let speculativePg=`vm_stat | grep speculative | awk '{ print $NF }' | sed 's/\.//'`
    +  let fileCachePg=`vm_stat | grep File-backed | awk '{ print $NF }' | sed 's/\.//'`
    +  let wiredMemPg=`vm_stat | grep 'wired down' | awk '{ print $NF }' | sed 's/\.//'`
    +  let compressedPg=`vm_stat | grep 'occupied by compressor' | awk '{ print $NF }' | sed 's/\.//'`
    +  #Total
    +  let totalRAM_inPages=$freePg+$activePg+$speculativePg+$fileCachePg+$wiredMemPg+$compressedPg
    +  let totalRAM_inMB=$totalRAM_inPages*$macOSPageSize/1048576
    +  let freeRAM_inMB=$freePg*$macOSPageSize/1048576
    +elif [[ "$OSTYPE" == "msys" ]]; then
    +  # Msys env on MinGW (TODO: Pending verification)
    +  let totalRAM_inMB=`cat /proc/meminfo | grep MemTotal | tr ' ' '\n'| grep '[0-9]'`/1024
    +  let freeRAM_inMB=`cat /proc/meminfo | grep MemFree | tr ' ' '\n'| grep '[0-9]'`/1024
    +else
    +  # Unknown OS
    +  echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] Unknown OS ("$OSTYPE"). Will not attempt to auto-configure memory"
    +  AutoMemConfigStatus="PASSED"
    +fi
    +
    +#Read current values
    +DbitMaxProcMem=$(valueInMB $DRILLBIT_MAX_PROC_MEM)
    +DbitMaxDirectMem=$(valueInMB $DRILL_MAX_DIRECT_MEMORY)
    +DbitMaxHeapMem=$(valueInMB $DRILL_HEAP)
    +DbitMaxCodeCacheMem=$(valueInMB $DRILLBIT_CODE_CACHE_SIZE)
    +
    +# Alert for %age usage
    +if [[ "$DRILLBIT_MAX_PROC_MEM" == *% ]] && [ -z "$AutoMemConfigStatus" ]; then
    +  echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] "$DRILLBIT_MAX_PROC_MEM" of System Memory ("$(valueInGB $totalRAM_inMB'm')" GB) translates to "$(valueInGB $DbitMaxProcMem'm')" GB"
    +fi
    +
    +### Performing Auto-Configuration
    +if [ -z "$DbitMaxProcMem" ] && [ -z "$AutoMemConfigStatus" ]; then
    +  if [ -n "$DbitMaxDirectMem" ] && [ -n "$DbitMaxHeapMem" ]; then
    +    ## [SCENARIO 1]: TotalCap is NOT Defined, but Heap&Direct ARE Defined (i.e. no limit)
    +    let currTotal=$DbitMaxDirectMem+$DbitMaxHeapMem
    +    #Estimating CodeCache size of current total
    +    if [ -z "$DbitMaxCodeCacheMem" ]; then export DRILLBIT_CODE_CACHE_SIZE=$(estCodeCacheInMB $currTotal)'m'; fi
    +  fi
    +  # Default values will be loaded for unspecified memory parameters
    +  AutoMemConfigStatus="PASSED"
    +elif [ -z "$AutoMemConfigStatus" ]; then
    +  ## Scenario: Total IS Defined
    +  if [ -z "$DbitMaxCodeCacheMem" ]; then
    +    let DbitMaxCodeCacheMem=$(estCodeCacheInMB $DbitMaxProcMem)
    +    export DRILLBIT_CODE_CACHE_SIZE=$DbitMaxCodeCacheMem'm'
    +  fi
    +  if [ -n "$DbitMaxHeapMem" ] && [ -n "$DbitMaxDirectMem" ]; then
    +    ## [SCENARIO 2]: Heap &Direct ARE Defined
    +    let calcTotalInMB=$DbitMaxDirectMem+$DbitMaxHeapMem+$DbitMaxCodeCacheMem
    +    # Fail if exceeding process limit
    +    if [ $calcTotalInMB -gt $DbitMaxProcMem ]; then
    +      echo `date +%Y-%m-%d" "%H:%M:%S`"  [ERROR]    Unable to start Drillbit due to memory constraint violations"
    +      echo "  Total Memory Requested : "$(valueInGB $calcTotalInMB'm')" GB"
    +      echo "  Check the following settings to possibly modify (or increase the Max Memory Permitted):"
    +      printCurrAllocation
    +      exit 127
    +    else
    +      #All numbers align
    +      let deltaInGB=($DbitMaxProcMem-$calcTotalInMB)/1024
    +      if [ $deltaInGB -gt 1 ]; then
    +        echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] You have an allocation of "$deltaInGB" GB that is currently unused from a total of "$(valueInGB $DbitMaxProcMem'm')" GB. You can increase your existing memory configuration to use this extra memory";
    +        printCurrAllocation
    +      fi
    +    fi
    +  elif [ -n "$DbitMaxHeapMem" ] && [ -z "$DbitMaxDirectMem" ]; then
    +    ## [SCENARIO 3]: Total and only Heap is defined
    +    let DbitMaxDirectMem=$DbitMaxProcMem-$DbitMaxHeapMem-$DbitMaxCodeCacheMem
    +  elif [ -z "$DbitMaxHeapMem" ] && [ -n "$DbitMaxDirectMem" ]; then
    +    ## [SCENARIO 4]: Total and only Direct is defined
    +    let DbitMaxHeapMem=$DbitMaxProcMem-$DbitMaxDirectMem-$DbitMaxCodeCacheMem
    +  elif [ -z "$DbitMaxDirectMem" ] && [ -z "$DbitMaxHeapMem" ]; then
    +    ## [SCENARIO 5]: Only Total is defined
    +    ## Compute Direct & Heap
    +    let DbitMaxProcMemInGB=$(valueInGB $DbitMaxProcMem'm')
    +    let DbitMaxHeapMemInGB=`echo $DbitMaxProcMemInGB | awk '{heap=-13.2+6.12*log($1); if (heap<1) {heap=1}; printf "%0.0f\n", heap }'`
    +    let DbitMaxHeapMem=$(valueInMB $DbitMaxHeapMemInGB'g')
    +    let DbitMaxDirectMem=$DbitMaxProcMem-$DbitMaxHeapMem-$DbitMaxCodeCacheMem
    +  fi
    +  ## Export computed values
    +  export DRILL_HEAP=$(valueInGB $DbitMaxHeapMem'm')"G"
    +  export DRILL_MAX_DIRECT_MEMORY=$(valueInGB $DbitMaxDirectMem'm')"G"
    +  export DRILLBIT_CODE_CACHE_SIZE=$DbitMaxCodeCacheMem'm'
    +fi
    +
    +### Broad check for System Level capacity
    +if [ -z "$AutoMemConfigStatus" ]; then
    +  # Rereading for recently exported env var
    +  DbitMaxDirectMem=$(valueInMB $DRILL_MAX_DIRECT_MEMORY)
    +  DbitMaxHeapMem=$(valueInMB $DRILL_HEAP)
    +  DbitMaxCodeCacheMem=$(valueInMB $DRILLBIT_CODE_CACHE_SIZE)
    +  let totalDBitMem_inMB=$DbitMaxDirectMem+$DbitMaxHeapMem+$DbitMaxCodeCacheMem
    +  if [ $totalDBitMem_inMB -gt $totalRAM_inMB ]; then
    +    echo `date +%Y-%m-%d" "%H:%M:%S`"  [ERROR] Total Memory Allocation for Drillbit ("$(valueInGB $totalDBitMem_inMB'm')"GB) exceeds total system memory ("$(valueInGB $totalRAM_inMB'm')"GB)"
    +    echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] Drillbit not will start up"
    +    exit 127
    +  elif [ $totalDBitMem_inMB -gt $freeRAM_inMB ]; then
    +    echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] Total Memory Allocation for Drillbit ("$(valueInGB $totalDBitMem_inMB'm')"GB) exceeds available free memory ("$(valueInGB $freeRAM_inMB'm')"GB)"
    +    echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] Drillbit will start up, but can potentially crash due to oversubscribing of system memory."
    +  fi
    +fi
    +
    +#Implicit that checks have passed
    +AutoMemConfigStatus="PASSED"
    --- End diff --
    
    In Drill 1.8, we added a "debug" option to `drillbit.sh`:
    
    ```
    drillbit.sh --debug
    ```
    
    This does nothing other than dump the environment and config settings. That feature has saved our bacon on several occasions because we can easily see exactly what Drill will use when launched.
    
    Should we provide something like that here? In debug mode, emit the values that this script thinks it is getting as input, then emit the auto-configured values that it is setting as output. This will be handy if, say, a community user seems to have a problem and we want to see what the script is doing.
    
    If you feel that the `--debug` output already shows this info (because it dumps the environment), then all is good. Else, consider if we need something more here.


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r164324543
  
    --- Diff: distribution/src/assemble/bin.xml ---
    @@ -345,6 +345,21 @@
           <fileMode>0755</fileMode>
           <outputDirectory>conf</outputDirectory>
         </file>
    +    <file>
    +      <source>src/resources/auto-setup.sh</source>
    +      <fileMode>0755</fileMode>
    +      <outputDirectory>bin</outputDirectory>
    +    </file>
    +    <file>
    +      <source>src/resources/drill-setup.sh</source>
    +      <fileMode>0755</fileMode>
    +      <outputDirectory>conf</outputDirectory>
    +    </file>
    +    <file>
    +      <source>src/resources/distrib-setup.sh</source>
    --- End diff --
    
    There should be no "distrib" files in Drill. The scripts should check if they exist, and if not (the default), just ignore the file. Distributions add their own files. Having no default file makes clear that these are not Apache-provided files.


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r160606258
  
    --- Diff: distribution/src/resources/distrib-auto.sh ---
    @@ -0,0 +1,223 @@
    +#!/usr/bin/env bash
    --- End diff --
    
    Wasn't sure which would be a a suitable place. I reasoned the "Apache" distribution is what we are defining it for, hence the choice


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r166157864
  
    --- Diff: distribution/src/resources/drill-config.sh ---
    @@ -180,18 +251,61 @@ else
       fi
     fi
     
    -# Default memory settings if none provided by the environment or
    +# Execute distrib-setup.sh for any distribution-specific setup (e.g. checks).
    +# distrib-setup.sh is optional; it is created by some distribution installers
    +# that need additional distribution-specific setup to be done.
    +# Because installers will have site-specific steps, the file
    +# should be moved into the site directory, if the user employs one.
    +
    +# Checking if being executed in context of Drillbit and not SQLLine
    +if [ "$DRILLBIT_CONTEXT" == "1" ]; then
    +  # Check whether to run exclusively distrib-setup.sh OR auto-setup.sh
    +  distribSetup="$DRILL_CONF_DIR/distrib-setup.sh" ; #Site-based distrib-setup.sh
    +  if [ $(checkExecutableLineCount $distribSetup) -eq 0 ]; then
    --- End diff --
    
    I'd have liked the KISS principle, but I thought there was a need for placeholder `distrib-setup.sh` file.
    Based on that, I need to figure out whether there is a distribtion-specific setup, or should we revert to executing the `auto-setup.sh`. Unlike sourcing environment files, where an unset variable can be set, for auto-setup, the choice of execution has to be mutually exclusive.
    This block looks complicated (and verbose with the comments), but is only identifying *what* setup script needs to execute. Hence, all we do here is an assignment of the variables.


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r164328389
  
    --- Diff: distribution/src/resources/drill-config.sh ---
    @@ -180,18 +251,61 @@ else
       fi
     fi
     
    -# Default memory settings if none provided by the environment or
    +# Execute distrib-setup.sh for any distribution-specific setup (e.g. checks).
    +# distrib-setup.sh is optional; it is created by some distribution installers
    +# that need additional distribution-specific setup to be done.
    +# Because installers will have site-specific steps, the file
    +# should be moved into the site directory, if the user employs one.
    +
    +# Checking if being executed in context of Drillbit and not SQLLine
    +if [ "$DRILLBIT_CONTEXT" == "1" ]; then
    +  # Check whether to run exclusively distrib-setup.sh OR auto-setup.sh
    +  distribSetup="$DRILL_CONF_DIR/distrib-setup.sh" ; #Site-based distrib-setup.sh
    +  if [ $(checkExecutableLineCount $distribSetup) -eq 0 ]; then
    +    distribSetup="$DRILL_HOME/conf/distrib-setup.sh" ; #Install-based distrib-setup.sh
    +    if [ $(checkExecutableLineCount $distribSetup) -eq 0 ]; then
    +      # Run Default Auto Setup
    +      distribSetup="$DRILL_HOME/bin/auto-setup.sh"
    +    fi
    +  fi
    +  # Check and run additional setup defined by user
    +  drillSetup="$DRILL_CONF_DIR/drill-setup.sh" ; #Site-based drill-setup.sh
    +  if [ $(checkExecutableLineCount $drillSetup) -eq 0 ]; then
    +    drillSetup="$DRILL_HOME/conf/drill-setup.sh" ; #Install-based drill-setup.sh
    +    if [ $(checkExecutableLineCount $drillSetup) -eq 0 ]; then drillSetup=""; fi
    +  fi
    +
    +  # Enforcing checks in order (distrib-setup.sh , drill-setup.sh)
    +  # (NOTE: A script is executed only if it has relevant executable lines)
    +  # Both distribSetup & drillSetup are executed because the user might have introduced additional checks
    +  if [ -n "$distribSetup" ]; then
    +    . "$distribSetup"
    +    if [ $? -gt 0 ]; then fatal_error "Aborting Drill Startup due failed setup by $distribSetup"; fi
    --- End diff --
    
    In the interests of simplicity, I would just have the script itself report errors and exit. See the existing code in `drill-config.sh`. This way, the script can give a reasonable error message. And, it is not very hard to write `exit 1` to exit the scripts.


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r160539726
  
    --- Diff: distribution/src/resources/drill-config.sh ---
    @@ -180,18 +251,46 @@ else
       fi
     fi
     
    -# Default memory settings if none provided by the environment or
    +# Checking if being executed in context of Drillbit and not SQLLine
    +if [ "$DRILLBIT_CONTEXT" == "1" ]; then 
    +  # *-auto.sh allows for distrib/user specific checks to be done
    +  distribAuto="$DRILL_CONF_DIR/distrib-auto.sh"
    +  if [ ! -r "$distribAuto" ]; then distribAuto="$DRILL_HOME/conf/distrib-auto.sh"; fi
    +  if [ ! -r "$distribAuto" ]; then distribAuto=""; fi
    +  drillAuto="$DRILL_CONF_DIR/drill-auto.sh"
    +  if [ ! -r "$drillAuto" ]; then drillAuto="$DRILL_HOME/conf/drill-auto.sh"; fi
    +  if [ ! -r "$drillAuto" ]; then drillAuto=""; fi
    +
    +  # Enforcing checks in order (distrib-auto.sh , drill-auto.sh)
    +  # (NOTE: A script is executed only if it has relevant executable lines)
    +  if [ -n "$distribAuto" ] && [ $(executableLineCount $distribAuto) -gt 0 ]; then
    +    . "$distribAuto"
    +    if [ $? -gt 0 ]; then fatal_error "Aborting Drill Startup due failed checks from $distribAuto"; fi
    +  fi
    +  if [ -n "$drillAuto" ] && [ $(executableLineCount $drillAuto) -gt 0 ]; then
    --- End diff --
    
    To make the code a bit simpler, touch each file once:
    
    * Check if it exists.
    * Invoke it if so.
    * Handle any errors that are reported.



---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r166158426
  
    --- Diff: distribution/src/resources/drill-config.sh ---
    @@ -180,18 +251,61 @@ else
       fi
     fi
     
    -# Default memory settings if none provided by the environment or
    +# Execute distrib-setup.sh for any distribution-specific setup (e.g. checks).
    +# distrib-setup.sh is optional; it is created by some distribution installers
    +# that need additional distribution-specific setup to be done.
    +# Because installers will have site-specific steps, the file
    +# should be moved into the site directory, if the user employs one.
    +
    +# Checking if being executed in context of Drillbit and not SQLLine
    +if [ "$DRILLBIT_CONTEXT" == "1" ]; then
    +  # Check whether to run exclusively distrib-setup.sh OR auto-setup.sh
    +  distribSetup="$DRILL_CONF_DIR/distrib-setup.sh" ; #Site-based distrib-setup.sh
    +  if [ $(checkExecutableLineCount $distribSetup) -eq 0 ]; then
    +    distribSetup="$DRILL_HOME/conf/distrib-setup.sh" ; #Install-based distrib-setup.sh
    +    if [ $(checkExecutableLineCount $distribSetup) -eq 0 ]; then
    +      # Run Default Auto Setup
    +      distribSetup="$DRILL_HOME/bin/auto-setup.sh"
    +    fi
    +  fi
    +  # Check and run additional setup defined by user
    +  drillSetup="$DRILL_CONF_DIR/drill-setup.sh" ; #Site-based drill-setup.sh
    +  if [ $(checkExecutableLineCount $drillSetup) -eq 0 ]; then
    +    drillSetup="$DRILL_HOME/conf/drill-setup.sh" ; #Install-based drill-setup.sh
    +    if [ $(checkExecutableLineCount $drillSetup) -eq 0 ]; then drillSetup=""; fi
    +  fi
    +
    +  # Enforcing checks in order (distrib-setup.sh , drill-setup.sh)
    +  # (NOTE: A script is executed only if it has relevant executable lines)
    +  # Both distribSetup & drillSetup are executed because the user might have introduced additional checks
    +  if [ -n "$distribSetup" ]; then
    +    . "$distribSetup"
    +    if [ $? -gt 0 ]; then fatal_error "Aborting Drill Startup due failed setup by $distribSetup"; fi
    --- End diff --
    
    The auto-configuration scripts do indeed do that. However, I thought it would be good to have a higher level error message also indicating the source of the failure. This allows us to catch any non-zero exit codes that might be thrown and not handled cleanly. Other sections of `drill-config.sh` followed this principle.


---

[GitHub] drill issue #1082: DRILL-5741: Automatically manage memory allocations durin...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on the issue:

    https://github.com/apache/drill/pull/1082
  
    Good point. Let me send out a post on the user-list since this affects a broader audience beyond just Dev. Thanks for reviewing!


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r160539584
  
    --- Diff: distribution/src/resources/drill-config.sh ---
    @@ -180,18 +251,46 @@ else
       fi
     fi
     
    -# Default memory settings if none provided by the environment or
    +# Checking if being executed in context of Drillbit and not SQLLine
    +if [ "$DRILLBIT_CONTEXT" == "1" ]; then 
    +  # *-auto.sh allows for distrib/user specific checks to be done
    +  distribAuto="$DRILL_CONF_DIR/distrib-auto.sh"
    +  if [ ! -r "$distribAuto" ]; then distribAuto="$DRILL_HOME/conf/distrib-auto.sh"; fi
    +  if [ ! -r "$distribAuto" ]; then distribAuto=""; fi
    +  drillAuto="$DRILL_CONF_DIR/drill-auto.sh"
    +  if [ ! -r "$drillAuto" ]; then drillAuto="$DRILL_HOME/conf/drill-auto.sh"; fi
    --- End diff --
    
    This part is just a bit tricky. Drill- and distribution- specific files do, in fact, reside in `$DRILL_HOME/conf`. But, user files reside in a number of places. So, please use:
    
    - `$DRILL_HOME/conf` for `distrib-auto.sh`
    - `$DRILL_CONF_DIR` for user-supplied files (such as `drill-auto.sh`)


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r166157369
  
    --- Diff: distribution/src/assemble/bin.xml ---
    @@ -345,6 +345,21 @@
           <fileMode>0755</fileMode>
           <outputDirectory>conf</outputDirectory>
         </file>
    +    <file>
    +      <source>src/resources/auto-setup.sh</source>
    +      <fileMode>0755</fileMode>
    +      <outputDirectory>bin</outputDirectory>
    +    </file>
    +    <file>
    +      <source>src/resources/drill-setup.sh</source>
    +      <fileMode>0755</fileMode>
    +      <outputDirectory>conf</outputDirectory>
    +    </file>
    +    <file>
    +      <source>src/resources/distrib-setup.sh</source>
    --- End diff --
    
    The `distrib-setup.sh` file is empty, but provided the placeholder to indicate where distributions should make the change.
    This is identical to the intent of having `distrib-env.sh` in the Apache distribution, which is also empty but serves the same purpose.
    https://github.com/apache/drill/blob/master/distribution/src/resources/distrib-env.sh
    Just following the same convention.


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r160803438
  
    --- Diff: distribution/src/resources/drill-config.sh ---
    @@ -180,18 +251,46 @@ else
       fi
     fi
     
    -# Default memory settings if none provided by the environment or
    +# Checking if being executed in context of Drillbit and not SQLLine
    +if [ "$DRILLBIT_CONTEXT" == "1" ]; then 
    +  # *-auto.sh allows for distrib/user specific checks to be done
    +  distribAuto="$DRILL_CONF_DIR/distrib-auto.sh"
    +  if [ ! -r "$distribAuto" ]; then distribAuto="$DRILL_HOME/conf/distrib-auto.sh"; fi
    +  if [ ! -r "$distribAuto" ]; then distribAuto=""; fi
    +  drillAuto="$DRILL_CONF_DIR/drill-auto.sh"
    +  if [ ! -r "$drillAuto" ]; then drillAuto="$DRILL_HOME/conf/drill-auto.sh"; fi
    +  if [ ! -r "$drillAuto" ]; then drillAuto=""; fi
    +
    +  # Enforcing checks in order (distrib-auto.sh , drill-auto.sh)
    +  # (NOTE: A script is executed only if it has relevant executable lines)
    +  if [ -n "$distribAuto" ] && [ $(executableLineCount $distribAuto) -gt 0 ]; then
    +    . "$distribAuto"
    +    if [ $? -gt 0 ]; then fatal_error "Aborting Drill Startup due failed checks from $distribAuto"; fi
    +  fi
    +  if [ -n "$drillAuto" ] && [ $(executableLineCount $drillAuto) -gt 0 ]; then
    --- End diff --
    
    Passed the checks for the file to the renamed function: `checkExecutableLineCount`


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r166157561
  
    --- Diff: distribution/src/resources/auto-setup.sh ---
    @@ -0,0 +1,222 @@
    +#!/usr/bin/env bash
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    +# contributor license agreements.  See the NOTICE file distributed with
    +# this work for additional information regarding copyright ownership.
    +# The ASF licenses this file to You under the Apache License, Version 2.0
    +# (the "License"); you may not use this file except in compliance with
    +# the License.  You may obtain a copy of the License at
    +#
    +#     http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +
    +# This file is invoked by drill-config.sh during a Drillbit startup and provides
    +# default checks and autoconfiguration.
    +# Distributions should not put anything in this file. Checks can be
    +# specified in ${DRILL_HOME}/conf/distrib-setup.sh
    +# Users should not put anything in this file. Additional checks can be defined
    +# and put in ${DRILL_CONF_DIR}/drill-setup.sh instead.
    +# To FAIL any check, return with a non-zero return code
    +# e.g.
    +# if [ $status == "FAILED" ]; return 1; fi
    +
    +###==========================================================================
    +# FEATURES
    +# 1. Provides checks and auto-configuration for memory settings
    +###==========================================================================
    +
    +# Convert Java memory value to MB
    +function valueInMB() {
    +  if [ -z "$1" ]; then echo ""; return; fi
    +  local inputTxt=`echo $1| tr '[A-Z]' '[a-z]'`
    +  local inputValue=`echo ${inputTxt:0:${#inputTxt}-1}`;
    +  # Extracting Numeric Value
    +  if [[ "$inputTxt" == *g ]]; then
    +    let valueInMB=$inputValue*1024
    +  elif [[ "$DbitMaxProcMem" == *k ]]; then
    +    let valueInMB=$inputValue/1024
    +  elif [[ "$inputTxt" == *m ]]; then
    +    let valueInMB=$inputValue
    +  elif [[ "$inputTxt" == *% ]]; then
    +    #TotalRAM_inMB*percentage [Works on Linux]
    +    let valueInMB=$inputValue*$totalRAM_inMB/100;
    +  else
    +    echo error;
    +    return 1;
    +  fi
    +  echo "$valueInMB"
    +  return
    +}
    +
    +# Convert Java memory value to GB
    +function valueInGB() {
    +  if [ -z "$1" ]; then echo ""; return; fi
    +  local inputTxt=`echo $1| tr '[A-Z]' '[a-z]'`
    +  local inputValue=`echo ${inputTxt:0:${#inputTxt}-1}`;
    +  # Extracting Numeric Value
    +  if [[ "$inputTxt" == *g ]]; then
    +    let valueInGB=$inputValue
    +  elif [[ "$DbitMaxProcMem" == *k ]]; then
    +    let valueInGB=$inputValue/1024/1024
    +  elif [[ "$inputTxt" == *m ]]; then
    +    let valueInGB=$inputValue/1024
    +  elif [[ "$inputTxt" == *% ]]; then
    +    #TotalRAM_inMB*percentage [Works on Linux]
    +    let valueInGB=$inputValue*`cat /proc/meminfo | grep MemTotal | tr ' ' '\n'| grep '[0-9]'`/1024/1024/100;
    +  else
    +    echo error;
    +    return 1;
    +  fi
    +  echo "$valueInGB"
    +  return
    +}
    +
    +# Estimates code cache based on total heap and direct
    +function estCodeCacheInMB() {
    +  local totalHeapAndDirect=$1
    +  if [ $totalHeapAndDirect -le 4096 ]; then echo 512;
    +  elif [ $totalHeapAndDirect -le 10240 ]; then echo 768;
    +  else echo 1024;
    +  fi
    +}
    +
    +#Print Current Allocation
    +function printCurrAllocation()
    +{
    +  if [ -n "$DRILLBIT_MAX_PROC_MEM" ]; then echo -e "\tDRILLBIT_MAX_PROC_MEM=$DRILLBIT_MAX_PROC_MEM"; fi
    +  if [ -n "$DRILL_HEAP" ]; then echo -e "\tDRILL_HEAP=$DRILL_HEAP"; fi
    +  if [ -n "$DRILL_MAX_DIRECT_MEMORY" ]; then echo -e "\tDRILL_MAX_DIRECT_MEMORY=$DRILL_MAX_DIRECT_MEMORY"; fi
    +  if [ -n "$DRILLBIT_CODE_CACHE_SIZE" ]; then
    +    echo -e "\tDRILLBIT_CODE_CACHE_SIZE=$DRILLBIT_CODE_CACHE_SIZE "
    +    echo -e "\t*NOTE: It is recommended not to specify DRILLBIT_CODE_CACHE_SIZE as this will be auto-computed based on the HeapSize and would not exceed 1GB"
    +  fi
    +}
    +
    +#============================================================================
    +# Check and auto-configuration for memory settings
    +#----------------------------------------------------------------------------
    +#Default (Track status of this check: "" => Continue checking ; "PASSED" => no more check required)
    +AutoMemConfigStatus=""
    +
    +#Computing existing system information
    +# Tested on Linux (CentOS/RHEL/Ubuntu); Cygwin (Win10Pro-64bit)
    +if [[ "$OSTYPE" == *linux* ]] || [[ "$OSTYPE" == cygwin* ]]; then
    +  let totalRAM_inMB=`cat /proc/meminfo | grep MemTotal | tr ' ' '\n'| grep '[0-9]'`/1024
    +  let freeRAM_inMB=`cat /proc/meminfo | grep MemFree | tr ' ' '\n'| grep '[0-9]'`/1024
    +elif [[ "$OSTYPE" == darwin* ]]; then
    +  # Mac OSX
    +  #Refer for math: https://apple.stackexchange.com/a/196925
    +  #Page Size
    +  let macOSPageSize=`vm_stat | grep 'page size' | grep -o -E '[0-9]+'`
    +  #MemoryUsage on MacOS
    +  let freePg=`vm_stat | grep free | awk '{ print $NF }' | sed 's/\.//'`
    +  let activePg=`vm_stat | grep -w 'active:' | awk '{ print $NF }' | sed 's/\.//'`
    +  let speculativePg=`vm_stat | grep speculative | awk '{ print $NF }' | sed 's/\.//'`
    +  let fileCachePg=`vm_stat | grep File-backed | awk '{ print $NF }' | sed 's/\.//'`
    +  let wiredMemPg=`vm_stat | grep 'wired down' | awk '{ print $NF }' | sed 's/\.//'`
    +  let compressedPg=`vm_stat | grep 'occupied by compressor' | awk '{ print $NF }' | sed 's/\.//'`
    +  #Total
    +  let totalRAM_inPages=$freePg+$activePg+$speculativePg+$fileCachePg+$wiredMemPg+$compressedPg
    +  let totalRAM_inMB=$totalRAM_inPages*$macOSPageSize/1048576
    +  let freeRAM_inMB=$freePg*$macOSPageSize/1048576
    +elif [[ "$OSTYPE" == "msys" ]]; then
    +  # Msys env on MinGW (TODO: Pending verification)
    +  let totalRAM_inMB=`cat /proc/meminfo | grep MemTotal | tr ' ' '\n'| grep '[0-9]'`/1024
    +  let freeRAM_inMB=`cat /proc/meminfo | grep MemFree | tr ' ' '\n'| grep '[0-9]'`/1024
    +else
    +  # Unknown OS
    +  echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] Unknown OS ("$OSTYPE"). Will not attempt to auto-configure memory"
    +  AutoMemConfigStatus="PASSED"
    +fi
    +
    +#Read current values
    +DbitMaxProcMem=$(valueInMB $DRILLBIT_MAX_PROC_MEM)
    +DbitMaxDirectMem=$(valueInMB $DRILL_MAX_DIRECT_MEMORY)
    +DbitMaxHeapMem=$(valueInMB $DRILL_HEAP)
    +DbitMaxCodeCacheMem=$(valueInMB $DRILLBIT_CODE_CACHE_SIZE)
    +
    +# Alert for %age usage
    +if [[ "$DRILLBIT_MAX_PROC_MEM" == *% ]] && [ -z "$AutoMemConfigStatus" ]; then
    +  echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] "$DRILLBIT_MAX_PROC_MEM" of System Memory ("$(valueInGB $totalRAM_inMB'm')" GB) translates to "$(valueInGB $DbitMaxProcMem'm')" GB"
    +fi
    +
    +### Performing Auto-Configuration
    +if [ -z "$DbitMaxProcMem" ] && [ -z "$AutoMemConfigStatus" ]; then
    +  if [ -n "$DbitMaxDirectMem" ] && [ -n "$DbitMaxHeapMem" ]; then
    +    ## [SCENARIO 1]: TotalCap is NOT Defined, but Heap&Direct ARE Defined (i.e. no limit)
    +    let currTotal=$DbitMaxDirectMem+$DbitMaxHeapMem
    +    #Estimating CodeCache size of current total
    +    if [ -z "$DbitMaxCodeCacheMem" ]; then export DRILLBIT_CODE_CACHE_SIZE=$(estCodeCacheInMB $currTotal)'m'; fi
    +  fi
    +  # Default values will be loaded for unspecified memory parameters
    +  AutoMemConfigStatus="PASSED"
    +elif [ -z "$AutoMemConfigStatus" ]; then
    +  ## Scenario: Total IS Defined
    +  if [ -z "$DbitMaxCodeCacheMem" ]; then
    +    let DbitMaxCodeCacheMem=$(estCodeCacheInMB $DbitMaxProcMem)
    +    export DRILLBIT_CODE_CACHE_SIZE=$DbitMaxCodeCacheMem'm'
    +  fi
    +  if [ -n "$DbitMaxHeapMem" ] && [ -n "$DbitMaxDirectMem" ]; then
    +    ## [SCENARIO 2]: Heap &Direct ARE Defined
    +    let calcTotalInMB=$DbitMaxDirectMem+$DbitMaxHeapMem+$DbitMaxCodeCacheMem
    +    # Fail if exceeding process limit
    +    if [ $calcTotalInMB -gt $DbitMaxProcMem ]; then
    +      echo `date +%Y-%m-%d" "%H:%M:%S`"  [ERROR]    Unable to start Drillbit due to memory constraint violations"
    +      echo "  Total Memory Requested : "$(valueInGB $calcTotalInMB'm')" GB"
    +      echo "  Check the following settings to possibly modify (or increase the Max Memory Permitted):"
    +      printCurrAllocation
    +      exit 127
    +    else
    +      #All numbers align
    +      let deltaInGB=($DbitMaxProcMem-$calcTotalInMB)/1024
    +      if [ $deltaInGB -gt 1 ]; then
    +        echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] You have an allocation of "$deltaInGB" GB that is currently unused from a total of "$(valueInGB $DbitMaxProcMem'm')" GB. You can increase your existing memory configuration to use this extra memory";
    +        printCurrAllocation
    +      fi
    +    fi
    +  elif [ -n "$DbitMaxHeapMem" ] && [ -z "$DbitMaxDirectMem" ]; then
    +    ## [SCENARIO 3]: Total and only Heap is defined
    +    let DbitMaxDirectMem=$DbitMaxProcMem-$DbitMaxHeapMem-$DbitMaxCodeCacheMem
    +  elif [ -z "$DbitMaxHeapMem" ] && [ -n "$DbitMaxDirectMem" ]; then
    +    ## [SCENARIO 4]: Total and only Direct is defined
    +    let DbitMaxHeapMem=$DbitMaxProcMem-$DbitMaxDirectMem-$DbitMaxCodeCacheMem
    +  elif [ -z "$DbitMaxDirectMem" ] && [ -z "$DbitMaxHeapMem" ]; then
    +    ## [SCENARIO 5]: Only Total is defined
    +    ## Compute Direct & Heap
    +    let DbitMaxProcMemInGB=$(valueInGB $DbitMaxProcMem'm')
    +    let DbitMaxHeapMemInGB=`echo $DbitMaxProcMemInGB | awk '{heap=-13.2+6.12*log($1); if (heap<1) {heap=1}; printf "%0.0f\n", heap }'`
    +    let DbitMaxHeapMem=$(valueInMB $DbitMaxHeapMemInGB'g')
    +    let DbitMaxDirectMem=$DbitMaxProcMem-$DbitMaxHeapMem-$DbitMaxCodeCacheMem
    +  fi
    +  ## Export computed values
    +  export DRILL_HEAP=$(valueInGB $DbitMaxHeapMem'm')"G"
    +  export DRILL_MAX_DIRECT_MEMORY=$(valueInGB $DbitMaxDirectMem'm')"G"
    +  export DRILLBIT_CODE_CACHE_SIZE=$DbitMaxCodeCacheMem'm'
    +fi
    +
    +### Broad check for System Level capacity
    +if [ -z "$AutoMemConfigStatus" ]; then
    +  # Rereading for recently exported env var
    +  DbitMaxDirectMem=$(valueInMB $DRILL_MAX_DIRECT_MEMORY)
    +  DbitMaxHeapMem=$(valueInMB $DRILL_HEAP)
    +  DbitMaxCodeCacheMem=$(valueInMB $DRILLBIT_CODE_CACHE_SIZE)
    +  let totalDBitMem_inMB=$DbitMaxDirectMem+$DbitMaxHeapMem+$DbitMaxCodeCacheMem
    +  if [ $totalDBitMem_inMB -gt $totalRAM_inMB ]; then
    +    echo `date +%Y-%m-%d" "%H:%M:%S`"  [ERROR] Total Memory Allocation for Drillbit ("$(valueInGB $totalDBitMem_inMB'm')"GB) exceeds total system memory ("$(valueInGB $totalRAM_inMB'm')"GB)"
    +    echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] Drillbit not will start up"
    +    exit 127
    +  elif [ $totalDBitMem_inMB -gt $freeRAM_inMB ]; then
    +    echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] Total Memory Allocation for Drillbit ("$(valueInGB $totalDBitMem_inMB'm')"GB) exceeds available free memory ("$(valueInGB $freeRAM_inMB'm')"GB)"
    +    echo `date +%Y-%m-%d" "%H:%M:%S`"  [WARN] Drillbit will start up, but can potentially crash due to oversubscribing of system memory."
    +  fi
    +fi
    +
    +#Implicit that checks have passed
    +AutoMemConfigStatus="PASSED"
    --- End diff --
    
    I tried with the `debug` argument and it not only shows the environment variables, but if there are any warnings, they also would appear. So, I'd say we are self-sufficient in this regard.
    I think we can indicate that we are auto-configuring the memory parameters and allow users to trace back the source of the settings. 


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r164325226
  
    --- Diff: distribution/src/resources/drill-config.sh ---
    @@ -180,18 +251,61 @@ else
       fi
     fi
     
    -# Default memory settings if none provided by the environment or
    +# Execute distrib-setup.sh for any distribution-specific setup (e.g. checks).
    +# distrib-setup.sh is optional; it is created by some distribution installers
    +# that need additional distribution-specific setup to be done.
    +# Because installers will have site-specific steps, the file
    +# should be moved into the site directory, if the user employs one.
    +
    +# Checking if being executed in context of Drillbit and not SQLLine
    +if [ "$DRILLBIT_CONTEXT" == "1" ]; then
    +  # Check whether to run exclusively distrib-setup.sh OR auto-setup.sh
    +  distribSetup="$DRILL_CONF_DIR/distrib-setup.sh" ; #Site-based distrib-setup.sh
    +  if [ $(checkExecutableLineCount $distribSetup) -eq 0 ]; then
    --- End diff --
    
    Honestly, this seems overly complex. First, if we simply omit `distrib-setup.sh` then we only need to check for existence, as we did for `distrib-env.sh` in the original file.
    
    There is no need at all to check the executable line count: this is just asking for problems. Just source the file if it exists. If the file contains only comments (as in the out-of-the-box `drill-env.sh`), then the shell is perfectly capable of doing nothing. There is no time savings either because the file will be read by the executable line check or the shell.
    
    In short, let's apply the KISS principle here.


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1082#discussion_r160538214
  
    --- Diff: distribution/src/assemble/bin.xml ---
    @@ -345,6 +345,16 @@
           <fileMode>0755</fileMode>
           <outputDirectory>conf</outputDirectory>
         </file>
    +    <file>
    +      <source>src/resources/drill-auto.sh</source>
    --- End diff --
    
    "-auto"? Is this for the self-driving car feature of Drill? :-)
    
    Is this meant to be for startup-time environment checks? For other add-on processing? Perhaps we can pick a name that makes that a bit clearer.


---