You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by kkhatua <gi...@git.apache.org> on 2018/01/09 00:12:30 UTC

[GitHub] drill pull request #1086: DRILL-6076: Reduce the default memory from a total...

GitHub user kkhatua opened a pull request:

    https://github.com/apache/drill/pull/1086

    DRILL-6076: Reduce the default memory from a total of 13GB to 5GB

    Reduce minimum memory requirements from 13GB to <= 5GB
    For this, the default need to be changed as follows:
    
    Allocation | Current | New 
    ------------|--------|------
     Heap | 4GB | 1GB 
     Direct | 8GB | 3GB 
     CodeCache | 1GB | 512MB 
     MaxPermSize | 512MB | 512MB (Ignored if JDK8+) 
     **Total** | 13.5GB | 5GB 
     **Total (JDK8+)** | 13GB | 4.5GB 


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kkhatua/drill DRILL-6076

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1086.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1086
    
----
commit e7d3cbf6abf904096efeea8391fc46a768e08bb8
Author: Kunal Khatua <kk...@...>
Date:   2018-01-05T01:49:50Z

    DRILL-6076: Reduce the default memory from a total of 13GB to 5GB
    
    Reduce minimum memory requirements from 13GB to <= 5GB
    For this, the default need to be changed as follows:
    Heap: 4GB -> 1GB
    Direct: 8GB -> 3GB
    CodeCache: 1GB -> 512MB
    MaxPermSize: 512MB -> 512MB (Ignored if JDK8+)

----


---

[GitHub] drill issue #1086: DRILL-6076: Reduce the default memory from a total of 13G...

Posted by parthchandra <gi...@git.apache.org>.
Github user parthchandra commented on the issue:

    https://github.com/apache/drill/pull/1086
  
    The settings for running the unit tests are (from the pom file :
                  -Xms512m -Xmx4096m 
                  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M
                  -XX:+CMSClassUnloadingEnabled -ea
    
    At the very least, the default settings for Drill and the unit tests should match.


---

[GitHub] drill issue #1086: DRILL-6076: Reduce the default memory from a total of 13G...

Posted by priteshm <gi...@git.apache.org>.
Github user priteshm commented on the issue:

    https://github.com/apache/drill/pull/1086
  
    @paul-rogers, @parthchandra  can you review/ comment on this change?


---

[GitHub] drill issue #1086: DRILL-6076: Reduce the default memory from a total of 13G...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on the issue:

    https://github.com/apache/drill/pull/1086
  
    Ok. Looks like [DRILL-5926](https://issues.apache.org/jira/browse/DRILL-5926) (committed into Apache master) explicitly has a need for bumping up the memory to 4GB for heap alone. I guess we'll need to figure out something else to manage this.


---

[GitHub] drill issue #1086: DRILL-6076: Reduce the default memory from a total of 13G...

Posted by Agirish <gi...@git.apache.org>.
Github user Agirish commented on the issue:

    https://github.com/apache/drill/pull/1086
  
    +1. I think this is a welcome change for newbies trying out Drill on personal systems. For production use, the current defaults were anyway not always sufficient and had to be updated. 


---

[GitHub] drill pull request #1086: DRILL-6076: Reduce the default memory from a total...

Posted by kkhatua <gi...@git.apache.org>.
GitHub user kkhatua reopened a pull request:

    https://github.com/apache/drill/pull/1086

    DRILL-6076: Reduce the default memory from a total of 13GB to 5GB

    Reduce minimum memory requirements from 13GB to <= 5GB
    For this, the default need to be changed as follows:
    
    Allocation | Current | New 
    ------------|--------|------
     Heap | 4GB | 1GB 
     Direct | 8GB | 3GB 
     CodeCache | 1GB | 512MB 
     MaxPermSize | 512MB | 512MB (Ignored if JDK8+) 
     **Total** | 13.5GB | 5GB 
     **Total (JDK8+)** | 13GB | 4.5GB 


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kkhatua/drill DRILL-6076

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1086.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1086
    
----
commit 1e21039be3b39c38d7fcee21df2b1c914fc916ff
Author: Kunal Khatua <kk...@...>
Date:   2018-01-05T01:49:50Z

    DRILL-6076: Reduce the default memory from a total of 13GB to 5GB
    
    Reduce minimum memory requirements from 13GB to <= 5GB
    For this, the default need to be changed as follows:
    Heap: 4GB -> 1GB
    Direct: 8GB -> 3GB
    CodeCache: 1GB -> 512MB
    MaxPermSize: 512MB -> 512MB (Ignored if JDK8+)

----


---

[GitHub] drill issue #1086: DRILL-6076: Reduce the default memory from a total of 13G...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on the issue:

    https://github.com/apache/drill/pull/1086
  
    I think folks running unit tests can continue with the existing limits of 4GB heap+4GB Direct. The idea is to get Drill up and running for minimal use cases, and you are expected to bump up memory to higher limits if queries get OOM errors. Unit tests, if carrying any tests that require much more memory, would qualify under this.
    I've launched a [fork](https://github.com/kkhatua/drill/commits/LowerMemUnitTest) of this branch with reduced memory settings for the unit tests as well and letting TravisCI test it out:
    https://travis-ci.org/kkhatua/drill/builds/328250155
    If it passes, we can either reduce those settings in the POM files as well, or leave it intact to ensure developers doing unit tests are able to run the tests in a reasonable amount of time.



---

[GitHub] drill issue #1086: DRILL-6076: Reduce the default memory from a total of 13G...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on the issue:

    https://github.com/apache/drill/pull/1086
  
    The primary objective of the JIRA (and this PR) is to allow Drill to truly start up in a bare-min memory footprint. Most people trying Drill would start experimenting with a LocalFS, before trying on a small distributedFS and then eventually scaling out the number of nodes for the DFS.
    
    The challenge has been to figure out what is a reasonably small size for Drill to function without failing. A setting of *5GB* (`1G Heap`+`3G Direct`+`512M CodeCache`) was sufficient for a single node Drill instance to startup and run the TPCH benchmark (single user, via SQLLine) for *scaleFactor=1*; against +CSV-Text+ and uncompressed +Parquet+ formats. Queries that failed were because of HashAgg operator not having sufficient memory. To work around this, as the error message suggested, the HashAgg fallback option was enabled. The heap memory peaked at `800MB`, while the Drillbit process itself maxed out at `2.2GB`.
    Functional tests from the drill-test framework also ran to completion without issue.
    
    However, the unit-tests hang because of insufficient memory. The tests typically need about `3GB Heap` and `4GB Direct` (~8GB total), which is defined separately in the `pom.xml` .
    
    Based on this, we have the following options, if we wish to move towards reducing the default memory footprint:
    
    1. Reduce default memory to 5GB, but leave the unit test’s memory requirements (8GB at last check) intact.
      We don’t expect anyone to run unit tests, so if they did run unit tests with the 8GB setting, it would run to completion if the user has enough memory.
      If a user does provide an excesssively large input to process to Drill, it is expected to correctly report insufficient memory availability and there is good documentation for explaining how to increase memory.
    
    2. Reduce default memory to match the unit-test memory requirements of 8GB
      This reduces the memory requirement substantially from the 8GB, but is still high for someone trying out Drill in a memory-constrained VM (or sandboxed) environment. This also means that we need to track the minimum memory requirements of the unit-tests and keep in sync with it. Even if the user is not intending to run unit-tests in the limited-memory environment.
    
    3. Change nothing, but introduce sample files on lines of `$DRILL_HOME/conf/drill-override.conf.example`
      The proposal here would be to have a minimum viable memory footprint config file (e.g. `drill-env.sh.minMem` ), which a first time user of Drill can swap with `./conf/drill-env.sh` when bringing up Drill. The flip side of this is that a Drill trial user would need to have the knowledge of swapping the config file for low-memory usage before starting up Drill.
    
    There seems to be consensus that users don’t feel intimidated to install Drill and start working with small-to-medium workloads on something as low end as a laptop. Based on that, I'd recommend option # 1, which is this PR. I'll leave this PR open to discussion.


---

[GitHub] drill pull request #1086: DRILL-6076: Reduce the default memory from a total...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua closed the pull request at:

    https://github.com/apache/drill/pull/1086


---