You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Timothy Farkas (JIRA)" <ji...@apache.org> on 2017/11/07 19:20:01 UTC
[jira] [Commented] (DRILL-5942) Drill Resource Management

    [ https://issues.apache.org/jira/browse/DRILL-5942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242689#comment-16242689 ] 

Timothy Farkas commented on DRILL-5942:
---------------------------------------

My initial analysis of this issue is as follows:

This is a very complex issue which will take a considerable amount of time to solve correctly. Rehashing the points that Paul has already mentioned in various discussions I think there are two main Phases this would need to be tackled in.

h2. Phase 1: Running Queries Non-Concurrently Without Running Out of Memory

h3. Goal

The goal here would be to run one query at a time successfully in all cases. I think this is possible to achieve with incremental improvements to the existing architecture. *Note:* I think achieving *Phase 2* will require significant changes to Drill's architecture.

h3. Tasks

In order to avoid out of memory exceptions for the single query case, it is necessary and sufficient to have solutions for all of these sub-tasks.

  * Make each operator memory aware. Given a specific memory budget each operator must be capable of obeying it. All the operators need to be analyze and made memory aware.
    * *Relevant Pending Work:* The HashJoin work Boaz is doing.
  * Account for the memory used by Drillbit to Drillbit communication. Currently exchanges use buffers on both the sending and recieving drill bits. These buffers can use a significant amount of memory. We would have to be able to set a limit on the amount of memory used by these buffers and make the exchange operations smart enough to obey the limit.
    * *Relevant Pending Work:* The exchange operator work Vlad is doing.
  * Make drill aware of the amount of direct memory allocated to the jvm. This is necessary because the Drillibit needs to know how much memory it has available to allocate to operators, and buffers.
  * Control batch sizes. We cannot effectively obey memory limits in the operators and buffers unless we can bound the size of batches. 
    * *Relevant Pending Work:* The work Paul is doing to limit batch sizes.
  * Once everything above is satisfied, we need to test the Parallelizer code to make sure it doesn't overallocate memory. I believe there are cases where incorrect configuration can cause the Parallelizers to grossly over allocate memory.

h2. Phase 2: Running Multiple Concurrently Without Running Out of Memory

h3. Goal

The goal here would be to run multiple concurrent queries successfully in all cases. However, before we can even think about *Phase 2* we must first solve the issues outlined in *Phase 1*.

h3. Theory

This will require significant changes to Drill's architecture. This is because we will have to solve the problem of distributed resource management in order to effectively allocate resources to concurrent queries without exceeding the resources we have in our cluster. 

h4. Desired State

The good news is that existing cluster managers like YARN already solve this problem for batch jobs by doing the following:

# The amount of available memory and cpu cores is reported to a resource manager.
# New jobs are submitted to the resource manager. A job includes a description of all the containers it needs to run. Each container description also includes the amount of memory and cpu it will need.
# The resource manager places the job in a Queue.
# The resource manager uses a scheduler to prioritize jobs in the Queue.
# A job with high priority is scheduled to run on the cluster if and only if the cluster has enough resources to execute the job.
# When a job is deployed to the cluster, the remaining unused resources on the cluster are updated appropriately.

h4. Current State

Currently Drill does not do any of this. Instead it does the following:

# A query is sent to Drill.
# A foreman is created for the query.
# The query is planned.
# During planning, each query assumes it has access to all of the cluster resources and is oblivious to any other queries running on the cluster.

Because of this the following issues can occur:

* Queries sporadically run out of memory when too many concurrent queries are run. For example, assume we have a cluster with 100 Gb of memory. Let's run *Query A* and assume it consumes 80Gb of memory on the cluster. While *Query A* is running let's try to run *Query B*. During the planning of *Query B* Drill is completely unaware that *Query A* is running and assumes that *Query B* has the full 100 GB at it's disposal. So Drill may launch *Query B* and give it 80 GB. Now there are two queries with 80 GB allocated to each of them for a total of 160 GB when the cluster only has 100 GB.
* Even if we try to have some smart heuristics to avoid the first issue, a resource deadlock can occur. For example *Query A* could be partially deployed to the cluster and take up half of the cluster resources, similarly *Query B* could be partially deployed to the cluster and take up the other half of the cluster resources. In this case both *Query A* and *Query B* will wait to get the remaining resources they need, but they will never get them unless one of them is cancelled.

In summary we need a true distributed resource management system like YARN described above to effectively resolve these issues. *Note:* I am not saying we should use YARN for this purpose, I am just saying that we will have to solve the same problem and will have to use a similar architecture.

h3. Short Term Actionable Items

We can start working on tasks outlined in *Phase 1* in order to incrementally step towards a solution to this problem. But we will continue to see this issue pop up until all of the issues / tasks for *Phase 1* and *Phase 2* outlined above are address. 



> Drill Resource Management
> -------------------------
>
>                 Key: DRILL-5942
>                 URL: https://issues.apache.org/jira/browse/DRILL-5942
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Timothy Farkas
>
> OutOfMemoryExceptions still occur in Drill. This ticket addresses a plan for what it would take to ensure all queries are able to execute on Drill without running out of memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)