You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2018/07/04 01:07:45 UTC
[3/6] impala git commit: [DOCS] Clarification on admission control
and DDL statements
[DOCS] Clarification on admission control and DDL statements
Removed the confusing example and paragraphs.
Change-Id: I2e3e82bd34e88e7a13de1864aeb97f01023bc715
Reviewed-on: http://gerrit.cloudera.org:8080/10829
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/6f52ce10
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/6f52ce10
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/6f52ce10
Branch: refs/heads/master
Commit: 6f52ce10e302ed9d168731dc11db07aabbfa2e53
Parents: 83448f1
Author: Alex Rodoni <ar...@cloudera.com>
Authored: Tue Jun 26 14:30:38 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Tue Jul 3 18:41:47 2018 +0000
----------------------------------------------------------------------
docs/topics/impala_admission.xml | 146 ++++++++++++++--------------------
1 file changed, 61 insertions(+), 85 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/impala/blob/6f52ce10/docs/topics/impala_admission.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_admission.xml b/docs/topics/impala_admission.xml
index 5de246b..317fa80 100644
--- a/docs/topics/impala_admission.xml
+++ b/docs/topics/impala_admission.xml
@@ -51,6 +51,11 @@ under the License.
not wait indefinitely, so that you can detect and correct <q>starvation</q> scenarios.
</p>
<p>
+ Queries, DML statements, and some DDL statements, including
+ <codeph>CREATE TABLE AS SELECT</codeph> and <codeph>COMPUTE
+ STATS</codeph> are affected by admission control.
+ </p>
+ <p>
Enable this feature if your cluster is
underutilized at some times and overutilized at others. Overutilization is indicated by performance
bottlenecks and queries being cancelled due to out-of-memory conditions, when those same queries are
@@ -765,38 +770,42 @@ impala.admission-control.pool-queue-timeout-ms.<varname>queue_name</varname></ph
<!-- End Config -->
<concept id="admission_guidelines">
-
- <title>Guidelines for Using Admission Control</title>
- <prolog>
- <metadata>
- <data name="Category" value="Planning"/>
- <data name="Category" value="Guidelines"/>
- <data name="Category" value="Best Practices"/>
- </metadata>
- </prolog>
-
- <conbody>
-
- <p>
- To see how admission control works for particular queries, examine the profile output for the query. This
- information is available through the <codeph>PROFILE</codeph> statement in <cmdname>impala-shell</cmdname>
- immediately after running a query in the shell, on the <uicontrol>queries</uicontrol> page of the Impala
- debug web UI, or in the Impala log file (basic information at log level 1, more detailed information at log
- level 2). The profile output contains details about the admission decision, such as whether the query was
- queued or not and which resource pool it was assigned to. It also includes the estimated and actual memory
- usage for the query, so you can fine-tune the configuration for the memory limits of the resource pools.
- </p>
-
- <p>
- Remember that the limits imposed by admission control are <q>soft</q> limits.
- The decentralized nature of this mechanism means that each Impala node makes its own decisions about whether
- to allow queries to run immediately or to queue them. These decisions rely on information passed back and forth
- between nodes by the statestore service. If a sudden surge in requests causes more queries than anticipated to run
- concurrently, then throughput could decrease due to queries spilling to disk or contending for resources;
- or queries could be cancelled if they exceed the <codeph>MEM_LIMIT</codeph> setting while running.
- </p>
-
-<!--
+ <title>Guidelines for Using Admission Control</title>
+ <prolog>
+ <metadata>
+ <data name="Category" value="Planning"/>
+ <data name="Category" value="Guidelines"/>
+ <data name="Category" value="Best Practices"/>
+ </metadata>
+ </prolog>
+ <conbody>
+ <p>
+ To see how admission control works for particular queries, examine
+ the profile output for the query. This information is available
+ through the <codeph>PROFILE</codeph> statement in
+ <cmdname>impala-shell</cmdname> immediately after running a query in
+ the shell, on the <uicontrol>queries</uicontrol> page of the Impala
+ debug web UI, or in the Impala log file (basic information at log
+ level 1, more detailed information at log level 2). The profile output
+ contains details about the admission decision, such as whether the
+ query was queued or not and which resource pool it was assigned to. It
+ also includes the estimated and actual memory usage for the query, so
+ you can fine-tune the configuration for the memory limits of the
+ resource pools.
+ </p>
+ <p>
+ Remember that the limits imposed by admission control are
+ <q>soft</q> limits. The decentralized nature of this mechanism means
+ that each Impala node makes its own decisions about whether to allow
+ queries to run immediately or to queue them. These decisions rely on
+ information passed back and forth between nodes by the statestore
+ service. If a sudden surge in requests causes more queries than
+ anticipated to run concurrently, then throughput could decrease due to
+ queries spilling to disk or contending for resources; or queries could
+ be cancelled if they exceed the <codeph>MEM_LIMIT</codeph> setting
+ while running.
+ </p>
+ <!--
<p>
If you have trouble getting a query to run because its estimated memory usage is too high, you can override
the estimate by setting the <codeph>MEM_LIMIT</codeph> query option in <cmdname>impala-shell</cmdname>,
@@ -806,58 +815,25 @@ impala.admission-control.pool-queue-timeout-ms.<varname>queue_name</varname></ph
pre-allocated by the query.
</p>
-->
-
- <p>
- In <cmdname>impala-shell</cmdname>, you can also specify which resource pool to direct queries to by
- setting the <codeph>REQUEST_POOL</codeph> query option.
- </p>
-
- <p>
- The statements affected by the admission control feature are primarily queries, but also include statements
- that write data such as <codeph>INSERT</codeph> and <codeph>CREATE TABLE AS SELECT</codeph>. Most write
- operations in Impala are not resource-intensive, but inserting into a Parquet table can require substantial
- memory due to buffering intermediate data before writing out each Parquet data block. See
- <xref href="impala_parquet.xml#parquet_etl"/> for instructions about inserting data efficiently into
- Parquet tables.
- </p>
-
- <p>
- Although admission control does not scrutinize memory usage for other kinds of DDL statements, if a query
- is queued due to a limit on concurrent queries or memory usage, subsequent statements in the same session
- are also queued so that they are processed in the correct order:
- </p>
-
-<codeblock>-- This query could be queued to avoid out-of-memory at times of heavy load.
-select * from huge_table join enormous_table using (id);
--- If so, this subsequent statement in the same session is also queued
--- until the previous statement completes.
-drop table huge_table;
-</codeblock>
-
- <p>
- If you set up different resource pools for different users and groups, consider reusing any classifications
- you developed for use with Sentry security. See <xref href="impala_authorization.xml#authorization"/> for details.
- </p>
-
- <p>
- For details about all the Fair Scheduler configuration settings, see
- <xref keyref="FairScheduler">Fair Scheduler Configuration</xref>, in particular the tags such as <codeph><queue></codeph> and
- <codeph><aclSubmitApps></codeph> to map users and groups to particular resource pools (queues).
- </p>
-
-<!-- Wait a sec. We say admission control doesn't use RESERVATION_REQUEST_TIMEOUT at all.
- What's the real story here? Matt did refer to some timeout option that was
- available through the shell but not the DB-centric APIs.
-<p>
- Because you cannot override query options such as
- <codeph>RESERVATION_REQUEST_TIMEOUT</codeph>
- in a JDBC or ODBC application, consider configuring timeout periods
- on the application side to cancel queries that take
- too long due to being queued during times of high load.
-</p>
--->
- </conbody>
- </concept>
+ <p>
+ In <cmdname>impala-shell</cmdname>, you can also specify which
+ resource pool to direct queries to by setting the
+ <codeph>REQUEST_POOL</codeph> query option.
+ </p>
+ <p>
+ If you set up different resource pools for different users and
+ groups, consider reusing any classifications you developed for use
+ with Sentry security. See <xref
+ href="impala_authorization.xml#authorization"/> for details.
+ </p>
+ <p>
+ For details about all the Fair Scheduler configuration settings, see
+ <xref keyref="FairScheduler">Fair Scheduler Configuration</xref>, in
+ particular the tags such as <codeph><queue></codeph> and
+ <codeph><aclSubmitApps></codeph> to map users and groups to
+ particular resource pools (queues).
+ </p>
+ </conbody>
+ </concept>
</concept>
</concept>
-