You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2018/04/12 05:28:18 UTC

[05/28] impala git commit: IMPALA-6779: [DOCS] Improve the REPLICA_PREFERENCE doc

IMPALA-6779: [DOCS] Improve the REPLICA_PREFERENCE doc

Added detail usage notes for REPLICA_PREFERENCE.

Change-Id: If38f9c881f553568c2516ecc23ec501f23ee1f28
Reviewed-on: http://gerrit.cloudera.org:8080/9877
Reviewed-by: John Russell <jr...@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/b96cbfd0
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/b96cbfd0
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/b96cbfd0

Branch: refs/heads/2.x
Commit: b96cbfd09a76ad1e14b970e1e450ac3935042db2
Parents: a0450d2
Author: Alex Rodoni <ar...@cloudera.com>
Authored: Fri Mar 30 14:54:39 2018 -0700
Committer: Impala Public Jenkins <im...@gerrit.cloudera.org>
Committed: Wed Apr 11 22:55:59 2018 +0000

----------------------------------------------------------------------
 docs/topics/impala_replica_preference.xml | 49 ++++++++++++++++++++------
 1 file changed, 38 insertions(+), 11 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/b96cbfd0/docs/topics/impala_replica_preference.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_replica_preference.xml b/docs/topics/impala_replica_preference.xml
index 45a5dbd..6c0d3ab 100644
--- a/docs/topics/impala_replica_preference.xml
+++ b/docs/topics/impala_replica_preference.xml
@@ -21,7 +21,13 @@ under the License.
 <concept id="replica_preference" rev="2.7.0">
 
   <title>REPLICA_PREFERENCE Query Option (<keyword keyref="impala27"/> or higher only)</title>
-  <titlealts audience="PDF"><navtitle>REPLICA_PREFERENCE</navtitle></titlealts>
+
+  <titlealts audience="PDF">
+
+    <navtitle>REPLICA_PREFERENCE</navtitle>
+
+  </titlealts>
+
   <prolog>
     <metadata>
       <data name="Category" value="Impala"/>
@@ -38,29 +44,50 @@ under the License.
     </p>
 
     <p>
-      The <codeph>REPLICA_PREFERENCE</codeph> query option
-      lets you spread the load more evenly if hotspots and bottlenecks persist, by allowing hosts to do local reads,
-      or even remote reads, to retrieve the data for cached blocks if Impala can determine that it would be
-      too expensive to do all such processing on a particular host.
+      The <codeph>REPLICA_PREFERENCE</codeph> query option lets you distribute the work more
+      evenly if hotspots and bottlenecks persist. It causes the access cost of all replicas of a
+      data block to be considered equal to or worse than the configured value. This allows
+      Impala to schedule reads to suboptimal replicas (e.g. local in the presence of cached
+      ones) in order to distribute the work across more executor nodes.
     </p>
 
     <p>
-      <b>Type:</b> numeric (0, 2, 4)
-      or corresponding mnemonic strings (<codeph>CACHE_LOCAL</codeph>, <codeph>DISK_LOCAL</codeph>, <codeph>REMOTE</codeph>).
-      The gaps in the numeric sequence are to accomodate other intermediate
-      values that might be added in the future.
+      Allowed values are: <codeph>CACHE_LOCAL</codeph> (<codeph>0</codeph>),
+      <codeph>DISK_LOCAL</codeph> (<codeph>2</codeph>), <codeph>REMOTE</codeph>
+      (<codeph>4</codeph>)
     </p>
 
     <p>
-      <b>Default:</b> 0 (equivalent to <codeph>CACHE_LOCAL</codeph>)
+      <b>Type:</b> Enum
+    </p>
+
+    <p>
+      <b>Default:</b> <codeph>CACHE_LOCAL (0)</codeph>
     </p>
 
     <p conref="../shared/impala_common.xml#common/added_in_270"/>
 
+    <p>
+      <b>Usage Notes:</b>
+    </p>
+
+    <p>
+      By default Impala selects the best replica it can find in terms of access cost. The
+      preferred order is cached, local, and remote. With <codeph>REPLICA_PREFERENCE</codeph>,
+      the preference of all replicas are capped at the selected value. For example, when
+      <codeph>REPLICA_PREFERENCE</codeph> is set to <codeph>DISK_LOCAL</codeph>, cached and
+      local replicas are treated with the equal preference. When set to
+      <codeph>REMOTE</codeph>, all three types of replicas, cached, local, remote, are treated
+      with equal preference.
+    </p>
+
     <p conref="../shared/impala_common.xml#common/related_info"/>
+
     <p>
-      <xref href="impala_perf_hdfs_caching.xml#hdfs_caching"/>, <xref href="impala_schedule_random_replica.xml#schedule_random_replica"/>
+      <xref href="impala_perf_hdfs_caching.xml#hdfs_caching"/>,
+      <xref href="impala_schedule_random_replica.xml#schedule_random_replica"/>
     </p>
 
   </conbody>
+
 </concept>