You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2017/04/14 23:48:06 UTC

[2/2] incubator-impala git commit: IMPALA-2924: [DOCS] Add docs for HDFS cache-related hints

IMPALA-2924: [DOCS] Add docs for HDFS cache-related hints

The JIRA discusses a RANDOM_REPLICA query option but Impala only
has a SCHEDULE_RANDOM_REPLICA option. So I stated that the
RANDOM_REPLICA hint is the same as specifying
SCHEDULE_RANDOM_REPLICA=true. Please confirm.

Change-Id: I7284dd45c8173eef104ebd32789429e8c16c7bf2
Reviewed-on: http://gerrit.cloudera.org:8080/6631
Reviewed-by: Lars Volker <lv...@cloudera.com>
Reviewed-by: John Russell <jr...@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/8bdfe032
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/8bdfe032
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/8bdfe032

Branch: refs/heads/master
Commit: 8bdfe032012e0b52550bc6784dc972b9dcfb5f7b
Parents: cb1e4f6
Author: John Russell <jr...@cloudera.com>
Authored: Thu Apr 13 14:10:07 2017 -0700
Committer: Impala Public Jenkins <im...@gerrit.cloudera.org>
Committed: Fri Apr 14 22:37:34 2017 +0000

----------------------------------------------------------------------
 docs/topics/impala_hints.xml | 42 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 40 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/8bdfe032/docs/topics/impala_hints.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_hints.xml b/docs/topics/impala_hints.xml
index 7d833f6..4524c14 100644
--- a/docs/topics/impala_hints.xml
+++ b/docs/topics/impala_hints.xml
@@ -80,7 +80,8 @@ INSERT <varname>insert_clauses</varname>
     <p rev="2.0.0">
       In <keyword keyref="impala20_full"/> and higher, you can also specify the hints inside comments that use
       either the <codeph>/* */</codeph> or <codeph>--</codeph> notation. Specify a <codeph>+</codeph> symbol
-      immediately before the hint name.
+      immediately before the hint name. Recently added hints are only available using the <codeph>/* */</codeph>
+      and <codeph>--</codeph> notation.
     </p>
 
 <codeblock rev="2.0.0">SELECT STRAIGHT_JOIN <varname>select_list</varname> FROM
@@ -102,6 +103,12 @@ INSERT <varname>insert_clauses</varname>
 INSERT <varname>insert_clauses</varname>
   -- +SHUFFLE|NOSHUFFLE
   SELECT <varname>remainder_of_query</varname>;
+
+<ph rev="IMPALA-2924">SELECT <varname>select_list</varname> FROM
+<varname>table_ref</varname>
+  /* +{SCHEDULE_CACHE_LOCAL | SCHEDULE_DISK_LOCAL | SCHEDULE_REMOTE}
+    [,RANDOM_REPLICA] */
+<varname>remainder_of_query</varname>;</ph>
 </codeblock>
 
     <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
@@ -109,7 +116,7 @@ INSERT <varname>insert_clauses</varname>
     <p>
       With both forms of hint syntax, include the <codeph>STRAIGHT_JOIN</codeph>
       keyword immediately after the <codeph>SELECT</codeph> keyword to prevent Impala from
-      reordering the tables in a way that makes the hint ineffective.
+      reordering the tables in a way that makes the join-related hints ineffective.
     </p>
 
     <p>
@@ -163,6 +170,37 @@ INSERT <varname>insert_clauses</varname>
 
     <p conref="../shared/impala_common.xml#common/insert_hints"/>
 
+    <p rev="IMPALA-2924">
+      <b>Hints for scheduling of HDFS blocks:</b>
+    </p>
+
+    <p rev="IMPALA-2924">
+      The hints <codeph>/* +SCHEDULE_CACHE_LOCAL */</codeph>,
+      <codeph>/* +SCHEDULE_DISK_LOCAL */</codeph>, and
+      <codeph>/* +SCHEDULE_REMOTE */</codeph> have the same effect
+      as specifying the <codeph>REPLICA_PREFERENCE</codeph> query
+      option with the respective option settings of <codeph>CACHE_LOCAL</codeph>,
+      <codeph>DISK_LOCAL</codeph>, or <codeph>REMOTE</codeph>.
+      The hint <codeph>/* +RANDOM_REPLICA */</codeph> is the same as
+      enabling the <codeph>SCHEDULE_RANDOM_REPLICA</codeph> query option.
+    </p>
+
+    <p rev="IMPALA-2924">
+      You can use these hints in combination by separating them with commas,
+      for example, <codeph>/* +SCHEDULE_CACHE_LOCAL,RANDOM_REPLICA */</codeph>.
+      See <xref keyref="replica_preference"/> and
+      <xref keyref="schedule_random_replica"/> for information about how
+      these settings influence the way Impala processes HDFS data blocks.
+    </p>
+
+    <p rev="IMPALA-2924">
+      Specifying the replica preference as a query hint always overrides the
+      query option setting. Specifying either the <codeph>SCHEDULE_RANDOM_REPLICA</codeph>
+      query option or the corresponding <codeph>RANDOM_REPLICA</codeph> query hint
+      enables the random tie-breaking behavior when processing data blocks
+      during the query.
+    </p>
+
     <p>
       <b>Suggestions versus directives:</b>
     </p>