You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@impala.apache.org by ta...@apache.org on 2018/07/04 01:07:43 UTC

[1/6] impala git commit: IMPALA-5981: [DOCS] Documented SET=""

Repository: impala
Updated Branches:
  refs/heads/master 2b6d71fee -> 61e6a4777


IMPALA-5981: [DOCS] Documented SET=""

Also, refactored the Impala SET doc and moved the command SET to
the Impala Shell Commands doc.

Change-Id: I7211405d5cc0a548c05ea5218798591873c14417
Reviewed-on: http://gerrit.cloudera.org:8080/10816
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/d03a2d63
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/d03a2d63
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/d03a2d63

Branch: refs/heads/master
Commit: d03a2d63fef2d29083fa5ee85b89b85891e923fc
Parents: 2b6d71f
Author: Alex Rodoni <ar...@cloudera.com>
Authored: Mon Jun 25 16:30:14 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Tue Jul 3 16:52:42 2018 +0000

----------------------------------------------------------------------
 docs/topics/impala_set.xml            | 305 +++++++++--------------------
 docs/topics/impala_shell_commands.xml |  35 ++--
 2 files changed, 110 insertions(+), 230 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/d03a2d63/docs/topics/impala_set.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_set.xml b/docs/topics/impala_set.xml
index 0020188..ddf89da 100644
--- a/docs/topics/impala_set.xml
+++ b/docs/topics/impala_set.xml
@@ -21,7 +21,13 @@ under the License.
 <concept rev="2.0.0" id="set">
 
   <title>SET Statement</title>
-  <titlealts audience="PDF"><navtitle>SET</navtitle></titlealts>
+
+  <titlealts audience="PDF">
+
+    <navtitle>SET</navtitle>
+
+  </titlealts>
+
   <prolog>
     <metadata>
       <data name="Category" value="Impala"/>
@@ -36,255 +42,130 @@ under the License.
   <conbody>
 
     <p rev="2.0.0">
-      <indexterm audience="hidden">SET statement</indexterm>
-      Specifies values for query options that control the runtime behavior of other statements within the same
-      session.
+      The <codeph>SET</codeph> statement specifies values for query options that control the
+      runtime behavior of other statements within the same session.
     </p>
 
-    <p rev="2.5.0 IMPALA-2180">
-      In <keyword keyref="impala25_full"/> and higher, <codeph>SET</codeph> also defines user-specified substitution variables for
-      the <cmdname>impala-shell</cmdname> interpreter. This feature uses the <codeph>SET</codeph> command
-      built into <cmdname>impala-shell</cmdname> instead of the SQL <codeph>SET</codeph> statement.
-      Therefore the substitution mechanism only works with queries processed by <cmdname>impala-shell</cmdname>,
-      not with queries submitted through JDBC or ODBC.
+    <p>
+      When issued in <codeph>impala-shell</codeph>, the <codeph>SET</codeph> command is
+      interpreted as an <codeph>impala-shell</codeph> command that has differences from the SQL
+      <codeph>SET</codeph> statement. See
+      <xref
+        href="impala_shell_commands.xml#shell_commands/set_cmd"/> for the
+      information about the <codeph>SET</codeph> command in <codeph>impala-shell</codeph>.
     </p>
 
-    <note type="important" rev="2.11.0 IMPALA-2181">
-      <p>
-        In <keyword keyref="impala211_full"/> and higher, the output of the <codeph>SET</codeph>
-        statement changes in some important ways:
-      </p>
-      <ul>
-        <li>
-          <p>
-            The options are divided into groups: <codeph>Regular Query Options</codeph>,
-            <codeph>Advanced Query Options</codeph>, <codeph>Development Query Options</codeph>, and
-            <codeph>Deprecated Query Options</codeph>.
-          </p>
-        </li>
-        <li>
-          <p>
-            The advanced options are intended for use in specific
-            kinds of performance tuning and debugging scenarios. The development options are
-            related to internal development of Impala or features that are not yet finalized;
-            these options might be changed or removed without notice.
-            The deprecated options are related to features that are removed or changed so that
-            the options no longer have any purpose; these options might be removed in future
-            versions.
-          </p>
-        </li>
-        <li>
-          <p>
-            By default, only the first two groups (regular and advanced) are
-            displayed by the <codeph>SET</codeph> command. Use the syntax <codeph>SET ALL</codeph>
-            to see all groups of options.
-          </p>
-        </li>
-        <li>
-          <p>
-            <cmdname>impala-shell</cmdname> options and user-specified variables are always displayed
-            at the end of the list of query options, after all appropriate option groups.
-          </p>
-        </li>
-        <li>
-          <p>
-            When the <codeph>SET</codeph> command is run through the JDBC or ODBC interfaces,
-            the result set has a new third column, <codeph>level</codeph>, indicating which
-            group each option belongs to. The same distinction of <codeph>SET</codeph>
-            returning the regular and advanced options, and <codeph>SET ALL</codeph>
-            returning all option groups, applies to JDBC and ODBC also.
-          </p>
-        </li>
-      </ul>
-    </note>
-
     <p conref="../shared/impala_common.xml#common/syntax_blurb"/>
 
-<codeblock>SET [<varname>query_option</varname>=<varname>option_value</varname>]
+<codeblock>SET
 <ph rev="2.11.0 IMPALA-2181">SET ALL</ph>
+SET <varname>query_option</varname>=<varname>option_value</varname>
+SET <varname>query_option</varname>=""
 </codeblock>
 
     <p rev="2.11.0 IMPALA-2181">
-      <codeph>SET</codeph> and <codeph>SET ALL</codeph> with no arguments return a
-      result set consisting of all the applicable query options and their current values.
+      <codeph>SET</codeph> and <codeph>SET ALL</codeph> with no arguments return a result set
+      consisting of all the applicable query options and their current values.
     </p>
 
     <p>
-      The query option name and any string argument values are case-insensitive.
+      The <varname>query_option</varname> and <varname>option_value</varname> are
+      case-insensitive.
     </p>
 
     <p>
-      Each query option has a specific allowed notation for its arguments. Boolean options can be enabled and
-      disabled by assigning values of either <codeph>true</codeph> and <codeph>false</codeph>, or
-      <codeph>1</codeph> and <codeph>0</codeph>. Some numeric options accept a final character signifying the unit,
-      such as <codeph>2g</codeph> for 2 gigabytes or <codeph>100m</codeph> for 100 megabytes. See
-      <xref href="impala_query_options.xml#query_options"/> for the details of each query option.
+      Unlike the <codeph>impala-shell</codeph> command version of <codeph>SET</codeph>, when
+      used as a SQL statement, the string values for <varname>option_value</varname> need to be
+      quoted, e.g. <codeph>SET option="new_value"</codeph>.
     </p>
 
     <p>
-      <b>Setting query options during impala-shell invocation:</b>
-    </p>
-
-    <p rev="2.11.0 IMPALA-5736">
-      In <keyword keyref="impala211_full"/> and higher, you can use one or more command-line options
-      of the form <codeph>--query_option=<varname>option</varname>=<varname>value</varname></codeph>
-      when running the <cmdname>impala-shell</cmdname> command. The corresponding query option settings
-      take effect for that <cmdname>impala-shell</cmdname> session.
+      The <codeph>SET <varname>query_option</varname> = ""</codeph> statement unsets the value
+      of the <varname>query_option</varname> in the current session, reverting it to the default
+      state. In <codeph>impala-shell</codeph>, use the <codeph>UNSET</codeph> command to set a
+      query option to it default.
     </p>
 
     <p>
-      <b>User-specified substitution variables:</b>
-    </p>
-
-    <p rev="2.5.0 IMPALA-2180">
-      In <keyword keyref="impala25_full"/> and higher, you can specify your own names and string substitution values
-      within the <cmdname>impala-shell</cmdname> interpreter. Once a substitution variable is set up,
-      its value is inserted into any SQL statement in that same <cmdname>impala-shell</cmdname> session
-      that contains the notation <codeph>${var:<varname>varname</varname>}</codeph>.
-      Using <codeph>SET</codeph> in an interactive <cmdname>impala-shell</cmdname> session overrides
-      any value for that same variable passed in through the <codeph>--var=<varname>varname</varname>=<varname>value</varname></codeph>
-      command-line option.
-    </p>
-
-    <p rev="2.5.0 IMPALA-2180">
-      For example, to set up some default parameters for report queries, but then override those default
-      within an <cmdname>impala-shell</cmdname> session, you might issue commands and statements such as
-      the following:
-    </p>
-
-<codeblock rev="2.5.0 IMPALA-2180">
--- Initial setup for this example.
-create table staging_table (s string);
-insert into staging_table values ('foo'), ('bar'), ('bletch');
-
-create table production_table (s string);
-insert into production_table values ('North America'), ('EMEA'), ('Asia');
-quit;
-
--- Start impala-shell with user-specified substitution variables,
--- run a query, then override the variables with SET and run the query again.
-$ impala-shell --var=table_name=staging_table --var=cutoff=2
-... <varname>banner message</varname> ...
-[localhost:21000] > select s from ${var:table_name} order by s limit ${var:cutoff};
-Query: select s from staging_table order by s limit 2
-+--------+
-| s      |
-+--------+
-| bar    |
-| bletch |
-+--------+
-Fetched 2 row(s) in 1.06s
-
-[localhost:21000] > set var:table_name=production_table;
-Variable TABLE_NAME set to production_table
-[localhost:21000] > set var:cutoff=3;
-Variable CUTOFF set to 3
-
-[localhost:21000] > select s from ${var:table_name} order by s limit ${var:cutoff};
-Query: select s from production_table order by s limit 3
-+---------------+
-| s             |
-+---------------+
-| Asia          |
-| EMEA          |
-| North America |
-+---------------+
-</codeblock>
-
-    <p rev="2.5.0 IMPALA-2180">
-      The following example shows how <codeph>SET ALL</codeph> with no parameters displays
-      all user-specified substitution variables, and how <codeph>UNSET</codeph> removes
-      the substitution variable entirely:
-    </p>
-
-<codeblock rev="2.11.0 IMPALA-2181">
-[localhost:21000] > set all;
-Query options (defaults shown in []):
-ABORT_ON_ERROR: [0]
-COMPRESSION_CODEC: []
-DISABLE_CODEGEN: [0]
-...
-
-Advanced Query Options:
-APPX_COUNT_DISTINCT: [0]
-BUFFER_POOL_LIMIT: []
-DEFAULT_JOIN_DISTRIBUTION_MODE: [0]
-...
-
-Development Query Options:
-BATCH_SIZE: [0]
-DEBUG_ACTION: []
-DECIMAL_V2: [0]
-...
-
-Deprecated Query Options:
-ABORT_ON_DEFAULT_LIMIT_EXCEEDED: [0]
-ALLOW_UNSUPPORTED_FORMATS: [0]
-DEFAULT_ORDER_BY_LIMIT: [-1]
-...
-
-Shell Options
-  LIVE_PROGRESS: False
-  LIVE_SUMMARY: False
-
-Variables:
-  CUTOFF: 3
-  TABLE_NAME: staging_table
-
-[localhost:21000] > unset var:cutoff;
-Unsetting variable CUTOFF
-[localhost:21000] > select s from ${var:table_name} order by s limit ${var:cutoff};
-Error: Unknown variable CUTOFF
-</codeblock>
-
-    <p rev="2.5.0 IMPALA-2180">
-      See <xref href="impala_shell_running_commands.xml"/> for more examples of using the
-      <codeph>--var</codeph>, <codeph>SET</codeph>, and <codeph>${var:<varname>varname</varname>}</codeph>
-      substitution technique in <cmdname>impala-shell</cmdname>.
+      Each query option has a specific allowed notation for its arguments. See
+      <xref href="impala_query_options.xml#query_options"/> for the details of each query
+      option.
     </p>
 
     <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
 
     <p>
-      <codeph>MEM_LIMIT</codeph> is probably the most commonly used query option. You can specify a high value to
-      allow a resource-intensive query to complete. For testing how queries would work on memory-constrained
-      systems, you might specify an artificially low value.
-    </p>
-
-    <p conref="../shared/impala_common.xml#common/complex_types_blurb"/>
-
-    <p conref="../shared/impala_common.xml#common/example_blurb"/>
-
-    <p>
-      The following example sets some numeric and some Boolean query options to control usage of memory, disk
-      space, and timeout periods, then runs a query whose success could depend on the options in effect:
-    </p>
-
-<codeblock>set mem_limit=64g;
-set DISABLE_UNSAFE_SPILLS=true;
-set parquet_file_size=400m;
-set RESERVATION_REQUEST_TIMEOUT=900000;
-insert overwrite parquet_table select c1, c2, count(c3) from text_table group by c1, c2, c3;
-</codeblock>
+      In <keyword keyref="impala211_full"/> and higher, the outputs of the <codeph>SET</codeph>
+      and <codeph>SET ALL</codeph> statements were reorganized as below:
+    </p>
+
+    <ul>
+      <li>
+        <p>
+          The options are divided into groups: <codeph>Regular Query Options</codeph>,
+          <codeph>Advanced Query Options</codeph>, <codeph>Development Query Options</codeph>,
+          and <codeph>Deprecated Query Options</codeph>.
+        </p>
+        <ul>
+          <li>
+            <p>
+              The advanced options are intended for use in specific kinds of performance tuning
+              and debugging scenarios.
+            </p>
+          </li>
+
+          <li>
+            <p>
+              The development options are related to internal development of Impala or features
+              that are not yet finalized. These options might be changed or removed without
+              notice.
+            </p>
+          </li>
+
+          <li>
+            <p>
+              The deprecated options are related to features that are removed or changed so that
+              the options no longer have any purpose. These options might be removed in future
+              versions.
+            </p>
+          </li>
+        </ul>
+      </li>
+
+      <li>
+        <p>
+          By default, only the first two groups, regular and advanced, are displayed by the
+          <codeph>SET</codeph> command. Use <codeph>SET ALL</codeph> to see all groups of
+          options.
+        </p>
+      </li>
+
+      <li>
+        <p>
+          <cmdname>impala-shell</cmdname> options and user-specified variables are always
+          displayed at the end of the list of query options, after all appropriate option
+          groups.
+        </p>
+      </li>
+    </ul>
 
     <p conref="../shared/impala_common.xml#common/added_in_20"/>
 
     <p>
-      <codeph>SET</codeph> has always been available as an <cmdname>impala-shell</cmdname> command. Promoting it to
-      a SQL statement lets you use this feature in client applications through the JDBC and ODBC APIs.
+      <codeph>SET</codeph> has always been available as an <cmdname>impala-shell</cmdname>
+      command. Promoting it to a SQL statement lets you use this feature in client applications
+      through the JDBC and ODBC APIs.
     </p>
 
-<!-- <p conref="../shared/impala_common.xml#common/jdbc_blurb"/> -->
-
-    <p conref="../shared/impala_common.xml#common/cancel_blurb_no"/>
-
     <p conref="../shared/impala_common.xml#common/permissions_blurb_no"/>
 
     <p conref="../shared/impala_common.xml#common/related_info"/>
 
     <p>
-      See <xref href="impala_query_options.xml#query_options"/> for the query options you can adjust using this
-      statement.
+      See <xref href="impala_query_options.xml#query_options"/> for the query options you can
+      adjust using this statement.
     </p>
+
   </conbody>
+
 </concept>

http://git-wip-us.apache.org/repos/asf/impala/blob/d03a2d63/docs/topics/impala_shell_commands.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_shell_commands.xml b/docs/topics/impala_shell_commands.xml
index f9a48d9..91c6d13 100644
--- a/docs/topics/impala_shell_commands.xml
+++ b/docs/topics/impala_shell_commands.xml
@@ -291,29 +291,28 @@ under the License.
             </entry>
             <entry>
               <p>
-                Manages query options for an <cmdname>impala-shell</cmdname> session. The available options are the
-                ones listed in <xref href="impala_query_options.xml#query_options"/>. These options are used for
-                query tuning and troubleshooting. Issue <codeph>SET</codeph> with no arguments to see the current
-                query options, either based on the <cmdname>impalad</cmdname> defaults, as specified by you at
-                <cmdname>impalad</cmdname> startup, or based on earlier <codeph>SET</codeph> statements in the same
-                session. To modify option values, issue commands with the syntax <codeph>set
-                <varname>option</varname>=<varname>value</varname></codeph>. To restore an option to its default,
-                use the <codeph>unset</codeph> command. Some options take Boolean values of <codeph>true</codeph>
-                and <codeph>false</codeph>. Others take numeric arguments, or quoted string values.
+                Manages query options for an <cmdname>impala-shell</cmdname>
+                session. The available options are the ones listed in <xref
+                  href="impala_query_options.xml#query_options"/>. These options
+                are used for query tuning and troubleshooting. Issue
+                  <codeph>SET</codeph> with no arguments to see the current
+                query options, either based on the <cmdname>impalad</cmdname>
+                defaults, as specified by you at <cmdname>impalad</cmdname>
+                startup, or based on earlier <codeph>SET</codeph> statements in
+                the same session. To modify option values, issue commands with
+                the syntax <codeph>set
+                    <varname>option</varname>=<varname>value</varname></codeph>.
+                To restore an option to its default, use the
+                  <codeph>unset</codeph> command.
               </p>
 
               <p conref="../shared/impala_common.xml#common/set_vs_connect"/>
 
               <p rev="2.0.0">
-                In Impala 2.0 and later, <codeph>SET</codeph> is available as a SQL statement for any kind of
-                application, not only through <cmdname>impala-shell</cmdname>. See
-                <xref href="impala_set.xml#set"/> for details.
-              </p>
-
-              <p rev="2.5.0 IMPALA-2180">
-                In Impala 2.5 and later, you can use <codeph>SET</codeph> to define your own substitution variables
-                within an <cmdname>impala-shell</cmdname> session.
-                Within a SQL statement, you substitute the value by using the notation <codeph>${var:<varname>variable_name</varname>}</codeph>.
+                In Impala 2.0 and later, <codeph>SET</codeph> is
+                available as a SQL statement for any kind of application as well
+                as in <cmdname>impala-shell</cmdname>. See <xref
+                  href="impala_set.xml#set"/> for details.
               </p>
             </entry>
           </row>

[5/6] impala git commit: IMPALA-6883: [DOCS] Refactor impala_authorization doc

Posted by ta...@apache.org.

IMPALA-6883: [DOCS] Refactor impala_authorization doc

Change-Id: I3df72adb25dcdcbc286934b048645f47d876b33d
Reviewed-on: http://gerrit.cloudera.org:8080/10786
Reviewed-by: Alex Rodoni <ar...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/4ff9f5f3
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/4ff9f5f3
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/4ff9f5f3

Branch: refs/heads/master
Commit: 4ff9f5f3d280607ca523652319c8691803c5db57
Parents: 30e82c6
Author: Alex Rodoni <ar...@cloudera.com>
Authored: Thu Jun 21 13:44:38 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Tue Jul 3 23:21:07 2018 +0000

----------------------------------------------------------------------
 docs/shared/impala_common.xml        | 867 ++++++++++++++----------------
 docs/topics/impala_authorization.xml | 266 ++++-----
 docs/topics/impala_grant.xml         | 111 +---
 3 files changed, 543 insertions(+), 701 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/4ff9f5f3/docs/shared/impala_common.xml
----------------------------------------------------------------------
diff --git a/docs/shared/impala_common.xml b/docs/shared/impala_common.xml
index 6faa9c1..4dcfffb 100644
--- a/docs/shared/impala_common.xml
+++ b/docs/shared/impala_common.xml
@@ -115,451 +115,388 @@ under the License.
         nested topics at the end of this file.
       </p>
 
-      <table id="sentry_privileges_objects">
-        <title>Valid privilege types and objects they apply to</title>
-        <tgroup cols="2">
-          <colspec colnum="1" colname="col1" colwidth="1*"/>
-          <colspec colnum="2" colname="col2" colwidth="2*"/>
-          <thead>
-            <row>
-              <entry><b>Privilege</b></entry>
-              <entry><b>Object</b></entry>
-            </row>
-          </thead>
-          <tbody>
-            <row>
-              <entry>INSERT</entry>
-              <entry>DB, TABLE</entry>
-            </row>
-            <row>
-              <entry>SELECT</entry>
-              <entry>DB, TABLE, COLUMN</entry>
-            </row>
-            <row>
-              <entry>ALL</entry>
-              <entry>SERVER, TABLE, DB, URI</entry>
-            </row>
-          </tbody>
-        </tgroup>
-      </table>
-
-      <table id="privileges_sql">
-        <title>Privilege table for Hive &amp; Impala operations</title>
-        <tgroup cols="4">
-          <colspec colnum="1" colname="col1" colwidth="1.31*"/>
-          <colspec colnum="2" colname="col2" colwidth="1.17*"/>
-          <colspec colnum="3" colname="col3" colwidth="1*"/>
-          <colspec colname="newCol4" colnum="4" colwidth="1*"/>
-          <thead>
-            <row>
-              <entry>Operation</entry>
-              <entry>Scope</entry>
-              <entry>Privileges Required</entry>
-              <entry>URI</entry>
-            </row>
-          </thead>
-          <tbody>
-            <row id="create_database_privs">
-              <entry>CREATE DATABASE</entry>
-              <entry>SERVER</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row id="drop_database_privs">
-              <entry>DROP DATABASE</entry>
-              <entry>DATABASE</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row id="create_table_privs">
-              <entry>CREATE TABLE</entry>
-              <entry>DATABASE</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row id="drop_table_privs">
-              <entry>DROP TABLE</entry>
-              <entry>TABLE</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row id="create_view_privs">
-              <entry>CREATE VIEW<p>-This operation is allowed if you have
-                  column-level <codeph>SELECT</codeph> access to the columns
-                  being used.</p></entry>
-              <entry>DATABASE; SELECT on TABLE; </entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row>
-              <entry>ALTER VIEW<p>-This operation is allowed if you have
-                  column-level <codeph>SELECT</codeph> access to the columns
-                  being used.</p></entry>
-              <entry>VIEW/TABLE</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row id="drop_view_privs">
-              <entry>DROP VIEW</entry>
-              <entry>VIEW/TABLE</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_add_columns_privs">
-              <entry>ALTER TABLE .. ADD COLUMNS</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_replace_columns_privs">
-              <entry>ALTER TABLE .. REPLACE COLUMNS</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_change_column_privs">
-              <entry>ALTER TABLE .. CHANGE column</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_rename_privs">
-              <entry>ALTER TABLE .. RENAME</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_set_tblproperties_privs">
-              <entry>ALTER TABLE .. SET TBLPROPERTIES</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_set_fileformat_privs">
-              <entry>ALTER TABLE .. SET FILEFORMAT</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_set_location_privs">
-              <entry>ALTER TABLE .. SET LOCATION</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry>URI</entry>
-            </row>
-            <row id="alter_table_add_partition_privs">
-              <entry>ALTER TABLE .. ADD PARTITION</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_add_partition_location_privs">
-              <entry>ALTER TABLE .. ADD PARTITION location</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry>URI</entry>
-            </row>
-            <row id="alter_table_drop_partition_privs">
-              <entry>ALTER TABLE .. DROP PARTITION</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_partition_set_fileformat_privs">
-              <entry>ALTER TABLE .. PARTITION SET FILEFORMAT</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="show_create_table_privs">
-              <entry>SHOW CREATE TABLE</entry>
-              <entry>TABLE</entry>
-              <entry>SELECT/INSERT</entry>
-              <entry/>
-            </row>
-            <row id="show_partitions_privs">
-              <entry>SHOW PARTITIONS</entry>
-              <entry>TABLE</entry>
-              <entry>SELECT/INSERT</entry>
-              <entry/>
-            </row>
-            <row>
-              <entry>SHOW TABLES<p>-Output includes all the tables for which
-                  the user has table-level privileges and all the tables for
-                  which the user has some column-level privileges.</p></entry>
-              <entry>TABLE</entry>
-              <entry>SELECT/INSERT</entry>
-              <entry/>
-            </row>
-            <row>
-              <entry>SHOW GRANT ROLE<p>-Output includes an additional field
-                  for any column-level privileges.</p></entry>
-              <entry>TABLE</entry>
-              <entry>SELECT/INSERT</entry>
-              <entry/>
-            </row>
-            <row id="describe_table_privs">
-              <entry>DESCRIBE TABLE<p>-Output shows <i>all</i> columns if the
-                  user has table level-privileges or <codeph>SELECT</codeph>
-                  privilege on at least one table column</p></entry>
-              <entry>TABLE</entry>
-              <entry>SELECT/INSERT</entry>
-              <entry/>
-            </row>
-            <row id="load_data_privs">
-              <entry>LOAD DATA</entry>
-              <entry>TABLE</entry>
-              <entry>INSERT</entry>
-              <entry>URI</entry>
-            </row>
-            <row id="select_privs">
-              <entry>SELECT<p>-You can grant the SELECT privilege on a view to
-                  give users access to specific columns of a table they do not
-                  otherwise have access to.</p><p>-See
-                  <xref keyref="sg_hive_sql"/>
-                  for details on allowed column-level
-                operations.</p></entry>
-              <entry>VIEW/TABLE; COLUMN</entry>
-              <entry>SELECT</entry>
-              <entry/>
-            </row>
-            <row id="insert_overwrite_table_privs">
-              <entry>INSERT OVERWRITE TABLE</entry>
-              <entry>TABLE</entry>
-              <entry>INSERT</entry>
-              <entry/>
-            </row>
-            <row id="create_table_as_select_privs">
-              <entry>CREATE TABLE .. AS SELECT<p>-This operation is allowed if
-                  you have column-level <codeph>SELECT</codeph> access to the
-                  columns being used.</p></entry>
-              <entry>DATABASE; SELECT on TABLE</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row id="use_privs">
-              <entry>USE &lt;dbName&gt;</entry>
-              <entry>Any</entry>
-              <entry/>
-              <entry/>
-            </row>
-            <row id="create_function_privs">
-              <entry>CREATE FUNCTION</entry>
-              <entry>SERVER</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_set_serdeproperties_privs">
-              <entry>ALTER TABLE .. SET SERDEPROPERTIES</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row>
-              <entry>ALTER TABLE .. PARTITION SET SERDEPROPERTIES</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="_privs">
-              <entry namest="col1" nameend="newCol4"><b>Hive-Only
-                  Operations</b></entry>
-            </row>
-            <row id="insert_overwrite_directory_privs">
-              <entry>INSERT OVERWRITE DIRECTORY</entry>
-              <entry>TABLE</entry>
-              <entry>INSERT</entry>
-              <entry>URI</entry>
-            </row>
-            <row id="analyze_table_privs">
-              <entry>Analyze TABLE</entry>
-              <entry>TABLE</entry>
-              <entry>SELECT + INSERT</entry>
-              <entry/>
-            </row>
-            <row id="import_table_privs">
-              <entry>IMPORT TABLE</entry>
-              <entry>DATABASE</entry>
-              <entry>ALL</entry>
-              <entry>URI</entry>
-            </row>
-            <row id="export_table_privs">
-              <entry>EXPORT TABLE</entry>
-              <entry>TABLE</entry>
-              <entry>SELECT</entry>
-              <entry>URI</entry>
-            </row>
-            <row id="alter_table_touch_privs">
-              <entry>ALTER TABLE TOUCH</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_touch_partition_privs">
-              <entry>ALTER TABLE TOUCH PARTITION</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_clustered_by_privs">
-              <entry>ALTER TABLE .. CLUSTERED BY SORTED BY</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_enable_privs">
-              <entry>ALTER TABLE .. ENABLE/DISABLE</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_partition_enable_privs">
-              <entry>ALTER TABLE .. PARTITION ENABLE/DISABLE</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row id="alter_table_partition_rename_privs">
-              <entry>ALTER TABLE .. PARTITION.. RENAME TO PARTITION</entry>
-              <entry>TABLE</entry>
-              <entry>ALL on DATABASE</entry>
-              <entry/>
-            </row>
-            <row>
-              <entry>MSCK REPAIR TABLE</entry>
-              <entry>TABLE</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row id="alter_database_privs">
-              <entry>ALTER DATABASE</entry>
-              <entry>DATABASE</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row id="describe_database_privs">
-              <entry>DESCRIBE DATABASE</entry>
-              <entry>DATABASE</entry>
-              <entry>SELECT/INSERT</entry>
-              <entry/>
-            </row>
-            <row id="show_columns_privs">
-              <entry>SHOW COLUMNS<p>-Output for this operation filters columns
-                  to which the user does not have explicit
-                    <codeph>SELECT</codeph> access </p></entry>
-              <entry>TABLE</entry>
-              <entry>SELECT/INSERT</entry>
-              <entry/>
-            </row>
-            <row id="create_index_privs">
-              <entry>CREATE INDEX</entry>
-              <entry>TABLE</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row id="drop_index_privs">
-              <entry>DROP INDEX</entry>
-              <entry>TABLE</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row id="show_indexes_privs">
-              <entry>SHOW INDEXES</entry>
-              <entry>TABLE</entry>
-              <entry>SELECT/INSERT</entry>
-              <entry/>
-            </row>
-            <row id="grant_privs">
-              <entry>GRANT PRIVILEGE</entry>
-              <entry>Allowed only for Sentry admin users</entry>
-              <entry/>
-              <entry/>
-            </row>
-            <row id="revoke_privs">
-              <entry>REVOKE PRIVILEGE</entry>
-              <entry>Allowed only for Sentry admin users</entry>
-              <entry/>
-              <entry/>
-            </row>
-            <row id="show_grants_privs">
-              <entry>SHOW GRANTS</entry>
-              <entry>Allowed only for Sentry admin users</entry>
-              <entry/>
-              <entry/>
-            </row>
-            <row id="show_tblproperties_privs">
-              <entry>SHOW TBLPROPERTIES</entry>
-              <entry>TABLE</entry>
-              <entry>SELECT/INSERT</entry>
-              <entry/>
-            </row>
-            <row id="describe_table_partition_privs">
-              <entry>DESCRIBE TABLE .. PARTITION</entry>
-              <entry>TABLE</entry>
-              <entry>SELECT/INSERT</entry>
-              <entry/>
-            </row>
-            <row id="add_jar_privs">
-              <entry>ADD JAR</entry>
-              <entry>Not Allowed</entry>
-              <entry/>
-              <entry/>
-            </row>
-            <row id="add_file_privs">
-              <entry>ADD FILE</entry>
-              <entry>Not Allowed</entry>
-              <entry/>
-              <entry/>
-            </row>
-            <row id="dfs_privs">
-              <entry>DFS</entry>
-              <entry>Not Allowed</entry>
-              <entry/>
-              <entry/>
-            </row>
-            <row>
-              <entry namest="col1" nameend="newCol4"><b>Impala-Only
-                  Operations</b></entry>
-            </row>
-            <row id="explain_privs">
-              <entry>EXPLAIN</entry>
-              <entry>TABLE; COLUMN</entry>
-              <entry>SELECT</entry>
-              <entry/>
-            </row>
-            <row id="invalidate_metadata_privs">
-              <entry>INVALIDATE METADATA</entry>
-              <entry>SERVER</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row id="invalidate_metadata_table_privs">
-              <entry>INVALIDATE METADATA &lt;table name&gt;</entry>
-              <entry>TABLE</entry>
-              <entry>SELECT/INSERT</entry>
-              <entry/>
-            </row>
-            <row id="refresh_privs">
-              <entry>REFRESH &lt;table name&gt; or REFRESH &lt;table name&gt; PARTITION (&lt;partition_spec&gt;)</entry>
-              <entry>TABLE</entry>
-              <entry>SELECT/INSERT</entry>
-              <entry/>
-            </row>
-            <row id="drop_function_privs">
-              <entry>DROP FUNCTION</entry>
-              <entry>SERVER</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-            <row id="compute_stats_privs">
-              <entry>COMPUTE STATS</entry>
-              <entry>TABLE</entry>
-              <entry>ALL</entry>
-              <entry/>
-            </row>
-          </tbody>
-        </tgroup>
-      </table>
+      <p id="sentry_privileges_objects">The table below lists the minimum level
+        of privileges and the scope required to execute SQL statements in
+          <keyword keyref="impala30_full"/> and higher. The following notations
+        are used:<ul>
+          <li><b>ANY</b> denotes the <codeph>SELECT</codeph>,
+              <codeph>INSERT</codeph>, <codeph>CREATE</codeph>,
+              <codeph>ALTER</codeph>, <codeph>DROP</codeph>, <b><i>or</i></b>
+            <codeph>REFRESH</codeph> privilege.</li>
+          <li><b>ALL</b> privilege denotes the <codeph>SELECT</codeph>,
+              <codeph>INSERT</codeph>, <codeph>CREATE</codeph>,
+              <codeph>ALTER</codeph>, <codeph>DROP</codeph>, <b><i>and</i></b>
+            <codeph>REFRESH</codeph> privileges.</li>
+          <li>The parent levels of the specified scope are implicitly supported.
+            For example, if a privilege is listed with the
+              <codeph>TABLE</codeph> scope, the same privilege granted on
+              <codeph>DATABASE</codeph> and <codeph>SERVER</codeph> will allow
+            the user to execute the specified SQL statement.</li>
+        </ul><table id="sentry_privileges_objects_tab" frame="all" colsep="1"
+          rowsep="1">
+          <tgroup cols="3">
+            <colspec colnum="1" colname="col1"/>
+            <colspec colnum="2" colname="col2"/>
+            <colspec colnum="3" colname="col3"/>
+            <tbody>
+              <row>
+                <entry><b>SQL Statement</b></entry>
+                <entry><b>Privileges</b></entry>
+                <entry><b>Scope</b></entry>
+              </row>
+              <row>
+                <entry>SELECT</entry>
+                <entry>SELECT</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>WITH SELECT</entry>
+                <entry>SELECT</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>EXPLAIN SELECT</entry>
+                <entry>SELECT</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>INSERT</entry>
+                <entry>INSERT</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>EXPLAIN INSERT</entry>
+                <entry>INSERT</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>TRUNCATE</entry>
+                <entry>INSERT</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>LOAD</entry>
+                <entry>INSERT</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry/>
+                <entry>ALL</entry>
+                <entry>URI</entry>
+              </row>
+              <row>
+                <entry>CREATE DATABASE</entry>
+                <entry>CREATE</entry>
+                <entry>SERVER</entry>
+              </row>
+              <row>
+                <entry>CREATE DATABASE LOCATION</entry>
+                <entry>CREATE</entry>
+                <entry>SERVER</entry>
+              </row>
+              <row>
+                <entry/>
+                <entry>ALL</entry>
+                <entry>URI</entry>
+              </row>
+              <row>
+                <entry>CREATE TABLE</entry>
+                <entry>CREATE</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry>CREATE TABLE LIKE</entry>
+                <entry>CREATE</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry/>
+                <entry>SELECT, INSERT, <b><i>or</i></b> REFRESH</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>CREATE TABLE AS SELECT</entry>
+                <entry>CREATE</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry/>
+                <entry>INSERT</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry/>
+                <entry>SELECT</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>EXPLAIN CREATE TABLE AS SELECT</entry>
+                <entry>CREATE</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry/>
+                <entry>INSERT</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry/>
+                <entry>SELECT</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>CREATE TABLE LOCATION</entry>
+                <entry>CREATE</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry/>
+                <entry>ALL</entry>
+                <entry>URI</entry>
+              </row>
+              <row>
+                <entry>CREATE VIEW</entry>
+                <entry>CREATE</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry/>
+                <entry>SELECT</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>ALTER DATABASE</entry>
+                <entry>ALTER</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry>ALTER TABLE</entry>
+                <entry>ALTER</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>ALTER TABLE SET LOCATION</entry>
+                <entry>ALTER</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry/>
+                <entry>ALL</entry>
+                <entry>URI</entry>
+              </row>
+              <row>
+                <entry>ALTER TABLE RENAME</entry>
+                <entry>CREATE</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry/>
+                <entry>ALTER</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>ALTER VIEW</entry>
+                <entry>ALTER</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry/>
+                <entry>SELECT</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>ALTER VIEW RENAME</entry>
+                <entry>CREATE</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry/>
+                <entry>ALTER</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>DROP DATABASE</entry>
+                <entry>DROP</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry>DROP TABLE</entry>
+                <entry>DROP</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>DROP VIEW</entry>
+                <entry>DROP</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>CREATE FUNCTION</entry>
+                <entry>CREATE</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry/>
+                <entry>ALL</entry>
+                <entry>URI</entry>
+              </row>
+              <row>
+                <entry>DROP FUNCTION</entry>
+                <entry>DROP</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry>COMPUTE STATS</entry>
+                <entry>ALTER and SELECT</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>DROP STATS</entry>
+                <entry>ALTER</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>INVALIDATE METADATA</entry>
+                <entry>REFRESH</entry>
+                <entry>SERVER</entry>
+              </row>
+              <row>
+                <entry>INVALIDATE METADATA &lt;table></entry>
+                <entry>REFRESH</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>REFRESH &lt;table></entry>
+                <entry>REFRESH</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>REFRESH FUNCTIONS</entry>
+                <entry>REFRESH</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry>COMMENT ON DATABASE</entry>
+                <entry>ALTER</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry>COMMENT ON TABLE</entry>
+                <entry>ALTER</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>COMMENT ON VIEW</entry>
+                <entry>ALTER</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>DESCRIBE DATABASE</entry>
+                <entry>SELECT, INSERT, <b><i>or</i></b> REFRESH</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry>DESCRIBE &lt;table/view></entry>
+                <entry>SELECT, INSERT, <b><i>or</i></b> REFRESH</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>USE</entry>
+                <entry>ANY</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>SHOW DATABASES</entry>
+                <entry>ANY</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>SHOW TABLES</entry>
+                <entry>ANY</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>SHOW FUNCTIONS</entry>
+                <entry>SELECT, INSERT, <b><i>or</i></b> REFRESH</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry>SHOW PARTITIONS</entry>
+                <entry>SELECT, INSERT, <b><i>or</i></b> REFRESH</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>SHOW TABLE STATS</entry>
+                <entry>SELECT, INSERT, <b><i>or</i></b> REFRESH</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>SHOW COLUMN STATS</entry>
+                <entry>SELECT, INSERT, <b><i>or</i></b> REFRESH</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>SHOW FILES</entry>
+                <entry>SELECT, INSERT, <b><i>or</i></b> REFRESH</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>SHOW CREATE TABLE</entry>
+                <entry>SELECT, INSERT, <b><i>or</i></b> REFRESH</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>SHOW CREATE VIEW</entry>
+                <entry>SELECT, INSERT, <b><i>or</i></b> REFRESH</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>SHOW CREATE FUNCTION</entry>
+                <entry>SELECT, INSERT, <b><i>or</i></b> REFRESH</entry>
+                <entry>DATABASE</entry>
+              </row>
+              <row>
+                <entry>SHOW RANGE PARTITIONS (Kudu only)</entry>
+                <entry>SELECT, INSERT, <b><i>or</i></b> REFRESH</entry>
+                <entry>TABLE</entry>
+              </row>
+              <row>
+                <entry>UPDATE (Kudu only)</entry>
+                <entry>ALL</entry>
+                <entry>SERVER</entry>
+              </row>
+              <row>
+                <entry>EXPLAIN UPDATE (Kudu only)</entry>
+                <entry>ALL</entry>
+                <entry>SERVER</entry>
+              </row>
+              <row>
+                <entry>UPSERT (Kudu only)</entry>
+                <entry>ALL</entry>
+                <entry>SERVER</entry>
+              </row>
+              <row>
+                <entry>WITH UPSERT (Kudu only)</entry>
+                <entry>ALL</entry>
+                <entry>SERVER</entry>
+              </row>
+              <row>
+                <entry>EXPLAIN UPSERT (Kudu only)</entry>
+                <entry>ALL</entry>
+                <entry>SERVER</entry>
+              </row>
+              <row>
+                <entry>DELETE (Kudu only)</entry>
+                <entry>ALL</entry>
+                <entry>SERVER</entry>
+              </row>
+              <row>
+                <entry>EXPLAIN DELETE (Kudu only)</entry>
+                <entry>ALL</entry>
+                <entry>SERVER</entry>
+              </row>
+            </tbody>
+          </tgroup>
+        </table></p>
 
     <p rev="IMPALA-2660" id="auth_to_local_instructions">
       In <keyword keyref="impala26_full"/> and higher, Impala recognizes the <codeph>auth_to_local</codeph> setting,
@@ -590,29 +527,23 @@ under the License.
       <b><ph id="title_sentry_debug">Debugging Failed Sentry Authorization Requests</ph></b>
     </p>
 
-    <p id="sentry_debug">
-      Sentry logs all facts that lead up to authorization decisions at the debug level. If you do not understand
-      why Sentry is denying access, the best way to debug is to temporarily turn on debug logging:
-      <ul>
-        <li>
-          Add <codeph>log4j.logger.org.apache.sentry=DEBUG</codeph> to the <filepath>log4j.properties</filepath>
-          file on each host in the cluster, in the appropriate configuration directory for each service.
-        </li>
-      </ul>
-      Specifically, look for exceptions and messages such as:
-<codeblock xml:space="preserve">FilePermission server..., RequestPermission server...., result [true|false]</codeblock>
-      which indicate each evaluation Sentry makes. The <codeph>FilePermission</codeph> is from the policy file,
-      while <codeph>RequestPermission</codeph> is the privilege required for the query. A
-      <codeph>RequestPermission</codeph> will iterate over all appropriate <codeph>FilePermission</codeph>
-      settings until a match is found. If no matching privilege is found, Sentry returns <codeph>false</codeph>
-      indicating <q>Access Denied</q> .
-<!--
-[1]
-Impala: Impala Daemon -> Advanced -> Impala Daemon Logging Safety Valve
-Hive: Hive Server 2 -> Advanced -> HiveServer2 Logging Safety Valve
-Search: Solr Server -> Advanced -> HiveServer2 Logging Safety Valve
--->
-    </p>
+    <p id="sentry_debug"> Sentry logs all facts that lead up to authorization
+        decisions at the debug level. If you do not understand why Sentry is
+        denying access, the best way to debug is to temporarily turn on debug
+        logging: <ul>
+          <li> Add <codeph>log4j.logger.org.apache.sentry=DEBUG</codeph> to the
+              <filepath>log4j.properties</filepath> file on each host in the
+            cluster, in the appropriate configuration directory for each
+            service. </li>
+        </ul> Specifically, look for exceptions and messages such as:
+        <codeblock xml:space="preserve">FilePermission server..., RequestPermission server...., result [true|false]</codeblock>
+        which indicate each evaluation Sentry makes. The
+          <codeph>FilePermission</codeph> is from the policy file, while
+          <codeph>RequestPermission</codeph> is the privilege required for the
+        query. A <codeph>RequestPermission</codeph> will iterate over all
+        appropriate <codeph>FilePermission</codeph> settings until a match is
+        found. If no matching privilege is found, Sentry returns
+          <codeph>false</codeph> indicating <q>Access Denied</q>.</p>
 
   </section>
 

http://git-wip-us.apache.org/repos/asf/impala/blob/4ff9f5f3/docs/topics/impala_authorization.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_authorization.xml b/docs/topics/impala_authorization.xml
index 4e4a243..39932f6 100644
--- a/docs/topics/impala_authorization.xml
+++ b/docs/topics/impala_authorization.xml
@@ -65,12 +65,23 @@ under the License.
     <conbody>
 
       <p>
-        Privileges can be granted on different objects in the schema. Any privilege that can be granted is
-        associated with a level in the object hierarchy. If a privilege is granted on a container object in the
-        hierarchy, the child object automatically inherits it. This is the same privilege model as Hive and other
-        database systems such as MySQL.
+        Privileges can be granted on different objects in the schema. Any privilege that can be
+        granted is associated with a level in the object hierarchy. If a privilege is granted on
+        a parent object in the hierarchy, the child object automatically inherits it. This is
+        the same privilege model as Hive and other database systems.
+      </p>
+
+      <p>
+        The objects in the Impala schema hierarchy are:
       </p>
 
+<codeblock>Server
+    URI
+    Database
+        Table
+            Column
+</codeblock>
+
       <p rev="2.3.0 collevelauth">
         The object hierarchy for Impala covers Server, URI, Database, Table, and Column. (The Table privileges apply to views as well;
         anywhere you specify a table name, you can specify a view name instead.)
@@ -81,63 +92,7 @@ under the License.
         in a table.
       </p>
 
-      <p>
-        A restricted set of privileges determines what you can do with each object:
-      </p>
-
-      <dl>
-        <dlentry id="select_priv">
-
-          <dt>
-            SELECT privilege
-          </dt>
-
-          <dd>
-            Lets you read data from a table or view, for example with the <codeph>SELECT</codeph> statement, the
-            <codeph>INSERT...SELECT</codeph> syntax, or <codeph>CREATE TABLE...LIKE</codeph>. Also required to
-            issue the <codeph>DESCRIBE</codeph> statement or the <codeph>EXPLAIN</codeph> statement for a query
-            against a particular table. Only objects for which a user has this privilege are shown in the output
-            for <codeph>SHOW DATABASES</codeph> and <codeph>SHOW TABLES</codeph> statements. The
-            <codeph>REFRESH</codeph> statement and <codeph>INVALIDATE METADATA</codeph> statements only access
-            metadata for tables for which the user has this privilege.
-          </dd>
-
-        </dlentry>
-
-        <dlentry id="insert_priv">
-
-          <dt>
-            INSERT privilege
-          </dt>
-
-          <dd>
-            Lets you write data to a table. Applies to the <codeph>INSERT</codeph> and <codeph>LOAD DATA</codeph>
-            statements.
-          </dd>
-
-        </dlentry>
-
-        <dlentry id="all_priv">
-
-          <dt>
-            ALL privilege
-          </dt>
-
-          <dd>
-            Lets you create or modify the object. Required to run DDL statements such as <codeph>CREATE
-            TABLE</codeph>, <codeph>ALTER TABLE</codeph>, or <codeph>DROP TABLE</codeph> for a table,
-            <codeph>CREATE DATABASE</codeph> or <codeph>DROP DATABASE</codeph> for a database, or <codeph>CREATE
-            VIEW</codeph>, <codeph>ALTER VIEW</codeph>, or <codeph>DROP VIEW</codeph> for a view. Also required for
-            the URI of the <q>location</q> parameter for the <codeph>CREATE EXTERNAL TABLE</codeph> and
-            <codeph>LOAD DATA</codeph> statements.
-<!-- Have to think about the best wording, how often to repeat, how best to conref this caveat.
-          You do not actually code the keyword <codeph>ALL</codeph> in the policy file; instead you use
-          <codeph>action=*</codeph> or shorten the right-hand portion of the rule.
-          -->
-          </dd>
-
-        </dlentry>
-      </dl>
+      <p conref="../shared/impala_common.xml#common/sentry_privileges_objects"/>
 
       <p>
         Privileges can be specified for a table or view before that object actually exists. If you do not have
@@ -145,6 +100,30 @@ under the License.
         not.
       </p>
 
+      <note>
+        <p>
+          Although this document refers to the <codeph>ALL</codeph> privilege, currently if you
+          use the policy file mode, you do not use the actual keyword <codeph>ALL</codeph> in
+          the policy file. When you code role entries in the policy file:
+        </p>
+        <ul>
+          <li>
+            To specify the <codeph>ALL</codeph> privilege for a server, use a role like
+            <codeph>server=<varname>server_name</varname></codeph>.
+          </li>
+
+          <li>
+            To specify the <codeph>ALL</codeph> privilege for a database, use a role like
+            <codeph>server=<varname>server_name</varname>-&gt;db=<varname>database_name</varname></codeph>.
+          </li>
+
+          <li>
+            To specify the <codeph>ALL</codeph> privilege for a table, use a role like
+            <codeph>server=<varname>server_name</varname>-&gt;db=<varname>database_name</varname>-&gt;table=<varname>table_name</varname>-&gt;action=*</codeph>.
+          </li>
+        </ul>
+      </note>
+
       <p>
         Originally, privileges were encoded in a policy file, stored in HDFS. This mode of operation is still an
         option, but the emphasis of privilege management is moving towards being SQL-based. Although currently
@@ -176,17 +155,21 @@ under the License.
 
       <ul>
         <li>
-          The <codeph>-server_name</codeph> option turns on Sentry authorization for Impala. The authorization
-          rules refer to a symbolic server name, and you specify the name to use as the argument to the
-          <codeph>-server_name</codeph> option.
+          <codeph>-server_name</codeph>: Turns on Sentry authorization for Impala. The
+          authorization rules refer to a symbolic server name, and you specify the same name to
+          use as the argument to the <codeph>-server_name</codeph> option for all
+          <cmdname>impalad</cmdname> nodes in the cluster.
+          <p>
+            Starting in Impala 1.4.0 and higher, if you specify just
+            <codeph>-server_name</codeph> without <codeph>-authorization_policy_file</codeph>,
+            Impala uses the Sentry service for authorization.
+          </p>
         </li>
 
-        <li rev="1.4.0">
-          If you specify just <codeph>-server_name</codeph>, Impala uses the Sentry service for authorization,
-          relying on the results of <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements issued through
-          Hive. (This mode of operation is available in Impala 1.4.0 and higher.) Prior to Impala 1.4.0, or if you
-          want to continue storing privilege rules in the policy file, also specify the
-          <codeph>-authorization_policy_file</codeph> option as in the following item.
+        <li>
+          <codeph>-sentry_config</codeph>: Specifies the local path to the
+          <codeph>sentry-site.xml</codeph> configuration file. This setting is required to
+          enable authorization.
         </li>
 
         <li>
@@ -218,6 +201,14 @@ under the License.
 </codeblock>
 
       <p>
+        The preceding examples set up a symbolic name of <codeph>server1</codeph> to refer to
+        the current instance of Impala. Specify the symbolic name for the
+        <codeph>sentry.hive.server</codeph> property in the <filepath>sentry-site.xml</filepath>
+        configuration file for Hive, as well as in the <codeph>-server_name</codeph> option for
+        <cmdname>impalad</cmdname>.
+      </p>
+
+      <p>
         The preceding examples set up a symbolic name of <codeph>server1</codeph> to refer to the current instance
         of Impala. This symbolic name is used in the following ways:
       </p>
@@ -307,7 +298,44 @@ report_generator = server=server1-&gt;db=reporting_db-&gt;table=*-&gt;action=SEL
         to security policies, restart all Impala daemons to pick up the changes immediately.
       </p>
 
-      <p outputclass="toc inpage"/>
+      <p>
+        URIs represent the file paths you specify as part of statements such as <codeph>CREATE
+        EXTERNAL TABLE</codeph> and <codeph>LOAD DATA</codeph>. Typically, you specify what look
+        like UNIX paths, but these locations can also be prefixed with <codeph>hdfs://</codeph>
+        to make clear that they are really URIs. To set privileges for a URI, specify the name
+        of a directory, and the privilege applies to all the files in that directory and any
+        directories underneath it.
+      </p>
+
+      <p>
+        URIs must start with <codeph>hdfs://</codeph>, <codeph>s3a://</codeph>,
+        <codeph>adl://</codeph>, or <codeph>file://</codeph>. If a URI starts with an absolute
+        path, the path will be appended to the default filesystem prefix. For example, if you
+        specify:
+<codeblock>
+GRANT ALL ON URI '/tmp';
+</codeblock>
+        The above statement effectively becomes the following where the default filesystem is
+        HDFS.
+<codeblock>
+GRANT ALL ON URI 'hdfs://localhost:20500/tmp';
+</codeblock>
+      </p>
+
+      <p>
+        When defining URIs for HDFS, you must also specify the NameNode. For example:
+<codeblock>GRANT ALL ON URI file:///path/to/dir TO &lt;role>
+GRANT ALL ON URI hdfs://namenode:port/path/to/dir TO &lt;role></codeblock>
+        <note type="warning">
+          <p>
+            Because the NameNode host and port must be specified, it is strongly recommended
+            that you use High Availability (HA). This ensures that the URI will remain constant
+            even if the NameNode changes. For example:
+          </p>
+<codeblock>GRANT ALL ON URI hdfs://ha-nn-uri/path/to/dir TO &lt;role></codeblock>
+        </note>
+      </p>
+
     </conbody>
 
     <concept id="security_policy_file_details">
@@ -520,14 +548,15 @@ student = server=server1-&gt;db=training-&gt;table=lesson_*-&gt;action=SELECT
 
 <codeblock></codeblock>
 
-</example>
-
-<example id="sec_ex_superuser_single_table">
-<title>A User with Full Privileges for a Specific Table</title>
-    <p>
-      If a user has <codeph>SELECT</codeph> privilege for a table, they can query, describe, or explain queries for
-      that table.
-    </p>
+            <li>
+              The <codeph>staging_dir</codeph> role can specify the HDFS path
+              <filepath>/user/impala-user/external_data</filepath> with the <codeph>LOAD
+              DATA</codeph> statement. When Impala queries or loads data files, it operates on
+              all the files in that directory, not just a single file, so any Impala
+              <codeph>LOCATION</codeph> parameters refer to a directory rather than an
+              individual file.
+            </li>
+          </ul>
 
 <codeblock></codeblock>
 </example>
@@ -564,33 +593,10 @@ student = server=server1-&gt;db=training-&gt;table=lesson_*-&gt;action=SELECT
             </li>
 
             <li>
-              The <codeph>staging_dir</codeph> role lets us specify the HDFS path
-              <filepath>/user/username/external_data</filepath> with the <codeph>LOAD DATA</codeph> statement.
-              Remember, when Impala queries or loads data files, it operates on all the files in that directory,
-              not just a single file, so any Impala <codeph>LOCATION</codeph> parameters refer to a directory
-              rather than an individual file.
-            </li>
-
-            <li>
-              We included the IP address and port of the Hadoop name node in the HDFS URI of the
-              <codeph>staging_dir</codeph> rule. We found those details in
-              <filepath>/etc/hadoop/conf/core-site.xml</filepath>, under the <codeph>fs.default.name</codeph>
-              element. That is what we use in any roles that specify URIs (that is, the locations of directories in
-              HDFS).
-            </li>
-
-            <li>
-              We start this example after the table <codeph>external_table.sample</codeph> is already created. In
-              the policy file for the example, we have already taken away the <codeph>external_table_admin</codeph>
-              role from the <codeph>username</codeph> group, and replaced it with the lesser-privileged
-              <codeph>external_table</codeph> role.
-            </li>
-
-            <li>
-              We assign privileges to a subdirectory underneath <filepath>/user/username</filepath> in HDFS,
-              because such privileges also apply to any subdirectories underneath. If we had assigned privileges to
-              the parent directory <filepath>/user/username</filepath>, it would be too likely to mess up other
-              files by specifying a wrong location by mistake.
+              Members of the <codeph>impala_users</codeph> group have the
+              <codeph>instructor</codeph> role and so can create, insert into, and query any
+              tables in the <codeph>training</codeph> database, but cannot create or drop the
+              database itself.
             </li>
 
             <li>
@@ -705,15 +711,14 @@ ERROR: AuthorizationException: User 'username' does not have privileges to acces
             with sensitive information, then create a view that only exposes the non-confidential columns.
           </p>
 
-<codeblock>[localhost:21000] &gt; create table sensitive_info
-                &gt; (
-                &gt;   name string,
-                &gt;   address string,
-                &gt;   credit_card string,
-                &gt;   taxpayer_id string
-                &gt; );
-[localhost:21000] &gt; create view name_address_view as select name, address from sensitive_info;
-</codeblock>
+      <note rev="1.4.0">
+        In <ph rev="upstream">CDH 5</ph> and higher, <ph
+          rev="upstream">Cloudera</ph>
+        recommends managing privileges through SQL statements, as described in
+        <xref
+          href="impala_authorization.xml#sentry_service"/>. If you are still using
+        policy files, plan to migrate to the new approach some time in the future.
+      </note>
 
           <p>
             Then the following policy file specifies read-only privilege for that view, without authorizing access
@@ -771,15 +776,28 @@ view_only_privs = server=server1-&gt;db=reports-&gt;table=name_address_view-&gt;
             </li>
           </ul>
 
-<codeblock>[groups]
-supergroup = training_sysadmin
-employee = instructor
-visitor = student
-
-[roles]
-training_sysadmin = server=server1-&gt;db=training
-instructor = server=server1-&gt;db=training-&gt;table=*-&gt;action=*
-student = server=server1-&gt;db=training-&gt;table=*-&gt;action=SELECT
+        <p>
+          In the <codeph>[roles]</codeph> section, you a set of roles. For each role, you
+          specify precisely the set of privileges is available. That is, which objects users
+          with that role can access, and what operations they can perform on those objects. This
+          is the lowest-level category of security information; the other sections in the policy
+          file map the privileges to higher-level divisions of groups and users. In the
+          <codeph>[groups]</codeph> section, you specify which roles are associated with which
+          groups. The group and usernames correspond to Linux groups and users on the server
+          where the <cmdname>impalad</cmdname> daemon runs. The privileges are specified using
+          patterns like:
+<codeblock>server=<varname>server_name</varname>-&gt;db=<varname>database_name</varname>-&gt;table=<varname>table_name</varname>-&gt;action=SELECT
+server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=t<varname>able_name</varname>->action=CREATE
+server=<varname>server_name</varname>-&gt;db=<varname>database_name</varname>-&gt;table=<varname>table_name</varname>-&gt;action=ALL
+</codeblock>
+          For the <varname>server_name</varname> value, substitute the same symbolic name you
+          specify with the <cmdname>impalad</cmdname> <codeph>-server_name</codeph> option. You
+          can use <codeph>*</codeph> wildcard characters at each level of the privilege
+          specification to allow access to all such objects. For example:
+<codeblock>server=impala-host.example.com-&gt;db=default-&gt;table=t1-&gt;action=SELECT
+server=impala-host.example.com->db=*->table=*->action=CREATE
+server=impala-host.example.com-&gt;db=*-&gt;table=audit_log-&gt;action=SELECT
+server=impala-host.example.com-&gt;db=default-&gt;table=t1-&gt;action=*
 </codeblock>
 
         </example>

http://git-wip-us.apache.org/repos/asf/impala/blob/4ff9f5f3/docs/topics/impala_grant.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_grant.xml b/docs/topics/impala_grant.xml
index 956a458..bdc71db 100644
--- a/docs/topics/impala_grant.xml
+++ b/docs/topics/impala_grant.xml
@@ -74,7 +74,7 @@ GRANT <varname>privilege</varname> ON <varname>object_type</varname> <varname>ob
 
     <p> The <codeph>WITH GRANT OPTION</codeph> clause allows members of the
       specified role to issue <codeph>GRANT</codeph> and <codeph>REVOKE</codeph>
-      statements for those same privileges Hence, if a role has the
+      statements for those same privileges. Hence, if a role has the
         <codeph>ALL</codeph> privilege on a database and the <codeph>WITH GRANT
         OPTION</codeph> set, users granted that role can execute
         <codeph>GRANT</codeph>/<codeph>REVOKE</codeph> statements only for that
@@ -100,114 +100,7 @@ GRANT <varname>privilege</varname> ON <varname>object_type</varname> <varname>ob
         <codeph>URI</codeph> object. Finer-grained privileges mentioned below on
       a <codeph>URI</codeph> are not supported.
     </p>
-
-    <p>
-      Starting in <keyword keyref="impala30_full"/>, finer grained privileges
-      are enforced as below.<simpletable frame="all" relcolwidth="1* 1* 1*"
-        id="simpletable_kmb_ppn_ndb">
-        <sthead>
-          <stentry>Privilege</stentry>
-          <stentry>Scope</stentry>
-          <stentry>SQL Allowed to Execute</stentry>
-        </sthead>
-        <strow>
-          <stentry><codeph>REFRESH</codeph></stentry>
-          <stentry><codeph>SERVER</codeph></stentry>
-          <stentry><codeph>INVALIDATE METADATA</codeph> on all tables in all
-                databases<p><codeph>REFRESH</codeph> on all tables and functions
-              in all databases</p></stentry>
-        </strow>
-        <strow>
-          <stentry><codeph>REFRESH</codeph></stentry>
-          <stentry><codeph>DATABASE</codeph></stentry>
-          <stentry><codeph>INVALIDATE METADATA</codeph> on all tables in the
-            named database<p><codeph>REFRESH</codeph> on all tables and
-              functions in the named database</p></stentry>
-        </strow>
-        <strow>
-          <stentry><codeph>REFRESH</codeph></stentry>
-          <stentry><codeph>TABLE</codeph></stentry>
-          <stentry><codeph>INVALIDATE METADATA</codeph> on the named
-                table<p><codeph>REFRESH</codeph> on the named
-            table</p></stentry>
-        </strow>
-        <strow>
-          <stentry><codeph>CREATE</codeph></stentry>
-          <stentry><codeph>SERVER</codeph></stentry>
-          <stentry><codeph>CREATE DATABASE</codeph> on all
-                databases<p><codeph>CREATE TABLE</codeph> on all
-            tables</p></stentry>
-        </strow>
-        <strow>
-          <stentry><codeph>CREATE</codeph></stentry>
-          <stentry><codeph>DATABASE</codeph></stentry>
-          <stentry><codeph>CREATE TABLE</codeph> on all tables in the named
-            database</stentry>
-        </strow>
-        <strow>
-          <stentry><codeph>DROP</codeph></stentry>
-          <stentry><codeph>SERVER</codeph></stentry>
-          <stentry><codeph>DROP DATBASE</codeph> on all databases<p><codeph>DROP
-                TABLE</codeph> on all tables</p></stentry>
-        </strow>
-        <strow>
-          <stentry><codeph>DROP</codeph></stentry>
-          <stentry><codeph>DATABASE</codeph></stentry>
-          <stentry><codeph>DROP DATABASE</codeph> on the named
-                database<p><codeph>DROP TABLE</codeph> on all tables in the
-              named database</p></stentry>
-        </strow>
-        <strow>
-          <stentry><codeph>DROP</codeph></stentry>
-          <stentry><codeph>TABLE</codeph></stentry>
-          <stentry><codeph>DROP TABLE</codeph> on the named table</stentry>
-        </strow>
-        <strow>
-          <stentry><codeph>ALTER</codeph></stentry>
-          <stentry><codeph>SERVER</codeph></stentry>
-          <stentry><codeph>ALTER TABLE</codeph> on all tables</stentry>
-        </strow>
-        <strow>
-          <stentry><codeph>ALTER</codeph></stentry>
-          <stentry><codeph>DATABASE</codeph></stentry>
-          <stentry><codeph>ALTER TABLE</codeph> on the tables in the named
-            database</stentry>
-        </strow>
-        <strow>
-          <stentry><codeph>ALTER</codeph></stentry>
-          <stentry><codeph>TABLE</codeph></stentry>
-          <stentry><codeph>ALTER TABLE</codeph> on the named table</stentry>
-        </strow>
-      </simpletable>
-    </p>
-
-    <p>
-      <note>
-        <p>
-          <ul>
-            <li>
-              <codeph>ALTER TABLE RENAME</codeph> requires the
-                <codeph>ALTER</codeph> privilege at the <codeph>TABLE</codeph>
-              level and the <codeph>CREATE</codeph> privilege at the
-                <codeph>DATABASE</codeph> level.
-            </li>
-
-            <li>
-              <codeph>CREATE TABLE AS SELECT</codeph> requires the
-                <codeph>CREATE</codeph> privilege on the database that should
-              contain the new table and the <codeph>SELECT</codeph> privilege on
-              the tables referenced in the query portion of the statement.
-            </li>
-
-            <li>
-              <codeph>COMPUTE STATS</codeph> requires  the
-                <codeph>ALTER</codeph> and <codeph>SELECT</codeph> privileges on
-              the target table.
-            </li>
-          </ul>
-        </p>
-      </note>
-    </p>
+    <p conref="../shared/impala_common.xml#common/sentry_privileges_objects"/>
 
     <p conref="../shared/impala_common.xml#common/compatibility_blurb"/>

[6/6] impala git commit: IMPALA-7236: Fix the parsing of ALLOW_ERASURE_CODED_FILES

Posted by ta...@apache.org.

IMPALA-7236: Fix the parsing of ALLOW_ERASURE_CODED_FILES

This patch adds a missing "break" statement in a switch statement
changed by IMPALA-7102.
Also fixes an non-deterministic test case.

Change-Id: Ife1e791541e3f4fed6bec00945390c7d7681e824
Reviewed-on: http://gerrit.cloudera.org:8080/10857
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/61e6a477
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/61e6a477
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/61e6a477

Branch: refs/heads/master
Commit: 61e6a47776ba7f14139b69f91a49d2072a76178b
Parents: 4ff9f5f
Author: Tianyi Wang <ti...@apache.org>
Authored: Mon Jul 2 19:03:19 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Tue Jul 3 23:49:44 2018 +0000

----------------------------------------------------------------------
 be/src/service/query-options.cc                                    | 1 +
 .../functional-query/queries/QueryTest/hdfs-erasure-coding.test    | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/61e6a477/be/src/service/query-options.cc
----------------------------------------------------------------------
diff --git a/be/src/service/query-options.cc b/be/src/service/query-options.cc
index 1063fef..5d61664 100644
--- a/be/src/service/query-options.cc
+++ b/be/src/service/query-options.cc
@@ -674,6 +674,7 @@ Status impala::SetQueryOption(const string& key, const string& value,
           return Status(Substitute("Invalid kudu_read_mode '$0'. Valid values are "
               "DEFAULT, READ_LATEST, and READ_AT_SNAPSHOT.", value));
         }
+        break;
       }
       case TImpalaQueryOptions::ALLOW_ERASURE_CODED_FILES: {
         query_options->__set_allow_erasure_coded_files(

http://git-wip-us.apache.org/repos/asf/impala/blob/61e6a477/testdata/workloads/functional-query/queries/QueryTest/hdfs-erasure-coding.test
----------------------------------------------------------------------
diff --git a/testdata/workloads/functional-query/queries/QueryTest/hdfs-erasure-coding.test b/testdata/workloads/functional-query/queries/QueryTest/hdfs-erasure-coding.test
index 0c773b4..fbe05c6 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/hdfs-erasure-coding.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/hdfs-erasure-coding.test
@@ -3,7 +3,7 @@
 set allow_erasure_coded_files=false;
 select count(*) from functional.alltypes;
 ---- CATCH
-ImpalaRuntimeException: Scanning of HDFS erasure-coded file (hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=1/090101.txt) is not supported
+ImpalaRuntimeException: Scanning of HDFS erasure-coded file
 ====
 ---- QUERY
 set allow_erasure_coded_files=true;

[4/6] impala git commit: IMPALA-7190: Remove unsupported format writer support

Posted by ta...@apache.org.

IMPALA-7190: Remove unsupported format writer support

This patch removes write support for unsupported formats like Sequence,
Avro and compressed text. Also, the related query options
ALLOW_UNSUPPORTED_FORMATS and SEQ_COMPRESSION_MODE have been migrated
to the REMOVED query options type.

Testing:
Ran exhaustive build.

Change-Id: I821dc7495a901f1658daa500daf3791b386c7185
Reviewed-on: http://gerrit.cloudera.org:8080/10823
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/30e82c63
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/30e82c63
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/30e82c63

Branch: refs/heads/master
Commit: 30e82c63ecdd56ded10fed931d95ab6d994b9244
Parents: 6f52ce1
Author: Bikramjeet Vig <bi...@cloudera.com>
Authored: Mon Jun 25 18:11:08 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Tue Jul 3 20:34:27 2018 +0000

----------------------------------------------------------------------
 be/src/exec/CMakeLists.txt                      |   2 -
 be/src/exec/hdfs-avro-table-writer.cc           | 295 ---------------
 be/src/exec/hdfs-avro-table-writer.h            | 121 -------
 be/src/exec/hdfs-sequence-table-writer.cc       | 361 -------------------
 be/src/exec/hdfs-sequence-table-writer.h        | 194 ----------
 be/src/exec/hdfs-table-sink.cc                  |  48 +--
 be/src/exec/hdfs-text-table-writer.cc           |  61 +---
 be/src/exec/hdfs-text-table-writer.h            |   9 -
 be/src/service/query-options-test.cc            |   2 -
 be/src/service/query-options.cc                 |  16 -
 be/src/service/query-options.h                  |   5 +-
 common/thrift/ImpalaInternalService.thrift      |   6 -
 common/thrift/ImpalaService.thrift              |   6 +-
 .../apache/impala/planner/PlannerTestBase.java  |   1 -
 testdata/bad_avro_snap/README                   |   4 +-
 .../queries/QueryTest/avro-writer.test          |  43 ---
 .../queries/QueryTest/seq-writer.test           | 308 ----------------
 .../functional-query/queries/QueryTest/set.test |   3 -
 .../queries/QueryTest/text-writer.test          |  47 ---
 .../queries/QueryTest/unsupported-writers.test  |  77 ++++
 tests/common/test_dimensions.py                 |  13 -
 tests/hs2/test_hs2.py                           |   8 +-
 tests/metadata/test_partition_metadata.py       |  26 +-
 tests/query_test/test_compressed_formats.py     |  62 +---
 tests/shell/test_shell_interactive.py           |  10 +-
 25 files changed, 121 insertions(+), 1607 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/be/src/exec/CMakeLists.txt
----------------------------------------------------------------------
diff --git a/be/src/exec/CMakeLists.txt b/be/src/exec/CMakeLists.txt
index 1753cb0..4544b95 100644
--- a/be/src/exec/CMakeLists.txt
+++ b/be/src/exec/CMakeLists.txt
@@ -56,12 +56,10 @@ add_library(Exec
   hdfs-rcfile-scanner.cc
   hdfs-sequence-scanner.cc
   hdfs-avro-scanner.cc
-  hdfs-avro-table-writer.cc
   hdfs-avro-scanner-ir.cc
   hdfs-plugin-text-scanner.cc
   hdfs-text-scanner.cc
   hdfs-text-table-writer.cc
-  hdfs-sequence-table-writer.cc
   hdfs-parquet-scanner.cc
   hdfs-parquet-scanner-ir.cc
   hdfs-parquet-table-writer.cc

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/be/src/exec/hdfs-avro-table-writer.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/hdfs-avro-table-writer.cc b/be/src/exec/hdfs-avro-table-writer.cc
deleted file mode 100644
index 3ce296d..0000000
--- a/be/src/exec/hdfs-avro-table-writer.cc
+++ /dev/null
@@ -1,295 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor license agreements.  See the NOTICE file
-// distributed with this work for additional information
-// regarding copyright ownership.  The ASF licenses this file
-// to you under the Apache License, Version 2.0 (the
-// "License"); you may not use this file except in compliance
-// with the License.  You may obtain a copy of the License at
-//
-//   http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing,
-// software distributed under the License is distributed on an
-// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-// KIND, either express or implied.  See the License for the
-// specific language governing permissions and limitations
-// under the License.
-
-#include "exec/hdfs-avro-table-writer.h"
-
-#include <vector>
-#include <hdfs.h>
-#include <boost/scoped_ptr.hpp>
-#include <stdlib.h>
-#include <gutil/strings/substitute.h>
-
-#include "exec/exec-node.h"
-#include "exec/hdfs-table-sink.h"
-#include "util/compress.h"
-#include "util/hdfs-util.h"
-#include "util/uid-util.h"
-#include "exprs/scalar-expr.h"
-#include "exprs/scalar-expr-evaluator.h"
-#include "runtime/mem-pool.h"
-#include "runtime/mem-tracker.h"
-#include "runtime/raw-value.h"
-#include "runtime/row-batch.h"
-#include "runtime/runtime-state.h"
-#include "runtime/hdfs-fs-cache.h"
-#include "runtime/types.h"
-#include "util/runtime-profile-counters.h"
-#include "write-stream.inline.h"
-
-#include "common/names.h"
-
-using namespace strings;
-using namespace impala;
-
-const uint8_t OBJ1[4] = {'O', 'b', 'j', 1};
-const char* AVRO_SCHEMA_STR = "avro.schema";
-const char* AVRO_CODEC_STR = "avro.codec";
-const THdfsCompression::type AVRO_DEFAULT_CODEC = THdfsCompression::SNAPPY;
-// Desired size of each Avro block (bytes); actual block size will vary +/- the
-// size of a row. This is approximate size of the block before compression.
-const int DEFAULT_AVRO_BLOCK_SIZE = 64 * 1024;
-
-HdfsAvroTableWriter::HdfsAvroTableWriter(HdfsTableSink* parent,
-    RuntimeState* state, OutputPartition* output,
-    const HdfsPartitionDescriptor* partition, const HdfsTableDescriptor* table_desc)
-  : HdfsTableWriter(parent, state, output, partition, table_desc),
-    unflushed_rows_(0) {
-  mem_pool_.reset(new MemPool(parent->mem_tracker()));
-}
-
-void HdfsAvroTableWriter::ConsumeRow(TupleRow* row) {
-  ++unflushed_rows_;
-  int num_non_partition_cols =
-      table_desc_->num_cols() - table_desc_->num_clustering_cols();
-  for (int j = 0; j < num_non_partition_cols; ++j) {
-    void* value = output_expr_evals_[j]->GetValue(row);
-    AppendField(output_expr_evals_[j]->root().type(), value);
-  }
-}
-
-inline void HdfsAvroTableWriter::AppendField(const ColumnType& type, const void* value) {
-  // Each avro field is written as union, which is a ZLong indicating the union
-  // field followed by the encoded value. Impala/Hive always stores values as
-  // a union of [ColumnType, NULL].
-  // TODO: the writer may be asked to write [NULL, ColumnType] unions. It is wrong
-  // for us to assume [ColumnType, NULL].
-
-  if (value == NULL) {
-    // indicate the second field of the union
-    out_.WriteZLong(1);
-    // No bytes are written for a null value.
-    return;
-  }
-
-  // indicate that we are using the first field of the union
-  out_.WriteZLong(0);
-
-  switch (type.type) {
-    case TYPE_BOOLEAN:
-      out_.WriteByte(*reinterpret_cast<const char*>(value));
-      break;
-    case TYPE_TINYINT:
-      out_.WriteZInt(*reinterpret_cast<const int8_t*>(value));
-      break;
-    case TYPE_SMALLINT:
-      out_.WriteZInt(*reinterpret_cast<const int16_t*>(value));
-      break;
-    case TYPE_INT:
-      out_.WriteZInt(*reinterpret_cast<const int32_t*>(value));
-      break;
-    case TYPE_BIGINT:
-      out_.WriteZLong(*reinterpret_cast<const int64_t*>(value));
-      break;
-    case TYPE_FLOAT:
-      out_.WriteBytes(4, reinterpret_cast<const char*>(value));
-      break;
-    case TYPE_DOUBLE:
-      out_.WriteBytes(8, reinterpret_cast<const char*>(value));
-      break;
-    case TYPE_STRING: {
-      const StringValue& sv = *reinterpret_cast<const StringValue*>(value);
-      out_.WriteZLong(sv.len);
-      out_.WriteBytes(sv.len, sv.ptr);
-      break;
-    }
-    case TYPE_DECIMAL: {
-      int byte_size = ColumnType::GetDecimalByteSize(type.precision);
-      out_.WriteZLong(byte_size);
-#if __BYTE_ORDER == __LITTLE_ENDIAN
-      char tmp[16];
-      BitUtil::ByteSwap(tmp, value, byte_size);
-      out_.WriteBytes(byte_size, tmp);
-#else
-      out_.WriteBytes(byte_size, reinterpret_cast<const char*>(value));
-#endif
-      break;
-    }
-    case TYPE_TIMESTAMP:
-    case TYPE_BINARY:
-    case INVALID_TYPE:
-    case TYPE_NULL:
-    case TYPE_DATE:
-    case TYPE_DATETIME:
-    default:
-      DCHECK(false);
-  }
-}
-
-Status HdfsAvroTableWriter::Init() {
-  // create the Sync marker
-  sync_marker_ = GenerateUUIDString();
-
-  THdfsCompression::type codec = AVRO_DEFAULT_CODEC;
-  if (state_->query_options().__isset.compression_codec) {
-    codec = state_->query_options().compression_codec;
-  }
-
-  // sets codec_name_ and compressor_
-  codec_type_ = codec;
-  switch (codec) {
-    case THdfsCompression::SNAPPY:
-      codec_name_ = "snappy";
-      break;
-    case THdfsCompression::DEFLATE:
-      codec_name_ = "deflate";
-      break;
-    case THdfsCompression::NONE:
-      codec_name_ = "null";
-      return Status::OK();
-    default:
-      const char* name = _THdfsCompression_VALUES_TO_NAMES.find(codec)->second;
-      return Status(Substitute(
-          "Avro only supports NONE, DEFLATE, and SNAPPY codecs; unsupported codec $0",
-          name));
-  }
-  RETURN_IF_ERROR(Codec::CreateCompressor(mem_pool_.get(), true, codec, &compressor_));
-  DCHECK(compressor_.get() != NULL);
-
-  return Status::OK();
-}
-
-void HdfsAvroTableWriter::Close() {
-  mem_pool_->FreeAll();
-}
-
-Status HdfsAvroTableWriter::AppendRows(
-    RowBatch* batch, const vector<int32_t>& row_group_indices, bool* new_file) {
-  int32_t limit;
-  bool all_rows = row_group_indices.empty();
-  if (all_rows) {
-    limit = batch->num_rows();
-  } else {
-    limit = row_group_indices.size();
-  }
-  COUNTER_ADD(parent_->rows_inserted_counter(), limit);
-
-  {
-    SCOPED_TIMER(parent_->encode_timer());
-    for (int row_idx = 0; row_idx < limit; ++row_idx) {
-      TupleRow* row = all_rows ?
-          batch->GetRow(row_idx) : batch->GetRow(row_group_indices[row_idx]);
-      ConsumeRow(row);
-    }
-  }
-
-  if (out_.Size() > DEFAULT_AVRO_BLOCK_SIZE) RETURN_IF_ERROR(Flush());
-  *new_file = false;
-  return Status::OK();
-}
-
-Status HdfsAvroTableWriter::WriteFileHeader() {
-  out_.Clear();
-  out_.WriteBytes(4, reinterpret_cast<const uint8_t*>(OBJ1));
-
-  // Write 'File Metadata' as an encoded avro map
-  // number of key/value pairs in the map
-  out_.WriteZLong(2);
-
-  // Schema information
-  out_.WriteZLong(strlen(AVRO_SCHEMA_STR));
-  out_.WriteBytes(strlen(AVRO_SCHEMA_STR), AVRO_SCHEMA_STR);
-  const string& avro_schema = table_desc_->avro_schema();
-  out_.WriteZLong(avro_schema.size());
-  out_.WriteBytes(avro_schema.size(), avro_schema.data());
-
-  // codec information
-  out_.WriteZLong(strlen(AVRO_CODEC_STR));
-  out_.WriteBytes(strlen(AVRO_CODEC_STR), AVRO_CODEC_STR);
-  out_.WriteZLong(codec_name_.size());
-  out_.WriteBytes(codec_name_.size(), codec_name_.data());
-
-  // Write end of map marker
-  out_.WriteZLong(0);
-
-  out_.WriteBytes(sync_marker_.size(), sync_marker_.data());
-
-  const string& text = out_.String();
-  RETURN_IF_ERROR(Write(reinterpret_cast<const uint8_t*>(text.c_str()),
-                        text.size()));
-  out_.Clear();
-  return Status::OK();
-}
-
-Status HdfsAvroTableWriter::Flush() {
-  if (unflushed_rows_ == 0) return Status::OK();
-
-  WriteStream header;
-  // 1. Count of objects in this block
-  header.WriteZLong(unflushed_rows_);
-
-  const uint8_t* output;
-  int64_t output_length;
-  // Snappy format requires a CRC after the compressed data
-  uint32_t crc;
-  const string& text = out_.String();
-
-  if (codec_type_ != THdfsCompression::NONE) {
-    SCOPED_TIMER(parent_->compress_timer());
-    uint8_t* temp;
-    RETURN_IF_ERROR(compressor_->ProcessBlock(false, text.size(),
-        reinterpret_cast<const uint8_t*>(text.data()), &output_length, &temp));
-    output = temp;
-    if (codec_type_ == THdfsCompression::SNAPPY) {
-      crc = SnappyCompressor::ComputeChecksum(
-          text.size(), reinterpret_cast<const uint8_t*>(text.data()));
-    }
-  } else {
-    output = reinterpret_cast<const uint8_t*>(text.data());
-    output_length = out_.Size();
-  }
-
-  // 2. length of serialized objects
-  if (codec_type_ == THdfsCompression::SNAPPY) {
-    // + 4 for the CRC checksum at the end of the compressed block
-    header.WriteZLong(output_length + 4);
-  } else {
-    header.WriteZLong(output_length);
-  }
-
-  const string& head = header.String();
-  {
-    SCOPED_TIMER(parent_->hdfs_write_timer());
-    // Flush (1) and (2) to HDFS
-    RETURN_IF_ERROR(
-        Write(reinterpret_cast<const uint8_t*>(head.data()), head.size()));
-    // 3. serialized objects
-    RETURN_IF_ERROR(Write(output, output_length));
-
-    // Write CRC checksum
-    if (codec_type_ == THdfsCompression::SNAPPY) {
-      RETURN_IF_ERROR(Write(reinterpret_cast<const uint8_t*>(&crc), sizeof(uint32_t)));
-    }
-  }
-
-  // 4. sync marker
-  RETURN_IF_ERROR(
-      Write(reinterpret_cast<const uint8_t*>(sync_marker_.data()), sync_marker_.size()));
-
-  out_.Clear();
-  unflushed_rows_ = 0;
-  return Status::OK();
-}

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/be/src/exec/hdfs-avro-table-writer.h
----------------------------------------------------------------------
diff --git a/be/src/exec/hdfs-avro-table-writer.h b/be/src/exec/hdfs-avro-table-writer.h
deleted file mode 100644
index 6966860..0000000
--- a/be/src/exec/hdfs-avro-table-writer.h
+++ /dev/null
@@ -1,121 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor license agreements.  See the NOTICE file
-// distributed with this work for additional information
-// regarding copyright ownership.  The ASF licenses this file
-// to you under the Apache License, Version 2.0 (the
-// "License"); you may not use this file except in compliance
-// with the License.  You may obtain a copy of the License at
-//
-//   http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing,
-// software distributed under the License is distributed on an
-// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-// KIND, either express or implied.  See the License for the
-// specific language governing permissions and limitations
-// under the License.
-
-#ifndef IMPALA_EXEC_HDFS_AVRO_WRITER_H
-#define IMPALA_EXEC_HDFS_AVRO_WRITER_H
-
-#include <hdfs.h>
-#include <sstream>
-#include <string>
-
-#include "common/status.h"
-#include "exec/hdfs-table-writer.h"
-#include "runtime/mem-pool.h"
-#include "util/codec.h"
-#include "exec/write-stream.h"
-
-namespace impala {
-
-struct ColumnType;
-class HdfsTableSink;
-class RuntimeState;
-class ScalarExprEvaluator;
-class TupleDescriptor;
-class TupleRow;
-struct OutputPartition;
-struct StringValue;
-
-/// Consumes rows and outputs the rows into an Avro file in HDFS
-/// Each Avro file contains a block of records (rows). The file metadata specifies the
-/// schema of the records in addition to the name of the codec, if any, used to compress
-/// blocks. The structure is:
-///   [ Metadata ]
-///   [ Sync Marker ]
-///   [ Data Block ]
-///     ...
-///   [ Data Block ]
-//
-/// Each Data Block consists of:
-///   [ Number of Rows in Block ]
-///   [ Size of serialized objects, after compression ]
-///   [ Serialized objects, compressed ]
-///   [ Sync Marker ]
-//
-/// If compression is used, each block is compressed individually. The block size defaults
-/// to about 64KB before compression.
-/// This writer implements the Avro 1.7.7 spec:
-/// http://avro.apache.org/docs/1.7.7/spec.html
-class HdfsAvroTableWriter : public HdfsTableWriter {
- public:
-  HdfsAvroTableWriter(HdfsTableSink* parent,
-                      RuntimeState* state, OutputPartition* output,
-                      const HdfsPartitionDescriptor* partition,
-                      const HdfsTableDescriptor* table_desc);
-
-  virtual ~HdfsAvroTableWriter() { }
-
-  virtual Status Init() override;
-  virtual Status Finalize() override { return Flush(); }
-  virtual Status InitNewFile() override { return WriteFileHeader(); }
-  virtual void Close() override;
-  virtual uint64_t default_block_size() const override { return 0; }
-  virtual std::string file_extension() const override { return "avro"; }
-
-  /// Outputs the given rows into an HDFS sequence file. The rows are buffered
-  /// to fill a sequence file block.
-  virtual Status AppendRows(RowBatch* rows,
-      const std::vector<int32_t>& row_group_indices, bool* new_file) override;
-
- private:
-  /// Processes a single row, appending to out_
-  void ConsumeRow(TupleRow* row);
-
-  /// Adds an encoded field to out_
-  inline void AppendField(const ColumnType& type, const void* value);
-
-  /// Writes the Avro file header to HDFS
-  Status WriteFileHeader() WARN_UNUSED_RESULT;
-
-  /// Writes the contents of out_ to HDFS as a single Avro file block.
-  /// Returns an error if write to HDFS fails.
-  Status Flush() WARN_UNUSED_RESULT;
-
-  /// Buffer which holds accumulated output
-  WriteStream out_;
-
-  /// Memory pool used by codec to allocate output buffer.
-  /// Owned by this class. Initialized using parent's memtracker.
-  boost::scoped_ptr<MemPool> mem_pool_;
-
-  /// Number of rows consumed since last flush
-  uint64_t unflushed_rows_;
-
-  /// Name of codec, only set if codec_type_ != NONE
-  std::string codec_name_;
-
-  /// Type of the codec, will be NONE if no compression is used
-  THdfsCompression::type codec_type_;
-
-  /// The codec for compressing, only set if codec_type_ != NONE
-  boost::scoped_ptr<Codec> compressor_;
-
-  /// 16 byte sync marker (a uuid)
-  std::string sync_marker_;
-};
-
-} // namespace impala
-#endif

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/be/src/exec/hdfs-sequence-table-writer.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/hdfs-sequence-table-writer.cc b/be/src/exec/hdfs-sequence-table-writer.cc
deleted file mode 100644
index 42a70f0..0000000
--- a/be/src/exec/hdfs-sequence-table-writer.cc
+++ /dev/null
@@ -1,361 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor license agreements.  See the NOTICE file
-// distributed with this work for additional information
-// regarding copyright ownership.  The ASF licenses this file
-// to you under the Apache License, Version 2.0 (the
-// "License"); you may not use this file except in compliance
-// with the License.  You may obtain a copy of the License at
-//
-//   http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing,
-// software distributed under the License is distributed on an
-// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-// KIND, either express or implied.  See the License for the
-// specific language governing permissions and limitations
-// under the License.
-
-#include "exec/hdfs-sequence-table-writer.h"
-#include "exec/write-stream.inline.h"
-#include "exec/exec-node.h"
-#include "util/hdfs-util.h"
-#include "util/uid-util.h"
-#include "exprs/scalar-expr.h"
-#include "exprs/scalar-expr-evaluator.h"
-#include "runtime/mem-tracker.h"
-#include "runtime/raw-value.h"
-#include "runtime/row-batch.h"
-#include "runtime/runtime-state.h"
-#include "runtime/hdfs-fs-cache.h"
-#include "util/runtime-profile-counters.h"
-
-#include <vector>
-#include <hdfs.h>
-#include <boost/scoped_ptr.hpp>
-#include <stdlib.h>
-
-#include "common/names.h"
-
-namespace impala {
-
-const uint8_t HdfsSequenceTableWriter::SEQ6_CODE[4] = {'S', 'E', 'Q', 6};
-const char* HdfsSequenceTableWriter::VALUE_CLASS_NAME = "org.apache.hadoop.io.Text";
-const char* HdfsSequenceTableWriter::KEY_CLASS_NAME =
-    "org.apache.hadoop.io.BytesWritable";
-
-HdfsSequenceTableWriter::HdfsSequenceTableWriter(HdfsTableSink* parent,
-    RuntimeState* state, OutputPartition* output,
-    const HdfsPartitionDescriptor* partition, const HdfsTableDescriptor* table_desc)
-  : HdfsTableWriter(parent, state, output, partition, table_desc),
-    mem_pool_(new MemPool(parent->mem_tracker())), compress_flag_(false),
-    unflushed_rows_(0), record_compression_(false) {
-  approx_block_size_ = 64 * 1024 * 1024;
-  parent->mem_tracker()->Consume(approx_block_size_);
-  field_delim_ = partition->field_delim();
-  escape_char_ = partition->escape_char();
-}
-
-Status HdfsSequenceTableWriter::Init() {
-  THdfsCompression::type codec = THdfsCompression::SNAPPY_BLOCKED;
-  const TQueryOptions& query_options = state_->query_options();
-  if (query_options.__isset.compression_codec) {
-    codec = query_options.compression_codec;
-    if (codec == THdfsCompression::SNAPPY) {
-      // Seq file (and in general things that use hadoop.io.codec) always
-      // mean snappy_blocked.
-      codec = THdfsCompression::SNAPPY_BLOCKED;
-    }
-  }
-  if (codec != THdfsCompression::NONE) {
-    compress_flag_ = true;
-    if (query_options.__isset.seq_compression_mode) {
-      record_compression_ =
-          query_options.seq_compression_mode == THdfsSeqCompressionMode::RECORD;
-    }
-    RETURN_IF_ERROR(Codec::GetHadoopCodecClassName(codec, &codec_name_));
-    RETURN_IF_ERROR(Codec::CreateCompressor(
-        mem_pool_.get(), true, codec_name_, &compressor_));
-    DCHECK(compressor_.get() != NULL);
-  }
-
-  // create the Sync marker
-  string uuid = GenerateUUIDString();
-  uint8_t sync_neg1[20];
-
-  ReadWriteUtil::PutInt(sync_neg1, static_cast<uint32_t>(-1));
-  DCHECK(uuid.size() == 16);
-  memcpy(sync_neg1 + sizeof(int32_t), uuid.data(), uuid.size());
-  neg1_sync_marker_ = string(reinterpret_cast<char*>(sync_neg1), 20);
-  sync_marker_ = uuid;
-
-  return Status::OK();
-}
-
-Status HdfsSequenceTableWriter::AppendRows(
-    RowBatch* batch, const vector<int32_t>& row_group_indices, bool* new_file) {
-  int32_t limit;
-  if (row_group_indices.empty()) {
-    limit = batch->num_rows();
-  } else {
-    limit = row_group_indices.size();
-  }
-  COUNTER_ADD(parent_->rows_inserted_counter(), limit);
-
-  bool all_rows = row_group_indices.empty();
-  int num_non_partition_cols =
-      table_desc_->num_cols() - table_desc_->num_clustering_cols();
-  DCHECK_GE(output_expr_evals_.size(), num_non_partition_cols)
-      << parent_->DebugString();
-
-  {
-    SCOPED_TIMER(parent_->encode_timer());
-    if (all_rows) {
-      for (int row_idx = 0; row_idx < limit; ++row_idx) {
-        RETURN_IF_ERROR(ConsumeRow(batch->GetRow(row_idx)));
-      }
-    } else {
-      for (int row_idx = 0; row_idx < limit; ++row_idx) {
-        TupleRow* row = batch->GetRow(row_group_indices[row_idx]);
-        RETURN_IF_ERROR(ConsumeRow(row));
-      }
-    }
-  }
-
-  if (!compress_flag_) {
-    out_.WriteBytes(neg1_sync_marker_.size(), neg1_sync_marker_.data());
-  }
-
-  if (out_.Size() >= approx_block_size_) RETURN_IF_ERROR(Flush());
-  *new_file = false;
-  return Status::OK();
-}
-
-Status HdfsSequenceTableWriter::WriteFileHeader() {
-  out_.WriteBytes(sizeof(SEQ6_CODE), SEQ6_CODE);
-
-  // Setup to be correct key class
-  out_.WriteText(strlen(KEY_CLASS_NAME),
-      reinterpret_cast<const uint8_t*>(KEY_CLASS_NAME));
-
-  // Setup to be correct value class
-  out_.WriteText(strlen(VALUE_CLASS_NAME),
-      reinterpret_cast<const uint8_t*>(VALUE_CLASS_NAME));
-
-  // Flag for if compression is used
-  out_.WriteBoolean(compress_flag_);
-  // Only valid if compression is used. Indicates if block compression is used.
-  out_.WriteBoolean(compress_flag_ && !record_compression_);
-
-  // Output the name of our compression codec, parsed by readers
-  if (compress_flag_) {
-    out_.WriteText(codec_name_.size(),
-        reinterpret_cast<const uint8_t*>(codec_name_.data()));
-  }
-
-  // Meta data is formated as an integer N followed by N*2 strings,
-  // which are key-value pairs. Hive does not write meta data, so neither does Impala
-  out_.WriteInt(0);
-
-  // write the sync marker
-  out_.WriteBytes(sync_marker_.size(), sync_marker_.data());
-
-  string text = out_.String();
-  RETURN_IF_ERROR(Write(reinterpret_cast<const uint8_t*>(text.c_str()), text.size()));
-  out_.Clear();
-  return Status::OK();
-}
-
-Status HdfsSequenceTableWriter::WriteCompressedBlock() {
-  WriteStream record;
-  uint8_t *output;
-  int64_t output_length;
-  DCHECK(compress_flag_);
-
-  // Add a sync marker to start of the block
-  record.WriteBytes(neg1_sync_marker_.size(), neg1_sync_marker_.data());
-
-  // Output the number of rows in this block
-  record.WriteVLong(unflushed_rows_);
-
-  // Output compressed key-lengths block-size & compressed key-lengths block.
-  // The key-lengths block contains byte value of 4 as a key length for each row (this is
-  // what Hive does).
-  string key_lengths_text(unflushed_rows_, '\x04');
-  {
-    SCOPED_TIMER(parent_->compress_timer());
-    RETURN_IF_ERROR(compressor_->ProcessBlock(false, key_lengths_text.size(),
-        reinterpret_cast<const uint8_t*>(key_lengths_text.data()), &output_length,
-        &output));
-  }
-  record.WriteVInt(output_length);
-  record.WriteBytes(output_length, output);
-
-  // Output compressed keys block-size & compressed keys block.
-  // The keys block contains "\0\0\0\0" byte sequence as a key for each row (this is what
-  // Hive does).
-  string keys_text(unflushed_rows_ * 4, '\0');
-  {
-    SCOPED_TIMER(parent_->compress_timer());
-    RETURN_IF_ERROR(compressor_->ProcessBlock(false, keys_text.size(),
-        reinterpret_cast<const uint8_t*>(keys_text.data()), &output_length, &output));
-  }
-  record.WriteVInt(output_length);
-  record.WriteBytes(output_length, output);
-
-  // Output compressed value-lengths block-size & compressed value-lengths block
-  string value_lengths_text = out_value_lengths_block_.String();
-  {
-    SCOPED_TIMER(parent_->compress_timer());
-    RETURN_IF_ERROR(compressor_->ProcessBlock(false, value_lengths_text.size(),
-        reinterpret_cast<const uint8_t*>(value_lengths_text.data()), &output_length, &output));
-  }
-  record.WriteVInt(output_length);
-  record.WriteBytes(output_length, output);
-
-  // Output compressed values block-size & compressed values block
-  string text = out_.String();
-  {
-    SCOPED_TIMER(parent_->compress_timer());
-    RETURN_IF_ERROR(compressor_->ProcessBlock(false, text.size(),
-        reinterpret_cast<const uint8_t*>(text.data()), &output_length, &output));
-  }
-  record.WriteVInt(output_length);
-  record.WriteBytes(output_length, output);
-
-  string rec = record.String();
-  RETURN_IF_ERROR(Write(reinterpret_cast<const uint8_t*>(rec.data()), rec.size()));
-  return Status::OK();
-}
-
-inline void HdfsSequenceTableWriter::WriteEscapedString(const StringValue* str_val,
-                                                       WriteStream* buf) {
-  for (int i = 0; i < str_val->len; ++i) {
-    if (str_val->ptr[i] == field_delim_ || str_val->ptr[i] == escape_char_) {
-      buf->WriteByte(escape_char_);
-    }
-    buf->WriteByte(str_val->ptr[i]);
-  }
-}
-
-void HdfsSequenceTableWriter::EncodeRow(TupleRow* row, WriteStream* buf) {
-  // TODO Unify with text table writer
-  int num_non_partition_cols =
-      table_desc_->num_cols() - table_desc_->num_clustering_cols();
-  DCHECK_GE(output_expr_evals_.size(), num_non_partition_cols)
-      << parent_->DebugString();
-  for (int j = 0; j < num_non_partition_cols; ++j) {
-    void* value = output_expr_evals_[j]->GetValue(row);
-    if (value != NULL) {
-      if (output_expr_evals_[j]->root().type().type == TYPE_STRING) {
-        WriteEscapedString(reinterpret_cast<const StringValue*>(value), &row_buf_);
-      } else {
-        string str;
-        output_expr_evals_[j]->PrintValue(value, &str);
-        buf->WriteBytes(str.size(), str.data());
-      }
-    } else {
-      // NULLs in hive are encoded based on the 'serialization.null.format' property.
-      const string& null_val = table_desc_->null_column_value();
-      buf->WriteBytes(null_val.size(), null_val.data());
-    }
-    // Append field delimiter.
-    if (j + 1 < num_non_partition_cols) {
-      buf->WriteByte(field_delim_);
-    }
-  }
-}
-
-inline Status HdfsSequenceTableWriter::ConsumeRow(TupleRow* row) {
-  ++unflushed_rows_;
-  row_buf_.Clear();
-  if (compress_flag_ && !record_compression_) {
-    // Output row for a block compressed sequence file.
-    // Value block: Write the length as a vlong and then write the contents.
-    EncodeRow(row, &row_buf_);
-    out_.WriteVLong(row_buf_.Size());
-    out_.WriteBytes(row_buf_.Size(), row_buf_.String().data());
-    // Value-lengths block: Write the number of bytes we have just written to out_ as
-    // vlong
-    out_value_lengths_block_.WriteVLong(
-        ReadWriteUtil::VLongRequiredBytes(row_buf_.Size()) + row_buf_.Size());
-    return Status::OK();
-  }
-
-  EncodeRow(row, &row_buf_);
-
-  const uint8_t* value_bytes;
-  int64_t value_length;
-  string text = row_buf_.String();
-  if (compress_flag_) {
-    // apply compression to row_buf_
-    // the length of the buffer must be prefixed to the buffer prior to compression
-    //
-    // TODO this incurs copy overhead to place the length in front of the
-    // buffer prior to compression. We may want to rewrite to avoid copying.
-    row_buf_.Clear();
-    // encoding as "Text" writes the length before the text
-    row_buf_.WriteText(text.size(), reinterpret_cast<const uint8_t*>(&text.data()[0]));
-    text = row_buf_.String();
-    uint8_t *tmp;
-    {
-      SCOPED_TIMER(parent_->compress_timer());
-      RETURN_IF_ERROR(compressor_->ProcessBlock(false, text.size(),
-          reinterpret_cast<const uint8_t*>(text.data()), &value_length, &tmp));
-    }
-    value_bytes = tmp;
-  } else {
-    value_length = text.size();
-    DCHECK_EQ(value_length, row_buf_.Size());
-    value_bytes = reinterpret_cast<const uint8_t*>(text.data());
-  }
-
-  int rec_len = value_length;
-  // if the record is compressed, the length is part of the compressed text
-  // if not, then we need to write the length (below) and account for it's size
-  if (!compress_flag_) {
-    rec_len += ReadWriteUtil::VLongRequiredBytes(value_length);
-  }
-  // The record contains the key, account for it's size (we use "\0\0\0\0" byte sequence
-  // as a key just like Hive).
-  rec_len += 4;
-
-  // Length of the record (incl. key and value length)
-  out_.WriteInt(rec_len);
-
-  // Write length of the key and the key
-  out_.WriteInt(4);
-  out_.WriteBytes(4, "\0\0\0\0");
-
-  // if the record is compressed, the length is part of the compressed text
-  if (!compress_flag_) out_.WriteVLong(value_length);
-
-  // write out the value (possibly compressed)
-  out_.WriteBytes(value_length, value_bytes);
-  return Status::OK();
-}
-
-Status HdfsSequenceTableWriter::Flush() {
-  if (unflushed_rows_ == 0) return Status::OK();
-
-  SCOPED_TIMER(parent_->hdfs_write_timer());
-
-  if (compress_flag_ && !record_compression_) {
-    RETURN_IF_ERROR(WriteCompressedBlock());
-  } else {
-    string out_str = out_.String();
-    RETURN_IF_ERROR(
-        Write(reinterpret_cast<const uint8_t*>(out_str.data()), out_str.size()));
-  }
-  out_.Clear();
-  out_value_lengths_block_.Clear();
-  unflushed_rows_ = 0;
-  return Status::OK();
-}
-
-void HdfsSequenceTableWriter::Close() {
-  // TODO: double check there is no memory leak.
-  parent_->mem_tracker()->Release(approx_block_size_);
-  mem_pool_->FreeAll();
-}
-
-} // namespace impala

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/be/src/exec/hdfs-sequence-table-writer.h
----------------------------------------------------------------------
diff --git a/be/src/exec/hdfs-sequence-table-writer.h b/be/src/exec/hdfs-sequence-table-writer.h
deleted file mode 100644
index f315920..0000000
--- a/be/src/exec/hdfs-sequence-table-writer.h
+++ /dev/null
@@ -1,194 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor license agreements.  See the NOTICE file
-// distributed with this work for additional information
-// regarding copyright ownership.  The ASF licenses this file
-// to you under the Apache License, Version 2.0 (the
-// "License"); you may not use this file except in compliance
-// with the License.  You may obtain a copy of the License at
-//
-//   http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing,
-// software distributed under the License is distributed on an
-// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-// KIND, either express or implied.  See the License for the
-// specific language governing permissions and limitations
-// under the License.
-
-#ifndef IMPALA_EXEC_HDFS_SEQUENCE_WRITER_H
-#define IMPALA_EXEC_HDFS_SEQUENCE_WRITER_H
-
-#include <hdfs.h>
-#include <sstream>
-
-#include "runtime/descriptors.h"
-#include "exec/hdfs-table-sink.h"
-#include "exec/hdfs-table-writer.h"
-#include "util/codec.h"
-#include "write-stream.h"
-
-namespace impala {
-
-class Expr;
-class TupleDescriptor;
-class TupleRow;
-class RuntimeState;
-struct StringValue;
-struct OutputPartition;
-
-/// Sequence files are flat files consisting of binary key/value pairs. Essentially there
-/// are 3 different formats for sequence files depending on the 'compression_codec' and
-/// 'seq_compression_mode' query options:
-/// - Uncompressed sequence file format
-/// - Record-compressed sequence file format
-/// - Block-compressed sequence file format
-/// All of them share a common header described below.
-///
-/// Sequence File Header
-/// --------------------
-/// - version - 3 bytes of magic header SEQ, followed by 1 byte of actual version number
-///   (e.g. SEQ4 or SEQ6)
-/// - keyClassName - key class
-/// - valueClassName - value class
-/// - compression - A boolean which specifies if compression is turned on for keys/values
-///   in this file.
-/// - blockCompression - A boolean which specifies if block-compression is turned on for
-///   keys/values in this file.
-/// - compression codec - compression codec class which is used for compression of keys
-///   and/or values (if compression is enabled).
-/// - metadata - SequenceFile.Metadata for this file.
-/// - sync - A 16 byte sync marker to denote end of the header.
-///
-/// Uncompressed Sequence File Format
-/// ---------------------------------
-/// - Header
-/// - Record
-///   - Record length
-///   - Key length
-///   - Key
-///   - Value
-/// - "\xFF\xFF\xFF\xFF" followed by a sync-marker every few 100 bytes or so.
-///
-/// Record-Compressed Sequence File Format
-/// --------------------------------------
-/// - Header
-/// - Record
-///   - Record length
-///   - Key length
-///   - Key
-///   - Compressed Value
-/// - "\xFF\xFF\xFF\xFF" followed by a sync-marker every few 100 bytes or so.
-///
-/// Block-Compressed Sequence File Format
-/// -------------------------------------
-/// - Header
-/// - Record Block
-///   - Uncompressed number of records in the block
-///   - Compressed key-lengths block-size
-///   - Compressed key-lengths block
-///   - Compressed keys block-size
-///   - Compressed keys block
-///   - Compressed value-lengths block-size
-///   - Compressed value-lengths block
-///   - Compressed values block-size
-///   - Compressed values block
-/// - "\xFF\xFF\xFF\xFF" followed by a sync-marker every block.
-/// The compressed blocks of key lengths and value lengths consist of the actual lengths
-/// of individual keys/values encoded in zero-compressed integer format.
-
-/// Consumes rows and outputs the rows into a sequence file in HDFS
-/// Output is buffered to fill sequence file blocks.
-class HdfsSequenceTableWriter : public HdfsTableWriter {
- public:
-  HdfsSequenceTableWriter(HdfsTableSink* parent, RuntimeState* state,
-      OutputPartition* output, const HdfsPartitionDescriptor* partition,
-      const HdfsTableDescriptor* table_desc);
-
-  ~HdfsSequenceTableWriter() { }
-
-  virtual Status Init();
-  virtual Status Finalize() { return Flush(); }
-  virtual Status InitNewFile() { return WriteFileHeader(); }
-  virtual void Close();
-  virtual uint64_t default_block_size() const { return 0; }
-  virtual std::string file_extension() const { return "seq"; }
-
-  /// Outputs the given rows into an HDFS sequence file. The rows are buffered
-  /// to fill a sequence file block.
-  virtual Status AppendRows(
-      RowBatch* rows, const std::vector<int32_t>& row_group_indices, bool* new_file);
-
- private:
-  /// processes a single row, delegates to Compress or NoCompress ConsumeRow().
-  inline Status ConsumeRow(TupleRow* row);
-
-  /// writes the SEQ file header to HDFS
-  Status WriteFileHeader();
-
-  /// writes the contents of out_value_lengths_block_ and out_ as a single
-  /// block-compressed record.
-  Status WriteCompressedBlock();
-
-  /// writes the tuple row to the given buffer; separates fields by field_delim_,
-  /// escapes string.
-  inline void EncodeRow(TupleRow* row, WriteStream* buf);
-
-  /// writes the str_val to the buffer, escaping special characters
-  inline void WriteEscapedString(const StringValue* str_val, WriteStream* buf);
-
-  /// flushes the output -- clearing out_ and writing to HDFS
-  /// if compress_flag_, will write contents of out_ as a single compressed block
-  Status Flush();
-
-  /// desired size of each block (bytes); actual block size will vary +/- the
-  /// size of a row; this is before compression is applied.
-  uint64_t approx_block_size_;
-
-  /// buffer which holds accumulated output
-  WriteStream out_;
-
-  /// buffer which holds accumulated value-lengths output (used with block-compressed
-  /// sequence files)
-  WriteStream out_value_lengths_block_;
-
-  /// Temporary Buffer for a single row
-  WriteStream row_buf_;
-
-  /// memory pool used by codec to allocate output buffer
-  boost::scoped_ptr<MemPool> mem_pool_;
-
-  /// true if compression is enabled
-  bool compress_flag_;
-
-  /// number of rows consumed since last flush
-  uint64_t unflushed_rows_;
-
-  /// name of codec, only set if compress_flag_
-  std::string codec_name_;
-  /// the codec for compressing, only set if compress_flag_
-  boost::scoped_ptr<Codec> compressor_;
-
-  /// true if compression is applied on each record individually
-  bool record_compression_;
-
-  /// Character delimiting fields
-  char field_delim_;
-
-  /// Escape character for text encoding
-  char escape_char_;
-
-  /// 16 byte sync marker (a uuid)
-  std::string sync_marker_;
-  /// A -1 infront of the sync marker, used in decompressed formats
-  std::string neg1_sync_marker_;
-
-  /// Name of java class to use when reading the keys
-  static const char* KEY_CLASS_NAME;
-  /// Name of java class to use when reading the values
-  static const char* VALUE_CLASS_NAME;
-  /// Magic characters used to identify the file type
-  static const uint8_t SEQ6_CODE[4];
-};
-
-} // namespace impala
-#endif

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/be/src/exec/hdfs-table-sink.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/hdfs-table-sink.cc b/be/src/exec/hdfs-table-sink.cc
index b6de7cf..9c46638 100644
--- a/be/src/exec/hdfs-table-sink.cc
+++ b/be/src/exec/hdfs-table-sink.cc
@@ -18,8 +18,6 @@
 #include "exec/hdfs-table-sink.h"
 #include "exec/hdfs-table-writer.h"
 #include "exec/hdfs-text-table-writer.h"
-#include "exec/hdfs-sequence-table-writer.h"
-#include "exec/hdfs-avro-table-writer.h"
 #include "exec/hdfs-parquet-table-writer.h"
 #include "exec/exec-node.h"
 #include "gen-cpp/ImpalaInternalService_constants.h"
@@ -469,28 +467,20 @@ Status HdfsTableSink::InitOutputPartition(RuntimeState* state,
 
   output_partition->partition_descriptor = &partition_descriptor;
 
-  bool allow_unsupported_formats =
-      state->query_options().__isset.allow_unsupported_formats &&
-      state->query_options().allow_unsupported_formats;
-  if (!allow_unsupported_formats) {
-    if (partition_descriptor.file_format() == THdfsFileFormat::SEQUENCE_FILE ||
-        partition_descriptor.file_format() == THdfsFileFormat::AVRO) {
-      stringstream error_msg;
-      map<int, const char*>::const_iterator i =
-          _THdfsFileFormat_VALUES_TO_NAMES.find(partition_descriptor.file_format());
-      error_msg << "Writing to table format " << i->second
-          << " is not supported. Use query option ALLOW_UNSUPPORTED_FORMATS"
-          " to override.";
-      return Status(error_msg.str());
-    }
-    if (partition_descriptor.file_format() == THdfsFileFormat::TEXT &&
-        state->query_options().__isset.compression_codec &&
-        state->query_options().compression_codec != THdfsCompression::NONE) {
-      stringstream error_msg;
-      error_msg << "Writing to compressed text table is not supported. "
-          "Use query option ALLOW_UNSUPPORTED_FORMATS to override.";
-      return Status(error_msg.str());
-    }
+  if (partition_descriptor.file_format() == THdfsFileFormat::SEQUENCE_FILE ||
+      partition_descriptor.file_format() == THdfsFileFormat::AVRO) {
+    stringstream error_msg;
+    map<int, const char*>::const_iterator i =
+        _THdfsFileFormat_VALUES_TO_NAMES.find(partition_descriptor.file_format());
+    error_msg << "Writing to table format " << i->second << " is not supported.";
+    return Status(error_msg.str());
+  }
+  if (partition_descriptor.file_format() == THdfsFileFormat::TEXT &&
+      state->query_options().__isset.compression_codec &&
+      state->query_options().compression_codec != THdfsCompression::NONE) {
+    stringstream error_msg;
+    error_msg << "Writing to compressed text table is not supported. ";
+    return Status(error_msg.str());
   }
 
   // It is incorrect to initialize a writer if there are no rows to feed it. The writer
@@ -508,16 +498,6 @@ Status HdfsTableSink::InitOutputPartition(RuntimeState* state,
           new HdfsParquetTableWriter(
               this, state, output_partition, &partition_descriptor, table_desc_));
       break;
-    case THdfsFileFormat::SEQUENCE_FILE:
-      output_partition->writer.reset(
-          new HdfsSequenceTableWriter(
-              this, state, output_partition, &partition_descriptor, table_desc_));
-      break;
-    case THdfsFileFormat::AVRO:
-      output_partition->writer.reset(
-          new HdfsAvroTableWriter(
-              this, state, output_partition, &partition_descriptor, table_desc_));
-      break;
     default:
       stringstream error_msg;
       map<int, const char*>::const_iterator i =

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/be/src/exec/hdfs-text-table-writer.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/hdfs-text-table-writer.cc b/be/src/exec/hdfs-text-table-writer.cc
index aaee773..f09b161 100644
--- a/be/src/exec/hdfs-text-table-writer.cc
+++ b/be/src/exec/hdfs-text-table-writer.cc
@@ -25,8 +25,6 @@
 #include "runtime/row-batch.h"
 #include "runtime/runtime-state.h"
 #include "runtime/string-value.inline.h"
-#include "util/codec.h"
-#include "util/compress.h"
 #include "util/hdfs-util.h"
 #include "util/runtime-profile-counters.h"
 
@@ -35,13 +33,6 @@
 
 #include "common/names.h"
 
-// Hdfs block size for compressed text.
-static const int64_t COMPRESSED_BLOCK_SIZE = 64 * 1024 * 1024;
-
-// Size to buffer before compression. We want this to be less than the block size
-// (compressed text is not splittable).
-static const int64_t COMPRESSED_BUFFERED_SIZE = 60 * 1024 * 1024;
-
 namespace impala {
 
 HdfsTextTableWriter::HdfsTextTableWriter(HdfsTableSink* parent,
@@ -61,41 +52,17 @@ HdfsTextTableWriter::HdfsTextTableWriter(HdfsTableSink* parent,
 }
 
 Status HdfsTextTableWriter::Init() {
-  const TQueryOptions& query_options = state_->query_options();
-  codec_ = THdfsCompression::NONE;
-  if (query_options.__isset.compression_codec) {
-    codec_ = query_options.compression_codec;
-    if (codec_ == THdfsCompression::SNAPPY) {
-      // hadoop.io.codec always means SNAPPY_BLOCKED. Alias the two.
-      codec_ = THdfsCompression::SNAPPY_BLOCKED;
-    }
-  }
-
-  if (codec_ != THdfsCompression::NONE) {
-    mem_pool_.reset(new MemPool(parent_->mem_tracker()));
-    RETURN_IF_ERROR(Codec::CreateCompressor(
-        mem_pool_.get(), true, codec_, &compressor_));
-    flush_size_ = COMPRESSED_BUFFERED_SIZE;
-  } else {
-    flush_size_ = HDFS_FLUSH_WRITE_SIZE;
-  }
   parent_->mem_tracker()->Consume(flush_size_);
   return Status::OK();
 }
 
 void HdfsTextTableWriter::Close() {
   parent_->mem_tracker()->Release(flush_size_);
-  if (mem_pool_.get() != NULL) mem_pool_->FreeAll();
 }
 
-uint64_t HdfsTextTableWriter::default_block_size() const {
-  return compressor_.get() == NULL ? 0 : COMPRESSED_BLOCK_SIZE;
-}
+uint64_t HdfsTextTableWriter::default_block_size() const { return 0; }
 
-string HdfsTextTableWriter::file_extension() const {
-  if (compressor_.get() == NULL) return "";
-  return compressor_->file_extension();
-}
+string HdfsTextTableWriter::file_extension() const { return ""; }
 
 Status HdfsTextTableWriter::AppendRows(
     RowBatch* batch, const vector<int32_t>& row_group_indices, bool* new_file) {
@@ -152,12 +119,7 @@ Status HdfsTextTableWriter::AppendRows(
   }
 
   *new_file = false;
-  if (rowbatch_stringstream_.tellp() >= flush_size_) {
-    RETURN_IF_ERROR(Flush());
-
-    // If compressed, start a new file (compressed data is not splittable).
-    *new_file = compressor_.get() != NULL;
-  }
+  if (rowbatch_stringstream_.tellp() >= flush_size_) RETURN_IF_ERROR(Flush());
 
   return Status::OK();
 }
@@ -178,22 +140,9 @@ Status HdfsTextTableWriter::InitNewFile() {
 Status HdfsTextTableWriter::Flush() {
   string rowbatch_string = rowbatch_stringstream_.str();
   rowbatch_stringstream_.str(string());
-  const uint8_t* uncompressed_data =
+  const uint8_t* data =
       reinterpret_cast<const uint8_t*>(rowbatch_string.data());
-  int64_t uncompressed_len = rowbatch_string.size();
-  const uint8_t* data = uncompressed_data;
-  int64_t len = uncompressed_len;
-
-  if (compressor_.get() != NULL) {
-    SCOPED_TIMER(parent_->compress_timer());
-    uint8_t* compressed_data;
-    int64_t compressed_len;
-    RETURN_IF_ERROR(compressor_->ProcessBlock(false,
-        uncompressed_len, uncompressed_data,
-        &compressed_len, &compressed_data));
-    data = compressed_data;
-    len = compressed_len;
-  }
+  int64_t len = rowbatch_string.size();
 
   {
     SCOPED_TIMER(parent_->hdfs_write_timer());

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/be/src/exec/hdfs-text-table-writer.h
----------------------------------------------------------------------
diff --git a/be/src/exec/hdfs-text-table-writer.h b/be/src/exec/hdfs-text-table-writer.h
index 589ed23..e2f6135 100644
--- a/be/src/exec/hdfs-text-table-writer.h
+++ b/be/src/exec/hdfs-text-table-writer.h
@@ -87,15 +87,6 @@ class HdfsTextTableWriter : public HdfsTableWriter {
   /// Stringstream to buffer output.  The stream is cleared between HDFS
   /// Write calls to allow for the internal buffers to be reused.
   std::stringstream rowbatch_stringstream_;
-
-  /// Compression codec.
-  THdfsCompression::type codec_;
-
-  /// Compressor if compression is enabled.
-  boost::scoped_ptr<Codec> compressor_;
-
-  /// Memory pool to use with compressor_.
-  boost::scoped_ptr<MemPool> mem_pool_;
 };
 
 }

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/be/src/service/query-options-test.cc
----------------------------------------------------------------------
diff --git a/be/src/service/query-options-test.cc b/be/src/service/query-options-test.cc
index e5bc48d..b9bda60 100644
--- a/be/src/service/query-options-test.cc
+++ b/be/src/service/query-options-test.cc
@@ -208,8 +208,6 @@ TEST(QueryOptions, SetEnumOptions) {
       TParquetFallbackSchemaResolution, (POSITION, NAME)), true);
   TestEnumCase(options, CASE(parquet_array_resolution, TParquetArrayResolution,
       (THREE_LEVEL, TWO_LEVEL, TWO_LEVEL_THEN_THREE_LEVEL)), true);
-  TestEnumCase(options, CASE(seq_compression_mode, THdfsSeqCompressionMode,
-      (BLOCK, RECORD)), false);
   TestEnumCase(options, CASE(compression_codec, THdfsCompression,
       (NONE, GZIP, BZIP2, DEFAULT, SNAPPY, SNAPPY_BLOCKED)), false);
 #undef CASE

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/be/src/service/query-options.cc
----------------------------------------------------------------------
diff --git a/be/src/service/query-options.cc b/be/src/service/query-options.cc
index 2e3415f..1063fef 100644
--- a/be/src/service/query-options.cc
+++ b/be/src/service/query-options.cc
@@ -226,25 +226,9 @@ Status impala::SetQueryOption(const string& key, const string& value,
       case TImpalaQueryOptions::NUM_SCANNER_THREADS:
         query_options->__set_num_scanner_threads(atoi(value.c_str()));
         break;
-      case TImpalaQueryOptions::ALLOW_UNSUPPORTED_FORMATS:
-        query_options->__set_allow_unsupported_formats(
-            iequals(value, "true") || iequals(value, "1"));
-        break;
       case TImpalaQueryOptions::DEBUG_ACTION:
         query_options->__set_debug_action(value.c_str());
         break;
-      case TImpalaQueryOptions::SEQ_COMPRESSION_MODE: {
-        if (iequals(value, "block")) {
-          query_options->__set_seq_compression_mode(THdfsSeqCompressionMode::BLOCK);
-        } else if (iequals(value, "record")) {
-          query_options->__set_seq_compression_mode(THdfsSeqCompressionMode::RECORD);
-        } else {
-          stringstream ss;
-          ss << "Invalid sequence file compression mode: " << value;
-          return Status(ss.str());
-        }
-        break;
-      }
       case TImpalaQueryOptions::COMPRESSION_CODEC: {
         if (iequals(value, "none")) {
           query_options->__set_compression_codec(THdfsCompression::NONE);

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/be/src/service/query-options.h
----------------------------------------------------------------------
diff --git a/be/src/service/query-options.h b/be/src/service/query-options.h
index fce042c..01f6e74 100644
--- a/be/src/service/query-options.h
+++ b/be/src/service/query-options.h
@@ -44,8 +44,7 @@ typedef std::unordered_map<string, beeswax::TQueryOptionLevel::type>
       TImpalaQueryOptions::ALLOW_ERASURE_CODED_FILES + 1);\
   REMOVED_QUERY_OPT_FN(abort_on_default_limit_exceeded, ABORT_ON_DEFAULT_LIMIT_EXCEEDED)\
   QUERY_OPT_FN(abort_on_error, ABORT_ON_ERROR, TQueryOptionLevel::REGULAR)\
-  QUERY_OPT_FN(allow_unsupported_formats, ALLOW_UNSUPPORTED_FORMATS,\
-      TQueryOptionLevel::DEPRECATED)\
+  REMOVED_QUERY_OPT_FN(allow_unsupported_formats, ALLOW_UNSUPPORTED_FORMATS)\
   QUERY_OPT_FN(batch_size, BATCH_SIZE, TQueryOptionLevel::DEVELOPMENT)\
   QUERY_OPT_FN(debug_action, DEBUG_ACTION, TQueryOptionLevel::DEVELOPMENT)\
   REMOVED_QUERY_OPT_FN(default_order_by_limit, DEFAULT_ORDER_BY_LIMIT)\
@@ -74,7 +73,7 @@ typedef std::unordered_map<string, beeswax::TQueryOptionLevel::type>
   QUERY_OPT_FN(buffer_pool_limit, BUFFER_POOL_LIMIT, TQueryOptionLevel::ADVANCED)\
   QUERY_OPT_FN(appx_count_distinct, APPX_COUNT_DISTINCT, TQueryOptionLevel::ADVANCED)\
   QUERY_OPT_FN(disable_unsafe_spills, DISABLE_UNSAFE_SPILLS, TQueryOptionLevel::REGULAR)\
-  QUERY_OPT_FN(seq_compression_mode, SEQ_COMPRESSION_MODE, TQueryOptionLevel::REGULAR)\
+  REMOVED_QUERY_OPT_FN(seq_compression_mode, SEQ_COMPRESSION_MODE)\
   QUERY_OPT_FN(exec_single_node_rows_threshold, EXEC_SINGLE_NODE_ROWS_THRESHOLD,\
       TQueryOptionLevel::ADVANCED)\
   QUERY_OPT_FN(optimize_partition_key_scans, OPTIMIZE_PARTITION_KEY_SCANS,\

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/common/thrift/ImpalaInternalService.thrift
----------------------------------------------------------------------
diff --git a/common/thrift/ImpalaInternalService.thrift b/common/thrift/ImpalaInternalService.thrift
index 6780138..120aebc 100644
--- a/common/thrift/ImpalaInternalService.thrift
+++ b/common/thrift/ImpalaInternalService.thrift
@@ -101,7 +101,6 @@ struct TQueryOptions {
   5: optional i32 num_nodes = NUM_NODES_ALL
   6: optional i64 max_scan_range_length = 0
   7: optional i32 num_scanner_threads = 0
-  9: optional bool allow_unsupported_formats = 0
   11: optional string debug_action = ""
   12: optional i64 mem_limit = 0
   14: optional CatalogObjects.THdfsCompression compression_codec
@@ -133,11 +132,6 @@ struct TQueryOptions {
   // has no plan hints, and at least one table is missing relevant stats.
   29: optional bool disable_unsafe_spills = 0
 
-  // Mode for compression; RECORD, or BLOCK
-  // This field only applies for certain file types and is ignored
-  // by all other file types.
-  30: optional CatalogObjects.THdfsSeqCompressionMode seq_compression_mode
-
   // If the number of rows that are processed for a single query is below the
   // threshold, it will be executed on the coordinator only with codegen disabled
   31: optional i32 exec_single_node_rows_threshold = 100

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/common/thrift/ImpalaService.thrift
----------------------------------------------------------------------
diff --git a/common/thrift/ImpalaService.thrift b/common/thrift/ImpalaService.thrift
index 529af04..665144f 100644
--- a/common/thrift/ImpalaService.thrift
+++ b/common/thrift/ImpalaService.thrift
@@ -72,8 +72,7 @@ enum TImpalaQueryOptions {
   // Number of scanner threads.
   NUM_SCANNER_THREADS,
 
-  // If true, Impala will try to execute on file formats that are not fully supported yet
-  ALLOW_UNSUPPORTED_FORMATS,
+  ALLOW_UNSUPPORTED_FORMATS, // Removed
 
   DEFAULT_ORDER_BY_LIMIT, // Removed
 
@@ -110,8 +109,7 @@ enum TImpalaQueryOptions {
   // Leave blank to use default.
   COMPRESSION_CODEC,
 
-  // Mode for compressing sequence files; either BLOCK, RECORD, or DEFAULT
-  SEQ_COMPRESSION_MODE,
+  SEQ_COMPRESSION_MODE, // Removed
 
   // HBase scan query option. If set and > 0, HBASE_CACHING is the value for
   // "hbase.client.Scan.setCaching()" when querying HBase table. Otherwise, use backend

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
----------------------------------------------------------------------
diff --git a/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java b/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
index 54ad57f..b671a1e 100644
--- a/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
+++ b/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
@@ -390,7 +390,6 @@ public class PlannerTestBase extends FrontendTestBase {
   protected TQueryOptions defaultQueryOptions() {
     TQueryOptions options = new TQueryOptions();
     options.setExplain_level(TExplainLevel.STANDARD);
-    options.setAllow_unsupported_formats(true);
     options.setExec_single_node_rows_threshold(0);
     return options;
   }

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/testdata/bad_avro_snap/README
----------------------------------------------------------------------
diff --git a/testdata/bad_avro_snap/README b/testdata/bad_avro_snap/README
index 6271967..71eb398 100644
--- a/testdata/bad_avro_snap/README
+++ b/testdata/bad_avro_snap/README
@@ -1,6 +1,6 @@
 String Data
 -----------
-Created by modifying Impala's HdfsAvroTableWriter.
+Created by modifying Impala's HdfsAvroTableWriter(removed).
 
 These files' schemas have a single nullable string column 's'.
 
@@ -14,7 +14,7 @@ truncated_string.avro: contains one value, which is missing the last byte.
 
 Float Data
 ----------
-Created by modifying Impala's HdfsAvroTableWriter.
+Created by modifying Impala's HdfsAvroTableWriter(removed).
 
 These files' schemas have a single nullable float column 'c1'.
 

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/testdata/workloads/functional-query/queries/QueryTest/avro-writer.test
----------------------------------------------------------------------
diff --git a/testdata/workloads/functional-query/queries/QueryTest/avro-writer.test b/testdata/workloads/functional-query/queries/QueryTest/avro-writer.test
deleted file mode 100644
index 6dc0899..0000000
--- a/testdata/workloads/functional-query/queries/QueryTest/avro-writer.test
+++ /dev/null
@@ -1,43 +0,0 @@
-====
----- QUERY
-drop table if exists __avro_write;
-====
----- QUERY
-SET COMPRESSION_CODEC=NONE;
-create table __avro_write (i int, s string, d double)
-stored as AVRO
-TBLPROPERTIES ('avro.schema.literal'='{
-                  "name": "my_record",
-                  "type": "record",
-                  "fields": [
-                  {"name":"i", "type":["int", "null"]},
-                  {"name":"s", "type":["string", "null"]},
-                  {"name":"d", "type":["double", "null"]}]}');
-====
----- QUERY
-SET COMPRESSION_CODEC=NONE;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __avro_write select 0, "a", 1.1;
-====
----- QUERY
-SET COMPRESSION_CODEC=SNAPPY;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __avro_write select 1, "b", 2.2;
-====
----- QUERY
-select * from __avro_write;
----- RESULTS
-0,'a',1.1
-1,'b',2.2
----- TYPES
-INT,STRING,DOUBLE
-====
----- QUERY
-SET ALLOW_UNSUPPORTED_FORMATS=0;
-insert into __avro_write select 1, "b", 2.2;
----- CATCH
-Writing to table format AVRO is not supported. Use query option ALLOW_UNSUPPORTED_FORMATS
-====
----- QUERY
-drop table __avro_write;
-====

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/testdata/workloads/functional-query/queries/QueryTest/seq-writer.test
----------------------------------------------------------------------
diff --git a/testdata/workloads/functional-query/queries/QueryTest/seq-writer.test b/testdata/workloads/functional-query/queries/QueryTest/seq-writer.test
deleted file mode 100644
index 7e2363f..0000000
--- a/testdata/workloads/functional-query/queries/QueryTest/seq-writer.test
+++ /dev/null
@@ -1,308 +0,0 @@
-====
----- QUERY
-SET COMPRESSION_CODEC=NONE;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-SET SEQ_COMPRESSION_MODE=BLOCK;
-create table __seq_write (i int, s string, d double)
-stored as SEQUENCEFILE;
-====
----- QUERY
-SET COMPRESSION_CODEC=NONE;
-SET SEQ_COMPRESSION_MODE=BLOCK;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __seq_write select 0, "a", 1.1;
-====
----- QUERY
-SET COMPRESSION_CODEC=DEFAULT;
-SET SEQ_COMPRESSION_MODE=BLOCK;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __seq_write values (1, "b", 2.2);
-====
----- QUERY
-SET COMPRESSION_CODEC=SNAPPY;
-SET SEQ_COMPRESSION_MODE=BLOCK;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __seq_write values (2, "c", 3.3);
-====
----- QUERY
-SET COMPRESSION_CODEC=SNAPPY_BLOCKED;
-SET SEQ_COMPRESSION_MODE=BLOCK;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __seq_write values (3, "d", 4.4);
-====
----- QUERY
-SET COMPRESSION_CODEC=GZIP;
-SET SEQ_COMPRESSION_MODE=BLOCK;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __seq_write values (4, "e", 5.5);
-====
----- QUERY
-SET COMPRESSION_CODEC=NONE;
-SET SEQ_COMPRESSION_MODE=RECORD;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __seq_write select 5, "a", 1.1;
-====
----- QUERY
-SET COMPRESSION_CODEC=DEFAULT;
-SET SEQ_COMPRESSION_MODE=RECORD;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __seq_write values (6, "b", 2.2);
-====
----- QUERY
-SET COMPRESSION_CODEC=SNAPPY;
-SET SEQ_COMPRESSION_MODE=RECORD;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __seq_write values (7, "c", 3.3);
-====
----- QUERY
-SET COMPRESSION_CODEC=SNAPPY_BLOCKED;
-SET SEQ_COMPRESSION_MODE=RECORD;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __seq_write values (8, "d", 4.4);
-====
----- QUERY
-SET COMPRESSION_CODEC=GZIP;
-SET SEQ_COMPRESSION_MODE=RECORD;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __seq_write values (9, "e", 5.5);
-====
----- QUERY
-SET ALLOW_UNSUPPORTED_FORMATS=0;
-insert into __seq_write values (4, "e", 5.5);
----- CATCH
-Writing to table format SEQUENCE_FILE is not supported. Use query option
-====
----- QUERY
-select * from __seq_write;
----- RESULTS
-0,'a',1.1
-1,'b',2.2
-2,'c',3.3
-3,'d',4.4
-4,'e',5.5
-5,'a',1.1
-6,'b',2.2
-7,'c',3.3
-8,'d',4.4
-9,'e',5.5
----- TYPES
-INT,STRING,DOUBLE
-====
----- QUERY
-# IMPALA-3079: Create a table containing larger seq files with NONE+RECORD and then read
-# it back
-SET COMPRESSION_CODEC=NONE;
-SET SEQ_COMPRESSION_MODE=RECORD;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-create table store_sales_seq_none_rec like tpcds_parquet.store_sales
-stored as SEQUENCEFILE;
-insert into store_sales_seq_none_rec partition(ss_sold_date_sk)
-select * from tpcds_parquet.store_sales
-where (ss_sold_date_sk between 2451175 and 2451200) or
-      (ss_sold_date_sk is null and ss_sold_time_sk > 60000);
-====
----- QUERY
-select count(*) from store_sales_seq_none_rec;
----- RESULTS
-60091
----- TYPES
-BIGINT
-====
----- QUERY
-# IMPALA-3079: Create a table containing larger seq files with DEFAULT+RECORD and then
-# read it back
-SET COMPRESSION_CODEC=DEFAULT;
-SET SEQ_COMPRESSION_MODE=RECORD;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-create table store_sales_seq_def_rec like tpcds_parquet.store_sales
-stored as SEQUENCEFILE;
-insert into store_sales_seq_def_rec partition(ss_sold_date_sk)
-select * from tpcds_parquet.store_sales
-where (ss_sold_date_sk between 2451175 and 2451200) or
-      (ss_sold_date_sk is null and ss_sold_time_sk > 60000);
-====
----- QUERY
-select count(*) from store_sales_seq_def_rec;
----- RESULTS
-60091
----- TYPES
-BIGINT
-====
----- QUERY
-# IMPALA-3079: Create a table containing larger seq files with SNAPPY_BLOCKED+RECORD and
-# then read it back
-SET COMPRESSION_CODEC=SNAPPY_BLOCKED;
-SET SEQ_COMPRESSION_MODE=RECORD;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-create table store_sales_seq_snapb_rec like tpcds_parquet.store_sales
-stored as SEQUENCEFILE;
-insert into store_sales_seq_snapb_rec partition(ss_sold_date_sk)
-select * from tpcds_parquet.store_sales
-where (ss_sold_date_sk between 2451175 and 2451200) or
-      (ss_sold_date_sk is null and ss_sold_time_sk > 60000);
-====
----- QUERY
-select count(*) from store_sales_seq_snapb_rec;
----- RESULTS
-60091
----- TYPES
-BIGINT
-====
----- QUERY
-# IMPALA-3079: Create a table containing larger seq files with SNAPPY+RECORD and then read
-# it back
-SET COMPRESSION_CODEC=SNAPPY;
-SET SEQ_COMPRESSION_MODE=RECORD;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-create table store_sales_seq_snap_rec like tpcds_parquet.store_sales
-stored as SEQUENCEFILE;
-insert into store_sales_seq_snap_rec partition(ss_sold_date_sk)
-select * from tpcds_parquet.store_sales
-where (ss_sold_date_sk between 2451175 and 2451200) or
-      (ss_sold_date_sk is null and ss_sold_time_sk > 60000);
-====
----- QUERY
-select count(*) from store_sales_seq_snap_rec;
----- RESULTS
-60091
----- TYPES
-BIGINT
-====
----- QUERY
-# IMPALA-3079: Create a table containing larger seq files with GZIP+RECORD and then read
-# it back
-SET COMPRESSION_CODEC=GZIP;
-SET SEQ_COMPRESSION_MODE=RECORD;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-create table store_sales_seq_gzip_rec like tpcds_parquet.store_sales
-stored as SEQUENCEFILE;
-insert into store_sales_seq_gzip_rec partition(ss_sold_date_sk)
-select * from tpcds_parquet.store_sales
-where (ss_sold_date_sk between 2451175 and 2451200) or
-      (ss_sold_date_sk is null and ss_sold_time_sk > 60000);
-====
----- QUERY
-select count(*) from store_sales_seq_gzip_rec;
----- RESULTS
-60091
----- TYPES
-BIGINT
-====
----- QUERY
-# IMPALA-3079: Create a table containing larger seq files with NONE+BLOCK and then read it
-# back
-SET COMPRESSION_CODEC=NONE;
-SET SEQ_COMPRESSION_MODE=BLOCK;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-create table store_sales_seq_none_block like tpcds_parquet.store_sales
-stored as SEQUENCEFILE;
-insert into store_sales_seq_none_block partition(ss_sold_date_sk)
-select * from tpcds_parquet.store_sales
-where (ss_sold_date_sk between 2451175 and 2451200) or
-      (ss_sold_date_sk is null and ss_sold_time_sk > 60000);
-====
----- QUERY
-select count(*) from store_sales_seq_none_block;
----- RESULTS
-60091
----- TYPES
-BIGINT
-====
----- QUERY
-# IMPALA-3079: Create a table containing larger seq files with DEFAULT+BLOCK and then read
-# it back
-SET COMPRESSION_CODEC=DEFAULT;
-SET SEQ_COMPRESSION_MODE=BLOCK;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-create table store_sales_seq_def_block like tpcds_parquet.store_sales
-stored as SEQUENCEFILE;
-insert into store_sales_seq_def_block partition(ss_sold_date_sk)
-select * from tpcds_parquet.store_sales
-where (ss_sold_date_sk between 2451175 and 2451200) or
-      (ss_sold_date_sk is null and ss_sold_time_sk > 60000);
-====
----- QUERY
-select count(*) from store_sales_seq_def_block;
----- RESULTS
-60091
----- TYPES
-BIGINT
-====
----- QUERY
-# IMPALA-3079: Create a table containing larger seq files with SNAPPY_BLOCKED+BLOCK and
-# then read it back
-SET COMPRESSION_CODEC=SNAPPY_BLOCKED;
-SET SEQ_COMPRESSION_MODE=BLOCK;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-create table store_sales_seq_snapb_block like tpcds_parquet.store_sales
-stored as SEQUENCEFILE;
-insert into store_sales_seq_snapb_block partition(ss_sold_date_sk)
-select * from tpcds_parquet.store_sales
-where (ss_sold_date_sk between 2451175 and 2451200) or
-      (ss_sold_date_sk is null and ss_sold_time_sk > 60000);
-====
----- QUERY
-select count(*) from store_sales_seq_snapb_block;
----- RESULTS
-60091
----- TYPES
-BIGINT
-====
----- QUERY
-# IMPALA-3079: Create a table containing larger seq files with SNAPPY+BLOCK and then read
-# it back
-SET COMPRESSION_CODEC=SNAPPY;
-SET SEQ_COMPRESSION_MODE=BLOCK;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-create table store_sales_seq_snap_block like tpcds_parquet.store_sales
-stored as SEQUENCEFILE;
-insert into store_sales_seq_snap_block partition(ss_sold_date_sk)
-select * from tpcds_parquet.store_sales
-where (ss_sold_date_sk between 2451175 and 2451200) or
-      (ss_sold_date_sk is null and ss_sold_time_sk > 60000);
-====
----- QUERY
-select count(*) from store_sales_seq_snap_block;
----- RESULTS
-60091
----- TYPES
-BIGINT
-====
----- QUERY
-# IMPALA-3079: Create a table containing larger seq files with GZIP+BLOCK and then read it
-# back
-SET COMPRESSION_CODEC=GZIP;
-SET SEQ_COMPRESSION_MODE=BLOCK;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-create table store_sales_seq_gzip_block like tpcds_parquet.store_sales
-stored as SEQUENCEFILE;
-insert into store_sales_seq_gzip_block partition(ss_sold_date_sk)
-select * from tpcds_parquet.store_sales
-where (ss_sold_date_sk between 2451175 and 2451200) or
-      (ss_sold_date_sk is null and ss_sold_time_sk > 60000);
-====
----- QUERY
-select count(*) from store_sales_seq_gzip_block;
----- RESULTS
-60091
----- TYPES
-BIGINT
-====
----- QUERY
-# IMPALA-5407: Create a table containing seq files with GZIP+RECORD. If the number of
-# impalad workers is three, three files will be created, two of which are large enough
-# (> 64MB) to force multiple flushes. Make sure that the files have been created
-# successfully.
-SET COMPRESSION_CODEC=GZIP;
-SET SEQ_COMPRESSION_MODE=RECORD;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-create table catalog_sales_seq_gzip_rec like tpcds.catalog_sales stored as SEQUENCEFILE;
-insert into catalog_sales_seq_gzip_rec select * from tpcds.catalog_sales;
-====
----- QUERY
-select count(*) from catalog_sales_seq_gzip_rec;
----- RESULTS
-1441548
----- TYPES
-BIGINT
-====

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/testdata/workloads/functional-query/queries/QueryTest/set.test
----------------------------------------------------------------------
diff --git a/testdata/workloads/functional-query/queries/QueryTest/set.test b/testdata/workloads/functional-query/queries/QueryTest/set.test
index 5a2c56a..ffb53a1 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/set.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/set.test
@@ -8,7 +8,6 @@ set buffer_pool_limit=7;
 set all;
 ---- RESULTS: VERIFY_IS_SUBSET
 'ABORT_ON_ERROR','0','REGULAR'
-'ALLOW_UNSUPPORTED_FORMATS','0','DEPRECATED'
 'BATCH_SIZE','0','DEVELOPMENT'
 'BUFFER_POOL_LIMIT','','ADVANCED'
 'DEBUG_ACTION','','DEVELOPMENT'
@@ -34,7 +33,6 @@ set explain_level=3;
 set all;
 ---- RESULTS: VERIFY_IS_SUBSET
 'ABORT_ON_ERROR','0','REGULAR'
-'ALLOW_UNSUPPORTED_FORMATS','0','DEPRECATED'
 'BATCH_SIZE','0','DEVELOPMENT'
 'BUFFER_POOL_LIMIT','','ADVANCED'
 'DEBUG_ACTION','','DEVELOPMENT'
@@ -60,7 +58,6 @@ set explain_level='0';
 set all;
 ---- RESULTS: VERIFY_IS_SUBSET
 'ABORT_ON_ERROR','0','REGULAR'
-'ALLOW_UNSUPPORTED_FORMATS','0','DEPRECATED'
 'BATCH_SIZE','0','DEVELOPMENT'
 'BUFFER_POOL_LIMIT','','ADVANCED'
 'DEBUG_ACTION','','DEVELOPMENT'

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/testdata/workloads/functional-query/queries/QueryTest/text-writer.test
----------------------------------------------------------------------
diff --git a/testdata/workloads/functional-query/queries/QueryTest/text-writer.test b/testdata/workloads/functional-query/queries/QueryTest/text-writer.test
deleted file mode 100644
index 89cd730..0000000
--- a/testdata/workloads/functional-query/queries/QueryTest/text-writer.test
+++ /dev/null
@@ -1,47 +0,0 @@
-====
----- QUERY
-drop table if exists __text_write;
-====
----- QUERY
-create table __text_write (i int, s string, d double);
-====
----- QUERY
-SET COMPRESSION_CODEC=NONE;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __text_write select 0, "a", 1.1;
-====
----- QUERY
-SET COMPRESSION_CODEC=DEFAULT;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __text_write values (1, "b", 2.2);
-====
----- QUERY
-SET COMPRESSION_CODEC=SNAPPY;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __text_write values (2, "c", 3.3);
-====
----- QUERY
-SET COMPRESSION_CODEC=GZIP;
-SET ALLOW_UNSUPPORTED_FORMATS=1;
-insert into __text_write values (3, "d", 4.4);
-====
----- QUERY
-SET COMPRESSION_CODEC=GZIP;
-SET ALLOW_UNSUPPORTED_FORMATS=0;
-insert into __text_write values (3, "d", 4.4);
----- CATCH
-Writing to compressed text table is not supported.
-====
----- QUERY
-select * from __text_write;
----- RESULTS
-0,'a',1.1
-1,'b',2.2
-2,'c',3.3
-3,'d',4.4
----- TYPES
-INT,STRING,DOUBLE
-====
----- QUERY
-drop table __text_write;
-====

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/testdata/workloads/functional-query/queries/QueryTest/unsupported-writers.test
----------------------------------------------------------------------
diff --git a/testdata/workloads/functional-query/queries/QueryTest/unsupported-writers.test b/testdata/workloads/functional-query/queries/QueryTest/unsupported-writers.test
new file mode 100644
index 0000000..68f355f
--- /dev/null
+++ b/testdata/workloads/functional-query/queries/QueryTest/unsupported-writers.test
@@ -0,0 +1,77 @@
+====
+---- QUERY
+create table __text_write (i int, s string, d double);
+====
+---- QUERY
+SET COMPRESSION_CODEC=NONE;
+insert into __text_write select 0, "a", 1.1;
+====
+---- QUERY
+SET COMPRESSION_CODEC=GZIP;
+insert into __text_write values (3, "d", 4.4);
+---- CATCH
+Writing to compressed text table is not supported.
+====
+---- QUERY
+select * from __text_write;
+---- RESULTS
+0,'a',1.1
+---- TYPES
+INT,STRING,DOUBLE
+====
+---- QUERY
+create table __avro_write (i int, s string, d double)
+stored as AVRO
+TBLPROPERTIES ('avro.schema.literal'='{
+                  "name": "my_record",
+                  "type": "record",
+                  "fields": [
+                  {"name":"i", "type":["int", "null"]},
+                  {"name":"s", "type":["string", "null"]},
+                  {"name":"d", "type":["double", "null"]}]}');
+====
+---- QUERY
+insert into __avro_write select 1, "b", 2.2;
+---- CATCH
+Writing to table format AVRO is not supported.
+====
+---- QUERY
+create table __seq_write (i int, s string, d double)
+stored as SEQUENCEFILE;
+====
+---- QUERY
+insert into __seq_write values (4, "e", 5.5);
+---- CATCH
+Writing to table format SEQUENCE_FILE is not supported.
+====
+---- QUERY
+# Test writing to mixed format table containing partitions in both supported and
+# unsupported formats where writing to the partition with supported format should succeed.
+# Create a table containing both text(supported) and avro(unsupported) partitions.
+create table __mixed_format_write (id int) partitioned by (part int);
+====
+---- QUERY
+insert into __mixed_format_write partition(part=2000) values(1);
+====
+---- QUERY
+insert into __mixed_format_write partition(part=2001) values(2);
+====
+---- QUERY
+alter table __mixed_format_write partition (part=2001) set fileformat AVRO;
+====
+---- QUERY
+insert into __mixed_format_write partition(part=2000) values(3);
+====
+---- QUERY
+insert into __mixed_format_write partition(part=2001) values(4);
+---- CATCH
+Writing to table format AVRO is not supported.
+====
+---- QUERY
+select id, part from __mixed_format_write where part = 2000;
+---- RESULTS
+1,2000
+3,2000
+---- TYPES
+INT,INT
+====

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/tests/common/test_dimensions.py
----------------------------------------------------------------------
diff --git a/tests/common/test_dimensions.py b/tests/common/test_dimensions.py
index 785cfa9..0460ea7 100644
--- a/tests/common/test_dimensions.py
+++ b/tests/common/test_dimensions.py
@@ -108,19 +108,6 @@ def create_parquet_dimension(workload):
   return ImpalaTestDimension('table_format',
       TableFormatInfo.create_from_string(dataset, 'parquet/none'))
 
-# Available Exec Options:
-#01: abort_on_error (bool)
-#02 max_errors (i32)
-#03: disable_codegen (bool)
-#04: batch_size (i32)
-#05: return_as_ascii (bool)
-#06: num_nodes (i32)
-#07: max_scan_range_length (i64)
-#08: num_scanner_threads (i32)
-#09: max_io_buffers (i32)
-#10: allow_unsupported_formats (bool)
-#11: partition_agg (bool)
-
 # Common sets of values for the exec option vectors
 ALL_BATCH_SIZES = [0]
 

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/tests/hs2/test_hs2.py
----------------------------------------------------------------------
diff --git a/tests/hs2/test_hs2.py b/tests/hs2/test_hs2.py
index cd861e9..795f45c 100644
--- a/tests/hs2/test_hs2.py
+++ b/tests/hs2/test_hs2.py
@@ -89,11 +89,10 @@ class TestHS2(HS2TestSuite):
     # Should be unchanged
     assert vals2["SYNC_DDL"] == "0"
 
-    # Verify that 'DEVELOPMENT' and 'DEPRECATED' options are not returned.
     assert "MAX_ERRORS" in vals2
     assert levels["MAX_ERRORS"] == "ADVANCED"
+    # Verify that 'DEVELOPMENT' options are not returned.
     assert "DEBUG_ACTION" not in vals2
-    assert "ALLOW_UNSUPPORTED_FORMATS" not in vals2
 
     # Removed options should not be returned.
     assert "MAX_IO_BUFFERS" not in vals2
@@ -101,7 +100,8 @@ class TestHS2(HS2TestSuite):
   @needs_session()
   def test_session_option_levels_via_set_all(self):
     """
-    Tests the level of session options returned by a SET ALL query.
+    Tests the level of session options returned by a SET ALL query except DEPRECATED as we
+    currently do not have any of those left.
     """
     vals, levels = self.get_session_options("SET ALL")
 
@@ -109,12 +109,10 @@ class TestHS2(HS2TestSuite):
     assert "SYNC_DDL" in vals
     assert "MAX_ERRORS" in vals
     assert "DEBUG_ACTION" in vals
-    assert "ALLOW_UNSUPPORTED_FORMATS" in vals
     assert levels["COMPRESSION_CODEC"] == "REGULAR"
     assert levels["SYNC_DDL"] == "REGULAR"
     assert levels["MAX_ERRORS"] == "ADVANCED"
     assert levels["DEBUG_ACTION"] == "DEVELOPMENT"
-    assert levels["ALLOW_UNSUPPORTED_FORMATS"] == "DEPRECATED"
 
     # Removed options should not be returned.
     assert "MAX_IO_BUFFERS" not in vals

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/tests/metadata/test_partition_metadata.py
----------------------------------------------------------------------
diff --git a/tests/metadata/test_partition_metadata.py b/tests/metadata/test_partition_metadata.py
index 1d77aa5..d23e3f0 100644
--- a/tests/metadata/test_partition_metadata.py
+++ b/tests/metadata/test_partition_metadata.py
@@ -45,10 +45,7 @@ class TestPartitionMetadata(ImpalaTestSuite):
     # compression codecs.
     cls.ImpalaTestMatrix.add_constraint(lambda v:
         (v.get_value('table_format').file_format in ('text', 'parquet') and
-         v.get_value('table_format').compression_codec == 'none') or
-        (v.get_value('table_format').file_format in ('seq', 'avro') and
-         v.get_value('table_format').compression_codec == 'snap' and
-         v.get_value('table_format').compression_type == 'block'))
+         v.get_value('table_format').compression_codec == 'none'))
 
   @SkipIfLocal.hdfs_client # TODO: this dependency might not exist anymore
   def test_multiple_partitions_same_location(self, vector, unique_database):
@@ -70,9 +67,6 @@ class TestPartitionMetadata(ImpalaTestSuite):
     self.client.execute("alter table %s add partition (j=2) location '%s/p'"
         % (FQ_TBL_NAME, TBL_LOCATION))
 
-    # Allow unsupported avro and sequence file writer.
-    self.client.execute("set allow_unsupported_formats=true")
-
     # Insert some data. This will only update partition j=1 (IMPALA-1480).
     self.client.execute("insert into table %s partition(j=1) select 1" % FQ_TBL_NAME)
     # Refresh to update file metadata of both partitions
@@ -80,31 +74,19 @@ class TestPartitionMetadata(ImpalaTestSuite):
 
     # The data will be read twice because each partition points to the same location.
     data = self.execute_scalar("select sum(i), sum(j) from %s" % FQ_TBL_NAME)
-    if file_format == 'avro':
-      # Avro writer is broken and produces nulls. Only check partition column.
-      assert data.split('\t')[1] == '3'
-    else:
-      assert data.split('\t') == ['2', '3']
+    assert data.split('\t') == ['2', '3']
 
     self.client.execute("insert into %s partition(j) select 1, 1" % FQ_TBL_NAME)
     self.client.execute("insert into %s partition(j) select 1, 2" % FQ_TBL_NAME)
     self.client.execute("refresh %s" % FQ_TBL_NAME)
     data = self.execute_scalar("select sum(i), sum(j) from %s" % FQ_TBL_NAME)
-    if file_format == 'avro':
-      # Avro writer is broken and produces nulls. Only check partition column.
-      assert data.split('\t')[1] == '9'
-    else:
-      assert data.split('\t') == ['6', '9']
+    assert data.split('\t') == ['6', '9']
 
     # Force all scan ranges to be on the same node. It should produce the same
     # result as above. See IMPALA-5412.
     self.client.execute("set num_nodes=1")
     data = self.execute_scalar("select sum(i), sum(j) from %s" % FQ_TBL_NAME)
-    if file_format == 'avro':
-      # Avro writer is broken and produces nulls. Only check partition column.
-      assert data.split('\t')[1] == '9'
-    else:
-      assert data.split('\t') == ['6', '9']
+    assert data.split('\t') == ['6', '9']
 
   @SkipIfS3.hive
   @SkipIfADLS.hive

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/tests/query_test/test_compressed_formats.py
----------------------------------------------------------------------
diff --git a/tests/query_test/test_compressed_formats.py b/tests/query_test/test_compressed_formats.py
index 694cfe9..2896632 100644
--- a/tests/query_test/test_compressed_formats.py
+++ b/tests/query_test/test_compressed_formats.py
@@ -129,71 +129,25 @@ class TestCompressedFormats(ImpalaTestSuite):
     finally:
       call(["hive", "-e", drop_cmd]);
 
-class TestTableWriters(ImpalaTestSuite):
+class TestUnsupportedTableWriters(ImpalaTestSuite):
   @classmethod
   def get_workload(cls):
     return 'functional-query'
 
   @classmethod
   def add_test_dimensions(cls):
-    super(TestTableWriters, cls).add_test_dimensions()
+    super(TestUnsupportedTableWriters, cls).add_test_dimensions()
     cls.ImpalaTestMatrix.add_dimension(create_single_exec_option_dimension())
-    # This class tests many formats, but doesn't use the contraints
-    # Each format is tested within one test file, we constrain to text/none
-    # as each test file only needs to be run once.
+    # This class tests different formats, but doesn't use constraints.
+    # The constraint added below is only to make sure that the test file runs once.
     cls.ImpalaTestMatrix.add_constraint(lambda v:
         (v.get_value('table_format').file_format =='text' and
         v.get_value('table_format').compression_codec == 'none'))
 
-  def test_seq_writer(self, vector, unique_database):
-    self.run_test_case('QueryTest/seq-writer', vector, unique_database)
-
-  @SkipIfS3.hive
-  @SkipIfADLS.hive
-  @SkipIfIsilon.hive
-  @SkipIfLocal.hive
-  def test_seq_writer_hive_compatibility(self, vector, unique_database):
-    self.client.execute('set ALLOW_UNSUPPORTED_FORMATS=1')
-    # Write sequence files with different compression codec/compression mode and then read
-    # it back in Impala and Hive.
-    # Note that we don't test snappy here as the snappy codec used by Impala does not seem
-    # to be fully compatible with the snappy codec used by Hive.
-    for comp_codec, comp_mode in [('NONE', 'RECORD'), ('NONE', 'BLOCK'),
-                                  ('DEFAULT', 'RECORD'), ('DEFAULT', 'BLOCK'),
-                                  ('GZIP', 'RECORD'), ('GZIP', 'BLOCK')]:
-      table_name = '%s.seq_tbl_%s_%s' % (unique_database, comp_codec, comp_mode)
-      self.client.execute('set COMPRESSION_CODEC=%s' % comp_codec)
-      self.client.execute('set SEQ_COMPRESSION_MODE=%s' % comp_mode)
-      self.client.execute('create table %s like functional.zipcode_incomes stored as '
-          'sequencefile' % table_name)
-      # Write sequence file of size greater than 4K
-      self.client.execute('insert into %s select * from functional.zipcode_incomes where '
-          'zip >= "5"' % table_name)
-      # Write sequence file of size less than 4K
-      self.client.execute('insert into %s select * from functional.zipcode_incomes where '
-          'zip="00601"' % table_name)
-
-      count_query = 'select count(*) from %s' % table_name
-
-      # Read it back in Impala
-      output = self.client.execute(count_query)
-      assert '16541' == output.get_data()
-      # Read it back in Hive
-      # Note that username is passed in for the sake of remote cluster tests. The default
-      # HDFS user is typically 'hdfs', and this is needed to run a count() operation using
-      # hive. For local mini clusters, the usename can be anything. See IMPALA-5413.
-      output = self.run_stmt_in_hive(count_query, username='hdfs')
-      assert '16541' == output.split('\n')[1]
-
-  def test_avro_writer(self, vector):
-    self.run_test_case('QueryTest/avro-writer', vector)
-
-  def test_text_writer(self, vector):
-    # TODO debug this test.
-    # This caused by a zlib failure. Suspected cause is too small a buffer
-    # passed to zlib for compression; similar to IMPALA-424
-    pytest.skip()
-    self.run_test_case('QueryTest/text-writer', vector)
+  def test_error_message(self, vector, unique_database):
+    # Tests that an appropriate error message is displayed for unsupported writers like
+    # compressed text, avro and sequence.
+    self.run_test_case('QueryTest/unsupported-writers', vector, unique_database)
 
 @pytest.mark.execute_serially
 class TestLargeCompressedFile(ImpalaTestSuite):

http://git-wip-us.apache.org/repos/asf/impala/blob/30e82c63/tests/shell/test_shell_interactive.py
----------------------------------------------------------------------
diff --git a/tests/shell/test_shell_interactive.py b/tests/shell/test_shell_interactive.py
index fe631cf..eac9d27 100755
--- a/tests/shell/test_shell_interactive.py
+++ b/tests/shell/test_shell_interactive.py
@@ -389,11 +389,10 @@ class TestImpalaShellInteractive(object):
     assert "APPX_COUNT_DISTINCT" in result.stdout
     assert "SUPPORT_START_OVER" in result.stdout
     # Development, deprecated and removed options should not be shown.
+    # Note: there are currently no deprecated options
     assert "Development Query Options:" not in result.stdout
-    assert "DEBUG_ACTION" not in result.stdout
-    assert "Deprecated Query Options:" not in result.stdout
-    assert "ALLOW_UNSUPPORTED_FORMATS" not in result.stdout
-    assert "MAX_IO_BUFFERS" not in result.stdout
+    assert "DEBUG_ACTION" not in result.stdout # Development option.
+    assert "MAX_IO_BUFFERS" not in result.stdout # Removed option.
 
     shell2 = ImpalaShell()
     shell2.send_cmd("set all")
@@ -401,7 +400,7 @@ class TestImpalaShellInteractive(object):
     assert "Query options (defaults shown in []):" in result.stdout
     assert "Advanced Query Options:" in result.stdout
     assert "Development Query Options:" in result.stdout
-    assert "Deprecated Query Options:" in result.stdout
+    assert "Deprecated Query Options:" not in result.stdout
     advanced_part_start_idx = result.stdout.find("Advanced Query Options")
     development_part_start_idx = result.stdout.find("Development Query Options")
     deprecated_part_start_idx = result.stdout.find("Deprecated Query Options")
@@ -411,7 +410,6 @@ class TestImpalaShellInteractive(object):
     assert "APPX_COUNT_DISTINCT" in advanced_part
     assert "SUPPORT_START_OVER" in advanced_part
     assert "DEBUG_ACTION" in development_part
-    assert "ALLOW_UNSUPPORTED_FORMATS" in result.stdout[deprecated_part_start_idx:]
     # Removed options should not be shown.
     assert "MAX_IO_BUFFERS" not in result.stdout

[3/6] impala git commit: [DOCS] Clarification on admission control and DDL statements

Posted by ta...@apache.org.

[DOCS] Clarification on admission control and DDL statements

Removed the confusing example and paragraphs.

Change-Id: I2e3e82bd34e88e7a13de1864aeb97f01023bc715
Reviewed-on: http://gerrit.cloudera.org:8080/10829
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/6f52ce10
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/6f52ce10
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/6f52ce10

Branch: refs/heads/master
Commit: 6f52ce10e302ed9d168731dc11db07aabbfa2e53
Parents: 83448f1
Author: Alex Rodoni <ar...@cloudera.com>
Authored: Tue Jun 26 14:30:38 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Tue Jul 3 18:41:47 2018 +0000

----------------------------------------------------------------------
 docs/topics/impala_admission.xml | 146 ++++++++++++++--------------------
 1 file changed, 61 insertions(+), 85 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/6f52ce10/docs/topics/impala_admission.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_admission.xml b/docs/topics/impala_admission.xml
index 5de246b..317fa80 100644
--- a/docs/topics/impala_admission.xml
+++ b/docs/topics/impala_admission.xml
@@ -51,6 +51,11 @@ under the License.
       not wait indefinitely, so that you can detect and correct <q>starvation</q> scenarios.
     </p>
     <p>
+      Queries, DML statements, and some DDL statements, including
+        <codeph>CREATE TABLE AS SELECT</codeph> and <codeph>COMPUTE
+        STATS</codeph> are affected by admission control.
+    </p>
+    <p>
       Enable this feature if your cluster is
       underutilized at some times and overutilized at others. Overutilization is indicated by performance
       bottlenecks and queries being cancelled due to out-of-memory conditions, when those same queries are
@@ -765,38 +770,42 @@ impala.admission-control.pool-queue-timeout-ms.<varname>queue_name</varname></ph
 <!-- End Config -->
 
   <concept id="admission_guidelines">
-
-    <title>Guidelines for Using Admission Control</title>
-  <prolog>
-    <metadata>
-      <data name="Category" value="Planning"/>
-      <data name="Category" value="Guidelines"/>
-      <data name="Category" value="Best Practices"/>
-    </metadata>
-  </prolog>
-
-    <conbody>
-
-      <p>
-        To see how admission control works for particular queries, examine the profile output for the query. This
-        information is available through the <codeph>PROFILE</codeph> statement in <cmdname>impala-shell</cmdname>
-        immediately after running a query in the shell, on the <uicontrol>queries</uicontrol> page of the Impala
-        debug web UI, or in the Impala log file (basic information at log level 1, more detailed information at log
-        level 2). The profile output contains details about the admission decision, such as whether the query was
-        queued or not and which resource pool it was assigned to. It also includes the estimated and actual memory
-        usage for the query, so you can fine-tune the configuration for the memory limits of the resource pools.
-      </p>
-
-      <p>
-        Remember that the limits imposed by admission control are <q>soft</q> limits.
-        The decentralized nature of this mechanism means that each Impala node makes its own decisions about whether
-        to allow queries to run immediately or to queue them. These decisions rely on information passed back and forth
-        between nodes by the statestore service. If a sudden surge in requests causes more queries than anticipated to run
-        concurrently, then throughput could decrease due to queries spilling to disk or contending for resources;
-        or queries could be cancelled if they exceed the <codeph>MEM_LIMIT</codeph> setting while running.
-      </p>
-
-<!--
+      <title>Guidelines for Using Admission Control</title>
+      <prolog>
+        <metadata>
+          <data name="Category" value="Planning"/>
+          <data name="Category" value="Guidelines"/>
+          <data name="Category" value="Best Practices"/>
+        </metadata>
+      </prolog>
+      <conbody>
+        <p>
+          To see how admission control works for particular queries, examine
+          the profile output for the query. This information is available
+          through the <codeph>PROFILE</codeph> statement in
+            <cmdname>impala-shell</cmdname> immediately after running a query in
+          the shell, on the <uicontrol>queries</uicontrol> page of the Impala
+          debug web UI, or in the Impala log file (basic information at log
+          level 1, more detailed information at log level 2). The profile output
+          contains details about the admission decision, such as whether the
+          query was queued or not and which resource pool it was assigned to. It
+          also includes the estimated and actual memory usage for the query, so
+          you can fine-tune the configuration for the memory limits of the
+          resource pools.
+        </p>
+        <p>
+          Remember that the limits imposed by admission control are
+            <q>soft</q> limits. The decentralized nature of this mechanism means
+          that each Impala node makes its own decisions about whether to allow
+          queries to run immediately or to queue them. These decisions rely on
+          information passed back and forth between nodes by the statestore
+          service. If a sudden surge in requests causes more queries than
+          anticipated to run concurrently, then throughput could decrease due to
+          queries spilling to disk or contending for resources; or queries could
+          be cancelled if they exceed the <codeph>MEM_LIMIT</codeph> setting
+          while running.
+        </p>
+        <!--
       <p>
         If you have trouble getting a query to run because its estimated memory usage is too high, you can override
         the estimate by setting the <codeph>MEM_LIMIT</codeph> query option in <cmdname>impala-shell</cmdname>,
@@ -806,58 +815,25 @@ impala.admission-control.pool-queue-timeout-ms.<varname>queue_name</varname></ph
         pre-allocated by the query.
       </p>
 -->
-
-      <p>
-        In <cmdname>impala-shell</cmdname>, you can also specify which resource pool to direct queries to by
-        setting the <codeph>REQUEST_POOL</codeph> query option.
-      </p>
-
-      <p>
-        The statements affected by the admission control feature are primarily queries, but also include statements
-        that write data such as <codeph>INSERT</codeph> and <codeph>CREATE TABLE AS SELECT</codeph>. Most write
-        operations in Impala are not resource-intensive, but inserting into a Parquet table can require substantial
-        memory due to buffering intermediate data before writing out each Parquet data block. See
-        <xref href="impala_parquet.xml#parquet_etl"/> for instructions about inserting data efficiently into
-        Parquet tables.
-      </p>
-
-      <p>
-        Although admission control does not scrutinize memory usage for other kinds of DDL statements, if a query
-        is queued due to a limit on concurrent queries or memory usage, subsequent statements in the same session
-        are also queued so that they are processed in the correct order:
-      </p>
-
-<codeblock>-- This query could be queued to avoid out-of-memory at times of heavy load.
-select * from huge_table join enormous_table using (id);
--- If so, this subsequent statement in the same session is also queued
--- until the previous statement completes.
-drop table huge_table;
-</codeblock>
-
-      <p>
-        If you set up different resource pools for different users and groups, consider reusing any classifications
-        you developed for use with Sentry security. See <xref href="impala_authorization.xml#authorization"/> for details.
-      </p>
-
-      <p>
-        For details about all the Fair Scheduler configuration settings, see
-        <xref keyref="FairScheduler">Fair Scheduler Configuration</xref>, in particular the tags such as <codeph>&lt;queue&gt;</codeph> and
-        <codeph>&lt;aclSubmitApps&gt;</codeph> to map users and groups to particular resource pools (queues).
-      </p>
-
-<!-- Wait a sec. We say admission control doesn't use RESERVATION_REQUEST_TIMEOUT at all.
-     What's the real story here? Matt did refer to some timeout option that was
-     available through the shell but not the DB-centric APIs.
-<p>
-  Because you cannot override query options such as
-  <codeph>RESERVATION_REQUEST_TIMEOUT</codeph>
-  in a JDBC or ODBC application, consider configuring timeout periods
-  on the application side to cancel queries that take
-  too long due to being queued during times of high load.
-</p>
--->
-    </conbody>
-  </concept>
+        <p>
+          In <cmdname>impala-shell</cmdname>, you can also specify which
+          resource pool to direct queries to by setting the
+            <codeph>REQUEST_POOL</codeph> query option.
+        </p>
+        <p>
+          If you set up different resource pools for different users and
+          groups, consider reusing any classifications you developed for use
+          with Sentry security. See <xref
+            href="impala_authorization.xml#authorization"/> for details.
+        </p>
+        <p>
+          For details about all the Fair Scheduler configuration settings, see
+            <xref keyref="FairScheduler">Fair Scheduler Configuration</xref>, in
+          particular the tags such as <codeph>&lt;queue&gt;</codeph> and
+            <codeph>&lt;aclSubmitApps&gt;</codeph> to map users and groups to
+          particular resource pools (queues).
+        </p>
+      </conbody>
+    </concept>
 </concept>
 </concept>
-

[2/6] impala git commit: IMPALA-7237: handle hex digits in ParseSmaps()

Posted by ta...@apache.org.

IMPALA-7237: handle hex digits in ParseSmaps()

Testing:
Manual. Added some temporary logging to print out which branch it took
with each line and confirmed it took the right branch for a line
starting with 'f'.

Change-Id: I3dad846dafb25b414bee1858eb63f3eda31d59ac
Reviewed-on: http://gerrit.cloudera.org:8080/10853
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/83448f1c
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/83448f1c
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/83448f1c

Branch: refs/heads/master
Commit: 83448f1c41d3f73ce0c0174c1725c78af3afd0d0
Parents: d03a2d6
Author: Tim Armstrong <ta...@cloudera.com>
Authored: Mon Jul 2 17:08:46 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Tue Jul 3 18:29:31 2018 +0000

----------------------------------------------------------------------
 be/src/util/mem-info.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/83448f1c/be/src/util/mem-info.cc
----------------------------------------------------------------------
diff --git a/be/src/util/mem-info.cc b/be/src/util/mem-info.cc
index 9ffc98d..10ce09f 100644
--- a/be/src/util/mem-info.cc
+++ b/be/src/util/mem-info.cc
@@ -115,9 +115,10 @@ MappedMemInfo MemInfo::ParseSmaps() {
     string line;
     getline(smaps, line);
     if (line.empty()) continue;
-    if (isdigit(line[0])) {
+    if (isdigit(line[0]) || (line[0] >= 'a' && line[0] <= 'f')) {
       // Line is the start of a new mapping, of form:
       // 561ceff9c000-561ceffa1000 rw-p 00000000 00:00 0
+      // We distinguish this case by checking for lower-case hex digits.
       ++result.num_maps;
       continue;
     }