You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by jo...@apache.org on 2018/05/25 23:16:39 UTC

[1/2] impala git commit: [DOCS] Correct info about REGEXP

Repository: impala
Updated Branches:
  refs/heads/master 9a5410570 -> 456356ca0


[DOCS] Correct info about REGEXP

Change-Id: I6920003c3903bfd6417a530c3785de9e5676a9dd
Reviewed-on: http://gerrit.cloudera.org:8080/10518
Reviewed-by: Alex Rodoni <ar...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/2d59e4eb
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/2d59e4eb
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/2d59e4eb

Branch: refs/heads/master
Commit: 2d59e4eb0d79ab0da24bf7a394410313b07af270
Parents: 9a54105
Author: Alex Rodoni <ar...@cloudera.com>
Authored: Fri May 25 12:23:09 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Fri May 25 21:26:35 2018 +0000

----------------------------------------------------------------------
 docs/topics/impala_operators.xml | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/2d59e4eb/docs/topics/impala_operators.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_operators.xml b/docs/topics/impala_operators.xml
index 673260e..731022f 100644
--- a/docs/topics/impala_operators.xml
+++ b/docs/topics/impala_operators.xml
@@ -989,8 +989,6 @@ SELECT COUNT(DISTINCT(visitor_id)) FROM web_traffic WHERE month IN ('January','J
 
       <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
 
-      <p conref="../shared/impala_common.xml#common/regular_expression_whole_string"/>
-
 <!-- Currently, there isn't any IRLIKE synonym, so REGEXP and IREGEXP are different in that respect.
      I pinged IMPALA-1787 to check if that's intentional.
       <p>
@@ -1005,9 +1003,7 @@ SELECT COUNT(DISTINCT(visitor_id)) FROM web_traffic WHERE month IN ('January','J
         built-in function. (Currently, there is not any case-insensitive equivalent for the <codeph>regexp_extract()</codeph> function.)
       </p>
 
-      <note rev="1.3.1">
-        <p rev="1.3.1" conref="../shared/impala_common.xml#common/regexp_matching"/>
-      </note>
+      <p rev="1.3.1" conref="../shared/impala_common.xml#common/regexp_matching"/>
 
       <p conref="../shared/impala_common.xml#common/regexp_re2"/>
 
@@ -1632,8 +1628,6 @@ where
 
       <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
 
-      <p conref="../shared/impala_common.xml#common/regular_expression_whole_string"/>
-
       <p>
         The <codeph>RLIKE</codeph> operator is a synonym for <codeph>REGEXP</codeph>.
       </p>
@@ -1645,9 +1639,7 @@ where
         built-in function.
       </p>
 
-      <note rev="1.3.1">
-          <p rev="1.3.1" conref="../shared/impala_common.xml#common/regexp_matching"/>
-      </note>
+      <p rev="1.3.1" conref="../shared/impala_common.xml#common/regexp_matching"/>
 
       <p conref="../shared/impala_common.xml#common/regexp_re2"/>
 


[2/2] impala git commit: [DOCS] Complex types in DDL not supported for text format files

Posted by jo...@apache.org.
[DOCS] Complex types in DDL not supported for text format files

Change-Id: Icc67c9d74de7e952d13b7ecc511ad263b3915272
Reviewed-on: http://gerrit.cloudera.org:8080/10508
Reviewed-by: Alex Rodoni <ar...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/456356ca
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/456356ca
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/456356ca

Branch: refs/heads/master
Commit: 456356ca0a69ad9de6c5acd6c1605fc5db66d174
Parents: 2d59e4e
Author: Alex Rodoni <ar...@cloudera.com>
Authored: Thu May 24 14:45:08 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Fri May 25 21:44:48 2018 +0000

----------------------------------------------------------------------
 docs/topics/impala_complex_types.xml | 82 +++++++++++++------------------
 1 file changed, 33 insertions(+), 49 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/456356ca/docs/topics/impala_complex_types.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_complex_types.xml b/docs/topics/impala_complex_types.xml
index 06a070f..b95e601 100644
--- a/docs/topics/impala_complex_types.xml
+++ b/docs/topics/impala_complex_types.xml
@@ -349,16 +349,6 @@ under the License.
         </p>
 
         <p>
-          Each table, or each partition within a table, can have a separate file format, and you can change file format at the table or
-          partition level through an <codeph>ALTER TABLE</codeph> statement. Because this flexibility makes it difficult to guarantee ahead
-          of time that all the data files for a table or partition are in a compatible format, Impala does not throw any errors when you
-          change the file format for a table or partition using <codeph>ALTER TABLE</codeph>. Any errors come at runtime when Impala
-          actually processes a table or partition that contains nested types and is not in one of the supported formats. If a query on a
-          partitioned table only processes some partitions, and all those partitions are in one of the supported formats, the query
-          succeeds.
-        </p>
-
-        <p>
           Because Impala does not parse the data structures containing nested types for unsupported formats such as text, Avro,
           SequenceFile, or RCFile, you cannot use data files in these formats with Impala, even if the query does not refer to the nested
           type columns. Also, if a table using an unsupported format originally contained nested type columns, and then those columns were
@@ -366,20 +356,24 @@ under the License.
           nested type data and Impala queries on that table will generate errors.
         </p>
 
-        <note rev="2.6.0 IMPALA-2844">
-          <p rev="2.6.0 IMPALA-2844">
+        <p rev="2.6.0 IMPALA-2844">
             The one exception to the preceding rule is <codeph>COUNT(*)</codeph> queries on RCFile tables that include complex types.
             Such queries are allowed in <keyword keyref="impala26_full"/> and higher.
-          </p>
-        </note>
+        </p>
 
         <p>
-          You can perform DDL operations (even <codeph>CREATE TABLE</codeph>) for tables involving complex types in file formats other than
-          Parquet. The DDL support lets you set up intermediate tables in your ETL pipeline, to be populated by Hive, before the final stage
-          where the data resides in a Parquet table and is queryable by Impala. Also, you can have a partitioned table with complex type
-          columns that uses a non-Parquet format, and use <codeph>ALTER TABLE</codeph> to change the file format to Parquet for individual
-          partitions. When you put Parquet data files into those partitions, Impala can execute queries against that data as long as the
-          query does not involve any of the non-Parquet partitions.
+          You can perform DDL operations for tables involving complex types in
+          most file formats other than Parquet. You cannot create tables in
+          Impala with complex types using text files.
+        </p>
+
+        <p>
+          You can have a partitioned table with complex type columns that uses
+          a non-Parquet format, and use <codeph>ALTER TABLE</codeph> to change
+          the file format to Parquet for individual partitions. When you put
+          Parquet data files into those partitions, Impala can execute queries
+          against that data as long as the query does not involve any of the
+          non-Parquet partitions.
         </p>
 
         <p>
@@ -491,21 +485,16 @@ under the License.
 
       <conbody>
 
-<!-- HiveQL functions like nested type constructors and posexplode(): https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF -->
-
-<!-- HiveQL complex types: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-ComplexTypes -->
-
-<!-- HiveQL lateral views: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView -->
-
         <p>
           Impala can query Parquet tables containing <codeph>ARRAY</codeph>, <codeph>STRUCT</codeph>, and <codeph>MAP</codeph> columns
           produced by Hive. There are some differences to be aware of between the Impala SQL and HiveQL syntax for complex types, primarily
           for queries.
         </p>
-
         <p>
-          The syntax for specifying <codeph>ARRAY</codeph>, <codeph>STRUCT</codeph>, and <codeph>MAP</codeph> types in a <codeph>CREATE
-          TABLE</codeph> statement is compatible between Impala and Hive.
+          Impala supports a subset of the syntax that Hive supports for
+          specifying <codeph>ARRAY</codeph>, <codeph>STRUCT</codeph>, and
+            <codeph>MAP</codeph> types in the <codeph>CREATE TABLE</codeph>
+          statements.
         </p>
 
         <p>
@@ -674,27 +663,22 @@ under the License.
         <p>
           Unions are not currently supported.
         </p>
-
         <p>
-          Array, struct, and map column type declarations are specified in the <codeph>CREATE TABLE</codeph> statement. You can also add or
-          change the type of complex columns through the <codeph>ALTER TABLE</codeph> statement.
-        </p>
-
-        <note>
-          <p>
-            Currently, Impala queries allow complex types only in tables that use the Parquet format. If an Impala query encounters complex
-            types in a table or partition using another file format, the query returns a runtime error.
-          </p>
-
-          <p>
-            The Impala DDL support for complex types works for all file formats, so that you can create tables using text or other
-            non-Parquet formats for Hive to use as staging tables in an ETL cycle that ends with the data in a Parquet table. You can also
-            use <codeph>ALTER TABLE ... SET FILEFORMAT PARQUET</codeph> to change the file format of an existing table containing complex
-            types to Parquet, after which Impala can query it. Make sure to load Parquet files into the table after changing the file
-            format, because the <codeph>ALTER TABLE ... SET FILEFORMAT</codeph> statement does not convert existing data to the new file
-            format.
-          </p>
-        </note>
+          <codeph>Array</codeph>, <codeph>struct</codeph>, and
+            <codeph>map</codeph> column type declarations are specified in the
+            <codeph>CREATE TABLE</codeph> statement. You can also add or change
+          the type of complex columns through the <codeph>ALTER TABLE</codeph>
+          statement. </p>
+        <p> Currently, Impala queries allow complex types only in tables that
+          use the Parquet format. If an Impala query encounters complex types in
+          a table or partition using another file format, the query returns a
+          runtime error. </p>
+        <p> You can use <codeph>ALTER TABLE ... SET FILEFORMAT PARQUET</codeph>
+          to change the file format of an existing table containing complex
+          types to Parquet, after which Impala can query it. Make sure to load
+          Parquet files into the table after changing the file format, because
+          the <codeph>ALTER TABLE ... SET FILEFORMAT</codeph> statement does not
+          convert existing data to the new file format. </p>
 
         <p conref="../shared/impala_common.xml#common/complex_types_partitioning"/>