You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2019/08/12 18:55:21 UTC

[impala] branch master updated (2066b72 -> f26a32f)

This is an automated email from the ASF dual-hosted git repository.

tarmstrong pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git.


    from 2066b72  IMPALA-8848: fix UNION missing input cardinality bug
     new 8eb5007  IMPALA-7374: [DOCS] Document the new DATE data type in Impala
     new 620329f  IMPALA-8837: [DOCS] HTTP support for proxy/delegation connection
     new f26a32f  IMPALA-8846: Undefined behaviour in RleEncoder::Put

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 be/src/util/rle-encoding.h            |   2 +-
 docs/impala.ditamap                   |   2 +-
 docs/topics/impala_authentication.xml |   3 +
 docs/topics/impala_date.xml           | 196 ++++++++++++++++-------
 docs/topics/impala_delegation.xml     |  17 +-
 docs/topics/impala_literals.xml       | 284 ++++++++++++++++++++--------------
 6 files changed, 326 insertions(+), 178 deletions(-)


[impala] 02/03: IMPALA-8837: [DOCS] HTTP support for proxy/delegation connection

Posted by ta...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

tarmstrong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 620329f6d72d1a09edf560e7da0bc1d09e13a57f
Author: Alex Rodoni <ar...@cloudera.com>
AuthorDate: Wed Aug 7 13:11:36 2019 -0700

    IMPALA-8837: [DOCS] HTTP support for proxy/delegation connection
    
    - Added a line on Knox support.
    
    Change-Id: I591e0fd736ea114aa52a999acf41806a94e49382
    Reviewed-on: http://gerrit.cloudera.org:8080/14033
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
    Reviewed-by: Thomas Tauber-Marshall <tm...@cloudera.com>
---
 docs/topics/impala_authentication.xml |  3 +++
 docs/topics/impala_delegation.xml     | 17 +++++++++++------
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/docs/topics/impala_authentication.xml b/docs/topics/impala_authentication.xml
index 996a927..4405b5c 100644
--- a/docs/topics/impala_authentication.xml
+++ b/docs/topics/impala_authentication.xml
@@ -43,6 +43,9 @@ under the License.
     <p>
       Impala supports authentication using either Kerberos or LDAP.
     </p>
+    <p>
+      You can also make proxy connections to Impala through Apache Knox.
+    </p>
 
     <note conref="../shared/impala_common.xml#common/authentication_vs_authorization"/>
 
diff --git a/docs/topics/impala_delegation.xml b/docs/topics/impala_delegation.xml
index c2a4722..bf2d4cc 100644
--- a/docs/topics/impala_delegation.xml
+++ b/docs/topics/impala_delegation.xml
@@ -20,7 +20,7 @@ under the License.
 <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
 <concept rev="1.2" id="delegation">
 
-  <title>Configuring Impala Delegation for Hue and BI Tools</title>
+  <title>Configuring Impala Delegation for Clients</title>
 
   <prolog>
     <metadata>
@@ -38,10 +38,10 @@ under the License.
   <conbody>
 
     <p>
-      When users submit Impala queries through a separate application, such as Hue or a business
-      intelligence tool, typically all requests are treated as coming from the same user. In
-      Impala 1.2 and higher, Impala supports <q>delegation</q> where users whose names you
-      specify can delegate the execution of a query to another user. The query runs with the
+      When users submit Impala queries through a separate client application, such as Hue or a
+      business intelligence tool, typically all requests are treated as coming from the same
+      user. In Impala 1.2 and higher, Impala supports <q>delegation</q> where users whose names
+      you specify can delegate the execution of a query to another user. The query runs with the
       privileges of the delegated user, not the original authenticated user.
     </p>
 
@@ -147,6 +147,11 @@ under the License.
         When opening a client connection, the client must provide a delegated username via the
         HiveServer2 protocol property,<codeph>impala.doas.user</codeph> or
         <codeph>DelegationUID</codeph>.
+        <p>
+          When the client connects over HTTP, the <codeph>doAs</codeph> parameter can be
+          specified in the HTTP path, e.g.
+          <codeph>/?doAs=</codeph><varname>delegated_user</varname>.
+        </p>
       </li>
 
       <li>
@@ -183,7 +188,7 @@ under the License.
       The user or group delegation process works as follows:
       <ol>
         <li>
-          The Impalad daemon starts with one of the following options:
+          The <codeph>impalad</codeph> daemon starts with one of the following options:
           <ul>
             <li>
               <codeph>&#8209;&#8209;authorized_proxy_user_config=<varname>authenticated_user</varname>=<varname>delegated_user</varname></codeph>


[impala] 01/03: IMPALA-7374: [DOCS] Document the new DATE data type in Impala

Posted by ta...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

tarmstrong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 8eb50076c2b232b60fa5e44fb9341c752e2bf417
Author: Alex Rodoni <ar...@cloudera.com>
AuthorDate: Thu Aug 1 16:08:50 2019 -0700

    IMPALA-7374: [DOCS] Document the new DATE data type in Impala
    
    Change-Id: I0c28361c7f0d225708eaf4b955c6704520eaaa68
    Reviewed-on: http://gerrit.cloudera.org:8080/13983
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
    Reviewed-by: Attila Jeges <at...@cloudera.com>
---
 docs/impala.ditamap             |   2 +-
 docs/topics/impala_date.xml     | 196 +++++++++++++++++++--------
 docs/topics/impala_literals.xml | 284 ++++++++++++++++++++++++----------------
 3 files changed, 311 insertions(+), 171 deletions(-)

diff --git a/docs/impala.ditamap b/docs/impala.ditamap
index b9bdb6d..2479d72 100644
--- a/docs/impala.ditamap
+++ b/docs/impala.ditamap
@@ -94,7 +94,7 @@ under the License.
       <topicref href="topics/impala_boolean.xml"/>
       <topicref href="topics/impala_char.xml"/>
       <topicref href="topics/impala_decimal.xml"/>
-      <topicref audience="hidden" href="topics/impala_date.xml"/>
+      <topicref href="topics/impala_date.xml"/>
       <topicref href="topics/impala_double.xml"/>
       <topicref href="topics/impala_float.xml"/>
       <topicref href="topics/impala_int.xml"/>
diff --git a/docs/topics/impala_date.xml b/docs/topics/impala_date.xml
index 228fc92..7f528d7 100644
--- a/docs/topics/impala_date.xml
+++ b/docs/topics/impala_date.xml
@@ -18,9 +18,16 @@ specific language governing permissions and limitations
 under the License.
 -->
 <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
-<concept audience="hidden" id="date" rev="2.0.0">
+<concept id="date" rev="2.0.0">
+
+  <title>DATE Data Type</title>
+
+  <titlealts audience="PDF">
+
+    <navtitle>DATE</navtitle>
+
+  </titlealts>
 
-  <title>DATE Data Type (<keyword keyref="impala21"/> or higher only)</title>
   <prolog>
     <metadata>
       <data name="Category" value="Impala"/>
@@ -35,88 +42,169 @@ under the License.
   <conbody>
 
     <p>
-      <indexterm audience="hidden">DATE data type</indexterm>
-      A type representing the date (year, month, and day) as a single numeric value. Used to represent a broader
-      date range than possible with the <codeph>TIMESTAMP</codeph> type, with fewer distinct values than
-      <codeph>TIMESTAMP</codeph>, and in a more compact and efficient form than using a <codeph>STRING</codeph>
-      such as <codeph>'2014-12-31'</codeph>.
+      Use the <codeph>DATE</codeph> data type to store date values. The <codeph>DATE</codeph>
+      type is supported for HBase, Text, and Parquet.
     </p>
 
-    <p conref="../shared/impala_common.xml#common/syntax_blurb"/>
-
-<codeblock><varname>column_name</varname> DATE</codeblock>
-
     <p>
-      <b>Range:</b> January 1, -4712 BC .. December 31, 9999 AD.
+      <b>Range:</b>
     </p>
 
-    <p conref="../shared/impala_common.xml#common/hbase_ok"/>
-
-    <p conref="../shared/impala_common.xml#common/parquet_blurb"/>
-
-    <ul>
-      <li>
-        This type can be read from and written to Parquet files.
-      </li>
-
-      <li>
-        There is no requirement for a particular level of Parquet.
-      </li>
+    <p>
+      0000-01-01 to 9999-12-31
+    </p>
 
-      <li>
-        Parquet files generated by Impala and containing this type can be freely interchanged with other components
-        such as Hive and MapReduce.
-      </li>
-    </ul>
+    <p>
+      <b>Literals and expressions:</b>
+    </p>
 
-    <p conref="../shared/impala_common.xml#common/hive_blurb"/>
+    <p>
+      The <codeph>DATE</codeph> literals are in the form of <codeph>DATE'YYYY-MM-DD'</codeph>.
+      For example, <codeph>DATE '2013-01-01'</codeph>
+    </p>
 
     <p>
-      TK.
+      <b>Parquet considerations:</b>
     </p>
 
-    <p conref="../shared/impala_common.xml#common/conversion_blurb"/>
+    <p>
+      Parquet uses <codeph>DATE</codeph> logical type for dates. The <codeph>DATE</codeph>
+      logical type annotates an <codeph>INT32</codeph> that stores the number of days from the
+      Unix epoch, January 1, 1970. This representation introduces a parquet interoperability
+      issue between Impala and older versions of Hive:
+    </p>
 
     <p>
-      TK.
+      If Hive versions lower than 3.1 wrote dates earlier than 1582-10-15 to a parquet table,
+      those dates will be read back incorrectly by Impala and vice versa. In Hive 3.1 and
+      higher, this is no longer an issue.
     </p>
 
-    <p conref="../shared/impala_common.xml#common/partitioning_blurb"/>
+    <p>
+      <b>Explicit casting between DATE and other data types:</b>
+    </p>
 
     <p>
-      This type can be used for partition key columns. Because it has less granularity (and thus fewer distinct
-      values) than an equivalent <codeph>TIMESTAMP</codeph> column, and numeric columns are more efficient as
-      partition keys than strings, prefer to partition by a <codeph>DATE</codeph> column rather than a
-      <codeph>TIMESTAMP</codeph> column or a <codeph>STRING</codeph> representation of a date.
+      <codeph>DATE</codeph> type can only be converted to/from <codeph>DATE</codeph>,
+      <codeph>TIMESTAMP</codeph>, or <codeph>STRING</codeph> types as described below.
     </p>
 
-    <p conref="../shared/impala_common.xml#common/compatibility_blurb"/>
+    <table id="table_ovw_zt1_p3b">
+      <tgroup cols="3">
+        <colspec colnum="1" colname="col1" colwidth="1*"/>
+        <colspec colname="newCol2" colnum="2" colwidth="1.09*"/>
+        <colspec colnum="3" colname="col2" colwidth="3.46*"/>
+        <thead>
+          <row>
+            <entry>
+              Cast from
+            </entry>
+            <entry>
+              Cast to
+            </entry>
+            <entry>
+              Result
+            </entry>
+          </row>
+        </thead>
+        <tbody>
+          <row>
+            <entry>
+              <codeph>TIMESTAMP</codeph>
+            </entry>
+            <entry>
+              <codeph>DATE</codeph>
+            </entry>
+            <entry>
+              The date component of the <codeph>TIMESTAMP</codeph> is returned, and the time of
+              the day component of the <codeph>TIMESTAMP</codeph> is ignored.
+            </entry>
+          </row>
+          <row>
+            <entry>
+              <codeph>STRING</codeph>
+            </entry>
+            <entry>
+              <codeph>DATE</codeph>
+            </entry>
+            <entry>
+              The <codeph>DATE</codeph> value of <codeph>yyyy-MM-dd</codeph> is returned.
+
+              <p>
+                The <codeph>STRING</codeph> value must be in the <codeph>yyyy-MM-dd</codeph> or
+                <codeph>yyyy-MM-dd HH:mm:ss.SSSSSSSSS</codeph> pattern.
+              </p>
+
+
+
+              <p>
+                If the time component is present in <codeph>STRING</codeph>, it is silently
+                ignored.
+              </p>
+
+
+
+              <p>
+                If the <codeph>STRING</codeph> value does not match the above formats, an error
+                is returned.
+              </p>
+            </entry>
+          </row>
+          <row>
+            <entry>
+              <codeph>DATE</codeph>
+            </entry>
+            <entry>
+              <codeph>TIMESTAMP</codeph>
+            </entry>
+            <entry>
+              The year, month, and day of the <codeph>DATE</codeph> is returned along with the
+              time of day component set to <codeph>00:00:00</codeph>.
+            </entry>
+          </row>
+          <row>
+            <entry>
+              <codeph>DATE</codeph>
+            </entry>
+            <entry>
+              <codeph>STRING</codeph>
+            </entry>
+            <entry>
+              The <codeph>STRING</codeph> value, <codeph>'yyyy-MM-dd'</codeph>, is returned.
+            </entry>
+          </row>
+        </tbody>
+      </tgroup>
+    </table>
 
     <p>
-      This type is available on <keyword keyref="impala20_full"/> or higher.
+      <b>Implicit casting between DATE and other types: </b>
     </p>
 
-    <p conref="../shared/impala_common.xml#common/internals_2_bytes"/>
-
-    <p conref="../shared/impala_common.xml#common/added_in_20"/>
+    <p>
+      Implicit casting is supported:
+    </p>
 
-    <p conref="../shared/impala_common.xml#common/column_stats_constant"/>
+    <ul>
+      <li>
+        From <codeph>STRING</codeph> to <codeph>DATE</codeph> if the source
+        <codeph>STRING</codeph> value is in the <codeph>yyyy-MM-dd</codeph> or
+        <codeph>yyyy-MM-dd HH:mm:ss.SSSSSSSSS</codeph> pattern.
+      </li>
 
-    <p conref="../shared/impala_common.xml#common/restrictions_blurb"/>
+      <li>
+        From <codeph>DATE</codeph> to <codeph>TIMESTAMP</codeph>.
+      </li>
+    </ul>
 
     <p>
-      Things happen when converting <codeph>TIMESTAMP</codeph> to <codeph>DATE</codeph> or <codeph>DATE</codeph> to
-      <codeph>TIMESTAMP</codeph>. TK.
+      <b>Added in:</b>
     </p>
 
-    <p conref="../shared/impala_common.xml#common/example_blurb"/>
-
-    <p conref="../shared/impala_common.xml#common/related_info"/>
-
     <p>
-      The <xref href="impala_timestamp.xml#timestamp">TIMESTAMP</xref> data type is closely related. Some functions
-      from <xref href="impala_datetime_functions.xml#datetime_functions"/> accept and return <codeph>DATE</codeph>
-      values.
+      The <codeph>DATE</codeph> type is available in Impala 3.3 and higher.
     </p>
+
   </conbody>
+
 </concept>
diff --git a/docs/topics/impala_literals.xml b/docs/topics/impala_literals.xml
index 32a69da..1bdbad5 100644
--- a/docs/topics/impala_literals.xml
+++ b/docs/topics/impala_literals.xml
@@ -21,6 +21,7 @@ under the License.
 <concept id="literals">
 
   <title>Literals</title>
+
   <prolog>
     <metadata>
       <data name="Category" value="Impala"/>
@@ -34,14 +35,16 @@ under the License.
   <conbody>
 
     <p>
-      <indexterm audience="hidden">literals</indexterm>
-      Each of the Impala data types has corresponding notation for literal values of that type. You specify literal
-      values in SQL statements, such as in the <codeph>SELECT</codeph> list or <codeph>WHERE</codeph> clause of a
-      query, or as an argument to a function call. See <xref href="impala_datatypes.xml#datatypes"/> for a complete
-      list of types, ranges, and conversion rules.
+      Each of the Impala data types has corresponding notation for literal values of that type.
+      You specify literal values in SQL statements, such as in the <codeph>SELECT</codeph> list
+      or <codeph>WHERE</codeph> clause of a query, or as an argument to a function call. See
+      <xref
+        href="impala_datatypes.xml#datatypes"/> for a complete list of types,
+      ranges, and conversion rules.
     </p>
 
     <p outputclass="toc inpage"/>
+
   </conbody>
 
   <concept id="numeric_literals">
@@ -51,17 +54,18 @@ under the License.
     <conbody>
 
       <p>
-        <indexterm audience="hidden">numeric literals</indexterm>
-        To write literals for the integer types (<codeph>TINYINT</codeph>, <codeph>SMALLINT</codeph>,
-        <codeph>INT</codeph>, and <codeph>BIGINT</codeph>), use a sequence of digits with optional leading zeros.
+        To write literals for the integer types (<codeph>TINYINT</codeph>,
+        <codeph>SMALLINT</codeph>, <codeph>INT</codeph>, and <codeph>BIGINT</codeph>), use a
+        sequence of digits with optional leading zeros.
       </p>
 
       <p rev="1.4.0">
         To write literals for the floating-point types (<codeph rev="1.4.0">DECIMAL</codeph>,
-        <codeph>FLOAT</codeph>, and <codeph>DOUBLE</codeph>), use a sequence of digits with an optional decimal
-        point (<codeph>.</codeph> character). To preserve accuracy during arithmetic expressions, Impala interprets
-        floating-point literals as the <codeph>DECIMAL</codeph> type with the smallest appropriate precision and
-        scale, until required by the context to convert the result to <codeph>FLOAT</codeph> or
+        <codeph>FLOAT</codeph>, and <codeph>DOUBLE</codeph>), use a sequence of digits with an
+        optional decimal point (<codeph>.</codeph> character). To preserve accuracy during
+        arithmetic expressions, Impala interprets floating-point literals as the
+        <codeph>DECIMAL</codeph> type with the smallest appropriate precision and scale, until
+        required by the context to convert the result to <codeph>FLOAT</codeph> or
         <codeph>DOUBLE</codeph>.
       </p>
 
@@ -70,20 +74,22 @@ under the License.
       </p>
 
       <p>
-        You can also use exponential notation by including an <codeph>e</codeph> character. For example,
-        <codeph>1e6</codeph> is 1 times 10 to the power of 6 (1 million). A number in exponential notation is
-        always interpreted as floating-point.
+        You can also use exponential notation by including an <codeph>e</codeph> character. For
+        example, <codeph>1e6</codeph> is 1 times 10 to the power of 6 (1 million). A number in
+        exponential notation is always interpreted as floating-point.
       </p>
 
       <p rev="tk">
-        When Impala encounters a numeric literal, it considers the type to be the <q>smallest</q> that can
-        accurately represent the value. The type is promoted to larger or more accurate types if necessary, based
-        on subsequent parts of an expression.
+        When Impala encounters a numeric literal, it considers the type to be the
+        <q>smallest</q> that can accurately represent the value. The type is promoted to larger
+        or more accurate types if necessary, based on subsequent parts of an expression.
       </p>
+
       <p>
-        For example, you can see by the types Impala defines for the following table columns
-        how it interprets the corresponding numeric literals:
+        For example, you can see by the types Impala defines for the following table columns how
+        it interprets the corresponding numeric literals:
       </p>
+
 <codeblock>[localhost:21000] > create table ten as select 10 as x;
 +-------------------+
 | summary           |
@@ -136,7 +142,9 @@ under the License.
 | x    | decimal(4,3) |         |
 +------+--------------+---------+
 </codeblock>
+
     </conbody>
+
   </concept>
 
   <concept id="string_literals">
@@ -146,15 +154,16 @@ under the License.
     <conbody>
 
       <p>
-        <indexterm audience="hidden">string literals</indexterm>
-        String literals are quoted using either single or double quotation marks. You can use either kind of quotes
-        for string literals, even both kinds for different literals within the same statement.
+        String literals are quoted using either single or double quotation marks. You can use
+        either kind of quotes for string literals, even both kinds for different literals within
+        the same statement.
       </p>
 
       <p rev="2.0.0">
-        Quoted literals are considered to be of type <codeph>STRING</codeph>. To use quoted literals in contexts
-        requiring a <codeph>CHAR</codeph> or <codeph>VARCHAR</codeph> value, <codeph>CAST()</codeph> the literal to
-        a <codeph>CHAR</codeph> or <codeph>VARCHAR</codeph> of the appropriate length.
+        Quoted literals are considered to be of type <codeph>STRING</codeph>. To use quoted
+        literals in contexts requiring a <codeph>CHAR</codeph> or <codeph>VARCHAR</codeph>
+        value, <codeph>CAST()</codeph> the literal to a <codeph>CHAR</codeph> or
+        <codeph>VARCHAR</codeph> of the appropriate length.
       </p>
 
       <p>
@@ -162,8 +171,8 @@ under the License.
       </p>
 
       <p>
-        To encode special characters within a string literal, precede them with the backslash (<codeph>\</codeph>)
-        escape character:
+        To encode special characters within a string literal, precede them with the backslash
+        (<codeph>\</codeph>) escape character:
       </p>
 
       <ul>
@@ -172,53 +181,57 @@ under the License.
         </li>
 
         <li>
-          <codeph>\n</codeph> represents a newline or linefeed. This might cause extra line breaks in
-          <cmdname>impala-shell</cmdname> output.
+          <codeph>\n</codeph> represents a newline or linefeed. This might cause extra line
+          breaks in <cmdname>impala-shell</cmdname> output.
         </li>
 
         <li>
-          <codeph>\r</codeph> represents a carriage return. This might cause unusual formatting (making it appear
-          that some content is overwritten) in <cmdname>impala-shell</cmdname> output.
+          <codeph>\r</codeph> represents a carriage return. This might cause unusual formatting
+          (making it appear that some content is overwritten) in <cmdname>impala-shell</cmdname>
+          output.
         </li>
 
         <li>
-          <codeph>\b</codeph> represents a backspace. This might cause unusual formatting (making it appear that
-          some content is overwritten) in <cmdname>impala-shell</cmdname> output.
+          <codeph>\b</codeph> represents a backspace. This might cause unusual formatting
+          (making it appear that some content is overwritten) in <cmdname>impala-shell</cmdname>
+          output.
         </li>
 
         <li>
-          <codeph>\0</codeph> represents an ASCII <codeph>nul</codeph> character (not the same as a SQL
-          <codeph>NULL</codeph>). This might not be visible in <cmdname>impala-shell</cmdname> output.
+          <codeph>\0</codeph> represents an ASCII <codeph>nul</codeph> character (not the same
+          as a SQL <codeph>NULL</codeph>). This might not be visible in
+          <cmdname>impala-shell</cmdname> output.
         </li>
 
         <li>
-          <codeph>\Z</codeph> represents a DOS end-of-file character. This might not be visible in
-          <cmdname>impala-shell</cmdname> output.
+          <codeph>\Z</codeph> represents a DOS end-of-file character. This might not be visible
+          in <cmdname>impala-shell</cmdname> output.
         </li>
 
         <li>
-          <codeph>\%</codeph> and <codeph>\_</codeph> can be used to escape wildcard characters within the string
-          passed to the <codeph>LIKE</codeph> operator.
+          <codeph>\%</codeph> and <codeph>\_</codeph> can be used to escape wildcard characters
+          within the string passed to the <codeph>LIKE</codeph> operator.
         </li>
 
         <li>
-          <codeph>\</codeph> followed by 3 octal digits represents the ASCII code of a single character; for
-          example, <codeph>\101</codeph> is ASCII 65, the character <codeph>A</codeph>.
+          <codeph>\</codeph> followed by 3 octal digits represents the ASCII code of a single
+          character; for example, <codeph>\101</codeph> is ASCII 65, the character
+          <codeph>A</codeph>.
         </li>
 
         <li>
-          Use two consecutive backslashes (<codeph>\\</codeph>) to prevent the backslash from being interpreted as
-          an escape character.
+          Use two consecutive backslashes (<codeph>\\</codeph>) to prevent the backslash from
+          being interpreted as an escape character.
         </li>
 
         <li>
-          Use the backslash to escape single or double quotation mark characters within a string literal, if the
-          literal is enclosed by the same type of quotation mark.
+          Use the backslash to escape single or double quotation mark characters within a string
+          literal, if the literal is enclosed by the same type of quotation mark.
         </li>
 
         <li>
-          If the character following the <codeph>\</codeph> does not represent the start of a recognized escape
-          sequence, the character is passed through unchanged.
+          If the character following the <codeph>\</codeph> does not represent the start of a
+          recognized escape sequence, the character is passed through unchanged.
         </li>
       </ul>
 
@@ -227,16 +240,18 @@ under the License.
       </p>
 
       <p>
-        To include a single quotation character within a string value, enclose the literal with either single or
-        double quotation marks, and optionally escape the single quote as a <codeph>\'</codeph> sequence. Earlier
-        releases required escaping a single quote inside double quotes. Continue using escape sequences in this
-        case if you also need to run your SQL code on older versions of Impala.
+        To include a single quotation character within a string value, enclose the literal with
+        either single or double quotation marks, and optionally escape the single quote as a
+        <codeph>\'</codeph> sequence. Earlier releases required escaping a single quote inside
+        double quotes. Continue using escape sequences in this case if you also need to run your
+        SQL code on older versions of Impala.
       </p>
 
       <p>
-        To include a double quotation character within a string value, enclose the literal with single quotation
-        marks, no escaping is necessary in this case. Or, enclose the literal with double quotation marks and
-        escape the double quote as a <codeph>\"</codeph> sequence.
+        To include a double quotation character within a string value, enclose the literal with
+        single quotation marks, no escaping is necessary in this case. Or, enclose the literal
+        with double quotation marks and escape the double quote as a <codeph>\"</codeph>
+        sequence.
       </p>
 
 <codeblock>[localhost:21000] &gt; select "What\'s happening?" as single_within_double,
@@ -261,13 +276,15 @@ under the License.
       </p>
 
       <p>
-        When dealing with output that includes non-ASCII or non-printable characters such as linefeeds and
-        backspaces, use the <cmdname>impala-shell</cmdname> options to save to a file, turn off pretty printing, or
-        both rather than relying on how the output appears visually. See
-        <xref href="impala_shell_options.xml#shell_options"/> for a list of <cmdname>impala-shell</cmdname>
-        options.
+        When dealing with output that includes non-ASCII or non-printable characters such as
+        linefeeds and backspaces, use the <cmdname>impala-shell</cmdname> options to save to a
+        file, turn off pretty printing, or both rather than relying on how the output appears
+        visually. See <xref href="impala_shell_options.xml#shell_options"/> for a list of
+        <cmdname>impala-shell</cmdname> options.
       </p>
+
     </conbody>
+
   </concept>
 
   <concept id="boolean_literals">
@@ -277,8 +294,8 @@ under the License.
     <conbody>
 
       <p>
-        For <codeph>BOOLEAN</codeph> values, the literals are <codeph>TRUE</codeph> and <codeph>FALSE</codeph>,
-        with no quotation marks and case-insensitive.
+        For <codeph>BOOLEAN</codeph> values, the literals are <codeph>TRUE</codeph> and
+        <codeph>FALSE</codeph>, with no quotation marks and case-insensitive.
       </p>
 
       <p conref="../shared/impala_common.xml#common/example_blurb"/>
@@ -286,7 +303,9 @@ under the License.
 <codeblock>select true;
 select * from t1 where assertion = false;
 select case bool_col when true then 'yes' when false 'no' else 'null' end from t1;</codeblock>
+
     </conbody>
+
   </concept>
 
   <concept id="timestamp_literals">
@@ -297,20 +316,41 @@ select case bool_col when true then 'yes' when false 'no' else 'null' end from t
 
       <p conref="../shared/impala_common.xml#common/timestamp_conversions"
         conrefend="../shared/impala_common.xml#common/cast_string_to_timestamp"/>
-      <p> You can also use <codeph>INTERVAL</codeph> expressions to add or
-        subtract from timestamp literal values, such as
-          <codeph>CAST('1966‑07‑30' AS
-          TIMESTAMP) + INTERVAL 5 YEARS + INTERVAL 3 DAYS</codeph>. See <xref
-          href="impala_timestamp.xml#timestamp"/> for details. </p>
 
       <p>
-        Depending on your data pipeline, you might receive date and time data as text, in notation that does not
-        exactly match the format for Impala <codeph>TIMESTAMP</codeph> literals.
-        See <xref href="impala_datetime_functions.xml#datetime_functions"/> for functions that can convert
-        between a variety of string literals (including different field order, separators, and timezone notation)
-        and equivalent <codeph>TIMESTAMP</codeph> or numeric values.
+        You can also use <codeph>INTERVAL</codeph> expressions to add or subtract from timestamp
+        literal values, such as <codeph>CAST('1966‑07‑30' AS
+        TIMESTAMP) + INTERVAL 5 YEARS + INTERVAL 3 DAYS</codeph>. See
+        <xref
+          href="impala_timestamp.xml#timestamp"/> for details.
+      </p>
+
+      <p>
+        Depending on your data pipeline, you might receive date and time data as text, in
+        notation that does not exactly match the format for Impala <codeph>TIMESTAMP</codeph>
+        literals. See <xref href="impala_datetime_functions.xml#datetime_functions"/> for
+        functions that can convert between a variety of string literals (including different
+        field order, separators, and timezone notation) and equivalent
+        <codeph>TIMESTAMP</codeph> or numeric values.
+      </p>
+
+    </conbody>
+
+  </concept>
+
+  <concept id="date_literals">
+
+    <title>Date Literals</title>
+
+    <conbody>
+
+      <p>
+        The <codeph>DATE</codeph> literals are in the form of <codeph>DATE'YYYY-MM-DD'</codeph>.
+        For example, <codeph>DATE '2013-01-01'</codeph>
       </p>
+
     </conbody>
+
   </concept>
 
   <concept id="null">
@@ -320,34 +360,36 @@ select case bool_col when true then 'yes' when false 'no' else 'null' end from t
     <conbody>
 
       <p>
-        <indexterm audience="hidden">NULL</indexterm>
-        The notion of <codeph>NULL</codeph> values is familiar from all kinds of database systems, but each SQL
-        dialect can have its own behavior and restrictions on <codeph>NULL</codeph> values. For Big Data
-        processing, the precise semantics of <codeph>NULL</codeph> values are significant: any misunderstanding
-        could lead to inaccurate results or misformatted data, that could be time-consuming to correct for large
-        data sets.
+        The notion of <codeph>NULL</codeph> values is familiar from all kinds of database
+        systems, but each SQL dialect can have its own behavior and restrictions on
+        <codeph>NULL</codeph> values. For Big Data processing, the precise semantics of
+        <codeph>NULL</codeph> values are significant: any misunderstanding could lead to
+        inaccurate results or misformatted data, that could be time-consuming to correct for
+        large data sets.
       </p>
 
       <ul>
         <li>
-          <codeph>NULL</codeph> is a different value than an empty string. The empty string is represented by a
-          string literal with nothing inside, <codeph>""</codeph> or <codeph>''</codeph>.
+          <codeph>NULL</codeph> is a different value than an empty string. The empty string is
+          represented by a string literal with nothing inside, <codeph>""</codeph> or
+          <codeph>''</codeph>.
         </li>
 
         <li>
-          In a delimited text file, the <codeph>NULL</codeph> value is represented by the special token
-          <codeph>\N</codeph>.
+          In a delimited text file, the <codeph>NULL</codeph> value is represented by the
+          special token <codeph>\N</codeph>.
         </li>
 
         <li>
-          When Impala inserts data into a partitioned table, and the value of one of the partitioning columns is
-          <codeph>NULL</codeph> or the empty string, the data is placed in a special partition that holds only
-          these two kinds of values. When these values are returned in a query, the result is <codeph>NULL</codeph>
-          whether the value was originally <codeph>NULL</codeph> or an empty string. This behavior is compatible
-          with the way Hive treats <codeph>NULL</codeph> values in partitioned tables. Hive does not allow empty
-          strings as partition keys, and it returns a string value such as
-          <codeph>__HIVE_DEFAULT_PARTITION__</codeph> instead of <codeph>NULL</codeph> when such values are
-          returned from a query. For example:
+          When Impala inserts data into a partitioned table, and the value of one of the
+          partitioning columns is <codeph>NULL</codeph> or the empty string, the data is placed
+          in a special partition that holds only these two kinds of values. When these values
+          are returned in a query, the result is <codeph>NULL</codeph> whether the value was
+          originally <codeph>NULL</codeph> or an empty string. This behavior is compatible with
+          the way Hive treats <codeph>NULL</codeph> values in partitioned tables. Hive does not
+          allow empty strings as partition keys, and it returns a string value such as
+          <codeph>__HIVE_DEFAULT_PARTITION__</codeph> instead of <codeph>NULL</codeph> when such
+          values are returned from a query. For example:
 <codeblock>create table t1 (i int) partitioned by (x int, y string);
 -- Select an INT column from another table, with all rows going into a special HDFS subdirectory
 -- named __HIVE_DEFAULT_PARTITION__. Depending on whether one or both of the partitioning keys
@@ -359,62 +401,72 @@ insert into t1 partition(x=NULL, y) select c1, c3  from some_other_table;</codeb
         </li>
 
         <li>
-          There is no <codeph>NOT NULL</codeph> clause when defining a column to prevent <codeph>NULL</codeph>
-          values in that column.
+          There is no <codeph>NOT NULL</codeph> clause when defining a column to prevent
+          <codeph>NULL</codeph> values in that column.
         </li>
 
         <li>
-          There is no <codeph>DEFAULT</codeph> clause to specify a non-<codeph>NULL</codeph> default value.
+          There is no <codeph>DEFAULT</codeph> clause to specify a non-<codeph>NULL</codeph>
+          default value.
         </li>
 
         <li>
-          If an <codeph>INSERT</codeph> operation mentions some columns but not others, the unmentioned columns
-          contain <codeph>NULL</codeph> for all inserted rows.
+          If an <codeph>INSERT</codeph> operation mentions some columns but not others, the
+          unmentioned columns contain <codeph>NULL</codeph> for all inserted rows.
         </li>
 
         <li rev="1.2.1">
           <p conref="../shared/impala_common.xml#common/null_sorting_change"/>
+
           <note>
-            <!-- To do: Probably a bunch of similar view-related restrictions like this that should be collected, reused, or cross-referenced under the Views topic. -->
-            Because the <codeph>NULLS FIRST</codeph> and <codeph>NULLS LAST</codeph> keywords are not currently
-            available in Hive queries, any views you create using those keywords will not be available through
-            Hive.
+<!-- To do: Probably a bunch of similar view-related restrictions like this that should be collected, reused, or cross-referenced under the Views topic. -->
+            Because the <codeph>NULLS FIRST</codeph> and <codeph>NULLS LAST</codeph> keywords
+            are not currently available in Hive queries, any views you create using those
+            keywords will not be available through Hive.
           </note>
         </li>
 
         <li>
-          In all other contexts besides sorting with <codeph>ORDER BY</codeph>, comparing a <codeph>NULL</codeph>
-          to anything else returns <codeph>NULL</codeph>, making the comparison meaningless. For example,
-          <codeph>10 &gt; NULL</codeph> produces <codeph>NULL</codeph>, <codeph>10 &lt; NULL</codeph> also produces
-          <codeph>NULL</codeph>, <codeph>5 BETWEEN 1 AND NULL</codeph> produces <codeph>NULL</codeph>, and so on.
+          In all other contexts besides sorting with <codeph>ORDER BY</codeph>, comparing a
+          <codeph>NULL</codeph> to anything else returns <codeph>NULL</codeph>, making the
+          comparison meaningless. For example, <codeph>10 &gt; NULL</codeph> produces
+          <codeph>NULL</codeph>, <codeph>10 &lt; NULL</codeph> also produces
+          <codeph>NULL</codeph>, <codeph>5 BETWEEN 1 AND NULL</codeph> produces
+          <codeph>NULL</codeph>, and so on.
         </li>
       </ul>
 
       <p>
         Several built-in functions serve as shorthand for evaluating expressions and returning
-        <codeph>NULL</codeph>, 0, or some other substitution value depending on the expression result:
-        <codeph>ifnull()</codeph>, <codeph>isnull()</codeph>, <codeph>nvl()</codeph>, <codeph>nullif()</codeph>,
-        <codeph>nullifzero()</codeph>, and <codeph>zeroifnull()</codeph>. See
+        <codeph>NULL</codeph>, 0, or some other substitution value depending on the expression
+        result: <codeph>ifnull()</codeph>, <codeph>isnull()</codeph>, <codeph>nvl()</codeph>,
+        <codeph>nullif()</codeph>, <codeph>nullifzero()</codeph>, and
+        <codeph>zeroifnull()</codeph>. See
         <xref href="impala_conditional_functions.xml#conditional_functions"/> for details.
       </p>
 
       <p conref="../shared/impala_common.xml#common/kudu_blurb"/>
+
       <p rev="kudu">
         Columns in Kudu tables have an attribute that specifies whether or not they can contain
-        <codeph>NULL</codeph> values. A column with a <codeph>NULL</codeph> attribute can contain
-        nulls. A column with a <codeph>NOT NULL</codeph> attribute cannot contain any nulls, and
-        an <codeph>INSERT</codeph>, <codeph>UPDATE</codeph>, or <codeph>UPSERT</codeph> statement
-        will skip any row that attempts to store a null in a column designated as <codeph>NOT NULL</codeph>.
-        Kudu tables default to the <codeph>NULL</codeph> setting for each column, except columns that
-        are part of the primary key.
+        <codeph>NULL</codeph> values. A column with a <codeph>NULL</codeph> attribute can
+        contain nulls. A column with a <codeph>NOT NULL</codeph> attribute cannot contain any
+        nulls, and an <codeph>INSERT</codeph>, <codeph>UPDATE</codeph>, or
+        <codeph>UPSERT</codeph> statement will skip any row that attempts to store a null in a
+        column designated as <codeph>NOT NULL</codeph>. Kudu tables default to the
+        <codeph>NULL</codeph> setting for each column, except columns that are part of the
+        primary key.
       </p>
+
       <p rev="kudu">
-        In addition to columns with the <codeph>NOT NULL</codeph> attribute, Kudu tables also have
-        restrictions on <codeph>NULL</codeph> values in columns that are part of the primary key for
-        a table. No column that is part of the primary key in a Kudu table can contain any
-        <codeph>NULL</codeph> values.
+        In addition to columns with the <codeph>NOT NULL</codeph> attribute, Kudu tables also
+        have restrictions on <codeph>NULL</codeph> values in columns that are part of the
+        primary key for a table. No column that is part of the primary key in a Kudu table can
+        contain any <codeph>NULL</codeph> values.
       </p>
 
     </conbody>
+
   </concept>
+
 </concept>


[impala] 03/03: IMPALA-8846: Undefined behaviour in RleEncoder::Put

Posted by ta...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

tarmstrong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit f26a32f85542bfdbceb7306a06327f66dc30294a
Author: Daniel Becker <da...@cloudera.com>
AuthorDate: Fri Aug 9 10:08:55 2019 +0200

    IMPALA-8846: Undefined behaviour in RleEncoder::Put
    
    To test for overflow, we used 'repeat_count_ <=
    std::numeric_limits<int32_t>::max()', but this is always true as
    repeat_count_ is an int. This could have lead to undefined behaviour
    because we increment repeat_count_ afterwards.
    
    Changed the comparison not to allow equality.
    
    Change-Id: I269443d1f1680e672fde7dd88eab5fcb56c65613
    Reviewed-on: http://gerrit.cloudera.org:8080/14042
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 be/src/util/rle-encoding.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/be/src/util/rle-encoding.h b/be/src/util/rle-encoding.h
index 4da9fd9..4ea3769 100644
--- a/be/src/util/rle-encoding.h
+++ b/be/src/util/rle-encoding.h
@@ -344,7 +344,7 @@ inline bool RleEncoder::Put(uint64_t value) {
   if (UNLIKELY(buffer_full_)) return false;
 
   if (LIKELY(current_value_ == value
-      && repeat_count_ <= std::numeric_limits<int32_t>::max())) {
+      && repeat_count_ < std::numeric_limits<int32_t>::max())) {
     ++repeat_count_;
     if (repeat_count_ > 8) {
       // This is just a continuation of the current run, no need to buffer the