You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by mi...@apache.org on 2018/05/09 21:10:33 UTC
[24/51] [partial] impala git commit: [DOCS] Impala doc site update
for 3.0
http://git-wip-us.apache.org/repos/asf/impala/blob/fae51ec2/docs/build3x/html/topics/impala_langref_unsupported.html
----------------------------------------------------------------------
diff --git a/docs/build3x/html/topics/impala_langref_unsupported.html b/docs/build3x/html/topics/impala_langref_unsupported.html
new file mode 100644
index 0000000..769bf86
--- /dev/null
+++ b/docs/build3x/html/topics/impala_langref_unsupported.html
@@ -0,0 +1,337 @@
+<!DOCTYPE html
+ SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_langref.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 3.0.x"><meta name="version" content="Impala 3.0.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="langref_hiveql_delta"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>SQL Differences Between Impala and Hive</title></head><body id="langref_hiveql_delta"><main role="main"><article role="article" aria-labelledby="ariaid-title1">
+
+ <h1 class="title topictitle1" id="ariaid-title1">SQL Differences Between Impala and Hive</h1>
+
+
+ <div class="body conbody">
+
+ <p class="p">
+
+
+ Impala's SQL syntax follows the SQL-92 standard, and includes many industry extensions in areas such as
+ built-in functions. See <a class="xref" href="impala_porting.html#porting">Porting SQL from Other Database Systems to Impala</a> for a general discussion of adapting SQL
+ code from a variety of database systems to Impala.
+ </p>
+
+ <p class="p">
+ Because Impala and Hive share the same metastore database and their tables are often used interchangeably,
+ the following section covers differences between Impala and Hive in detail.
+ </p>
+
+ <p class="p toc inpage"></p>
+ </div>
+
+ <nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_langref.html">Impala SQL Language Reference</a></div></div></nav><article class="topic concept nested1" aria-labelledby="ariaid-title2" id="langref_hiveql_delta__langref_hiveql_unsupported">
+
+ <h2 class="title topictitle2" id="ariaid-title2">HiveQL Features not Available in Impala</h2>
+
+ <div class="body conbody">
+
+ <p class="p">
+ The current release of Impala does not support the following SQL features that you might be familiar with
+ from HiveQL:
+ </p>
+
+
+
+ <ul class="ul">
+
+
+ <li class="li">
+ Extensibility mechanisms such as <code class="ph codeph">TRANSFORM</code>, custom file formats, or custom SerDes.
+ </li>
+
+ <li class="li">
+ The <code class="ph codeph">DATE</code> data type.
+ </li>
+
+ <li class="li">
+ XML and JSON functions.
+ </li>
+
+ <li class="li">
+ Certain aggregate functions from HiveQL: <code class="ph codeph">covar_pop</code>, <code class="ph codeph">covar_samp</code>,
+ <code class="ph codeph">corr</code>, <code class="ph codeph">percentile</code>, <code class="ph codeph">percentile_approx</code>,
+ <code class="ph codeph">histogram_numeric</code>, <code class="ph codeph">collect_set</code>; Impala supports the set of aggregate
+ functions listed in <a class="xref" href="impala_aggregate_functions.html#aggregate_functions">Impala Aggregate Functions</a> and analytic
+ functions listed in <a class="xref" href="impala_analytic_functions.html#analytic_functions">Impala Analytic Functions</a>.
+ </li>
+
+ <li class="li">
+ Sampling.
+ </li>
+
+ <li class="li">
+ Lateral views. In <span class="keyword">Impala 2.3</span> and higher, Impala supports queries on complex types
+ (<code class="ph codeph">STRUCT</code>, <code class="ph codeph">ARRAY</code>, or <code class="ph codeph">MAP</code>), using join notation
+ rather than the <code class="ph codeph">EXPLODE()</code> keyword.
+ See <a class="xref" href="impala_complex_types.html#complex_types">Complex Types (Impala 2.3 or higher only)</a> for details about Impala support for complex types.
+ </li>
+
+ <li class="li">
+ Multiple <code class="ph codeph">DISTINCT</code> clauses per query, although Impala includes some workarounds for this
+ limitation.
+ <div class="note note note_note"><span class="note__title notetitle">Note:</span>
+ <p class="p">
+ By default, Impala only allows a single <code class="ph codeph">COUNT(DISTINCT <var class="keyword varname">columns</var>)</code>
+ expression in each query.
+ </p>
+ <p class="p">
+ If you do not need precise accuracy, you can produce an estimate of the distinct values for a column by
+ specifying <code class="ph codeph">NDV(<var class="keyword varname">column</var>)</code>; a query can contain multiple instances of
+ <code class="ph codeph">NDV(<var class="keyword varname">column</var>)</code>. To make Impala automatically rewrite
+ <code class="ph codeph">COUNT(DISTINCT)</code> expressions to <code class="ph codeph">NDV()</code>, enable the
+ <code class="ph codeph">APPX_COUNT_DISTINCT</code> query option.
+ </p>
+ <p class="p">
+ To produce the same result as multiple <code class="ph codeph">COUNT(DISTINCT)</code> expressions, you can use the
+ following technique for queries involving a single table:
+ </p>
+<pre class="pre codeblock"><code>select v1.c1 result1, v2.c1 result2 from
+ (select count(distinct col1) as c1 from t1) v1
+ cross join
+ (select count(distinct col2) as c1 from t1) v2;
+</code></pre>
+ <p class="p">
+ Because <code class="ph codeph">CROSS JOIN</code> is an expensive operation, prefer to use the <code class="ph codeph">NDV()</code>
+ technique wherever practical.
+ </p>
+ </div>
+ </li>
+ </ul>
+
+ <div class="p">
+ User-defined functions (UDFs) are supported starting in Impala 1.2. See <a class="xref" href="impala_udf.html#udfs">Impala User-Defined Functions (UDFs)</a>
+ for full details on Impala UDFs.
+ <ul class="ul">
+ <li class="li">
+ <p class="p">
+ Impala supports high-performance UDFs written in C++, as well as reusing some Java-based Hive UDFs.
+ </p>
+ </li>
+
+ <li class="li">
+ <p class="p">
+ Impala supports scalar UDFs and user-defined aggregate functions (UDAFs). Impala does not currently
+ support user-defined table generating functions (UDTFs).
+ </p>
+ </li>
+
+ <li class="li">
+ <p class="p">
+ Only Impala-supported column types are supported in Java-based UDFs.
+ </p>
+ </li>
+
+ <li class="li">
+ <p class="p">
+ The Hive <code class="ph codeph">current_user()</code> function cannot be
+ called from a Java UDF through Impala.
+ </p>
+ </li>
+ </ul>
+ </div>
+
+ <p class="p">
+ Impala does not currently support these HiveQL statements:
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ <code class="ph codeph">ANALYZE TABLE</code> (the Impala equivalent is <code class="ph codeph">COMPUTE STATS</code>)
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">DESCRIBE COLUMN</code>
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">DESCRIBE DATABASE</code>
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">EXPORT TABLE</code>
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">IMPORT TABLE</code>
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">SHOW TABLE EXTENDED</code>
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">SHOW INDEXES</code>
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">SHOW COLUMNS</code>
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">INSERT OVERWRITE DIRECTORY</code>; use <code class="ph codeph">INSERT OVERWRITE <var class="keyword varname">table_name</var></code>
+ or <code class="ph codeph">CREATE TABLE AS SELECT</code> to materialize query results into the HDFS directory associated
+ with an Impala table.
+ </li>
+ </ul>
+ <p class="p">
+ Impala respects the <code class="ph codeph">serialization.null.format</code> table
+ property only for TEXT tables and ignores the property for Parquet and
+ other formats. Hive respects the <code class="ph codeph">serialization.null.format</code>
+ property for Parquet and other formats and converts matching values
+ to NULL during the scan. See <a class="xref" href="impala_txtfile.html">Using Text Data Files with Impala Tables</a> for
+ using the table property in Impala.
+ </p>
+ </div>
+ </article>
+
+ <article class="topic concept nested1" aria-labelledby="ariaid-title3" id="langref_hiveql_delta__langref_hiveql_semantics">
+
+ <h2 class="title topictitle2" id="ariaid-title3">Semantic Differences Between Impala and HiveQL Features</h2>
+
+ <div class="body conbody">
+
+ <p class="p">
+ This section covers instances where Impala and Hive have similar functionality, sometimes including the
+ same syntax, but there are differences in the runtime semantics of those features.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Security:</strong>
+ </p>
+
+ <p class="p">
+ Impala utilizes the <a class="xref" href="http://sentry.apache.org/" target="_blank">Apache
+ Sentry </a> authorization framework, which provides fine-grained role-based access control
+ to protect data against unauthorized access or tampering.
+ </p>
+
+ <p class="p">
+ The Hive component now includes Sentry-enabled <code class="ph codeph">GRANT</code>,
+ <code class="ph codeph">REVOKE</code>, and <code class="ph codeph">CREATE/DROP ROLE</code> statements. Earlier Hive releases had a
+ privilege system with <code class="ph codeph">GRANT</code> and <code class="ph codeph">REVOKE</code> statements that were primarily
+ intended to prevent accidental deletion of data, rather than a security mechanism to protect against
+ malicious users.
+ </p>
+
+ <p class="p">
+ Impala can make use of privileges set up through Hive <code class="ph codeph">GRANT</code> and <code class="ph codeph">REVOKE</code> statements.
+ Impala has its own <code class="ph codeph">GRANT</code> and <code class="ph codeph">REVOKE</code> statements in Impala 2.0 and higher.
+ See <a class="xref" href="impala_authorization.html#authorization">Enabling Sentry Authorization for Impala</a> for the details of authorization in Impala, including
+ how to switch from the original policy file-based privilege model to the Sentry service using privileges
+ stored in the metastore database.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">SQL statements and clauses:</strong>
+ </p>
+
+ <p class="p">
+ The semantics of Impala SQL statements varies from HiveQL in some cases where they use similar SQL
+ statement and clause names:
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ Impala uses different syntax and names for query hints, <code class="ph codeph">[SHUFFLE]</code> and
+ <code class="ph codeph">[NOSHUFFLE]</code> rather than <code class="ph codeph">MapJoin</code> or <code class="ph codeph">StreamJoin</code>. See
+ <a class="xref" href="impala_joins.html#joins">Joins in Impala SELECT Statements</a> for the Impala details.
+ </li>
+
+ <li class="li">
+ Impala does not expose MapReduce specific features of <code class="ph codeph">SORT BY</code>, <code class="ph codeph">DISTRIBUTE
+ BY</code>, or <code class="ph codeph">CLUSTER BY</code>.
+ </li>
+
+ <li class="li">
+ Impala does not require queries to include a <code class="ph codeph">FROM</code> clause.
+ </li>
+ </ul>
+
+ <p class="p">
+ <strong class="ph b">Data types:</strong>
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ Impala supports a limited set of implicit casts. This can help avoid undesired results from unexpected
+ casting behavior.
+ <ul class="ul">
+ <li class="li">
+ Impala does not implicitly cast between string and numeric or Boolean types. Always use
+ <code class="ph codeph">CAST()</code> for these conversions.
+ </li>
+
+ <li class="li">
+ Impala does perform implicit casts among the numeric types, when going from a smaller or less precise
+ type to a larger or more precise one. For example, Impala will implicitly convert a
+ <code class="ph codeph">SMALLINT</code> to a <code class="ph codeph">BIGINT</code> or <code class="ph codeph">FLOAT</code>, but to convert from
+ <code class="ph codeph">DOUBLE</code> to <code class="ph codeph">FLOAT</code> or <code class="ph codeph">INT</code> to <code class="ph codeph">TINYINT</code>
+ requires a call to <code class="ph codeph">CAST()</code> in the query.
+ </li>
+
+ <li class="li">
+ Impala does perform implicit casts from string to timestamp. Impala has a restricted set of literal
+ formats for the <code class="ph codeph">TIMESTAMP</code> data type and the <code class="ph codeph">from_unixtime()</code> format
+ string; see <a class="xref" href="impala_timestamp.html#timestamp">TIMESTAMP Data Type</a> for details.
+ </li>
+ </ul>
+ <p class="p">
+ See <a class="xref" href="impala_datatypes.html#datatypes">Data Types</a> for full details on implicit and explicit casting for
+ all types, and <a class="xref" href="impala_conversion_functions.html#conversion_functions">Impala Type Conversion Functions</a> for details about
+ the <code class="ph codeph">CAST()</code> function.
+ </p>
+ </li>
+
+ <li class="li">
+ Impala does not store or interpret timestamps using the local timezone, to avoid undesired results from
+ unexpected time zone issues. Timestamps are stored and interpreted relative to UTC. This difference can
+ produce different results for some calls to similarly named date/time functions between Impala and Hive.
+ See <a class="xref" href="impala_datetime_functions.html#datetime_functions">Impala Date and Time Functions</a> for details about the Impala
+ functions. See <a class="xref" href="impala_timestamp.html#timestamp">TIMESTAMP Data Type</a> for a discussion of how Impala handles
+ time zones, and configuration options you can use to make Impala match the Hive behavior more closely
+ when dealing with Parquet-encoded <code class="ph codeph">TIMESTAMP</code> data or when converting between
+ the local time zone and UTC.
+ </li>
+
+ <li class="li">
+ The Impala <code class="ph codeph">TIMESTAMP</code> type can represent dates ranging from 1400-01-01 to 9999-12-31.
+ This is different from the Hive date range, which is 0000-01-01 to 9999-12-31.
+ </li>
+
+ <li class="li">
+ <p class="p">
+ Impala does not return column overflows as <code class="ph codeph">NULL</code>, so that customers can distinguish
+ between <code class="ph codeph">NULL</code> data and overflow conditions similar to how they do so with traditional
+ database systems. Impala returns the largest or smallest value in the range for the type. For example,
+ valid values for a <code class="ph codeph">tinyint</code> range from -128 to 127. In Impala, a <code class="ph codeph">tinyint</code>
+ with a value of -200 returns -128 rather than <code class="ph codeph">NULL</code>. A <code class="ph codeph">tinyint</code> with a
+ value of 200 returns 127.
+ </p>
+ </li>
+
+ </ul>
+
+ <p class="p">
+ <strong class="ph b">Miscellaneous features:</strong>
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ Impala does not provide virtual columns.
+ </li>
+
+ <li class="li">
+ Impala does not expose locking.
+ </li>
+
+ <li class="li">
+ Impala does not expose some configuration properties.
+ </li>
+ </ul>
+ </div>
+ </article>
+</article></main></body></html>
http://git-wip-us.apache.org/repos/asf/impala/blob/fae51ec2/docs/build3x/html/topics/impala_ldap.html
----------------------------------------------------------------------
diff --git a/docs/build3x/html/topics/impala_ldap.html b/docs/build3x/html/topics/impala_ldap.html
new file mode 100644
index 0000000..7729e93
--- /dev/null
+++ b/docs/build3x/html/topics/impala_ldap.html
@@ -0,0 +1,294 @@
+<!DOCTYPE html
+ SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_authentication.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 3.0.x"><meta name="version" content="Impala 3.0.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="ldap"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>Enabling LDAP Authentication for Impala</title></head><body id="ldap"><main role="main"><article role="article" aria-labelledby="ariaid-title1">
+
+ <h1 class="title topictitle1" id="ariaid-title1">Enabling LDAP Authentication for Impala</h1>
+
+
+ <div class="body conbody">
+
+
+
+ <p class="p"> Authentication is the process of allowing only specified named users to
+ access the server (in this case, the Impala server). This feature is
+ crucial for any production deployment, to prevent misuse, tampering, or
+ excessive load on the server. Impala uses LDAP for authentication,
+ verifying the credentials of each user who connects through
+ <span class="keyword cmdname">impala-shell</span>, Hue, a Business Intelligence tool, JDBC
+ or ODBC application, and so on. </p>
+
+ <div class="note note note_note"><span class="note__title notetitle">Note:</span>
+ Regardless of the authentication mechanism used, Impala always creates HDFS directories and data files
+ owned by the same user (typically <code class="ph codeph">impala</code>). To implement user-level access to different
+ databases, tables, columns, partitions, and so on, use the Sentry authorization feature, as explained in
+ <a class="xref" href="../shared/../topics/impala_authorization.html#authorization">Enabling Sentry Authorization for Impala</a>.
+ </div>
+
+ <p class="p">
+ An alternative form of authentication you can use is Kerberos, described in
+ <a class="xref" href="impala_kerberos.html#kerberos">Enabling Kerberos Authentication for Impala</a>.
+ </p>
+
+ <p class="p toc inpage"></p>
+
+ </div>
+
+ <nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_authentication.html">Impala Authentication</a></div></div></nav><article class="topic concept nested1" aria-labelledby="ariaid-title2" id="ldap__ldap_prereqs">
+
+ <h2 class="title topictitle2" id="ariaid-title2">Requirements for Using Impala with LDAP</h2>
+
+
+ <div class="body conbody">
+
+ <p class="p">
+ Authentication against LDAP servers is available in Impala 1.2.2 and higher. Impala 1.4.0 adds support for
+ secure LDAP authentication through SSL and TLS.
+ </p>
+
+ <p class="p">
+ The Impala LDAP support lets you use Impala with systems such as Active Directory that use LDAP behind the
+ scenes.
+ </p>
+ </div>
+ </article>
+
+ <article class="topic concept nested1" aria-labelledby="ariaid-title3" id="ldap__ldap_client_server">
+
+ <h2 class="title topictitle2" id="ariaid-title3">Client-Server Considerations for LDAP</h2>
+
+ <div class="body conbody">
+
+ <p class="p">
+ Only client->Impala connections can be authenticated by LDAP.
+ </p>
+
+ <p class="p"> You must use the Kerberos authentication mechanism for connections
+ between internal Impala components, such as between the
+ <span class="keyword cmdname">impalad</span>, <span class="keyword cmdname">statestored</span>, and
+ <span class="keyword cmdname">catalogd</span> daemons. See <a class="xref" href="impala_kerberos.html#kerberos">Enabling Kerberos Authentication for Impala</a> on how to set up Kerberos for
+ Impala. </p>
+ </div>
+ </article>
+
+ <article class="topic concept nested1" aria-labelledby="ariaid-title4" id="ldap__ldap_config">
+
+ <h2 class="title topictitle2" id="ariaid-title4">Server-Side LDAP Setup</h2>
+
+ <div class="body conbody">
+
+ <p class="p">
+ These requirements apply on the server side when configuring and starting Impala:
+ </p>
+
+ <p class="p">
+ To enable LDAP authentication, set the following startup options for <span class="keyword cmdname">impalad</span>:
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ <code class="ph codeph">--enable_ldap_auth</code> enables LDAP-based authentication between the client and Impala.
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">--ldap_uri</code> sets the URI of the LDAP server to use. Typically, the URI is prefixed with
+ <code class="ph codeph">ldap://</code>. In Impala 1.4.0 and higher, you can specify secure SSL-based LDAP transport by
+ using the prefix <code class="ph codeph">ldaps://</code>. The URI can optionally specify the port, for example:
+ <code class="ph codeph">ldap://ldap_server.example.com:389</code> or
+ <code class="ph codeph">ldaps://ldap_server.example.com:636</code>. (389 and 636 are the default ports for non-SSL and
+ SSL LDAP connections, respectively.)
+ </li>
+
+
+
+ <li class="li">
+ For <code class="ph codeph">ldaps://</code> connections secured by SSL,
+ <code class="ph codeph">--ldap_ca_certificate="<var class="keyword varname">/path/to/certificate/pem</var>"</code> specifies the
+ location of the certificate in standard <code class="ph codeph">.PEM</code> format. Store this certificate on the local
+ filesystem, in a location that only the <code class="ph codeph">impala</code> user and other trusted users can read.
+ </li>
+
+
+ </ul>
+ </div>
+ </article>
+
+ <article class="topic concept nested1" aria-labelledby="ariaid-title5" id="ldap__ldap_bind_strings">
+
+ <h2 class="title topictitle2" id="ariaid-title5">Support for Custom Bind Strings</h2>
+
+ <div class="body conbody">
+
+ <p class="p">
+ When Impala connects to LDAP it issues a bind call to the LDAP server to authenticate as the connected
+ user. Impala clients, including the Impala shell, provide the short name of the user to Impala. This is
+ necessary so that Impala can use Sentry for role-based access, which uses short names.
+ </p>
+
+ <p class="p">
+ However, LDAP servers often require more complex, structured usernames for authentication. Impala supports
+ three ways of transforming the short name (for example, <code class="ph codeph">'henry'</code>) to a more complicated
+ string. If necessary, specify one of the following configuration options
+ when starting the <span class="keyword cmdname">impalad</span> daemon on each DataNode:
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ <code class="ph codeph">--ldap_domain</code>: Replaces the username with a string
+ <code class="ph codeph"><var class="keyword varname">username</var>@<var class="keyword varname">ldap_domain</var></code>.
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">--ldap_baseDN</code>: Replaces the username with a <span class="q">"distinguished name"</span> (DN) of the form:
+ <code class="ph codeph">uid=<var class="keyword varname">userid</var>,ldap_baseDN</code>. (This is equivalent to a Hive option).
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">--ldap_bind_pattern</code>: This is the most general option, and replaces the username with the
+ string <var class="keyword varname">ldap_bind_pattern</var> where all instances of the string <code class="ph codeph">#UID</code> are
+ replaced with <var class="keyword varname">userid</var>. For example, an <code class="ph codeph">ldap_bind_pattern</code> of
+ <code class="ph codeph">"user=#UID,OU=foo,CN=bar"</code> with a username of <code class="ph codeph">henry</code> will construct a
+ bind name of <code class="ph codeph">"user=henry,OU=foo,CN=bar"</code>.
+ </li>
+ </ul>
+
+ <p class="p">
+ These options are mutually exclusive; Impala does not start if more than one of these options is specified.
+ </p>
+ </div>
+ </article>
+
+ <article class="topic concept nested1" aria-labelledby="ariaid-title6" id="ldap__ldap_security">
+
+ <h2 class="title topictitle2" id="ariaid-title6">Secure LDAP Connections</h2>
+
+ <div class="body conbody">
+
+ <p class="p">
+ To avoid sending credentials over the wire in cleartext, you must configure a secure connection between
+ both the client and Impala, and between Impala and the LDAP server. The secure connection could use SSL or
+ TLS.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Secure LDAP connections through SSL:</strong>
+ </p>
+
+ <p class="p">
+ For SSL-enabled LDAP connections, specify a prefix of <code class="ph codeph">ldaps://</code> instead of
+ <code class="ph codeph">ldap://</code>. Also, the default port for SSL-enabled LDAP connections is 636 instead of 389.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Secure LDAP connections through TLS:</strong>
+ </p>
+
+ <p class="p">
+ <a class="xref" href="http://en.wikipedia.org/wiki/Transport_Layer_Security" target="_blank">TLS</a>,
+ the successor to the SSL protocol, is supported by most modern LDAP servers. Unlike SSL connections, TLS
+ connections can be made on the same server port as non-TLS connections. To secure all connections using
+ TLS, specify the following flags as startup options to the <span class="keyword cmdname">impalad</span> daemon:
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ <code class="ph codeph">--ldap_tls</code> tells Impala to start a TLS connection to the LDAP server, and to fail
+ authentication if it cannot be done.
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">--ldap_ca_certificate="<var class="keyword varname">/path/to/certificate/pem</var>"</code> specifies the
+ location of the certificate in standard <code class="ph codeph">.PEM</code> format. Store this certificate on the local
+ filesystem, in a location that only the <code class="ph codeph">impala</code> user and other trusted users can read.
+ </li>
+ </ul>
+ </div>
+ </article>
+
+ <article class="topic concept nested1" aria-labelledby="ariaid-title7" id="ldap__ldap_impala_shell">
+
+ <h2 class="title topictitle2" id="ariaid-title7">LDAP Authentication for impala-shell Interpreter</h2>
+
+ <div class="body conbody">
+
+ <p class="p">
+ To connect to Impala using LDAP authentication, you specify command-line options to the
+ <span class="keyword cmdname">impala-shell</span> command interpreter and enter the password when prompted:
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ <code class="ph codeph">-l</code> enables LDAP authentication.
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">-u</code> sets the user. Per Active Directory, the user is the short username, not the full
+ LDAP distinguished name. If your LDAP settings include a search base, use the
+ <code class="ph codeph">--ldap_bind_pattern</code> on the <span class="keyword cmdname">impalad</span> daemon to translate the short user
+ name from <span class="keyword cmdname">impala-shell</span> automatically to the fully qualified name.
+
+ </li>
+
+ <li class="li">
+ <span class="keyword cmdname">impala-shell</span> automatically prompts for the password.
+ </li>
+ </ul>
+
+ <p class="p">
+ For the full list of available <span class="keyword cmdname">impala-shell</span> options, see
+ <a class="xref" href="impala_shell_options.html#shell_options">impala-shell Configuration Options</a>.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">LDAP authentication for JDBC applications:</strong> See <a class="xref" href="impala_jdbc.html#impala_jdbc">Configuring Impala to Work with JDBC</a> for the
+ format to use with the JDBC connection string for servers using LDAP authentication.
+ </p>
+ </div>
+ </article>
+ <article class="topic concept nested1" aria-labelledby="ariaid-title8" id="ldap__ldap_impala_hue">
+ <h2 class="title topictitle2" id="ariaid-title8">Enabling LDAP for Impala in Hue</h2>
+
+ <div class="body conbody">
+ <section class="section" id="ldap_impala_hue__ldap_impala_hue_cmdline"><h3 class="title sectiontitle">Enabling LDAP for Impala in Hue Using the Command Line</h3>
+
+ <div class="p">LDAP authentication for the Impala app in Hue can be enabled by
+ setting the following properties under the <code class="ph codeph">[impala]</code>
+ section in <code class="ph codeph">hue.ini</code>. <table class="table" id="ldap_impala_hue__ldap_impala_hue_configs"><caption></caption><colgroup><col style="width:33.33333333333333%"><col style="width:66.66666666666666%"></colgroup><tbody class="tbody">
+ <tr class="row">
+ <td class="entry nocellnorowborder"><code class="ph codeph">auth_username</code></td>
+ <td class="entry nocellnorowborder">LDAP username of Hue user to be authenticated.</td>
+ </tr>
+ <tr class="row">
+ <td class="entry nocellnorowborder"><code class="ph codeph">auth_password</code></td>
+ <td class="entry nocellnorowborder">
+ <p class="p">LDAP password of Hue user to be authenticated.</p>
+ </td>
+ </tr>
+ </tbody></table>These login details are only used by Impala to authenticate to
+ LDAP. The Impala service trusts Hue to have already validated the user
+ being impersonated, rather than simply passing on the credentials.</div>
+ </section>
+ </div>
+ </article>
+
+ <article class="topic concept nested1" aria-labelledby="ariaid-title9" id="ldap__ldap_delegation">
+ <h2 class="title topictitle2" id="ariaid-title9">Enabling Impala Delegation for LDAP Users</h2>
+ <div class="body conbody">
+ <p class="p">
+ See <a class="xref" href="impala_delegation.html#delegation">Configuring Impala Delegation for Hue and BI Tools</a> for details about the delegation feature
+ that lets certain users submit queries using the credentials of other users.
+ </p>
+ </div>
+ </article>
+
+ <article class="topic concept nested1" aria-labelledby="ariaid-title10" id="ldap__ldap_restrictions">
+
+ <h2 class="title topictitle2" id="ariaid-title10">LDAP Restrictions for Impala</h2>
+
+ <div class="body conbody">
+
+ <p class="p">
+ The LDAP support is preliminary. It currently has only been tested against Active Directory.
+ </p>
+ </div>
+ </article>
+</article></main></body></html>
http://git-wip-us.apache.org/repos/asf/impala/blob/fae51ec2/docs/build3x/html/topics/impala_limit.html
----------------------------------------------------------------------
diff --git a/docs/build3x/html/topics/impala_limit.html b/docs/build3x/html/topics/impala_limit.html
new file mode 100644
index 0000000..22dc7a5
--- /dev/null
+++ b/docs/build3x/html/topics/impala_limit.html
@@ -0,0 +1,168 @@
+<!DOCTYPE html
+ SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_select.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 3.0.x"><meta name="version" content="Impala 3.0.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="limit"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>LIMIT Clause</title></head><body id="limit"><main role="main"><article role="article" aria-labelledby="ariaid-title1">
+
+ <h1 class="title topictitle1" id="ariaid-title1">LIMIT Clause</h1>
+
+
+ <div class="body conbody">
+
+ <p class="p">
+ The <code class="ph codeph">LIMIT</code> clause in a <code class="ph codeph">SELECT</code> query sets a maximum number of rows for the
+ result set. Pre-selecting the maximum size of the result set helps Impala to optimize memory usage while
+ processing a distributed query.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Syntax:</strong>
+ </p>
+
+<pre class="pre codeblock"><code>LIMIT <var class="keyword varname">constant_integer_expression</var></code></pre>
+
+ <p class="p">
+ The argument to the <code class="ph codeph">LIMIT</code> clause must evaluate to a constant value. It can be a numeric
+ literal, or another kind of numeric expression involving operators, casts, and function return values. You
+ cannot refer to a column or use a subquery.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Usage notes:</strong>
+ </p>
+
+ <p class="p">
+ This clause is useful in contexts such as:
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ To return exactly N items from a top-N query, such as the 10 highest-rated items in a shopping category or
+ the 50 hostnames that refer the most traffic to a web site.
+ </li>
+
+ <li class="li">
+ To demonstrate some sample values from a table or a particular query. (To display some arbitrary items, use
+ a query with no <code class="ph codeph">ORDER BY</code> clause. An <code class="ph codeph">ORDER BY</code> clause causes additional
+ memory and/or disk usage during the query.)
+ </li>
+
+ <li class="li">
+ To keep queries from returning huge result sets by accident if a table is larger than expected, or a
+ <code class="ph codeph">WHERE</code> clause matches more rows than expected.
+ </li>
+ </ul>
+
+ <p class="p">
+ Originally, the value for the <code class="ph codeph">LIMIT</code> clause had to be a numeric literal. In Impala 1.2.1 and
+ higher, it can be a numeric expression.
+ </p>
+
+ <p class="p">
+ Prior to Impala 1.4.0, Impala required any query including an
+ <code class="ph codeph"><a class="xref" href="../shared/../topics/impala_order_by.html#order_by">ORDER BY</a></code> clause to also use a
+ <code class="ph codeph"><a class="xref" href="../shared/../topics/impala_limit.html#limit">LIMIT</a></code> clause. In Impala 1.4.0 and
+ higher, the <code class="ph codeph">LIMIT</code> clause is optional for <code class="ph codeph">ORDER BY</code> queries. In cases where
+ sorting a huge result set requires enough memory to exceed the Impala memory limit for a particular node,
+ Impala automatically uses a temporary disk work area to perform the sort operation.
+ </p>
+
+ <p class="p">
+ See <a class="xref" href="impala_order_by.html#order_by">ORDER BY Clause</a> for details.
+ </p>
+
+ <p class="p">
+ In Impala 1.2.1 and higher, you can combine a <code class="ph codeph">LIMIT</code> clause with an <code class="ph codeph">OFFSET</code>
+ clause to produce a small result set that is different from a top-N query, for example, to return items 11
+ through 20. This technique can be used to simulate <span class="q">"paged"</span> results. Because Impala queries typically
+ involve substantial amounts of I/O, use this technique only for compatibility in cases where you cannot
+ rewrite the application logic. For best performance and scalability, wherever practical, query as many
+ items as you expect to need, cache them on the application side, and display small groups of results to
+ users using application logic.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Restrictions:</strong>
+ </p>
+
+ <p class="p">
+ Correlated subqueries used in <code class="ph codeph">EXISTS</code> and <code class="ph codeph">IN</code> operators cannot include a
+ <code class="ph codeph">LIMIT</code> clause.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Examples:</strong>
+ </p>
+
+ <p class="p">
+ The following example shows how the <code class="ph codeph">LIMIT</code> clause caps the size of the result set, with the
+ limit being applied after any other clauses such as <code class="ph codeph">WHERE</code>.
+ </p>
+
+<pre class="pre codeblock"><code>[localhost:21000] > create database limits;
+[localhost:21000] > use limits;
+[localhost:21000] > create table numbers (x int);
+[localhost:21000] > insert into numbers values (1), (3), (4), (5), (2);
+Inserted 5 rows in 1.34s
+[localhost:21000] > select x from numbers limit 100;
++---+
+| x |
++---+
+| 1 |
+| 3 |
+| 4 |
+| 5 |
+| 2 |
++---+
+Returned 5 row(s) in 0.26s
+[localhost:21000] > select x from numbers limit 3;
++---+
+| x |
++---+
+| 1 |
+| 3 |
+| 4 |
++---+
+Returned 3 row(s) in 0.27s
+[localhost:21000] > select x from numbers where x > 2 limit 2;
++---+
+| x |
++---+
+| 3 |
+| 4 |
++---+
+Returned 2 row(s) in 0.27s</code></pre>
+
+ <p class="p">
+ For top-N and bottom-N queries, you use the <code class="ph codeph">ORDER BY</code> and <code class="ph codeph">LIMIT</code> clauses
+ together:
+ </p>
+
+<pre class="pre codeblock"><code>[localhost:21000] > select x as "Top 3" from numbers order by x desc limit 3;
++-------+
+| top 3 |
++-------+
+| 5 |
+| 4 |
+| 3 |
++-------+
+[localhost:21000] > select x as "Bottom 3" from numbers order by x limit 3;
++----------+
+| bottom 3 |
++----------+
+| 1 |
+| 2 |
+| 3 |
++----------+
+</code></pre>
+
+ <p class="p">
+ You can use constant values besides integer literals as the <code class="ph codeph">LIMIT</code> argument:
+ </p>
+
+<pre class="pre codeblock"><code>-- Other expressions that yield constant integer values work too.
+SELECT x FROM t1 LIMIT 1e6; -- Limit is one million.
+SELECT x FROM t1 LIMIT length('hello world'); -- Limit is 11.
+SELECT x FROM t1 LIMIT 2+2; -- Limit is 4.
+SELECT x FROM t1 LIMIT cast(truncate(9.9) AS INT); -- Limit is 9.
+</code></pre>
+ </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_select.html">SELECT Statement</a></div></div></nav></article></main></body></html>
http://git-wip-us.apache.org/repos/asf/impala/blob/fae51ec2/docs/build3x/html/topics/impala_lineage.html
----------------------------------------------------------------------
diff --git a/docs/build3x/html/topics/impala_lineage.html b/docs/build3x/html/topics/impala_lineage.html
new file mode 100644
index 0000000..12b3794
--- /dev/null
+++ b/docs/build3x/html/topics/impala_lineage.html
@@ -0,0 +1,91 @@
+<!DOCTYPE html
+ SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_security.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 3.0.x"><meta name="version" content="Impala 3.0.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="lineage"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>Viewing Lineage Information for Impala Data</title></head><body id="lineage"><main role="main"><article role="article" aria-labelledby="ariaid-title1">
+
+ <h1 class="title topictitle1" id="ariaid-title1">Viewing Lineage Information for Impala Data</h1>
+
+
+
+ <div class="body conbody">
+
+ <p class="p">
+
+
+ <dfn class="term">Lineage</dfn> is a feature that helps you track where data originated, and how
+ data propagates through the system through SQL statements such as
+ <code class="ph codeph">SELECT</code>, <code class="ph codeph">INSERT</code>, and <code class="ph codeph">CREATE
+ TABLE AS SELECT</code>.
+ </p>
+ <p class="p">
+ This type of tracking is important in high-security configurations, especially in
+ highly regulated industries such as healthcare, pharmaceuticals, financial services and
+ intelligence. For such kinds of sensitive data, it is important to know all
+ the places in the system that contain that data or other data derived from it; to verify who has accessed
+ that data; and to be able to doublecheck that the data used to make a decision was processed correctly and
+ not tampered with.
+ </p>
+
+ <section class="section" id="lineage__column_lineage"><h2 class="title sectiontitle">Column Lineage</h2>
+
+
+
+ <p class="p">
+ <dfn class="term">Column lineage</dfn> tracks information in fine detail, at the level of
+ particular columns rather than entire tables.
+ </p>
+
+ <p class="p">
+ For example, if you have a table with information derived from web logs, you might copy that data into
+ other tables as part of the ETL process. The ETL operations might involve transformations through
+ expressions and function calls, and rearranging the columns into more or fewer tables
+ (<dfn class="term">normalizing</dfn> or <dfn class="term">denormalizing</dfn> the data). Then for reporting, you might issue
+ queries against multiple tables and views. In this example, column lineage helps you determine that data
+ that entered the system as <code class="ph codeph">RAW_LOGS.FIELD1</code> was then turned into
+ <code class="ph codeph">WEBSITE_REPORTS.IP_ADDRESS</code> through an <code class="ph codeph">INSERT ... SELECT</code> statement. Or,
+ conversely, you could start with a reporting query against a view, and trace the origin of the data in a
+ field such as <code class="ph codeph">TOP_10_VISITORS.USER_ID</code> back to the underlying table and even further back
+ to the point where the data was first loaded into Impala.
+ </p>
+
+ <p class="p">
+ When you have tables where you need to track or control access to sensitive information at the column
+ level, see <a class="xref" href="impala_authorization.html#authorization">Enabling Sentry Authorization for Impala</a> for how to implement column-level
+ security. You set up authorization using the Sentry framework, create views that refer to specific sets of
+ columns, and then assign authorization privileges to those views rather than the underlying tables.
+ </p>
+
+ </section>
+
+ <section class="section" id="lineage__lineage_data"><h2 class="title sectiontitle">Lineage Data for Impala</h2>
+
+
+
+ <p class="p">
+ The lineage feature is enabled by default. When lineage logging is enabled, the serialized column lineage
+ graph is computed for each query and stored in a specialized log file in JSON format.
+ </p>
+
+ <p class="p">
+ Impala records queries in the lineage log if they complete successfully, or fail due to authorization
+ errors. For write operations such as <code class="ph codeph">INSERT</code> and <code class="ph codeph">CREATE TABLE AS SELECT</code>,
+ the statement is recorded in the lineage log only if it successfully completes. Therefore, the lineage
+ feature tracks data that was accessed by successful queries, or that was attempted to be accessed by
+ unsuccessful queries that were blocked due to authorization failure. These kinds of queries represent data
+ that really was accessed, or where the attempted access could represent malicious activity.
+ </p>
+
+ <p class="p">
+ Impala does not record in the lineage log queries that fail due to syntax errors or that fail or are
+ cancelled before they reach the stage of requesting rows from the result set.
+ </p>
+
+ <p class="p">
+ To enable or disable this feature, set or remove the <code class="ph codeph">-lineage_event_log_dir</code>
+ configuration option for the <span class="keyword cmdname">impalad</span> daemon.
+ </p>
+
+ </section>
+
+ </div>
+
+<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_security.html">Impala Security</a></div></div></nav></article></main></body></html>
http://git-wip-us.apache.org/repos/asf/impala/blob/fae51ec2/docs/build3x/html/topics/impala_literals.html
----------------------------------------------------------------------
diff --git a/docs/build3x/html/topics/impala_literals.html b/docs/build3x/html/topics/impala_literals.html
new file mode 100644
index 0000000..b9cfe57
--- /dev/null
+++ b/docs/build3x/html/topics/impala_literals.html
@@ -0,0 +1,424 @@
+<!DOCTYPE html
+ SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_langref.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 3.0.x"><meta name="version" content="Impala 3.0.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="literals"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>Literals</title></head><body id="literals"><main role="main"><article role="article" aria-labelledby="ariaid-title1">
+
+ <h1 class="title topictitle1" id="ariaid-title1">Literals</h1>
+
+
+ <div class="body conbody">
+
+ <p class="p">
+
+ Each of the Impala data types has corresponding notation for literal values of that type. You specify literal
+ values in SQL statements, such as in the <code class="ph codeph">SELECT</code> list or <code class="ph codeph">WHERE</code> clause of a
+ query, or as an argument to a function call. See <a class="xref" href="impala_datatypes.html#datatypes">Data Types</a> for a complete
+ list of types, ranges, and conversion rules.
+ </p>
+
+ <p class="p toc inpage"></p>
+ </div>
+
+ <nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_langref.html">Impala SQL Language Reference</a></div></div></nav><article class="topic concept nested1" aria-labelledby="ariaid-title2" id="literals__numeric_literals">
+
+ <h2 class="title topictitle2" id="ariaid-title2">Numeric Literals</h2>
+
+ <div class="body conbody">
+
+ <p class="p">
+
+ To write literals for the integer types (<code class="ph codeph">TINYINT</code>, <code class="ph codeph">SMALLINT</code>,
+ <code class="ph codeph">INT</code>, and <code class="ph codeph">BIGINT</code>), use a sequence of digits with optional leading zeros.
+ </p>
+
+ <p class="p">
+ To write literals for the floating-point types (<code class="ph codeph">DECIMAL</code>,
+ <code class="ph codeph">FLOAT</code>, and <code class="ph codeph">DOUBLE</code>), use a sequence of digits with an optional decimal
+ point (<code class="ph codeph">.</code> character). To preserve accuracy during arithmetic expressions, Impala interprets
+ floating-point literals as the <code class="ph codeph">DECIMAL</code> type with the smallest appropriate precision and
+ scale, until required by the context to convert the result to <code class="ph codeph">FLOAT</code> or
+ <code class="ph codeph">DOUBLE</code>.
+ </p>
+
+ <p class="p">
+ Integer values are promoted to floating-point when necessary, based on the context.
+ </p>
+
+ <p class="p">
+ You can also use exponential notation by including an <code class="ph codeph">e</code> character. For example,
+ <code class="ph codeph">1e6</code> is 1 times 10 to the power of 6 (1 million). A number in exponential notation is
+ always interpreted as floating-point.
+ </p>
+
+ <p class="p">
+ When Impala encounters a numeric literal, it considers the type to be the <span class="q">"smallest"</span> that can
+ accurately represent the value. The type is promoted to larger or more accurate types if necessary, based
+ on subsequent parts of an expression.
+ </p>
+ <p class="p">
+ For example, you can see by the types Impala defines for the following table columns
+ how it interprets the corresponding numeric literals:
+ </p>
+<pre class="pre codeblock"><code>[localhost:21000] > create table ten as select 10 as x;
++-------------------+
+| summary |
++-------------------+
+| Inserted 1 row(s) |
++-------------------+
+[localhost:21000] > desc ten;
++------+---------+---------+
+| name | type | comment |
++------+---------+---------+
+| x | tinyint | |
++------+---------+---------+
+
+[localhost:21000] > create table four_k as select 4096 as x;
++-------------------+
+| summary |
++-------------------+
+| Inserted 1 row(s) |
++-------------------+
+[localhost:21000] > desc four_k;
++------+----------+---------+
+| name | type | comment |
++------+----------+---------+
+| x | smallint | |
++------+----------+---------+
+
+[localhost:21000] > create table one_point_five as select 1.5 as x;
++-------------------+
+| summary |
++-------------------+
+| Inserted 1 row(s) |
++-------------------+
+[localhost:21000] > desc one_point_five;
++------+--------------+---------+
+| name | type | comment |
++------+--------------+---------+
+| x | decimal(2,1) | |
++------+--------------+---------+
+
+[localhost:21000] > create table one_point_three_three_three as select 1.333 as x;
++-------------------+
+| summary |
++-------------------+
+| Inserted 1 row(s) |
++-------------------+
+[localhost:21000] > desc one_point_three_three_three;
++------+--------------+---------+
+| name | type | comment |
++------+--------------+---------+
+| x | decimal(4,3) | |
++------+--------------+---------+
+</code></pre>
+ </div>
+ </article>
+
+ <article class="topic concept nested1" aria-labelledby="ariaid-title3" id="literals__string_literals">
+
+ <h2 class="title topictitle2" id="ariaid-title3">String Literals</h2>
+
+ <div class="body conbody">
+
+ <p class="p">
+
+ String literals are quoted using either single or double quotation marks. You can use either kind of quotes
+ for string literals, even both kinds for different literals within the same statement.
+ </p>
+
+ <p class="p">
+ Quoted literals are considered to be of type <code class="ph codeph">STRING</code>. To use quoted literals in contexts
+ requiring a <code class="ph codeph">CHAR</code> or <code class="ph codeph">VARCHAR</code> value, <code class="ph codeph">CAST()</code> the literal to
+ a <code class="ph codeph">CHAR</code> or <code class="ph codeph">VARCHAR</code> of the appropriate length.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Escaping special characters:</strong>
+ </p>
+
+ <p class="p">
+ To encode special characters within a string literal, precede them with the backslash (<code class="ph codeph">\</code>)
+ escape character:
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ <code class="ph codeph">\t</code> represents a tab.
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">\n</code> represents a newline or linefeed. This might cause extra line breaks in
+ <span class="keyword cmdname">impala-shell</span> output.
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">\r</code> represents a carriage return. This might cause unusual formatting (making it appear
+ that some content is overwritten) in <span class="keyword cmdname">impala-shell</span> output.
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">\b</code> represents a backspace. This might cause unusual formatting (making it appear that
+ some content is overwritten) in <span class="keyword cmdname">impala-shell</span> output.
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">\0</code> represents an ASCII <code class="ph codeph">nul</code> character (not the same as a SQL
+ <code class="ph codeph">NULL</code>). This might not be visible in <span class="keyword cmdname">impala-shell</span> output.
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">\Z</code> represents a DOS end-of-file character. This might not be visible in
+ <span class="keyword cmdname">impala-shell</span> output.
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">\%</code> and <code class="ph codeph">\_</code> can be used to escape wildcard characters within the string
+ passed to the <code class="ph codeph">LIKE</code> operator.
+ </li>
+
+ <li class="li">
+ <code class="ph codeph">\</code> followed by 3 octal digits represents the ASCII code of a single character; for
+ example, <code class="ph codeph">\101</code> is ASCII 65, the character <code class="ph codeph">A</code>.
+ </li>
+
+ <li class="li">
+ Use two consecutive backslashes (<code class="ph codeph">\\</code>) to prevent the backslash from being interpreted as
+ an escape character.
+ </li>
+
+ <li class="li">
+ Use the backslash to escape single or double quotation mark characters within a string literal, if the
+ literal is enclosed by the same type of quotation mark.
+ </li>
+
+ <li class="li">
+ If the character following the <code class="ph codeph">\</code> does not represent the start of a recognized escape
+ sequence, the character is passed through unchanged.
+ </li>
+ </ul>
+
+ <p class="p">
+ <strong class="ph b">Quotes within quotes:</strong>
+ </p>
+
+ <p class="p">
+ To include a single quotation character within a string value, enclose the literal with either single or
+ double quotation marks, and optionally escape the single quote as a <code class="ph codeph">\'</code> sequence. Earlier
+ releases required escaping a single quote inside double quotes. Continue using escape sequences in this
+ case if you also need to run your SQL code on older versions of Impala.
+ </p>
+
+ <p class="p">
+ To include a double quotation character within a string value, enclose the literal with single quotation
+ marks, no escaping is necessary in this case. Or, enclose the literal with double quotation marks and
+ escape the double quote as a <code class="ph codeph">\"</code> sequence.
+ </p>
+
+<pre class="pre codeblock"><code>[localhost:21000] > select "What\'s happening?" as single_within_double,
+ > 'I\'m not sure.' as single_within_single,
+ > "Homer wrote \"The Iliad\"." as double_within_double,
+ > 'Homer also wrote "The Odyssey".' as double_within_single;
++----------------------+----------------------+--------------------------+---------------------------------+
+| single_within_double | single_within_single | double_within_double | double_within_single |
++----------------------+----------------------+--------------------------+---------------------------------+
+| What's happening? | I'm not sure. | Homer wrote "The Iliad". | Homer also wrote "The Odyssey". |
++----------------------+----------------------+--------------------------+---------------------------------+
+</code></pre>
+
+ <p class="p">
+ <strong class="ph b">Field terminator character in CREATE TABLE:</strong>
+ </p>
+
+ <div class="note note note_note"><span class="note__title notetitle">Note:</span>
+ The <code class="ph codeph">CREATE TABLE</code> clauses <code class="ph codeph">FIELDS TERMINATED BY</code>, <code class="ph codeph">ESCAPED
+ BY</code>, and <code class="ph codeph">LINES TERMINATED BY</code> have special rules for the string literal used for
+ their argument, because they all require a single character. You can use a regular character surrounded by
+ single or double quotation marks, an octal sequence such as <code class="ph codeph">'\054'</code> (representing a comma),
+ or an integer in the range '-127'..'128' (with quotation marks but no backslash), which is interpreted as a
+ single-byte ASCII character. Negative values are subtracted from 256; for example, <code class="ph codeph">FIELDS
+ TERMINATED BY '-2'</code> sets the field delimiter to ASCII code 254, the <span class="q">"Icelandic Thorn"</span>
+ character used as a delimiter by some data formats.
+ </div>
+
+ <p class="p">
+ <strong class="ph b">impala-shell considerations:</strong>
+ </p>
+
+ <p class="p">
+ When dealing with output that includes non-ASCII or non-printable characters such as linefeeds and
+ backspaces, use the <span class="keyword cmdname">impala-shell</span> options to save to a file, turn off pretty printing, or
+ both rather than relying on how the output appears visually. See
+ <a class="xref" href="impala_shell_options.html#shell_options">impala-shell Configuration Options</a> for a list of <span class="keyword cmdname">impala-shell</span>
+ options.
+ </p>
+ </div>
+ </article>
+
+ <article class="topic concept nested1" aria-labelledby="ariaid-title4" id="literals__boolean_literals">
+
+ <h2 class="title topictitle2" id="ariaid-title4">Boolean Literals</h2>
+
+ <div class="body conbody">
+
+ <p class="p">
+ For <code class="ph codeph">BOOLEAN</code> values, the literals are <code class="ph codeph">TRUE</code> and <code class="ph codeph">FALSE</code>,
+ with no quotation marks and case-insensitive.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Examples:</strong>
+ </p>
+
+<pre class="pre codeblock"><code>select true;
+select * from t1 where assertion = false;
+select case bool_col when true then 'yes' when false 'no' else 'null' end from t1;</code></pre>
+ </div>
+ </article>
+
+ <article class="topic concept nested1" aria-labelledby="ariaid-title5" id="literals__timestamp_literals">
+
+ <h2 class="title topictitle2" id="ariaid-title5">Timestamp Literals</h2>
+
+ <div class="body conbody">
+
+ <p class="p">
+ Impala automatically converts <code class="ph codeph">STRING</code> literals of the
+ correct format into <code class="ph codeph">TIMESTAMP</code> values. Timestamp values
+ are accepted in the format <code class="ph codeph">"yyyy-MM-dd HH:mm:ss.SSSSSS"</code>,
+ and can consist of just the date, or just the time, with or without the
+ fractional second portion. For example, you can specify <code class="ph codeph">TIMESTAMP</code>
+ values such as <code class="ph codeph">'1966-07-30'</code>, <code class="ph codeph">'08:30:00'</code>,
+ or <code class="ph codeph">'1985-09-25 17:45:30.005'</code>.
+ </p>
+
+ <p class="p">
+ You can also use <code class="ph codeph">INTERVAL</code> expressions to add or subtract from timestamp literal values,
+ such as <code class="ph codeph">CAST('1966-07-30' AS TIMESTAMP) + INTERVAL 5 YEARS + INTERVAL 3 DAYS</code>. See
+ <a class="xref" href="impala_timestamp.html#timestamp">TIMESTAMP Data Type</a> for details.
+ </p>
+
+ <p class="p">
+ Depending on your data pipeline, you might receive date and time data as text, in notation that does not
+ exactly match the format for Impala <code class="ph codeph">TIMESTAMP</code> literals.
+ See <a class="xref" href="impala_datetime_functions.html#datetime_functions">Impala Date and Time Functions</a> for functions that can convert
+ between a variety of string literals (including different field order, separators, and timezone notation)
+ and equivalent <code class="ph codeph">TIMESTAMP</code> or numeric values.
+ </p>
+ </div>
+ </article>
+
+ <article class="topic concept nested1" aria-labelledby="ariaid-title6" id="literals__null">
+
+ <h2 class="title topictitle2" id="ariaid-title6">NULL</h2>
+
+ <div class="body conbody">
+
+ <p class="p">
+
+ The notion of <code class="ph codeph">NULL</code> values is familiar from all kinds of database systems, but each SQL
+ dialect can have its own behavior and restrictions on <code class="ph codeph">NULL</code> values. For Big Data
+ processing, the precise semantics of <code class="ph codeph">NULL</code> values are significant: any misunderstanding
+ could lead to inaccurate results or misformatted data, that could be time-consuming to correct for large
+ data sets.
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ <code class="ph codeph">NULL</code> is a different value than an empty string. The empty string is represented by a
+ string literal with nothing inside, <code class="ph codeph">""</code> or <code class="ph codeph">''</code>.
+ </li>
+
+ <li class="li">
+ In a delimited text file, the <code class="ph codeph">NULL</code> value is represented by the special token
+ <code class="ph codeph">\N</code>.
+ </li>
+
+ <li class="li">
+ When Impala inserts data into a partitioned table, and the value of one of the partitioning columns is
+ <code class="ph codeph">NULL</code> or the empty string, the data is placed in a special partition that holds only
+ these two kinds of values. When these values are returned in a query, the result is <code class="ph codeph">NULL</code>
+ whether the value was originally <code class="ph codeph">NULL</code> or an empty string. This behavior is compatible
+ with the way Hive treats <code class="ph codeph">NULL</code> values in partitioned tables. Hive does not allow empty
+ strings as partition keys, and it returns a string value such as
+ <code class="ph codeph">__HIVE_DEFAULT_PARTITION__</code> instead of <code class="ph codeph">NULL</code> when such values are
+ returned from a query. For example:
+<pre class="pre codeblock"><code>create table t1 (i int) partitioned by (x int, y string);
+-- Select an INT column from another table, with all rows going into a special HDFS subdirectory
+-- named __HIVE_DEFAULT_PARTITION__. Depending on whether one or both of the partitioning keys
+-- are null, this special directory name occurs at different levels of the physical data directory
+-- for the table.
+insert into t1 partition(x=NULL, y=NULL) select c1 from some_other_table;
+insert into t1 partition(x, y=NULL) select c1, c2 from some_other_table;
+insert into t1 partition(x=NULL, y) select c1, c3 from some_other_table;</code></pre>
+ </li>
+
+ <li class="li">
+ There is no <code class="ph codeph">NOT NULL</code> clause when defining a column to prevent <code class="ph codeph">NULL</code>
+ values in that column.
+ </li>
+
+ <li class="li">
+ There is no <code class="ph codeph">DEFAULT</code> clause to specify a non-<code class="ph codeph">NULL</code> default value.
+ </li>
+
+ <li class="li">
+ If an <code class="ph codeph">INSERT</code> operation mentions some columns but not others, the unmentioned columns
+ contain <code class="ph codeph">NULL</code> for all inserted rows.
+ </li>
+
+ <li class="li">
+ <p class="p">
+ In Impala 1.2.1 and higher, all <code class="ph codeph">NULL</code> values come at the end of the result set for
+ <code class="ph codeph">ORDER BY ... ASC</code> queries, and at the beginning of the result set for <code class="ph codeph">ORDER BY ...
+ DESC</code> queries. In effect, <code class="ph codeph">NULL</code> is considered greater than all other values for
+ sorting purposes. The original Impala behavior always put <code class="ph codeph">NULL</code> values at the end, even for
+ <code class="ph codeph">ORDER BY ... DESC</code> queries. The new behavior in Impala 1.2.1 makes Impala more compatible
+ with other popular database systems. In Impala 1.2.1 and higher, you can override or specify the sorting
+ behavior for <code class="ph codeph">NULL</code> by adding the clause <code class="ph codeph">NULLS FIRST</code> or <code class="ph codeph">NULLS
+ LAST</code> at the end of the <code class="ph codeph">ORDER BY</code> clause.
+ </p>
+ <div class="note note note_note"><span class="note__title notetitle">Note:</span>
+
+ Because the <code class="ph codeph">NULLS FIRST</code> and <code class="ph codeph">NULLS LAST</code> keywords are not currently
+ available in Hive queries, any views you create using those keywords will not be available through
+ Hive.
+ </div>
+ </li>
+
+ <li class="li">
+ In all other contexts besides sorting with <code class="ph codeph">ORDER BY</code>, comparing a <code class="ph codeph">NULL</code>
+ to anything else returns <code class="ph codeph">NULL</code>, making the comparison meaningless. For example,
+ <code class="ph codeph">10 > NULL</code> produces <code class="ph codeph">NULL</code>, <code class="ph codeph">10 < NULL</code> also produces
+ <code class="ph codeph">NULL</code>, <code class="ph codeph">5 BETWEEN 1 AND NULL</code> produces <code class="ph codeph">NULL</code>, and so on.
+ </li>
+ </ul>
+
+ <p class="p">
+ Several built-in functions serve as shorthand for evaluating expressions and returning
+ <code class="ph codeph">NULL</code>, 0, or some other substitution value depending on the expression result:
+ <code class="ph codeph">ifnull()</code>, <code class="ph codeph">isnull()</code>, <code class="ph codeph">nvl()</code>, <code class="ph codeph">nullif()</code>,
+ <code class="ph codeph">nullifzero()</code>, and <code class="ph codeph">zeroifnull()</code>. See
+ <a class="xref" href="impala_conditional_functions.html#conditional_functions">Impala Conditional Functions</a> for details.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Kudu considerations:</strong>
+ </p>
+ <p class="p">
+ Columns in Kudu tables have an attribute that specifies whether or not they can contain
+ <code class="ph codeph">NULL</code> values. A column with a <code class="ph codeph">NULL</code> attribute can contain
+ nulls. A column with a <code class="ph codeph">NOT NULL</code> attribute cannot contain any nulls, and
+ an <code class="ph codeph">INSERT</code>, <code class="ph codeph">UPDATE</code>, or <code class="ph codeph">UPSERT</code> statement
+ will skip any row that attempts to store a null in a column designated as <code class="ph codeph">NOT NULL</code>.
+ Kudu tables default to the <code class="ph codeph">NULL</code> setting for each column, except columns that
+ are part of the primary key.
+ </p>
+ <p class="p">
+ In addition to columns with the <code class="ph codeph">NOT NULL</code> attribute, Kudu tables also have
+ restrictions on <code class="ph codeph">NULL</code> values in columns that are part of the primary key for
+ a table. No column that is part of the primary key in a Kudu table can contain any
+ <code class="ph codeph">NULL</code> values.
+ </p>
+
+ </div>
+ </article>
+</article></main></body></html>
http://git-wip-us.apache.org/repos/asf/impala/blob/fae51ec2/docs/build3x/html/topics/impala_live_progress.html
----------------------------------------------------------------------
diff --git a/docs/build3x/html/topics/impala_live_progress.html b/docs/build3x/html/topics/impala_live_progress.html
new file mode 100644
index 0000000..bce7807
--- /dev/null
+++ b/docs/build3x/html/topics/impala_live_progress.html
@@ -0,0 +1,131 @@
+<!DOCTYPE html
+ SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_query_options.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 3.0.x"><meta name="version" content="Impala 3.0.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="live_progress"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>LIVE_PROGRESS Query Option (Impala 2.3 or higher only)</title></head><body id="live_progress"><main role="main"><article role="article" aria-labelledby="ariaid-title1">
+
+ <h1 class="title topictitle1" id="ariaid-title1">LIVE_PROGRESS Query Option (<span class="keyword">Impala 2.3</span> or higher only)</h1>
+
+
+
+ <div class="body conbody">
+
+ <p class="p">
+
+ For queries submitted through the <span class="keyword cmdname">impala-shell</span> command,
+ displays an interactive progress bar showing roughly what percentage of
+ processing has been completed. When the query finishes, the progress bar is erased
+ from the <span class="keyword cmdname">impala-shell</span> console output.
+ </p>
+
+ <p class="p">
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Type:</strong> Boolean; recognized values are 1 and 0, or <code class="ph codeph">true</code> and <code class="ph codeph">false</code>;
+ any other value interpreted as <code class="ph codeph">false</code>
+ </p>
+ <p class="p">
+ <strong class="ph b">Default:</strong> <code class="ph codeph">false</code> (shown as 0 in output of <code class="ph codeph">SET</code> statement)
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Command-line equivalent:</strong>
+ </p>
+ <p class="p">
+ You can enable this query option within <span class="keyword cmdname">impala-shell</span>
+ by starting the shell with the <code class="ph codeph">--live_progress</code>
+ command-line option.
+ You can still turn this setting off and on again within the shell through the
+ <code class="ph codeph">SET</code> command.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Usage notes:</strong>
+ </p>
+ <p class="p">
+ The output from this query option is printed to standard error. The output is only displayed in interactive mode,
+ that is, not when the <code class="ph codeph">-q</code> or <code class="ph codeph">-f</code> options are used.
+ </p>
+ <p class="p">
+ For a more detailed way of tracking the progress of an interactive query through
+ all phases of processing, see <a class="xref" href="impala_live_summary.html#live_summary">LIVE_SUMMARY Query Option (Impala 2.3 or higher only)</a>.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Restrictions:</strong>
+ </p>
+ <p class="p">
+ Because the percentage complete figure is calculated using the number of
+ issued and completed <span class="q">"scan ranges"</span>, which occur while reading the table
+ data, the progress bar might reach 100% before the query is entirely finished.
+ For example, the query might do work to perform aggregations after all the
+ table data has been read. If many of your queries fall into this category,
+ consider using the <code class="ph codeph">LIVE_SUMMARY</code> option instead for
+ more granular progress reporting.
+ </p>
+ <p class="p">
+ The <code class="ph codeph">LIVE_PROGRESS</code> and <code class="ph codeph">LIVE_SUMMARY</code> query options
+ currently do not produce any output during <code class="ph codeph">COMPUTE STATS</code> operations.
+ </p>
+ <div class="p">
+ Because the <code class="ph codeph">LIVE_PROGRESS</code> and <code class="ph codeph">LIVE_SUMMARY</code> query options
+ are available only within the <span class="keyword cmdname">impala-shell</span> interpreter:
+ <ul class="ul">
+ <li class="li">
+ <p class="p">
+ You cannot change these query options through the SQL <code class="ph codeph">SET</code>
+ statement using the JDBC or ODBC interfaces. The <code class="ph codeph">SET</code>
+ command in <span class="keyword cmdname">impala-shell</span> recognizes these names as
+ shell-only options.
+ </p>
+ </li>
+ <li class="li">
+ <p class="p">
+ Be careful when using <span class="keyword cmdname">impala-shell</span> on a pre-<span class="keyword">Impala 2.3</span>
+ system to connect to a system running <span class="keyword">Impala 2.3</span> or higher.
+ The older <span class="keyword cmdname">impala-shell</span> does not recognize these
+ query option names. Upgrade <span class="keyword cmdname">impala-shell</span> on the
+ systems where you intend to use these query options.
+ </p>
+ </li>
+ <li class="li">
+ <p class="p">
+ Likewise, the <span class="keyword cmdname">impala-shell</span> command relies on
+ some information only available in <span class="keyword">Impala 2.3</span> and higher
+ to prepare live progress reports and query summaries. The
+ <code class="ph codeph">LIVE_PROGRESS</code> and <code class="ph codeph">LIVE_SUMMARY</code>
+ query options have no effect when <span class="keyword cmdname">impala-shell</span> connects
+ to a cluster running an older version of Impala.
+ </p>
+ </li>
+ </ul>
+ </div>
+
+ <p class="p">
+ <strong class="ph b">Added in:</strong> <span class="keyword">Impala 2.3.0</span>
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Examples:</strong>
+ </p>
+<pre class="pre codeblock"><code>[localhost:21000] > set live_progress=true;
+LIVE_PROGRESS set to true
+[localhost:21000] > select count(*) from customer;
++----------+
+| count(*) |
++----------+
+| 150000 |
++----------+
+[localhost:21000] > select count(*) from customer t1 cross join customer t2;
+[################################### ] 50%
+[######################################################################] 100%
+
+
+</code></pre>
+
+ <p class="p">
+ To see how the <code class="ph codeph">LIVE_PROGRESS</code> and <code class="ph codeph">LIVE_SUMMARY</code> query options
+ work in real time, see <a class="xref" href="https://asciinema.org/a/1rv7qippo0fe7h5k1b6k4nexk" target="_blank">this animated demo</a>.
+ </p>
+
+ </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_query_options.html">Query Options for the SET Statement</a></div></div></nav></article></main></body></html>