You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by jb...@apache.org on 2017/04/12 18:25:09 UTC

[05/51] [partial] incubator-impala git commit: IMPALA-4181 [DOCS] Publish rendered Impala documentation to ASF site

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_smallint.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_smallint.html b/docs/build/html/topics/impala_smallint.html
new file mode 100644
index 0000000..cd48e90
--- /dev/null
+++ b/docs/build/html/topics/impala_smallint.html
@@ -0,0 +1,125 @@
+<!DOCTYPE html
+  SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_datatypes.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="smallint"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>SMALLINT Data Type</title></head><body id="smallint"><main role="main"><article role="article" aria-labelledby="ariaid-title1">
+
+  <h1 class="title topictitle1" id="ariaid-title1">SMALLINT Data Type</h1>
+  
+  
+
+  <div class="body conbody">
+
+    <p class="p">
+      A 2-byte integer data type used in <code class="ph codeph">CREATE TABLE</code> and <code class="ph codeph">ALTER TABLE</code> statements.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Syntax:</strong>
+      </p>
+
+    <p class="p">
+      In the column definition of a <code class="ph codeph">CREATE TABLE</code> statement:
+    </p>
+
+<pre class="pre codeblock"><code><var class="keyword varname">column_name</var> SMALLINT</code></pre>
+
+    <p class="p">
+      <strong class="ph b">Range:</strong> -32768 .. 32767. There is no <code class="ph codeph">UNSIGNED</code> subtype.
+    </p>
+
+    <p class="p">
+      <strong class="ph b">Conversions:</strong> Impala automatically converts to a larger integer type (<code class="ph codeph">INT</code> or
+      <code class="ph codeph">BIGINT</code>) or a floating-point type (<code class="ph codeph">FLOAT</code> or <code class="ph codeph">DOUBLE</code>)
+      automatically. Use <code class="ph codeph">CAST()</code> to convert to <code class="ph codeph">TINYINT</code>, <code class="ph codeph">STRING</code>,
+      or <code class="ph codeph">TIMESTAMP</code>.
+      <span class="ph">Casting an integer or floating-point value <code class="ph codeph">N</code> to
+        <code class="ph codeph">TIMESTAMP</code> produces a value that is <code class="ph codeph">N</code> seconds past the start of the epoch
+        date (January 1, 1970). By default, the result value represents a date and time in the UTC time zone.
+        If the setting <code class="ph codeph">-use_local_tz_for_unix_timestamp_conversions=true</code> is in effect,
+        the resulting <code class="ph codeph">TIMESTAMP</code> represents a date and time in the local time zone.</span>
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Usage notes:</strong>
+      </p>
+
+    <p class="p">
+      For a convenient and automated way to check the bounds of the <code class="ph codeph">SMALLINT</code> type, call the
+      functions <code class="ph codeph">MIN_SMALLINT()</code> and <code class="ph codeph">MAX_SMALLINT()</code>.
+    </p>
+
+    <p class="p">
+      If an integer value is too large to be represented as a <code class="ph codeph">SMALLINT</code>, use an
+      <code class="ph codeph">INT</code> instead.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">NULL considerations:</strong> Casting any non-numeric value to this type produces a <code class="ph codeph">NULL</code>
+        value.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+
+<pre class="pre codeblock"><code>CREATE TABLE t1 (x SMALLINT);
+SELECT CAST(1000 AS SMALLINT);
+</code></pre>
+
+    <p class="p">
+        <strong class="ph b">Parquet considerations:</strong>
+      </p>
+
+
+
+    <p class="p">
+      Physically, Parquet files represent <code class="ph codeph">TINYINT</code> and <code class="ph codeph">SMALLINT</code> values as 32-bit
+      integers. Although Impala rejects attempts to insert out-of-range values into such columns, if you create a
+      new table with the <code class="ph codeph">CREATE TABLE ... LIKE PARQUET</code> syntax, any <code class="ph codeph">TINYINT</code> or
+      <code class="ph codeph">SMALLINT</code> columns in the original table turn into <code class="ph codeph">INT</code> columns in the new
+      table.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Partitioning:</strong> Prefer to use this type for a partition key column. Impala can process the numeric
+        type more efficiently than a <code class="ph codeph">STRING</code> representation of the value.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">HBase considerations:</strong> This data type is fully compatible with HBase tables.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Text table considerations:</strong> Values of this type are potentially larger in text tables than in tables
+        using Parquet or other binary formats.
+      </p>
+
+
+
+    <p class="p">
+        <strong class="ph b">Internal details:</strong> Represented in memory as a 2-byte value.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Added in:</strong> Available in all versions of Impala.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Column statistics considerations:</strong> Because this type has a fixed size, the maximum and average size
+        fields are always filled in for column statistics, even before you run the <code class="ph codeph">COMPUTE STATS</code>
+        statement.
+      </p>
+
+
+
+    <p class="p">
+        <strong class="ph b">Related information:</strong>
+      </p>
+
+    <p class="p">
+      <a class="xref" href="impala_literals.html#numeric_literals">Numeric Literals</a>, <a class="xref" href="impala_tinyint.html#tinyint">TINYINT Data Type</a>,
+      <a class="xref" href="impala_smallint.html#smallint">SMALLINT Data Type</a>, <a class="xref" href="impala_int.html#int">INT Data Type</a>,
+      <a class="xref" href="impala_bigint.html#bigint">BIGINT Data Type</a>, <a class="xref" href="impala_decimal.html#decimal">DECIMAL Data Type (Impala 1.4 or higher only)</a>,
+      <a class="xref" href="impala_math_functions.html#math_functions">Impala Mathematical Functions</a>
+    </p>
+  </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_datatypes.html">Data Types</a></div></div></nav></article></main></body></html>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_ssl.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_ssl.html b/docs/build/html/topics/impala_ssl.html
new file mode 100644
index 0000000..a9b4d25
--- /dev/null
+++ b/docs/build/html/topics/impala_ssl.html
@@ -0,0 +1,119 @@
+<!DOCTYPE html
+  SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_security.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="ssl"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>Configuring TLS/SSL for Impala</title></head><body id="ssl"><main role="main"><article role="article" aria-labelledby="ssl__tls">
+
+  <h1 class="title topictitle1" id="ssl__tls">Configuring TLS/SSL for Impala</h1>
+
+  
+
+  <div class="body conbody">
+
+    <p class="p">
+      
+      Impala supports TLS/SSL network encryption, between Impala and client
+      programs, and between the Impala-related daemons running on different nodes
+      in the cluster. This feature is important when you also use other features such as Kerberos
+      authentication or Sentry authorization, where credentials are being
+      transmitted back and forth.
+    </p>
+
+  </div>
+
+  <nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_security.html">Impala Security</a></div></div></nav><article class="topic concept nested1" aria-labelledby="ariaid-title2" id="ssl__concept_q1p_j2d_rp">
+
+    <h2 class="title topictitle2" id="ariaid-title2">Using the Command Line</h2>
+
+    <div class="body conbody">
+      <p class="p">
+        To enable SSL for when client applications connect to Impala, add both of the following flags to the <span class="keyword cmdname">impalad</span> startup options:
+      </p>
+
+      <ul class="ul" id="concept_q1p_j2d_rp__ul_i2p_m2d_rp">
+        <li class="li">
+          <code class="ph codeph">--ssl_server_certificate</code>: the full path to the server certificate, on the local filesystem.
+        </li>
+
+        <li class="li">
+          <code class="ph codeph">--ssl_private_key</code>: the full path to the server private key, on the local filesystem.
+        </li>
+      </ul>
+
+      <p class="p">
+        In <span class="keyword">Impala 2.3</span> and higher, Impala can also use SSL for its own internal communication between the
+        <span class="keyword cmdname">impalad</span>, <code class="ph codeph">statestored</code>, and <code class="ph codeph">catalogd</code> daemons.
+        To enable this additional SSL encryption, set the <code class="ph codeph">--ssl_server_certificate</code>
+        and <code class="ph codeph">--ssl_private_key</code> flags in the startup options for
+        <span class="keyword cmdname">impalad</span>, <span class="keyword cmdname">catalogd</span>, and <span class="keyword cmdname">statestored</span>,
+        and also add the <code class="ph codeph">--ssl_client_ca_certificate</code> flag for all three of those daemons.
+      </p>
+
+      <div class="note warning note_warning"><span class="note__title warningtitle">Warning:</span> 
+        Prior to <span class="keyword">Impala 2.3.2</span>, you could enable Kerberos authentication between Impala internal components,
+        or SSL encryption between Impala internal components, but not both at the same time.
+        This restriction has now been lifted.
+        See <a class="xref" href="https://issues.apache.org/jira/browse/IMPALA-2598" target="_blank">IMPALA-2598</a>
+        to see the maintenance releases for different levels of Impala where the fix has been published.
+      </div>
+
+      <p class="p">
+        If either of these flags are set, both must be set. In that case, Impala starts listening for Beeswax and HiveServer2 requests on
+        SSL-secured ports only. (The port numbers stay the same; see <a class="xref" href="impala_ports.html#ports">Ports Used by Impala</a> for details.)
+      </p>
+
+      <p class="p">
+        Since Impala uses passphrase-less certificates in PEM format, you can reuse a host's existing Java keystore
+        by using the <code class="ph codeph">openssl</code> toolkit to convert it to the PEM format.
+      </p>
+
+      <section class="section" id="concept_q1p_j2d_rp__secref"><h3 class="title sectiontitle">Configuring TLS/SSL Communication for the Impala Shell</h3>
+
+        
+
+        <p class="p">
+          With SSL enabled for Impala, use the following options when starting the
+          <span class="keyword cmdname">impala-shell</span> interpreter:
+        </p>
+
+        <ul class="ul" id="concept_q1p_j2d_rp__ul_kgp_m2d_rp">
+          <li class="li">
+            <code class="ph codeph">--ssl</code>: enables TLS/SSL for <span class="keyword cmdname">impala-shell</span>.
+          </li>
+
+          <li class="li">
+            <code class="ph codeph">--ca_cert</code>: the local pathname pointing to the third-party CA certificate, or to a copy of the server
+            certificate for self-signed server certificates.
+          </li>
+        </ul>
+
+        <p class="p">
+          If <code class="ph codeph">--ca_cert</code> is not set, <span class="keyword cmdname">impala-shell</span> enables TLS/SSL, but does not validate the server
+          certificate. This is useful for connecting to a known-good Impala that is only running over TLS/SSL, when a copy of the
+          certificate is not available (such as when debugging customer installations).
+        </p>
+
+      </section>
+
+    </div>
+
+  </article>
+
+  <article class="topic concept nested1" aria-labelledby="ariaid-title3" id="ssl__ssl_jdbc_odbc">
+    <h2 class="title topictitle2" id="ariaid-title3">Using TLS/SSL with Business Intelligence Tools</h2>
+    <div class="body conbody">
+      <p class="p">
+        You can use Kerberos authentication, TLS/SSL encryption, or both to secure
+        connections from JDBC and ODBC applications to Impala.
+        See <a class="xref" href="impala_jdbc.html#impala_jdbc">Configuring Impala to Work with JDBC</a> and <a class="xref" href="impala_odbc.html#impala_odbc">Configuring Impala to Work with ODBC</a>
+        for details.
+      </p>
+
+      <p class="p">
+        Prior to <span class="keyword">Impala 2.5</span>, the Hive JDBC driver did not support connections that use both Kerberos authentication
+        and SSL encryption. If your cluster is running an older release that has this restriction,
+        use an alternative JDBC driver that supports
+        both of these security features.
+      </p>
+    </div>
+  </article>
+
+</article></main></body></html>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_stddev.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_stddev.html b/docs/build/html/topics/impala_stddev.html
new file mode 100644
index 0000000..4a14e14
--- /dev/null
+++ b/docs/build/html/topics/impala_stddev.html
@@ -0,0 +1,121 @@
+<!DOCTYPE html
+  SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_aggregate_functions.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="stddev"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>STDDEV, STDDEV_SAMP, STDDEV_POP Functions</title></head><body id="stddev"><main role="main"><article role="article" aria-labelledby="ariaid-title1">
+
+  <h1 class="title topictitle1" id="ariaid-title1">STDDEV, STDDEV_SAMP, STDDEV_POP Functions</h1>
+  
+  
+
+  <div class="body conbody">
+
+    <p class="p">
+      
+      
+      
+      An aggregate function that
+      <a class="xref" href="http://en.wikipedia.org/wiki/Standard_deviation" target="_blank">standard
+      deviation</a> of a set of numbers.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Syntax:</strong>
+      </p>
+
+<pre class="pre codeblock"><code>{ STDDEV | STDDEV_SAMP | STDDEV_POP } ([DISTINCT | ALL] <var class="keyword varname">expression</var>)</code></pre>
+
+    <p class="p">
+      This function works with any numeric data type.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Return type:</strong> <code class="ph codeph">DOUBLE</code> in Impala 2.0 and higher; <code class="ph codeph">STRING</code> in earlier
+        releases
+      </p>
+
+    <p class="p">
+      This function is typically used in mathematical formulas related to probability distributions.
+    </p>
+
+    <p class="p">
+      The <code class="ph codeph">STDDEV_POP()</code> and <code class="ph codeph">STDDEV_SAMP()</code> functions compute the population
+      standard deviation and sample standard deviation, respectively, of the input values.
+      (<code class="ph codeph">STDDEV()</code> is an alias for <code class="ph codeph">STDDEV_SAMP()</code>.) Both functions evaluate all input
+      rows matched by the query. The difference is that <code class="ph codeph">STDDEV_SAMP()</code> is scaled by
+      <code class="ph codeph">1/(N-1)</code> while <code class="ph codeph">STDDEV_POP()</code> is scaled by <code class="ph codeph">1/N</code>.
+    </p>
+
+    <p class="p">
+      If no input rows match the query, the result of any of these functions is <code class="ph codeph">NULL</code>. If a single
+      input row matches the query, the result of any of these functions is <code class="ph codeph">"0.0"</code>.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+
+    <p class="p">
+      This example demonstrates how <code class="ph codeph">STDDEV()</code> and <code class="ph codeph">STDDEV_SAMP()</code> return the same
+      result, while <code class="ph codeph">STDDEV_POP()</code> uses a slightly different calculation to reflect that the input
+      data is considered part of a larger <span class="q">"population"</span>.
+    </p>
+
+<pre class="pre codeblock"><code>[localhost:21000] &gt; select stddev(score) from test_scores;
++---------------+
+| stddev(score) |
++---------------+
+| 28.5          |
++---------------+
+[localhost:21000] &gt; select stddev_samp(score) from test_scores;
++--------------------+
+| stddev_samp(score) |
++--------------------+
+| 28.5               |
++--------------------+
+[localhost:21000] &gt; select stddev_pop(score) from test_scores;
++-------------------+
+| stddev_pop(score) |
++-------------------+
+| 28.4858           |
++-------------------+
+</code></pre>
+
+    <p class="p">
+      This example demonstrates that, because the return value of these aggregate functions is a
+      <code class="ph codeph">STRING</code>, you must currently convert the result with <code class="ph codeph">CAST</code>.
+    </p>
+
+<pre class="pre codeblock"><code>[localhost:21000] &gt; create table score_stats as select cast(stddev(score) as decimal(7,4)) `standard_deviation`, cast(variance(score) as decimal(7,4)) `variance` from test_scores;
++-------------------+
+| summary           |
++-------------------+
+| Inserted 1 row(s) |
++-------------------+
+[localhost:21000] &gt; desc score_stats;
++--------------------+--------------+---------+
+| name               | type         | comment |
++--------------------+--------------+---------+
+| standard_deviation | decimal(7,4) |         |
+| variance           | decimal(7,4) |         |
++--------------------+--------------+---------+
+</code></pre>
+
+    <p class="p">
+        <strong class="ph b">Restrictions:</strong>
+      </p>
+
+    <p class="p">
+        This function cannot be used in an analytic context. That is, the <code class="ph codeph">OVER()</code> clause is not allowed at all with this function.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Related information:</strong>
+      </p>
+
+    <p class="p">
+      The <code class="ph codeph">STDDEV()</code>, <code class="ph codeph">STDDEV_POP()</code>, and <code class="ph codeph">STDDEV_SAMP()</code> functions
+      compute the standard deviation (square root of the variance) based on the results of
+      <code class="ph codeph">VARIANCE()</code>, <code class="ph codeph">VARIANCE_POP()</code>, and <code class="ph codeph">VARIANCE_SAMP()</code>
+      respectively. See <a class="xref" href="impala_variance.html#variance">VARIANCE, VARIANCE_SAMP, VARIANCE_POP, VAR_SAMP, VAR_POP Functions</a> for details about the variance property.
+    </p>
+  </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_aggregate_functions.html">Impala Aggregate Functions</a></div></div></nav></article></main></body></html>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_string.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_string.html b/docs/build/html/topics/impala_string.html
new file mode 100644
index 0000000..60714c7
--- /dev/null
+++ b/docs/build/html/topics/impala_string.html
@@ -0,0 +1,197 @@
+<!DOCTYPE html
+  SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_datatypes.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="string"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>STRING Data Type</title></head><body id="string"><main role="main"><article role="article" aria-labelledby="ariaid-title1">
+
+  <h1 class="title topictitle1" id="ariaid-title1">STRING Data Type</h1>
+  
+  
+
+  <div class="body conbody">
+
+    <p class="p">
+      A data type used in <code class="ph codeph">CREATE TABLE</code> and <code class="ph codeph">ALTER TABLE</code> statements.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Syntax:</strong>
+      </p>
+
+    <p class="p">
+      In the column definition of a <code class="ph codeph">CREATE TABLE</code> statement:
+    </p>
+
+<pre class="pre codeblock"><code><var class="keyword varname">column_name</var> STRING</code></pre>
+
+    <p class="p">
+      <strong class="ph b">Length:</strong> Maximum of 32,767 bytes. Do not use any length constraint when declaring
+      <code class="ph codeph">STRING</code> columns, as you might be familiar with from <code class="ph codeph">VARCHAR</code>,
+      <code class="ph codeph">CHAR</code>, or similar column types from relational database systems. <span class="ph">If you do
+      need to manipulate string values with precise or maximum lengths, in Impala 2.0 and higher you can declare
+      columns as <code class="ph codeph">VARCHAR(<var class="keyword varname">max_length</var>)</code> or
+      <code class="ph codeph">CHAR(<var class="keyword varname">length</var>)</code>, but for best performance use <code class="ph codeph">STRING</code>
+      where practical.</span>
+    </p>
+
+    <p class="p">
+      <strong class="ph b">Character sets:</strong> For full support in all Impala subsystems, restrict string values to the ASCII
+      character set. Although some UTF-8 character data can be stored in Impala and retrieved through queries, UTF-8 strings
+      containing non-ASCII characters are not guaranteed to work properly in combination with many SQL aspects,
+      including but not limited to:
+    </p>
+    <ul class="ul">
+      <li class="li">
+        String manipulation functions.
+      </li>
+      <li class="li">
+        Comparison operators.
+      </li>
+      <li class="li">
+        The <code class="ph codeph">ORDER BY</code> clause.
+      </li>
+      <li class="li">
+        Values in partition key columns.
+      </li>
+    </ul>
+
+    <p class="p">
+      For any national language aspects such as
+      collation order or interpreting extended ASCII variants such as ISO-8859-1 or ISO-8859-2 encodings, Impala
+      does not include such metadata with the table definition. If you need to sort, manipulate, or display data
+      depending on those national language characteristics of string data, use logic on the application side.
+    </p>
+
+    <p class="p">
+      <strong class="ph b">Conversions:</strong>
+    </p>
+
+    <ul class="ul">
+      <li class="li">
+        <p class="p">
+          Impala does not automatically convert <code class="ph codeph">STRING</code> to any numeric type. Impala does
+          automatically convert <code class="ph codeph">STRING</code> to <code class="ph codeph">TIMESTAMP</code> if the value matches one of
+          the accepted <code class="ph codeph">TIMESTAMP</code> formats; see <a class="xref" href="impala_timestamp.html#timestamp">TIMESTAMP Data Type</a> for
+          details.
+        </p>
+      </li>
+
+      <li class="li">
+        <p class="p">
+          You can use <code class="ph codeph">CAST()</code> to convert <code class="ph codeph">STRING</code> values to
+          <code class="ph codeph">TINYINT</code>, <code class="ph codeph">SMALLINT</code>, <code class="ph codeph">INT</code>, <code class="ph codeph">BIGINT</code>,
+          <code class="ph codeph">FLOAT</code>, <code class="ph codeph">DOUBLE</code>, or <code class="ph codeph">TIMESTAMP</code>.
+        </p>
+      </li>
+
+      <li class="li">
+        <p class="p">
+          You cannot directly cast a <code class="ph codeph">STRING</code> value to <code class="ph codeph">BOOLEAN</code>. You can use a
+          <code class="ph codeph">CASE</code> expression to evaluate string values such as <code class="ph codeph">'T'</code>,
+          <code class="ph codeph">'true'</code>, and so on and return Boolean <code class="ph codeph">true</code> and <code class="ph codeph">false</code>
+          values as appropriate.
+        </p>
+      </li>
+
+      <li class="li">
+        <p class="p">
+          You can cast a <code class="ph codeph">BOOLEAN</code> value to <code class="ph codeph">STRING</code>, returning <code class="ph codeph">'1'</code>
+          for <code class="ph codeph">true</code> values and <code class="ph codeph">'0'</code> for <code class="ph codeph">false</code> values.
+        </p>
+      </li>
+    </ul>
+
+    <p class="p">
+        <strong class="ph b">Partitioning:</strong>
+      </p>
+
+    <p class="p">
+      Although it might be convenient to use <code class="ph codeph">STRING</code> columns for partition keys, even when those
+      columns contain numbers, for performance and scalability it is much better to use numeric columns as
+      partition keys whenever practical. Although the underlying HDFS directory name might be the same in either
+      case, the in-memory storage for the partition key columns is more compact, and computations are faster, if
+      partition key columns such as <code class="ph codeph">YEAR</code>, <code class="ph codeph">MONTH</code>, <code class="ph codeph">DAY</code> and so on
+      are declared as <code class="ph codeph">INT</code>, <code class="ph codeph">SMALLINT</code>, and so on.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Zero-length strings:</strong> For purposes of clauses such as <code class="ph codeph">DISTINCT</code> and <code class="ph codeph">GROUP
+        BY</code>, Impala considers zero-length strings (<code class="ph codeph">""</code>), <code class="ph codeph">NULL</code>, and space
+        to all be different values.
+      </p>
+
+
+
+
+
+    <p class="p">
+        <strong class="ph b">Text table considerations:</strong> Values of this type are potentially larger in text tables than in tables
+        using Parquet or other binary formats.
+      </p>
+
+    <p class="p"><strong class="ph b">Avro considerations:</strong></p>
+    <p class="p">
+        The Avro specification allows string values up to 2**64 bytes in length.
+        Impala queries for Avro tables use 32-bit integers to hold string lengths.
+        In <span class="keyword">Impala 2.5</span> and higher, Impala truncates <code class="ph codeph">CHAR</code>
+        and <code class="ph codeph">VARCHAR</code> values in Avro tables to (2**31)-1 bytes.
+        If a query encounters a <code class="ph codeph">STRING</code> value longer than (2**31)-1
+        bytes in an Avro table, the query fails. In earlier releases,
+        encountering such long values in an Avro table could cause a crash.
+      </p>
+
+
+
+
+
+
+
+    <p class="p">
+        <strong class="ph b">Column statistics considerations:</strong> Because the values of this type have variable size, none of the
+        column statistics fields are filled in until you run the <code class="ph codeph">COMPUTE STATS</code> statement.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+
+    <p class="p">
+      The following examples demonstrate double-quoted and single-quoted string literals, and required escaping for
+      quotation marks within string literals:
+    </p>
+
+<pre class="pre codeblock"><code>SELECT 'I am a single-quoted string';
+SELECT "I am a double-quoted string";
+SELECT 'I\'m a single-quoted string with an apostrophe';
+SELECT "I\'m a double-quoted string with an apostrophe";
+SELECT 'I am a "short" single-quoted string containing quotes';
+SELECT "I am a \"short\" double-quoted string containing quotes";
+</code></pre>
+
+    <p class="p">
+      The following examples demonstrate calls to string manipulation functions to concatenate strings, convert
+      numbers to strings, or pull out substrings:
+    </p>
+
+<pre class="pre codeblock"><code>SELECT CONCAT("Once upon a time, there were ", CAST(3 AS STRING), ' little pigs.');
+SELECT SUBSTR("hello world",7,5);
+</code></pre>
+
+    <p class="p">
+      The following examples show how to perform operations on <code class="ph codeph">STRING</code> columns within a table:
+    </p>
+
+<pre class="pre codeblock"><code>CREATE TABLE t1 (s1 STRING, s2 STRING);
+INSERT INTO t1 VALUES ("hello", 'world'), (CAST(7 AS STRING), "wonders");
+SELECT s1, s2, length(s1) FROM t1 WHERE s2 LIKE 'w%';
+</code></pre>
+
+    <p class="p">
+        <strong class="ph b">Related information:</strong>
+      </p>
+
+    <p class="p">
+      <a class="xref" href="impala_literals.html#string_literals">String Literals</a>, <a class="xref" href="impala_char.html#char">CHAR Data Type (Impala 2.0 or higher only)</a>,
+      <a class="xref" href="impala_varchar.html#varchar">VARCHAR Data Type (Impala 2.0 or higher only)</a>, <a class="xref" href="impala_string_functions.html#string_functions">Impala String Functions</a>,
+      <a class="xref" href="impala_datetime_functions.html#datetime_functions">Impala Date and Time Functions</a>
+    </p>
+  </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_datatypes.html">Data Types</a></div></div></nav></article></main></body></html>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_string_functions.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_string_functions.html b/docs/build/html/topics/impala_string_functions.html
new file mode 100644
index 0000000..aab1f35
--- /dev/null
+++ b/docs/build/html/topics/impala_string_functions.html
@@ -0,0 +1,1036 @@
+<!DOCTYPE html
+  SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_functions.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="string_functions"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>Impala String Functions</title></head><body id="string_functions"><main role="main"><article role="article" aria-labelledby="ariaid-title1">
+
+  <h1 class="title topictitle1" id="ariaid-title1">Impala String Functions</h1>
+  
+  
+
+  <div class="body conbody">
+
+    <div class="p">
+      String functions are classified as those primarily accepting or returning <code class="ph codeph">STRING</code>,
+      <code class="ph codeph">VARCHAR</code>, or <code class="ph codeph">CHAR</code> data types, for example to measure the length of a string
+      or concatenate two strings together.
+      <ul class="ul">
+        <li class="li">
+          All the functions that accept <code class="ph codeph">STRING</code> arguments also accept the <code class="ph codeph">VARCHAR</code>
+          and <code class="ph codeph">CHAR</code> types introduced in Impala 2.0.
+        </li>
+
+        <li class="li">
+          Whenever <code class="ph codeph">VARCHAR</code> or <code class="ph codeph">CHAR</code> values are passed to a function that returns a
+          string value, the return type is normalized to <code class="ph codeph">STRING</code>. For example, a call to
+          <code class="ph codeph">concat()</code> with a mix of <code class="ph codeph">STRING</code>, <code class="ph codeph">VARCHAR</code>, and
+          <code class="ph codeph">CHAR</code> arguments produces a <code class="ph codeph">STRING</code> result.
+        </li>
+      </ul>
+    </div>
+
+    <p class="p">
+        <strong class="ph b">Related information:</strong>
+      </p>
+
+    <p class="p">
+      The string functions operate mainly on these data types: <a class="xref" href="impala_string.html#string">STRING Data Type</a>,
+      <a class="xref" href="impala_varchar.html#varchar">VARCHAR Data Type (Impala 2.0 or higher only)</a>, and <a class="xref" href="impala_char.html#char">CHAR Data Type (Impala 2.0 or higher only)</a>.
+    </p>
+
+    <p class="p">
+      <strong class="ph b">Function reference:</strong>
+    </p>
+
+    <p class="p">
+      Impala supports the following string functions:
+    </p>
+
+    <dl class="dl">
+      
+
+        <dt class="dt dlterm" id="string_functions__ascii">
+          <code class="ph codeph">ascii(string str)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the numeric ASCII code of the first character of the argument.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">int</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__btrim">
+          <code class="ph codeph">btrim(string a)</code>,
+          <code class="ph codeph">btrim(string a, string chars_to_trim)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Removes all instances of one or more characters
+          from the start and end of a <code class="ph codeph">STRING</code> value.
+          By default, removes only spaces.
+          If a non-<code class="ph codeph">NULL</code> optional second argument is specified, the function removes all
+          occurrences of characters in that second argument from the beginning and
+          end of the string.
+          <p class="p"><strong class="ph b">Return type:</strong> <code class="ph codeph">string</code></p>
+          <p class="p">
+        <strong class="ph b">Added in:</strong> <span class="keyword">Impala 2.3.0</span>
+      </p>
+          <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+          <p class="p">
+            The following examples show the default <code class="ph codeph">btrim()</code> behavior,
+            and what changes when you specify the optional second argument.
+            All the examples bracket the output value with <code class="ph codeph">[ ]</code>
+            so that you can see any leading or trailing spaces in the <code class="ph codeph">btrim()</code> result.
+            By default, the function removes and number of both leading and trailing spaces.
+            When the second argument is specified, any number of occurrences of any
+            character in the second argument are removed from the start and end of the
+            input string; in this case, spaces are not removed (unless they are part of the second
+            argument) and any instances of the characters are not removed if they do not come
+            right at the beginning or end of the string.
+          </p>
+<pre class="pre codeblock"><code>-- Remove multiple spaces before and one space after.
+select concat('[',btrim('    hello '),']');
++---------------------------------------+
+| concat('[', btrim('    hello '), ']') |
++---------------------------------------+
+| [hello]                               |
++---------------------------------------+
+
+-- Remove any instances of x or y or z at beginning or end. Leave spaces alone.
+select concat('[',btrim('xy    hello zyzzxx','xyz'),']');
++------------------------------------------------------+
+| concat('[', btrim('xy    hello zyzzxx', 'xyz'), ']') |
++------------------------------------------------------+
+| [    hello ]                                         |
++------------------------------------------------------+
+
+-- Remove any instances of x or y or z at beginning or end.
+-- Leave x, y, z alone in the middle of the string.
+select concat('[',btrim('xyhelxyzlozyzzxx','xyz'),']');
++----------------------------------------------------+
+| concat('[', btrim('xyhelxyzlozyzzxx', 'xyz'), ']') |
++----------------------------------------------------+
+| [helxyzlo]                                         |
++----------------------------------------------------+
+</code></pre>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__char_length">
+          <code class="ph codeph">char_length(string a), <span class="ph" id="string_functions__character_length">character_length(string a)</span></code>
+        </dt>
+
+        <dd class="dd">
+          
+          
+          <strong class="ph b">Purpose:</strong> Returns the length in characters of the argument string. Aliases for the
+          <code class="ph codeph">length()</code> function.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">int</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__chr">
+          <code class="ph codeph">chr(int character_code)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns a character specified by a decimal code point value.
+          The interpretation and display of the resulting character depends on your system locale.
+          Because consistent processing of Impala string values is only guaranteed
+          for values within the ASCII range, only use this function for values
+          corresponding to ASCII characters.
+          In particular, parameter values greater than 255 return an empty string.
+          <p class="p"><strong class="ph b">Return type:</strong> <code class="ph codeph">string</code></p>
+          <p class="p">
+            <strong class="ph b">Usage notes:</strong> Can be used as the inverse of the <code class="ph codeph">ascii()</code> function, which
+            converts a character to its numeric ASCII code.
+          </p>
+          <p class="p">
+        <strong class="ph b">Added in:</strong> <span class="keyword">Impala 2.3.0</span>
+      </p>
+          <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+<pre class="pre codeblock"><code>SELECT chr(65);
++---------+
+| chr(65) |
++---------+
+| A       |
++---------+
+
+SELECT chr(97);
++---------+
+| chr(97) |
++---------+
+| a       |
++---------+
+</code></pre>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__concat">
+          <code class="ph codeph">concat(string a, string b...)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns a single string representing all the argument values joined together.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+          <p class="p">
+        <strong class="ph b">Usage notes:</strong> <code class="ph codeph">concat()</code> and <code class="ph codeph">concat_ws()</code> are appropriate for
+        concatenating the values of multiple columns within the same row, while <code class="ph codeph">group_concat()</code>
+        joins together values from different rows.
+      </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__concat_ws">
+          <code class="ph codeph">concat_ws(string sep, string a, string b...)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns a single string representing the second and following argument values joined
+          together, delimited by a specified separator.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+          <p class="p">
+        <strong class="ph b">Usage notes:</strong> <code class="ph codeph">concat()</code> and <code class="ph codeph">concat_ws()</code> are appropriate for
+        concatenating the values of multiple columns within the same row, while <code class="ph codeph">group_concat()</code>
+        joins together values from different rows.
+      </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__find_in_set">
+          <code class="ph codeph">find_in_set(string str, string strList)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the position (starting from 1) of the first occurrence of a specified string
+          within a comma-separated string. Returns <code class="ph codeph">NULL</code> if either argument is
+          <code class="ph codeph">NULL</code>, 0 if the search string is not found, or 0 if the search string contains a comma.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">int</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__group_concat">
+          <code class="ph codeph">group_concat(string s [, string sep])</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns a single string representing the argument value concatenated together for each
+          row of the result set. If the optional separator string is specified, the separator is added between each
+          pair of concatenated values.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+          <p class="p">
+        <strong class="ph b">Usage notes:</strong> <code class="ph codeph">concat()</code> and <code class="ph codeph">concat_ws()</code> are appropriate for
+        concatenating the values of multiple columns within the same row, while <code class="ph codeph">group_concat()</code>
+        joins together values from different rows.
+      </p>
+          <p class="p">
+            By default, returns a single string covering the whole result set. To include other columns or values
+            in the result set, or to produce multiple concatenated strings for subsets of rows, include a
+            <code class="ph codeph">GROUP BY</code> clause in the query.
+          </p>
+          <p class="p">
+            Strictly speaking, <code class="ph codeph">group_concat()</code> is an aggregate function, not a scalar
+            function like the others in this list.
+            For additional details and examples, see <a class="xref" href="impala_group_concat.html#group_concat">GROUP_CONCAT Function</a>.
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__initcap">
+          <code class="ph codeph">initcap(string str)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the input string with the first letter capitalized.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__instr">
+          <code class="ph codeph">instr(string str, string substr)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the position (starting from 1) of the first occurrence of a substring within a
+          longer string.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">int</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__length">
+          <code class="ph codeph">length(string a)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the length in characters of the argument string.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">int</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__locate">
+          <code class="ph codeph">locate(string substr, string str[, int pos])</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the position (starting from 1) of the first occurrence of a substring within a
+          longer string, optionally after a particular position.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">int</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__lower">
+          <code class="ph codeph">lower(string a), <span class="ph" id="string_functions__lcase">lcase(string a)</span> </code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the argument string converted to all-lowercase.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+          <p class="p">
+        <strong class="ph b">Usage notes:</strong>
+      </p>
+          <p class="p">
+        In <span class="keyword">Impala 2.5</span> and higher, you can simplify queries that
+        use many <code class="ph codeph">UPPER()</code> and <code class="ph codeph">LOWER()</code> calls
+        to do case-insensitive comparisons, by using the <code class="ph codeph">ILIKE</code>
+        or <code class="ph codeph">IREGEXP</code> operators instead. See
+        <a class="xref" href="../shared/../topics/impala_operators.html#ilike">ILIKE Operator</a> and
+        <a class="xref" href="../shared/../topics/impala_operators.html#iregexp">IREGEXP Operator</a> for details.
+      </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__lpad">
+          <code class="ph codeph">lpad(string str, int len, string pad)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns a string of a specified length, based on the first argument string. If the
+          specified string is too short, it is padded on the left with a repeating sequence of the characters from
+          the pad string. If the specified string is too long, it is truncated on the right.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__ltrim">
+          <code class="ph codeph">ltrim(string a)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the argument string with any leading spaces removed from the left side.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__parse_url">
+          <code class="ph codeph">parse_url(string urlString, string partToExtract [, string keyToExtract])</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the portion of a URL corresponding to a specified part. The part argument can be
+          <code class="ph codeph">'PROTOCOL'</code>, <code class="ph codeph">'HOST'</code>, <code class="ph codeph">'PATH'</code>, <code class="ph codeph">'REF'</code>,
+          <code class="ph codeph">'AUTHORITY'</code>, <code class="ph codeph">'FILE'</code>, <code class="ph codeph">'USERINFO'</code>, or
+          <code class="ph codeph">'QUERY'</code>. Uppercase is required for these literal values. When requesting the
+          <code class="ph codeph">QUERY</code> portion of the URL, you can optionally specify a key to retrieve just the
+          associated value from the key-value pairs in the query string.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+          <p class="p">
+            <strong class="ph b">Usage notes:</strong> This function is important for the traditional Hadoop use case of interpreting web
+            logs. For example, if the web traffic data features raw URLs not divided into separate table columns,
+            you can count visitors to a particular page by extracting the <code class="ph codeph">'PATH'</code> or
+            <code class="ph codeph">'FILE'</code> field, or analyze search terms by extracting the corresponding key from the
+            <code class="ph codeph">'QUERY'</code> field.
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__regexp_extract">
+          <code class="ph codeph">regexp_extract(string subject, string pattern, int index)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the specified () group from a string based on a regular expression pattern. Group
+          0 refers to the entire extracted string, while group 1, 2, and so on refers to the first, second, and so
+          on <code class="ph codeph">(...)</code> portion.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+          <p class="p">
+        In Impala 2.0 and later, the Impala regular expression syntax conforms to the POSIX Extended Regular
+        Expression syntax used by the Google RE2 library. For details, see
+        <a class="xref" href="https://code.google.com/p/re2/" target="_blank">the RE2 documentation</a>. It
+        has most idioms familiar from regular expressions in Perl, Python, and so on, including
+        <code class="ph codeph">.*?</code> for non-greedy matches.
+      </p>
+          <p class="p">
+        In Impala 2.0 and later, a change in the underlying regular expression library could cause changes in the
+        way regular expressions are interpreted by this function. Test any queries that use regular expressions and
+        adjust the expression patterns if necessary. See
+        <a class="xref" href="../shared/../topics/impala_incompatible_changes.html#incompatible_changes_200">Incompatible Changes Introduced in Impala 2.0.0</a> for details.
+      </p>
+          <p class="p">
+        Because the <span class="keyword cmdname">impala-shell</span> interpreter uses the <code class="ph codeph">\</code> character for escaping,
+        use <code class="ph codeph">\\</code> to represent the regular expression escape character in any regular expressions
+        that you submit through <span class="keyword cmdname">impala-shell</span> . You might prefer to use the equivalent character
+        class names, such as <code class="ph codeph">[[:digit:]]</code> instead of <code class="ph codeph">\d</code> which you would have to
+        escape as <code class="ph codeph">\\d</code>.
+      </p>
+          <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+          <p class="p">
+            This example shows how group 0 matches the full pattern string, including the portion outside any
+            <code class="ph codeph">()</code> group:
+          </p>
+<pre class="pre codeblock"><code>[localhost:21000] &gt; select regexp_extract('abcdef123ghi456jkl','.*?(\\d+)',0);
++------------------------------------------------------+
+| regexp_extract('abcdef123ghi456jkl', '.*?(\\d+)', 0) |
++------------------------------------------------------+
+| abcdef123ghi456                                      |
++------------------------------------------------------+
+Returned 1 row(s) in 0.11s</code></pre>
+          <p class="p">
+            This example shows how group 1 matches just the contents inside the first <code class="ph codeph">()</code> group in
+            the pattern string:
+          </p>
+<pre class="pre codeblock"><code>[localhost:21000] &gt; select regexp_extract('abcdef123ghi456jkl','.*?(\\d+)',1);
++------------------------------------------------------+
+| regexp_extract('abcdef123ghi456jkl', '.*?(\\d+)', 1) |
++------------------------------------------------------+
+| 456                                                  |
++------------------------------------------------------+
+Returned 1 row(s) in 0.11s</code></pre>
+          <p class="p">
+            Unlike in earlier Impala releases, the regular expression library used in Impala 2.0 and later supports
+            the <code class="ph codeph">.*?</code> idiom for non-greedy matches. This example shows how a pattern string starting
+            with <code class="ph codeph">.*?</code> matches the shortest possible portion of the source string, returning the
+            rightmost set of lowercase letters. A pattern string both starting and ending with <code class="ph codeph">.*?</code>
+            finds two potential matches of equal length, and returns the first one found (the leftmost set of
+            lowercase letters).
+          </p>
+<pre class="pre codeblock"><code>[localhost:21000] &gt; select regexp_extract('AbcdBCdefGHI','.*?([[:lower:]]+)',1);
++--------------------------------------------------------+
+| regexp_extract('abcdbcdefghi', '.*?([[:lower:]]+)', 1) |
++--------------------------------------------------------+
+| def                                                    |
++--------------------------------------------------------+
+[localhost:21000] &gt; select regexp_extract('AbcdBCdefGHI','.*?([[:lower:]]+).*?',1);
++-----------------------------------------------------------+
+| regexp_extract('abcdbcdefghi', '.*?([[:lower:]]+).*?', 1) |
++-----------------------------------------------------------+
+| bcd                                                       |
++-----------------------------------------------------------+
+</code></pre>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__regexp_like">
+          <code class="ph codeph">regexp_like(string source, string pattern[, string options])</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns <code class="ph codeph">true</code> or <code class="ph codeph">false</code> to indicate
+          whether the source string contains anywhere inside it the regular expression given by the pattern.
+          The optional third argument consists of letter flags that change how the match is performed,
+          such as <code class="ph codeph">i</code> for case-insensitive matching.
+          <p class="p">
+        <strong class="ph b">Syntax:</strong>
+      </p>
+          <p class="p">
+            The flags that you can include in the optional third argument are:
+          </p>
+          <ul class="ul">
+          <li class="li">
+          <code class="ph codeph">c</code>: Case-sensitive matching (the default).
+          </li>
+          <li class="li">
+          <code class="ph codeph">i</code>: Case-insensitive matching. If multiple instances of <code class="ph codeph">c</code> and <code class="ph codeph">i</code>
+          are included in the third argument, the last such option takes precedence.
+          </li>
+          <li class="li">
+          <code class="ph codeph">m</code>: Multi-line matching. The <code class="ph codeph">^</code> and <code class="ph codeph">$</code>
+          operators match the start or end of any line within the source string, not the
+          start and end of the entire string.
+          </li>
+          <li class="li">
+          <code class="ph codeph">n</code>: Newline matching. The <code class="ph codeph">.</code> operator can match the
+          newline character. A repetition operator such as <code class="ph codeph">.*</code> can
+          match a portion of the source string that spans multiple lines.
+          </li>
+          </ul>
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">boolean</code>
+          </p>
+          <p class="p">
+        In Impala 2.0 and later, the Impala regular expression syntax conforms to the POSIX Extended Regular
+        Expression syntax used by the Google RE2 library. For details, see
+        <a class="xref" href="https://code.google.com/p/re2/" target="_blank">the RE2 documentation</a>. It
+        has most idioms familiar from regular expressions in Perl, Python, and so on, including
+        <code class="ph codeph">.*?</code> for non-greedy matches.
+      </p>
+          <p class="p">
+        In Impala 2.0 and later, a change in the underlying regular expression library could cause changes in the
+        way regular expressions are interpreted by this function. Test any queries that use regular expressions and
+        adjust the expression patterns if necessary. See
+        <a class="xref" href="../shared/../topics/impala_incompatible_changes.html#incompatible_changes_200">Incompatible Changes Introduced in Impala 2.0.0</a> for details.
+      </p>
+          <p class="p">
+        Because the <span class="keyword cmdname">impala-shell</span> interpreter uses the <code class="ph codeph">\</code> character for escaping,
+        use <code class="ph codeph">\\</code> to represent the regular expression escape character in any regular expressions
+        that you submit through <span class="keyword cmdname">impala-shell</span> . You might prefer to use the equivalent character
+        class names, such as <code class="ph codeph">[[:digit:]]</code> instead of <code class="ph codeph">\d</code> which you would have to
+        escape as <code class="ph codeph">\\d</code>.
+      </p>
+          <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+          <p class="p">
+            This example shows how <code class="ph codeph">regexp_like()</code> can test for the existence
+            of various kinds of regular expression patterns within a source string:
+          </p>
+<pre class="pre codeblock"><code>
+-- Matches because the 'f' appears somewhere in 'foo'.
+select regexp_like('foo','f');
++-------------------------+
+| regexp_like('foo', 'f') |
++-------------------------+
+| true                    |
++-------------------------+
+
+-- Does not match because the comparison is case-sensitive by default.
+select regexp_like('foo','F');
++-------------------------+
+| regexp_like('foo', 'f') |
++-------------------------+
+| false                   |
++-------------------------+
+
+-- The 3rd argument can change the matching logic, such as 'i' meaning case-insensitive.
+select regexp_like('foo','F','i');
++------------------------------+
+| regexp_like('foo', 'f', 'i') |
++------------------------------+
+| true                         |
++------------------------------+
+
+-- The familiar regular expression notations work, such as ^ and $ anchors...
+select regexp_like('foo','f$');
++--------------------------+
+| regexp_like('foo', 'f$') |
++--------------------------+
+| false                    |
++--------------------------+
+
+select regexp_like('foo','o$');
++--------------------------+
+| regexp_like('foo', 'o$') |
++--------------------------+
+| true                     |
++--------------------------+
+
+-- ...and repetition operators such as * and +
+select regexp_like('foooooobar','fo+b');
++-----------------------------------+
+| regexp_like('foooooobar', 'fo+b') |
++-----------------------------------+
+| true                              |
++-----------------------------------+
+
+select regexp_like('foooooobar','fx*y*o*b');
++---------------------------------------+
+| regexp_like('foooooobar', 'fx*y*o*b') |
++---------------------------------------+
+| true                                  |
++---------------------------------------+
+
+</code></pre>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__regexp_replace">
+          <code class="ph codeph">regexp_replace(string initial, string pattern, string replacement)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the initial argument with the regular expression pattern replaced by the final
+          argument string.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+          <p class="p">
+        In Impala 2.0 and later, the Impala regular expression syntax conforms to the POSIX Extended Regular
+        Expression syntax used by the Google RE2 library. For details, see
+        <a class="xref" href="https://code.google.com/p/re2/" target="_blank">the RE2 documentation</a>. It
+        has most idioms familiar from regular expressions in Perl, Python, and so on, including
+        <code class="ph codeph">.*?</code> for non-greedy matches.
+      </p>
+          <p class="p">
+        In Impala 2.0 and later, a change in the underlying regular expression library could cause changes in the
+        way regular expressions are interpreted by this function. Test any queries that use regular expressions and
+        adjust the expression patterns if necessary. See
+        <a class="xref" href="../shared/../topics/impala_incompatible_changes.html#incompatible_changes_200">Incompatible Changes Introduced in Impala 2.0.0</a> for details.
+      </p>
+          <p class="p">
+        Because the <span class="keyword cmdname">impala-shell</span> interpreter uses the <code class="ph codeph">\</code> character for escaping,
+        use <code class="ph codeph">\\</code> to represent the regular expression escape character in any regular expressions
+        that you submit through <span class="keyword cmdname">impala-shell</span> . You might prefer to use the equivalent character
+        class names, such as <code class="ph codeph">[[:digit:]]</code> instead of <code class="ph codeph">\d</code> which you would have to
+        escape as <code class="ph codeph">\\d</code>.
+      </p>
+          <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+          <p class="p">
+            These examples show how you can replace parts of a string matching a pattern with replacement text,
+            which can include backreferences to any <code class="ph codeph">()</code> groups in the pattern string. The
+            backreference numbers start at 1, and any <code class="ph codeph">\</code> characters must be escaped as
+            <code class="ph codeph">\\</code>.
+          </p>
+          <p class="p">
+            Replace a character pattern with new text:
+          </p>
+<pre class="pre codeblock"><code>[localhost:21000] &gt; select regexp_replace('aaabbbaaa','b+','xyz');
++------------------------------------------+
+| regexp_replace('aaabbbaaa', 'b+', 'xyz') |
++------------------------------------------+
+| aaaxyzaaa                                |
++------------------------------------------+
+Returned 1 row(s) in 0.11s</code></pre>
+          <p class="p">
+            Replace a character pattern with substitution text that includes the original matching text:
+          </p>
+<pre class="pre codeblock"><code>[localhost:21000] &gt; select regexp_replace('aaabbbaaa','(b+)','&lt;\\1&gt;');
++----------------------------------------------+
+| regexp_replace('aaabbbaaa', '(b+)', '&lt;\\1&gt;') |
++----------------------------------------------+
+| aaa&lt;bbb&gt;aaa                                  |
++----------------------------------------------+
+Returned 1 row(s) in 0.11s</code></pre>
+          <p class="p">
+            Remove all characters that are not digits:
+          </p>
+<pre class="pre codeblock"><code>[localhost:21000] &gt; select regexp_replace('123-456-789','[^[:digit:]]','');
++---------------------------------------------------+
+| regexp_replace('123-456-789', '[^[:digit:]]', '') |
++---------------------------------------------------+
+| 123456789                                         |
++---------------------------------------------------+
+Returned 1 row(s) in 0.12s</code></pre>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__repeat">
+          <code class="ph codeph">repeat(string str, int n)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the argument string repeated a specified number of times.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__reverse">
+          <code class="ph codeph">reverse(string a)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the argument string with characters in reversed order.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__rpad">
+          <code class="ph codeph">rpad(string str, int len, string pad)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns a string of a specified length, based on the first argument string. If the
+          specified string is too short, it is padded on the right with a repeating sequence of the characters from
+          the pad string. If the specified string is too long, it is truncated on the right.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__rtrim">
+          <code class="ph codeph">rtrim(string a)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the argument string with any trailing spaces removed from the right side.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__space">
+          <code class="ph codeph">space(int n)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns a concatenated string of the specified number of spaces. Shorthand for
+          <code class="ph codeph">repeat(' ',<var class="keyword varname">n</var>)</code>.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__split_part">
+          <code class="ph codeph">split_part(string source, string delimiter, bigint n)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the nth field within a delimited string.
+          The fields are numbered starting from 1.
+          The delimiter can consist of multiple characters, not just a
+          single character. All matching of the delimiter is done exactly, not using any
+          regular expression patterns.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+          <p class="p">
+        In Impala 2.0 and later, the Impala regular expression syntax conforms to the POSIX Extended Regular
+        Expression syntax used by the Google RE2 library. For details, see
+        <a class="xref" href="https://code.google.com/p/re2/" target="_blank">the RE2 documentation</a>. It
+        has most idioms familiar from regular expressions in Perl, Python, and so on, including
+        <code class="ph codeph">.*?</code> for non-greedy matches.
+      </p>
+          <p class="p">
+        In Impala 2.0 and later, a change in the underlying regular expression library could cause changes in the
+        way regular expressions are interpreted by this function. Test any queries that use regular expressions and
+        adjust the expression patterns if necessary. See
+        <a class="xref" href="../shared/../topics/impala_incompatible_changes.html#incompatible_changes_200">Incompatible Changes Introduced in Impala 2.0.0</a> for details.
+      </p>
+          <p class="p">
+        Because the <span class="keyword cmdname">impala-shell</span> interpreter uses the <code class="ph codeph">\</code> character for escaping,
+        use <code class="ph codeph">\\</code> to represent the regular expression escape character in any regular expressions
+        that you submit through <span class="keyword cmdname">impala-shell</span> . You might prefer to use the equivalent character
+        class names, such as <code class="ph codeph">[[:digit:]]</code> instead of <code class="ph codeph">\d</code> which you would have to
+        escape as <code class="ph codeph">\\d</code>.
+      </p>
+          <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+          <p class="p">
+            These examples show how to retrieve the nth field from a delimited string:
+          </p>
+<pre class="pre codeblock"><code>
+select split_part('x,y,z',',',1);
++-----------------------------+
+| split_part('x,y,z', ',', 1) |
++-----------------------------+
+| x                           |
++-----------------------------+
+
+select split_part('x,y,z',',',2);
++-----------------------------+
+| split_part('x,y,z', ',', 2) |
++-----------------------------+
+| y                           |
++-----------------------------+
+
+select split_part('x,y,z',',',3);
++-----------------------------+
+| split_part('x,y,z', ',', 3) |
++-----------------------------+
+| z                           |
++-----------------------------+
+&lt;/codeblock&gt;
+
+          &lt;p&gt;
+            These examples show what happens for out-of-range field positions.
+            Specifying a value less than 1 produces an error. Specifying a value
+            greater than the number of fields returns a zero-length string
+            (which is not the same as &lt;codeph&gt;NULL&lt;/codeph&gt;).
+          &lt;/p&gt;
+&lt;codeblock&gt;&lt;![CDATA[
+select split_part('x,y,z',',',0);
+ERROR: Invalid field position: 0
+
+with t1 as (select split_part('x,y,z',',',4) nonexistent_field)
+  select
+      nonexistent_field
+    , concat('[',nonexistent_field,']')
+    , length(nonexistent_field);
+from t1
++-------------------+-------------------------------------+---------------------------+
+| nonexistent_field | concat('[', nonexistent_field, ']') | length(nonexistent_field) |
++-------------------+-------------------------------------+---------------------------+
+|                   | []                                  | 0                         |
++-------------------+-------------------------------------+---------------------------+
+&lt;/codeblock&gt;
+
+          &lt;p&gt;
+            These examples show how the delimiter can be a multi-character value:
+          &lt;/p&gt;
+&lt;codeblock&gt;&lt;![CDATA[
+select split_part('one***two***three','***',2);
++-------------------------------------------+
+| split_part('one***two***three', '***', 2) |
++-------------------------------------------+
+| two                                       |
++-------------------------------------------+
+
+select split_part('one\|/two\|/three','\|/',3);
++-------------------------------------------+
+| split_part('one\|/two\|/three', '\|/', 3) |
++-------------------------------------------+
+| three                                     |
++-------------------------------------------+
+
+</code></pre>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__strleft">
+          <code class="ph codeph">strleft(string a, int num_chars)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the leftmost characters of the string. Shorthand for a call to
+          <code class="ph codeph">substr()</code> with 2 arguments.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__strright">
+          <code class="ph codeph">strright(string a, int num_chars)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the rightmost characters of the string. Shorthand for a call to
+          <code class="ph codeph">substr()</code> with 2 arguments.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__substr">
+          <code class="ph codeph">substr(string a, int start [, int len]), <span class="ph" id="string_functions__substring">substring(string a, int start [, int
+          len])</span></code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the portion of the string starting at a specified point, optionally with a
+          specified maximum length. The characters in the string are indexed starting at 1.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__translate">
+          <code class="ph codeph">translate(string input, string from, string to)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the input string with a set of characters replaced by another set of characters.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__trim">
+          <code class="ph codeph">trim(string a)</code>
+        </dt>
+
+        <dd class="dd">
+          
+          <strong class="ph b">Purpose:</strong> Returns the input string with both leading and trailing spaces removed. The same as
+          passing the string through both <code class="ph codeph">ltrim()</code> and <code class="ph codeph">rtrim()</code>.
+          <p class="p">
+            <strong class="ph b">Usage notes:</strong> Often used during data cleansing operations during the ETL cycle, if input values might still have surrounding spaces.
+            For a more general-purpose function that can remove other leading and trailing characters besides spaces, see <code class="ph codeph">btrim()</code>.
+          </p>
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+        </dd>
+
+      
+
+      
+
+        <dt class="dt dlterm" id="string_functions__upper">
+          <code class="ph codeph">upper(string a), <span class="ph" id="string_functions__ucase">ucase(string a)</span></code>
+        </dt>
+
+        <dd class="dd">
+          
+          
+          <strong class="ph b">Purpose:</strong> Returns the argument string converted to all-uppercase.
+          <p class="p">
+            <strong class="ph b">Return type:</strong> <code class="ph codeph">string</code>
+          </p>
+          <p class="p">
+        <strong class="ph b">Usage notes:</strong>
+      </p>
+          <p class="p">
+        In <span class="keyword">Impala 2.5</span> and higher, you can simplify queries that
+        use many <code class="ph codeph">UPPER()</code> and <code class="ph codeph">LOWER()</code> calls
+        to do case-insensitive comparisons, by using the <code class="ph codeph">ILIKE</code>
+        or <code class="ph codeph">IREGEXP</code> operators instead. See
+        <a class="xref" href="../shared/../topics/impala_operators.html#ilike">ILIKE Operator</a> and
+        <a class="xref" href="../shared/../topics/impala_operators.html#iregexp">IREGEXP Operator</a> for details.
+      </p>
+        </dd>
+
+      
+    </dl>
+  </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_functions.html">Impala Built-In Functions</a></div></div></nav></article></main></body></html>
\ No newline at end of file