You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by mi...@apache.org on 2018/05/09 21:10:49 UTC
[40/51] [partial] impala git commit: [DOCS] Impala doc site update
for 3.0
http://git-wip-us.apache.org/repos/asf/impala/blob/fae51ec2/docs/build3x/html/topics/impala_count.html
----------------------------------------------------------------------
diff --git a/docs/build3x/html/topics/impala_count.html b/docs/build3x/html/topics/impala_count.html
new file mode 100644
index 0000000..a451013
--- /dev/null
+++ b/docs/build3x/html/topics/impala_count.html
@@ -0,0 +1,353 @@
+<!DOCTYPE html
+ SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_aggregate_functions.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 3.0.x"><meta name="version" content="Impala 3.0.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="count"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>COUNT Function</title></head><body id="count"><main role="main"><article role="article" aria-labelledby="ariaid-title1">
+
+ <h1 class="title topictitle1" id="ariaid-title1">COUNT Function</h1>
+
+
+
+ <div class="body conbody">
+
+ <p class="p">
+
+ An aggregate function that returns the number of rows, or the number of non-<code class="ph codeph">NULL</code> rows.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Syntax:</strong>
+ </p>
+
+<pre class="pre codeblock"><code>COUNT([DISTINCT | ALL] <var class="keyword varname">expression</var>) [OVER (<var class="keyword varname">analytic_clause</var>)]</code></pre>
+
+ <p class="p">
+ Depending on the argument, <code class="ph codeph">COUNT()</code> considers rows that meet certain conditions:
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ The notation <code class="ph codeph">COUNT(*)</code> includes <code class="ph codeph">NULL</code> values in the total.
+ </li>
+
+ <li class="li">
+ The notation <code class="ph codeph">COUNT(<var class="keyword varname">column_name</var>)</code> only considers rows where the column
+ contains a non-<code class="ph codeph">NULL</code> value.
+ </li>
+
+ <li class="li">
+ You can also combine <code class="ph codeph">COUNT</code> with the <code class="ph codeph">DISTINCT</code> operator to eliminate
+ duplicates before counting, and to count the combinations of values across multiple columns.
+ </li>
+ </ul>
+
+ <p class="p">
+ When the query contains a <code class="ph codeph">GROUP BY</code> clause, returns one value for each combination of
+ grouping values.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Return type:</strong> <code class="ph codeph">BIGINT</code>
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Usage notes:</strong>
+ </p>
+
+ <p class="p">
+ If you frequently run aggregate functions such as <code class="ph codeph">MIN()</code>, <code class="ph codeph">MAX()</code>, and
+ <code class="ph codeph">COUNT(DISTINCT)</code> on partition key columns, consider enabling the <code class="ph codeph">OPTIMIZE_PARTITION_KEY_SCANS</code>
+ query option, which optimizes such queries. This feature is available in <span class="keyword">Impala 2.5</span> and higher.
+ See <a class="xref" href="../shared/../topics/impala_optimize_partition_key_scans.html">OPTIMIZE_PARTITION_KEY_SCANS Query Option (Impala 2.5 or higher only)</a>
+ for the kinds of queries that this option applies to, and slight differences in how partitions are
+ evaluated when this query option is enabled.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Complex type considerations:</strong>
+ </p>
+
+ <p class="p">
+ To access a column with a complex type (<code class="ph codeph">ARRAY</code>, <code class="ph codeph">STRUCT</code>, or <code class="ph codeph">MAP</code>)
+ in an aggregation function, you unpack the individual elements using join notation in the query,
+ and then apply the function to the final scalar item, field, key, or value at the bottom of any nested type hierarchy in the column.
+ See <a class="xref" href="../shared/../topics/impala_complex_types.html#complex_types">Complex Types (Impala 2.3 or higher only)</a> for details about using complex types in Impala.
+ </p>
+
+ <div class="p">
+The following example demonstrates calls to several aggregation functions
+using values from a column containing nested complex types
+(an <code class="ph codeph">ARRAY</code> of <code class="ph codeph">STRUCT</code> items).
+The array is unpacked inside the query using join notation.
+The array elements are referenced using the <code class="ph codeph">ITEM</code>
+pseudocolumn, and the structure fields inside the array elements
+are referenced using dot notation.
+Numeric values such as <code class="ph codeph">SUM()</code> and <code class="ph codeph">AVG()</code>
+are computed using the numeric <code class="ph codeph">R_NATIONKEY</code> field, and
+the general-purpose <code class="ph codeph">MAX()</code> and <code class="ph codeph">MIN()</code>
+values are computed from the string <code class="ph codeph">N_NAME</code> field.
+<pre class="pre codeblock"><code>describe region;
++-------------+-------------------------+---------+
+| name | type | comment |
++-------------+-------------------------+---------+
+| r_regionkey | smallint | |
+| r_name | string | |
+| r_comment | string | |
+| r_nations | array<struct< | |
+| | n_nationkey:smallint, | |
+| | n_name:string, | |
+| | n_comment:string | |
+| | >> | |
++-------------+-------------------------+---------+
+
+select r_name, r_nations.item.n_nationkey
+ from region, region.r_nations as r_nations
+order by r_name, r_nations.item.n_nationkey;
++-------------+------------------+
+| r_name | item.n_nationkey |
++-------------+------------------+
+| AFRICA | 0 |
+| AFRICA | 5 |
+| AFRICA | 14 |
+| AFRICA | 15 |
+| AFRICA | 16 |
+| AMERICA | 1 |
+| AMERICA | 2 |
+| AMERICA | 3 |
+| AMERICA | 17 |
+| AMERICA | 24 |
+| ASIA | 8 |
+| ASIA | 9 |
+| ASIA | 12 |
+| ASIA | 18 |
+| ASIA | 21 |
+| EUROPE | 6 |
+| EUROPE | 7 |
+| EUROPE | 19 |
+| EUROPE | 22 |
+| EUROPE | 23 |
+| MIDDLE EAST | 4 |
+| MIDDLE EAST | 10 |
+| MIDDLE EAST | 11 |
+| MIDDLE EAST | 13 |
+| MIDDLE EAST | 20 |
++-------------+------------------+
+
+select
+ r_name,
+ count(r_nations.item.n_nationkey) as count,
+ sum(r_nations.item.n_nationkey) as sum,
+ avg(r_nations.item.n_nationkey) as avg,
+ min(r_nations.item.n_name) as minimum,
+ max(r_nations.item.n_name) as maximum,
+ ndv(r_nations.item.n_nationkey) as distinct_vals
+from
+ region, region.r_nations as r_nations
+group by r_name
+order by r_name;
++-------------+-------+-----+------+-----------+----------------+---------------+
+| r_name | count | sum | avg | minimum | maximum | distinct_vals |
++-------------+-------+-----+------+-----------+----------------+---------------+
+| AFRICA | 5 | 50 | 10 | ALGERIA | MOZAMBIQUE | 5 |
+| AMERICA | 5 | 47 | 9.4 | ARGENTINA | UNITED STATES | 5 |
+| ASIA | 5 | 68 | 13.6 | CHINA | VIETNAM | 5 |
+| EUROPE | 5 | 77 | 15.4 | FRANCE | UNITED KINGDOM | 5 |
+| MIDDLE EAST | 5 | 58 | 11.6 | EGYPT | SAUDI ARABIA | 5 |
++-------------+-------+-----+------+-----------+----------------+---------------+
+</code></pre>
+</div>
+
+ <p class="p">
+ <strong class="ph b">Examples:</strong>
+ </p>
+
+<pre class="pre codeblock"><code>-- How many rows total are in the table, regardless of NULL values?
+select count(*) from t1;
+-- How many rows are in the table with non-NULL values for a column?
+select count(c1) from t1;
+-- Count the rows that meet certain conditions.
+-- Again, * includes NULLs, so COUNT(*) might be greater than COUNT(col).
+select count(*) from t1 where x > 10;
+select count(c1) from t1 where x > 10;
+-- Can also be used in combination with DISTINCT and/or GROUP BY.
+-- Combine COUNT and DISTINCT to find the number of unique values.
+-- Must use column names rather than * with COUNT(DISTINCT ...) syntax.
+-- Rows with NULL values are not counted.
+select count(distinct c1) from t1;
+-- Rows with a NULL value in _either_ column are not counted.
+select count(distinct c1, c2) from t1;
+-- Return more than one result.
+select month, year, count(distinct visitor_id) from web_stats group by month, year;
+</code></pre>
+
+ <div class="p">
+ The following examples show how to use <code class="ph codeph">COUNT()</code> in an analytic context. They use a table
+ containing integers from 1 to 10. Notice how the <code class="ph codeph">COUNT()</code> is reported for each input value, as
+ opposed to the <code class="ph codeph">GROUP BY</code> clause which condenses the result set.
+<pre class="pre codeblock"><code>select x, property, count(x) over (partition by property) as count from int_t where property in ('odd','even');
++----+----------+-------+
+| x | property | count |
++----+----------+-------+
+| 2 | even | 5 |
+| 4 | even | 5 |
+| 6 | even | 5 |
+| 8 | even | 5 |
+| 10 | even | 5 |
+| 1 | odd | 5 |
+| 3 | odd | 5 |
+| 5 | odd | 5 |
+| 7 | odd | 5 |
+| 9 | odd | 5 |
++----+----------+-------+
+</code></pre>
+
+Adding an <code class="ph codeph">ORDER BY</code> clause lets you experiment with results that are cumulative or apply to a moving
+set of rows (the <span class="q">"window"</span>). The following examples use <code class="ph codeph">COUNT()</code> in an analytic context
+(that is, with an <code class="ph codeph">OVER()</code> clause) to produce a running count of all the even values,
+then a running count of all the odd values. The basic <code class="ph codeph">ORDER BY x</code> clause implicitly
+activates a window clause of <code class="ph codeph">RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</code>,
+which is effectively the same as <code class="ph codeph">ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</code>,
+therefore all of these examples produce the same results:
+<pre class="pre codeblock"><code>select x, property,
+ count(x) over (partition by property <strong class="ph b">order by x</strong>) as 'cumulative count'
+ from int_t where property in ('odd','even');
++----+----------+------------------+
+| x | property | cumulative count |
++----+----------+------------------+
+| 2 | even | 1 |
+| 4 | even | 2 |
+| 6 | even | 3 |
+| 8 | even | 4 |
+| 10 | even | 5 |
+| 1 | odd | 1 |
+| 3 | odd | 2 |
+| 5 | odd | 3 |
+| 7 | odd | 4 |
+| 9 | odd | 5 |
++----+----------+------------------+
+
+select x, property,
+ count(x) over
+ (
+ partition by property
+ <strong class="ph b">order by x</strong>
+ <strong class="ph b">range between unbounded preceding and current row</strong>
+ ) as 'cumulative total'
+from int_t where property in ('odd','even');
++----+----------+------------------+
+| x | property | cumulative count |
++----+----------+------------------+
+| 2 | even | 1 |
+| 4 | even | 2 |
+| 6 | even | 3 |
+| 8 | even | 4 |
+| 10 | even | 5 |
+| 1 | odd | 1 |
+| 3 | odd | 2 |
+| 5 | odd | 3 |
+| 7 | odd | 4 |
+| 9 | odd | 5 |
++----+----------+------------------+
+
+select x, property,
+ count(x) over
+ (
+ partition by property
+ <strong class="ph b">order by x</strong>
+ <strong class="ph b">rows between unbounded preceding and current row</strong>
+ ) as 'cumulative total'
+ from int_t where property in ('odd','even');
++----+----------+------------------+
+| x | property | cumulative count |
++----+----------+------------------+
+| 2 | even | 1 |
+| 4 | even | 2 |
+| 6 | even | 3 |
+| 8 | even | 4 |
+| 10 | even | 5 |
+| 1 | odd | 1 |
+| 3 | odd | 2 |
+| 5 | odd | 3 |
+| 7 | odd | 4 |
+| 9 | odd | 5 |
++----+----------+------------------+
+</code></pre>
+
+The following examples show how to construct a moving window, with a running count taking into account 1 row before
+and 1 row after the current row, within the same partition (all the even values or all the odd values).
+Therefore, the count is consistently 3 for rows in the middle of the window, and 2 for
+rows near the ends of the window, where there is no preceding or no following row in the partition.
+Because of a restriction in the Impala <code class="ph codeph">RANGE</code> syntax, this type of
+moving window is possible with the <code class="ph codeph">ROWS BETWEEN</code> clause but not the <code class="ph codeph">RANGE BETWEEN</code>
+clause:
+<pre class="pre codeblock"><code>select x, property,
+ count(x) over
+ (
+ partition by property
+ <strong class="ph b">order by x</strong>
+ <strong class="ph b">rows between 1 preceding and 1 following</strong>
+ ) as 'moving total'
+ from int_t where property in ('odd','even');
++----+----------+--------------+
+| x | property | moving total |
++----+----------+--------------+
+| 2 | even | 2 |
+| 4 | even | 3 |
+| 6 | even | 3 |
+| 8 | even | 3 |
+| 10 | even | 2 |
+| 1 | odd | 2 |
+| 3 | odd | 3 |
+| 5 | odd | 3 |
+| 7 | odd | 3 |
+| 9 | odd | 2 |
++----+----------+--------------+
+
+-- Doesn't work because of syntax restriction on RANGE clause.
+select x, property,
+ count(x) over
+ (
+ partition by property
+ <strong class="ph b">order by x</strong>
+ <strong class="ph b">range between 1 preceding and 1 following</strong>
+ ) as 'moving total'
+from int_t where property in ('odd','even');
+ERROR: AnalysisException: RANGE is only supported with both the lower and upper bounds UNBOUNDED or one UNBOUNDED and the other CURRENT ROW.
+</code></pre>
+ </div>
+
+ <div class="note note note_note"><span class="note__title notetitle">Note:</span>
+ <p class="p">
+ By default, Impala only allows a single <code class="ph codeph">COUNT(DISTINCT <var class="keyword varname">columns</var>)</code>
+ expression in each query.
+ </p>
+ <p class="p">
+ If you do not need precise accuracy, you can produce an estimate of the distinct values for a column by
+ specifying <code class="ph codeph">NDV(<var class="keyword varname">column</var>)</code>; a query can contain multiple instances of
+ <code class="ph codeph">NDV(<var class="keyword varname">column</var>)</code>. To make Impala automatically rewrite
+ <code class="ph codeph">COUNT(DISTINCT)</code> expressions to <code class="ph codeph">NDV()</code>, enable the
+ <code class="ph codeph">APPX_COUNT_DISTINCT</code> query option.
+ </p>
+ <p class="p">
+ To produce the same result as multiple <code class="ph codeph">COUNT(DISTINCT)</code> expressions, you can use the
+ following technique for queries involving a single table:
+ </p>
+<pre class="pre codeblock"><code>select v1.c1 result1, v2.c1 result2 from
+ (select count(distinct col1) as c1 from t1) v1
+ cross join
+ (select count(distinct col2) as c1 from t1) v2;
+</code></pre>
+ <p class="p">
+ Because <code class="ph codeph">CROSS JOIN</code> is an expensive operation, prefer to use the <code class="ph codeph">NDV()</code>
+ technique wherever practical.
+ </p>
+ </div>
+
+ <p class="p">
+ <strong class="ph b">Related information:</strong>
+ </p>
+
+ <p class="p">
+ <a class="xref" href="impala_analytic_functions.html#analytic_functions">Impala Analytic Functions</a>
+ </p>
+
+ </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_aggregate_functions.html">Impala Aggregate Functions</a></div></div></nav></article></main></body></html>
http://git-wip-us.apache.org/repos/asf/impala/blob/fae51ec2/docs/build3x/html/topics/impala_create_database.html
----------------------------------------------------------------------
diff --git a/docs/build3x/html/topics/impala_create_database.html b/docs/build3x/html/topics/impala_create_database.html
new file mode 100644
index 0000000..14cd785
--- /dev/null
+++ b/docs/build3x/html/topics/impala_create_database.html
@@ -0,0 +1,209 @@
+<!DOCTYPE html
+ SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_langref_sql.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 3.0.x"><meta name="version" content="Impala 3.0.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="create_database"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>CREATE DATABASE Statement</title></head><body id="create_database"><main role="main"><article role="article" aria-labelledby="ariaid-title1">
+
+ <h1 class="title topictitle1" id="ariaid-title1">CREATE DATABASE Statement</h1>
+
+
+
+ <div class="body conbody">
+
+ <p class="p">
+
+ Creates a new database.
+ </p>
+
+ <p class="p">
+ In Impala, a database is both:
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ A logical construct for grouping together related tables, views, and functions within their own namespace.
+ You might use a separate database for each application, set of related tables, or round of experimentation.
+ </li>
+
+ <li class="li">
+ A physical construct represented by a directory tree in HDFS. Tables (internal tables), partitions, and
+ data files are all located under this directory. You can perform HDFS-level operations such as backing it up and measuring space usage,
+ or remove it with a <code class="ph codeph">DROP DATABASE</code> statement.
+ </li>
+ </ul>
+
+ <p class="p">
+ <strong class="ph b">Syntax:</strong>
+ </p>
+
+<pre class="pre codeblock"><code>CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] <var class="keyword varname">database_name</var>[COMMENT '<var class="keyword varname">database_comment</var>']
+ [LOCATION <var class="keyword varname">hdfs_path</var>];</code></pre>
+
+ <p class="p">
+ <strong class="ph b">Statement type:</strong> DDL
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Usage notes:</strong>
+ </p>
+
+ <p class="p">
+ A database is physically represented as a directory in HDFS, with a filename extension <code class="ph codeph">.db</code>,
+ under the main Impala data directory. If the associated HDFS directory does not exist, it is created for you.
+ All databases and their associated directories are top-level objects, with no physical or logical nesting.
+ </p>
+
+ <p class="p">
+ After creating a database, to make it the current database within an <span class="keyword cmdname">impala-shell</span> session,
+ use the <code class="ph codeph">USE</code> statement. You can refer to tables in the current database without prepending
+ any qualifier to their names.
+ </p>
+
+ <p class="p">
+ When you first connect to Impala through <span class="keyword cmdname">impala-shell</span>, the database you start in (before
+ issuing any <code class="ph codeph">CREATE DATABASE</code> or <code class="ph codeph">USE</code> statements) is named
+ <code class="ph codeph">default</code>.
+ </p>
+
+ <div class="p">
+ Impala includes another predefined database, <code class="ph codeph">_impala_builtins</code>, that serves as the location
+ for the <a class="xref" href="../shared/../topics/impala_functions.html#builtins">built-in functions</a>. To see the built-in
+ functions, use a statement like the following:
+<pre class="pre codeblock"><code>show functions in _impala_builtins;
+show functions in _impala_builtins like '*<var class="keyword varname">substring</var>*';
+</code></pre>
+ </div>
+
+ <p class="p">
+ After creating a database, your <span class="keyword cmdname">impala-shell</span> session or another
+ <span class="keyword cmdname">impala-shell</span> connected to the same node can immediately access that database. To access
+ the database through the Impala daemon on a different node, issue the <code class="ph codeph">INVALIDATE METADATA</code>
+ statement first while connected to that other node.
+ </p>
+
+ <p class="p">
+ Setting the <code class="ph codeph">LOCATION</code> attribute for a new database is a way to work with sets of files in an
+ HDFS directory structure outside the default Impala data directory, as opposed to setting the
+ <code class="ph codeph">LOCATION</code> attribute for each individual table.
+ </p>
+
+ <p class="p">
+ If you connect to different Impala nodes within an <span class="keyword cmdname">impala-shell</span> session for
+ load-balancing purposes, you can enable the <code class="ph codeph">SYNC_DDL</code> query option to make each DDL
+ statement wait before returning, until the new or changed metadata has been received by all the Impala
+ nodes. See <a class="xref" href="../shared/../topics/impala_sync_ddl.html#sync_ddl">SYNC_DDL Query Option</a> for details.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Hive considerations:</strong>
+ </p>
+
+ <p class="p">
+ When you create a database in Impala, the database can also be used by Hive.
+ When you create a database in Hive, issue an <code class="ph codeph">INVALIDATE METADATA</code>
+ statement in Impala to make Impala permanently aware of the new database.
+ </p>
+
+ <p class="p">
+ The <code class="ph codeph">SHOW DATABASES</code> statement lists all databases, or the databases whose name
+ matches a wildcard pattern. <span class="ph">In <span class="keyword">Impala 2.5</span> and higher, the
+ <code class="ph codeph">SHOW DATABASES</code> output includes a second column that displays the associated
+ comment, if any, for each database.</span>
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Amazon S3 considerations:</strong>
+ </p>
+
+ <p class="p">
+ To specify that any tables created within a database reside on the Amazon S3 system,
+ you can include an <code class="ph codeph">s3a://</code> prefix on the <code class="ph codeph">LOCATION</code>
+ attribute. In <span class="keyword">Impala 2.6</span> and higher, Impala automatically creates any
+ required folders as the databases, tables, and partitions are created, and removes
+ them when they are dropped.
+ </p>
+
+ <p class="p">
+ In <span class="keyword">Impala 2.6</span> and higher, Impala DDL statements such as
+ <code class="ph codeph">CREATE DATABASE</code>, <code class="ph codeph">CREATE TABLE</code>, <code class="ph codeph">DROP DATABASE CASCADE</code>,
+ <code class="ph codeph">DROP TABLE</code>, and <code class="ph codeph">ALTER TABLE [ADD|DROP] PARTITION</code> can create or remove folders
+ as needed in the Amazon S3 system. Prior to <span class="keyword">Impala 2.6</span>, you had to create folders yourself and point
+ Impala database, tables, or partitions at them, and manually remove folders when no longer needed.
+ See <a class="xref" href="../shared/../topics/impala_s3.html#s3">Using Impala with the Amazon S3 Filesystem</a> for details about reading and writing S3 data with Impala.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Cancellation:</strong> Cannot be cancelled.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">HDFS permissions:</strong>
+ </p>
+ <p class="p">
+ The user ID that the <span class="keyword cmdname">impalad</span> daemon runs under,
+ typically the <code class="ph codeph">impala</code> user, must have write
+ permission for the parent HDFS directory under which the database
+ is located.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Examples:</strong>
+ </p>
+
+ <pre class="pre codeblock"><code>create database first_db;
+use first_db;
+create table t1 (x int);
+
+create database second_db;
+use second_db;
+-- Each database has its own namespace for tables.
+-- You can reuse the same table names in each database.
+create table t1 (s string);
+
+create database temp;
+
+-- You can either USE a database after creating it,
+-- or qualify all references to the table name with the name of the database.
+-- Here, tables T2 and T3 are both created in the TEMP database.
+
+create table temp.t2 (x int, y int);
+use database temp;
+create table t3 (s string);
+
+-- You cannot drop a database while it is selected by the USE statement.
+drop database temp;
+<em class="ph i">ERROR: AnalysisException: Cannot drop current default database: temp</em>
+
+-- The always-available database 'default' is a convenient one to USE
+-- before dropping a database you created.
+use default;
+
+-- Before dropping a database, first drop all the tables inside it,
+<span class="ph">-- or in <span class="keyword">Impala 2.3</span> and higher use the CASCADE clause.</span>
+drop database temp;
+ERROR: ImpalaRuntimeException: Error making 'dropDatabase' RPC to Hive Metastore:
+CAUSED BY: InvalidOperationException: Database temp is not empty
+show tables in temp;
++------+
+| name |
++------+
+| t3 |
++------+
+
+<span class="ph">-- <span class="keyword">Impala 2.3</span> and higher:</span>
+<span class="ph">drop database temp cascade;</span>
+
+-- Earlier releases:
+drop table temp.t3;
+drop database temp;
+</code></pre>
+
+ <p class="p">
+ <strong class="ph b">Related information:</strong>
+ </p>
+
+ <p class="p">
+ <a class="xref" href="impala_databases.html#databases">Overview of Impala Databases</a>, <a class="xref" href="impala_drop_database.html#drop_database">DROP DATABASE Statement</a>,
+ <a class="xref" href="impala_use.html#use">USE Statement</a>, <a class="xref" href="impala_show.html#show_databases">SHOW DATABASES</a>,
+ <a class="xref" href="impala_tables.html#tables">Overview of Impala Tables</a>
+ </p>
+ </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_langref_sql.html">Impala SQL Statements</a></div></div></nav></article></main></body></html>
http://git-wip-us.apache.org/repos/asf/impala/blob/fae51ec2/docs/build3x/html/topics/impala_create_function.html
----------------------------------------------------------------------
diff --git a/docs/build3x/html/topics/impala_create_function.html b/docs/build3x/html/topics/impala_create_function.html
new file mode 100644
index 0000000..9b25620
--- /dev/null
+++ b/docs/build3x/html/topics/impala_create_function.html
@@ -0,0 +1,502 @@
+<!DOCTYPE html
+ SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_langref_sql.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 3.0.x"><meta name="version" content="Impala 3.0.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="create_function"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>CREATE FUNCTION Statement</title></head><body id="create_function"><main role="main"><article role="article" aria-labelledby="ariaid-title1">
+
+ <h1 class="title topictitle1" id="ariaid-title1">CREATE FUNCTION Statement</h1>
+
+
+
+ <div class="body conbody">
+
+ <p class="p">
+
+ Creates a user-defined function (UDF), which you can use to implement custom logic during
+ <code class="ph codeph">SELECT</code> or <code class="ph codeph">INSERT</code> operations.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Syntax:</strong>
+ </p>
+
+ <p class="p">
+ The syntax is different depending on whether you create a scalar UDF, which is called once for each row and
+ implemented by a single function, or a user-defined aggregate function (UDA), which is implemented by
+ multiple functions that compute intermediate results across sets of rows.
+ </p>
+
+ <p class="p">
+ In <span class="keyword">Impala 2.5</span> and higher, the syntax is also different for creating or dropping scalar Java-based UDFs.
+ The statements for Java UDFs use a new syntax, without any argument types or return type specified. Java-based UDFs
+ created using the new syntax persist across restarts of the Impala catalog server, and can be shared transparently
+ between Impala and Hive.
+ </p>
+
+ <p class="p">
+ To create a persistent scalar C++ UDF with <code class="ph codeph">CREATE FUNCTION</code>:
+ </p>
+
+<pre class="pre codeblock"><code>CREATE FUNCTION [IF NOT EXISTS] [<var class="keyword varname">db_name</var>.]<var class="keyword varname">function_name</var>([<var class="keyword varname">arg_type</var>[, <var class="keyword varname">arg_type</var>...])
+ RETURNS <var class="keyword varname">return_type</var>
+ LOCATION '<var class="keyword varname">hdfs_path_to_dot_so</var>'
+ SYMBOL='<var class="keyword varname">symbol_name</var>'</code></pre>
+
+ <div class="p">
+ To create a persistent Java UDF with <code class="ph codeph">CREATE FUNCTION</code>:
+<pre class="pre codeblock"><code>CREATE FUNCTION [IF NOT EXISTS] [<var class="keyword varname">db_name</var>.]<var class="keyword varname">function_name</var>
+ LOCATION '<var class="keyword varname">hdfs_path_to_jar</var>'
+ SYMBOL='<var class="keyword varname">class_name</var>'</code></pre>
+ </div>
+
+
+
+ <p class="p">
+ To create a persistent UDA, which must be written in C++, issue a <code class="ph codeph">CREATE AGGREGATE FUNCTION</code> statement:
+ </p>
+
+<pre class="pre codeblock"><code>CREATE [AGGREGATE] FUNCTION [IF NOT EXISTS] [<var class="keyword varname">db_name</var>.]<var class="keyword varname">function_name</var>([<var class="keyword varname">arg_type</var>[, <var class="keyword varname">arg_type</var>...])
+ RETURNS <var class="keyword varname">return_type</var>
+ LOCATION '<var class="keyword varname">hdfs_path</var>'
+ [INIT_FN='<var class="keyword varname">function</var>]
+ UPDATE_FN='<var class="keyword varname">function</var>
+ MERGE_FN='<var class="keyword varname">function</var>
+ [PREPARE_FN='<var class="keyword varname">function</var>]
+ [CLOSEFN='<var class="keyword varname">function</var>]
+ <span class="ph">[SERIALIZE_FN='<var class="keyword varname">function</var>]</span>
+ [FINALIZE_FN='<var class="keyword varname">function</var>]
+ <span class="ph">[INTERMEDIATE <var class="keyword varname">type_spec</var>]</span></code></pre>
+
+ <p class="p">
+ <strong class="ph b">Statement type:</strong> DDL
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Varargs notation:</strong>
+ </p>
+
+ <div class="note note note_note"><span class="note__title notetitle">Note:</span>
+ <p class="p">
+ Variable-length argument lists are supported for C++ UDFs, but currently not for Java UDFs.
+ </p>
+ </div>
+
+ <p class="p">
+ If the underlying implementation of your function accepts a variable number of arguments:
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ The variable arguments must go last in the argument list.
+ </li>
+
+ <li class="li">
+ The variable arguments must all be of the same type.
+ </li>
+
+ <li class="li">
+ You must include at least one instance of the variable arguments in every function call invoked from SQL.
+ </li>
+
+ <li class="li">
+ You designate the variable portion of the argument list in the <code class="ph codeph">CREATE FUNCTION</code> statement
+ by including <code class="ph codeph">...</code> immediately after the type name of the first variable argument. For
+ example, to create a function that accepts an <code class="ph codeph">INT</code> argument, followed by a
+ <code class="ph codeph">BOOLEAN</code>, followed by one or more <code class="ph codeph">STRING</code> arguments, your <code class="ph codeph">CREATE
+ FUNCTION</code> statement would look like:
+<pre class="pre codeblock"><code>CREATE FUNCTION <var class="keyword varname">func_name</var> (INT, BOOLEAN, STRING ...)
+ RETURNS <var class="keyword varname">type</var> LOCATION '<var class="keyword varname">path</var>' SYMBOL='<var class="keyword varname">entry_point</var>';
+</code></pre>
+ </li>
+ </ul>
+
+ <p class="p">
+ See <a class="xref" href="impala_udf.html#udf_varargs">Variable-Length Argument Lists</a> for how to code a C++ UDF to accept
+ variable-length argument lists.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Scalar and aggregate functions:</strong>
+ </p>
+
+ <p class="p">
+ The simplest kind of user-defined function returns a single scalar value each time it is called, typically
+ once for each row in the result set. This general kind of function is what is usually meant by UDF.
+ User-defined aggregate functions (UDAs) are a specialized kind of UDF that produce a single value based on
+ the contents of multiple rows. You usually use UDAs in combination with a <code class="ph codeph">GROUP BY</code> clause to
+ condense a large result set into a smaller one, or even a single row summarizing column values across an
+ entire table.
+ </p>
+
+ <p class="p">
+ You create UDAs by using the <code class="ph codeph">CREATE AGGREGATE FUNCTION</code> syntax. The clauses
+ <code class="ph codeph">INIT_FN</code>, <code class="ph codeph">UPDATE_FN</code>, <code class="ph codeph">MERGE_FN</code>,
+ <span class="ph"><code class="ph codeph">SERIALIZE_FN</code>,</span> <code class="ph codeph">FINALIZE_FN</code>, and
+ <code class="ph codeph">INTERMEDIATE</code> only apply when you create a UDA rather than a scalar UDF.
+ </p>
+
+ <p class="p">
+ The <code class="ph codeph">*_FN</code> clauses specify functions to call at different phases of function processing.
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ <strong class="ph b">Initialize:</strong> The function you specify with the <code class="ph codeph">INIT_FN</code> clause does any initial
+ setup, such as initializing member variables in internal data structures. This function is often a stub for
+ simple UDAs. You can omit this clause and a default (no-op) function will be used.
+ </li>
+
+ <li class="li">
+ <strong class="ph b">Update:</strong> The function you specify with the <code class="ph codeph">UPDATE_FN</code> clause is called once for each
+ row in the original result set, that is, before any <code class="ph codeph">GROUP BY</code> clause is applied. A separate
+ instance of the function is called for each different value returned by the <code class="ph codeph">GROUP BY</code>
+ clause. The final argument passed to this function is a pointer, to which you write an updated value based
+ on its original value and the value of the first argument.
+ </li>
+
+ <li class="li">
+ <strong class="ph b">Merge:</strong> The function you specify with the <code class="ph codeph">MERGE_FN</code> clause is called an arbitrary
+ number of times, to combine intermediate values produced by different nodes or different threads as Impala
+ reads and processes data files in parallel. The final argument passed to this function is a pointer, to
+ which you write an updated value based on its original value and the value of the first argument.
+ </li>
+
+ <li class="li">
+ <strong class="ph b">Serialize:</strong> The function you specify with the <code class="ph codeph">SERIALIZE_FN</code> clause frees memory
+ allocated to intermediate results. It is required if any memory was allocated by the Allocate function in
+ the Init, Update, or Merge functions, or if the intermediate type contains any pointers. See
+ <span class="xref">the UDA code samples</span> for details.
+ </li>
+
+ <li class="li">
+ <strong class="ph b">Finalize:</strong> The function you specify with the <code class="ph codeph">FINALIZE_FN</code> clause does any required
+ teardown for resources acquired by your UDF, such as freeing memory, closing file handles if you explicitly
+ opened any files, and so on. This function is often a stub for simple UDAs. You can omit this clause and a
+ default (no-op) function will be used. It is required in UDAs where the final return type is different than
+ the intermediate type. or if any memory was allocated by the Allocate function in the Init, Update, or
+ Merge functions. See <span class="xref">the UDA code samples</span> for details.
+ </li>
+ </ul>
+
+ <p class="p">
+ If you use a consistent naming convention for each of the underlying functions, Impala can automatically
+ determine the names based on the first such clause, so the others are optional.
+ </p>
+
+
+
+ <p class="p">
+ For end-to-end examples of UDAs, see <a class="xref" href="impala_udf.html#udfs">Impala User-Defined Functions (UDFs)</a>.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Complex type considerations:</strong>
+ </p>
+
+ <p class="p">
+ Currently, Impala UDFs cannot accept arguments or return values of the Impala complex types
+ (<code class="ph codeph">STRUCT</code>, <code class="ph codeph">ARRAY</code>, or <code class="ph codeph">MAP</code>).
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Usage notes:</strong>
+ </p>
+
+ <ul class="ul">
+ <li class="li">
+ You can write Impala UDFs in either C++ or Java. C++ UDFs are new to Impala, and are the recommended format
+ for high performance utilizing native code. Java-based UDFs are compatible between Impala and Hive, and are
+ most suited to reusing existing Hive UDFs. (Impala can run Java-based Hive UDFs but not Hive UDAs.)
+ </li>
+
+ <li class="li">
+ <span class="keyword">Impala 2.5</span> introduces UDF improvements to persistence for both C++ and Java UDFs,
+ and better compatibility between Impala and Hive for Java UDFs.
+ See <a class="xref" href="impala_udf.html#udfs">Impala User-Defined Functions (UDFs)</a> for details.
+ </li>
+
+ <li class="li">
+ The body of the UDF is represented by a <code class="ph codeph">.so</code> or <code class="ph codeph">.jar</code> file, which you store
+ in HDFS and the <code class="ph codeph">CREATE FUNCTION</code> statement distributes to each Impala node.
+ </li>
+
+ <li class="li">
+ Impala calls the underlying code during SQL statement evaluation, as many times as needed to process all
+ the rows from the result set. All UDFs are assumed to be deterministic, that is, to always return the same
+ result when passed the same argument values. Impala might or might not skip some invocations of a UDF if
+ the result value is already known from a previous call. Therefore, do not rely on the UDF being called a
+ specific number of times, and do not return different result values based on some external factor such as
+ the current time, a random number function, or an external data source that could be updated while an
+ Impala query is in progress.
+ </li>
+
+ <li class="li">
+ The names of the function arguments in the UDF are not significant, only their number, positions, and data
+ types.
+ </li>
+
+ <li class="li">
+ You can overload the same function name by creating multiple versions of the function, each with a
+ different argument signature. For security reasons, you cannot make a UDF with the same name as any
+ built-in function.
+ </li>
+
+ <li class="li">
+ In the UDF code, you represent the function return result as a <code class="ph codeph">struct</code>. This
+ <code class="ph codeph">struct</code> contains 2 fields. The first field is a <code class="ph codeph">boolean</code> representing
+ whether the value is <code class="ph codeph">NULL</code> or not. (When this field is <code class="ph codeph">true</code>, the return
+ value is interpreted as <code class="ph codeph">NULL</code>.) The second field is the same type as the specified function
+ return type, and holds the return value when the function returns something other than
+ <code class="ph codeph">NULL</code>.
+ </li>
+
+ <li class="li">
+ In the UDF code, you represent the function arguments as an initial pointer to a UDF context structure,
+ followed by references to zero or more <code class="ph codeph">struct</code>s, corresponding to each of the arguments.
+ Each <code class="ph codeph">struct</code> has the same 2 fields as with the return value, a <code class="ph codeph">boolean</code>
+ field representing whether the argument is <code class="ph codeph">NULL</code>, and a field of the appropriate type
+ holding any non-<code class="ph codeph">NULL</code> argument value.
+ </li>
+
+ <li class="li">
+ For sample code and build instructions for UDFs,
+ see <span class="xref">the sample UDFs in the Impala github repo</span>.
+ </li>
+
+ <li class="li">
+ Because the file representing the body of the UDF is stored in HDFS, it is automatically available to all
+ the Impala nodes. You do not need to manually copy any UDF-related files between servers.
+ </li>
+
+ <li class="li">
+ Because Impala currently does not have any <code class="ph codeph">ALTER FUNCTION</code> statement, if you need to rename
+ a function, move it to a different database, or change its signature or other properties, issue a
+ <code class="ph codeph">DROP FUNCTION</code> statement for the original function followed by a <code class="ph codeph">CREATE
+ FUNCTION</code> with the desired properties.
+ </li>
+
+ <li class="li">
+ Because each UDF is associated with a particular database, either issue a <code class="ph codeph">USE</code> statement
+ before doing any <code class="ph codeph">CREATE FUNCTION</code> statements, or specify the name of the function as
+ <code class="ph codeph"><var class="keyword varname">db_name</var>.<var class="keyword varname">function_name</var></code>.
+ </li>
+ </ul>
+
+ <p class="p">
+ If you connect to different Impala nodes within an <span class="keyword cmdname">impala-shell</span> session for
+ load-balancing purposes, you can enable the <code class="ph codeph">SYNC_DDL</code> query option to make each DDL
+ statement wait before returning, until the new or changed metadata has been received by all the Impala
+ nodes. See <a class="xref" href="../shared/../topics/impala_sync_ddl.html#sync_ddl">SYNC_DDL Query Option</a> for details.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Compatibility:</strong>
+ </p>
+
+ <p class="p">
+ Impala can run UDFs that were created through Hive, as long as they refer to Impala-compatible data types
+ (not composite or nested column types). Hive can run Java-based UDFs that were created through Impala, but
+ not Impala UDFs written in C++.
+ </p>
+
+ <p class="p">
+ The Hive <code class="ph codeph">current_user()</code> function cannot be
+ called from a Java UDF through Impala.
+ </p>
+
+ <p class="p"><strong class="ph b">Persistence:</strong></p>
+
+ <p class="p">
+ In <span class="keyword">Impala 2.5</span> and higher, Impala UDFs and UDAs written in C++ are persisted in the metastore database.
+ Java UDFs are also persisted, if they were created with the new <code class="ph codeph">CREATE FUNCTION</code> syntax for Java UDFs,
+ where the Java function argument and return types are omitted.
+ Java-based UDFs created with the old <code class="ph codeph">CREATE FUNCTION</code> syntax do not persist across restarts
+ because they are held in the memory of the <span class="keyword cmdname">catalogd</span> daemon.
+ Until you re-create such Java UDFs using the new <code class="ph codeph">CREATE FUNCTION</code> syntax,
+ you must reload those Java-based UDFs by running the original <code class="ph codeph">CREATE FUNCTION</code> statements again each time
+ you restart the <span class="keyword cmdname">catalogd</span> daemon.
+ Prior to <span class="keyword">Impala 2.5</span> the requirement to reload functions after a restart applied to both C++ and Java functions.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Cancellation:</strong> Cannot be cancelled.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">HDFS permissions:</strong> This statement does not touch any HDFS files or directories,
+ therefore no HDFS permissions are required.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Examples:</strong>
+ </p>
+
+ <p class="p">
+ For additional examples of all kinds of user-defined functions, see <a class="xref" href="impala_udf.html#udfs">Impala User-Defined Functions (UDFs)</a>.
+ </p>
+
+ <p class="p">
+ The following example shows how to take a Java jar file and make all the functions inside one of its classes
+ into UDFs under a single (overloaded) function name in Impala. Each <code class="ph codeph">CREATE FUNCTION</code> or
+ <code class="ph codeph">DROP FUNCTION</code> statement applies to all the overloaded Java functions with the same name.
+ This example uses the signatureless syntax for <code class="ph codeph">CREATE FUNCTION</code> and <code class="ph codeph">DROP FUNCTION</code>,
+ which is available in <span class="keyword">Impala 2.5</span> and higher.
+ </p>
+ <p class="p">
+ At the start, the jar file is in the local filesystem. Then it is copied into HDFS, so that it is
+ available for Impala to reference through the <code class="ph codeph">CREATE FUNCTION</code> statement and
+ queries that refer to the Impala function name.
+ </p>
+<pre class="pre codeblock"><code>
+$ jar -tvf udf-examples.jar
+ 0 Mon Feb 22 04:06:50 PST 2016 META-INF/
+ 122 Mon Feb 22 04:06:48 PST 2016 META-INF/MANIFEST.MF
+ 0 Mon Feb 22 04:06:46 PST 2016 org/
+ 0 Mon Feb 22 04:06:46 PST 2016 org/apache/
+ 0 Mon Feb 22 04:06:46 PST 2016 org/apache/impala/
+ 2460 Mon Feb 22 04:06:46 PST 2016 org/apache/impala/IncompatibleUdfTest.class
+ 541 Mon Feb 22 04:06:46 PST 2016 org/apache/impala/TestUdfException.class
+ 3438 Mon Feb 22 04:06:46 PST 2016 org/apache/impala/JavaUdfTest.class
+ 5872 Mon Feb 22 04:06:46 PST 2016 org/apache/impala/TestUdf.class
+...
+$ hdfs dfs -put udf-examples.jar /user/impala/udfs
+$ hdfs dfs -ls /user/impala/udfs
+Found 2 items
+-rw-r--r-- 3 jrussell supergroup 853 2015-10-09 14:05 /user/impala/udfs/hello_world.jar
+-rw-r--r-- 3 jrussell supergroup 7366 2016-06-08 14:25 /user/impala/udfs/udf-examples.jar
+</code></pre>
+ <p class="p">
+ In <span class="keyword cmdname">impala-shell</span>, the <code class="ph codeph">CREATE FUNCTION</code> refers to the HDFS path of the jar file
+ and the fully qualified class name inside the jar. Each of the functions inside the class becomes an
+ Impala function, each one overloaded under the specified Impala function name.
+ </p>
+<pre class="pre codeblock"><code>
+[localhost:21000] > create function testudf location '/user/impala/udfs/udf-examples.jar' symbol='org.apache.impala.TestUdf';
+[localhost:21000] > show functions;
++-------------+---------------------------------------+-------------+---------------+
+| return type | signature | binary type | is persistent |
++-------------+---------------------------------------+-------------+---------------+
+| BIGINT | testudf(BIGINT) | JAVA | true |
+| BOOLEAN | testudf(BOOLEAN) | JAVA | true |
+| BOOLEAN | testudf(BOOLEAN, BOOLEAN) | JAVA | true |
+| BOOLEAN | testudf(BOOLEAN, BOOLEAN, BOOLEAN) | JAVA | true |
+| DOUBLE | testudf(DOUBLE) | JAVA | true |
+| DOUBLE | testudf(DOUBLE, DOUBLE) | JAVA | true |
+| DOUBLE | testudf(DOUBLE, DOUBLE, DOUBLE) | JAVA | true |
+| FLOAT | testudf(FLOAT) | JAVA | true |
+| FLOAT | testudf(FLOAT, FLOAT) | JAVA | true |
+| FLOAT | testudf(FLOAT, FLOAT, FLOAT) | JAVA | true |
+| INT | testudf(INT) | JAVA | true |
+| DOUBLE | testudf(INT, DOUBLE) | JAVA | true |
+| INT | testudf(INT, INT) | JAVA | true |
+| INT | testudf(INT, INT, INT) | JAVA | true |
+| SMALLINT | testudf(SMALLINT) | JAVA | true |
+| SMALLINT | testudf(SMALLINT, SMALLINT) | JAVA | true |
+| SMALLINT | testudf(SMALLINT, SMALLINT, SMALLINT) | JAVA | true |
+| STRING | testudf(STRING) | JAVA | true |
+| STRING | testudf(STRING, STRING) | JAVA | true |
+| STRING | testudf(STRING, STRING, STRING) | JAVA | true |
+| TINYINT | testudf(TINYINT) | JAVA | true |
++-------------+---------------------------------------+-------------+---------------+
+</code></pre>
+ <p class="p">
+ These are all simple functions that return their single arguments, or
+ sum, concatenate, and so on their multiple arguments. Impala determines which
+ overloaded function to use based on the number and types of the arguments.
+ </p>
+<pre class="pre codeblock"><code>
+insert into bigint_x values (1), (2), (4), (3);
+select testudf(x) from bigint_x;
++-----------------+
+| udfs.testudf(x) |
++-----------------+
+| 1 |
+| 2 |
+| 4 |
+| 3 |
++-----------------+
+
+insert into int_x values (1), (2), (4), (3);
+select testudf(x, x+1, x*x) from int_x;
++-------------------------------+
+| udfs.testudf(x, x + 1, x * x) |
++-------------------------------+
+| 4 |
+| 9 |
+| 25 |
+| 16 |
++-------------------------------+
+
+select testudf(x) from string_x;
++-----------------+
+| udfs.testudf(x) |
++-----------------+
+| one |
+| two |
+| four |
+| three |
++-----------------+
+select testudf(x,x) from string_x;
++--------------------+
+| udfs.testudf(x, x) |
++--------------------+
+| oneone |
+| twotwo |
+| fourfour |
+| threethree |
++--------------------+
+</code></pre>
+
+ <p class="p">
+ The previous example used the same Impala function name as the name of the class.
+ This example shows how the Impala function name is independent of the underlying
+ Java class or function names. A second <code class="ph codeph">CREATE FUNCTION</code> statement
+ results in a set of overloaded functions all named <code class="ph codeph">my_func</code>,
+ to go along with the overloaded functions all named <code class="ph codeph">testudf</code>.
+ </p>
+<pre class="pre codeblock"><code>
+create function my_func location '/user/impala/udfs/udf-examples.jar'
+ symbol='org.apache.impala.TestUdf';
+
+show functions;
++-------------+---------------------------------------+-------------+---------------+
+| return type | signature | binary type | is persistent |
++-------------+---------------------------------------+-------------+---------------+
+| BIGINT | my_func(BIGINT) | JAVA | true |
+| BOOLEAN | my_func(BOOLEAN) | JAVA | true |
+| BOOLEAN | my_func(BOOLEAN, BOOLEAN) | JAVA | true |
+...
+| BIGINT | testudf(BIGINT) | JAVA | true |
+| BOOLEAN | testudf(BOOLEAN) | JAVA | true |
+| BOOLEAN | testudf(BOOLEAN, BOOLEAN) | JAVA | true |
+...
+</code></pre>
+ <p class="p">
+ The corresponding <code class="ph codeph">DROP FUNCTION</code> statement with no signature
+ drops all the overloaded functions with that name.
+ </p>
+<pre class="pre codeblock"><code>
+drop function my_func;
+show functions;
++-------------+---------------------------------------+-------------+---------------+
+| return type | signature | binary type | is persistent |
++-------------+---------------------------------------+-------------+---------------+
+| BIGINT | testudf(BIGINT) | JAVA | true |
+| BOOLEAN | testudf(BOOLEAN) | JAVA | true |
+| BOOLEAN | testudf(BOOLEAN, BOOLEAN) | JAVA | true |
+...
+</code></pre>
+ <p class="p">
+ The signatureless <code class="ph codeph">CREATE FUNCTION</code> syntax for Java UDFs ensures that
+ the functions shown in this example remain available after the Impala service
+ (specifically, the Catalog Server) are restarted.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Related information:</strong>
+ </p>
+
+ <p class="p">
+ <a class="xref" href="impala_udf.html#udfs">Impala User-Defined Functions (UDFs)</a> for more background information, usage instructions, and examples for
+ Impala UDFs; <a class="xref" href="impala_drop_function.html#drop_function">DROP FUNCTION Statement</a>
+ </p>
+ </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_langref_sql.html">Impala SQL Statements</a></div></div></nav></article></main></body></html>
http://git-wip-us.apache.org/repos/asf/impala/blob/fae51ec2/docs/build3x/html/topics/impala_create_role.html
----------------------------------------------------------------------
diff --git a/docs/build3x/html/topics/impala_create_role.html b/docs/build3x/html/topics/impala_create_role.html
new file mode 100644
index 0000000..2930c3a
--- /dev/null
+++ b/docs/build3x/html/topics/impala_create_role.html
@@ -0,0 +1,70 @@
+<!DOCTYPE html
+ SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_langref_sql.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 3.0.x"><meta name="version" content="Impala 3.0.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="create_role"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>CREATE ROLE Statement (Impala 2.0 or higher only)</title></head><body id="create_role"><main role="main"><article role="article" aria-labelledby="ariaid-title1">
+
+ <h1 class="title topictitle1" id="ariaid-title1">CREATE ROLE Statement (<span class="keyword">Impala 2.0</span> or higher only)</h1>
+
+
+
+ <div class="body conbody">
+
+ <p class="p">
+
+
+ The <code class="ph codeph">CREATE ROLE</code> statement creates a role to which privileges can be granted. Privileges can
+ be granted to roles, which can then be assigned to users. A user that has been assigned a role will only be
+ able to exercise the privileges of that role. Only users that have administrative privileges can create/drop
+ roles. By default, the <code class="ph codeph">hive</code>, <code class="ph codeph">impala</code> and <code class="ph codeph">hue</code> users have
+ administrative privileges in Sentry.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Syntax:</strong>
+ </p>
+
+<pre class="pre codeblock"><code>CREATE ROLE <var class="keyword varname">role_name</var>
+</code></pre>
+
+ <p class="p">
+ <strong class="ph b">Required privileges:</strong>
+ </p>
+
+ <p class="p">
+ Only administrative users (those with <code class="ph codeph">ALL</code> privileges on the server, defined in the Sentry
+ policy file) can use this statement.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Compatibility:</strong>
+ </p>
+
+ <p class="p">
+ Impala makes use of any roles and privileges specified by the <code class="ph codeph">GRANT</code> and
+ <code class="ph codeph">REVOKE</code> statements in Hive, and Hive makes use of any roles and privileges specified by the
+ <code class="ph codeph">GRANT</code> and <code class="ph codeph">REVOKE</code> statements in Impala. The Impala <code class="ph codeph">GRANT</code>
+ and <code class="ph codeph">REVOKE</code> statements for privileges do not require the <code class="ph codeph">ROLE</code> keyword to be
+ repeated before each role name, unlike the equivalent Hive statements.
+ </p>
+
+
+
+ <p class="p">
+ <strong class="ph b">Cancellation:</strong> Cannot be cancelled.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">HDFS permissions:</strong> This statement does not touch any HDFS files or directories,
+ therefore no HDFS permissions are required.
+ </p>
+
+ <p class="p">
+ <strong class="ph b">Related information:</strong>
+ </p>
+
+ <p class="p">
+ <a class="xref" href="impala_authorization.html#authorization">Enabling Sentry Authorization for Impala</a>, <a class="xref" href="impala_grant.html#grant">GRANT Statement (Impala 2.0 or higher only)</a>,
+ <a class="xref" href="impala_revoke.html#revoke">REVOKE Statement (Impala 2.0 or higher only)</a>, <a class="xref" href="impala_drop_role.html#drop_role">DROP ROLE Statement (Impala 2.0 or higher only)</a>,
+ <a class="xref" href="impala_show.html#show">SHOW Statement</a>
+ </p>
+ </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_langref_sql.html">Impala SQL Statements</a></div></div></nav></article></main></body></html>