You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Thomas Tauber-Marshall (JIRA)" <ji...@apache.org> on 2018/03/20 23:18:00 UTC

[jira] [Created] (IMPALA-6710) Docs around INSERT into partitioned tables are misleading

Thomas Tauber-Marshall created IMPALA-6710:
----------------------------------------------

             Summary: Docs around INSERT into partitioned tables are misleading
                 Key: IMPALA-6710
                 URL: https://issues.apache.org/jira/browse/IMPALA-6710
             Project: IMPALA
          Issue Type: Bug
          Components: Docs
    Affects Versions: Impala 2.12.0
            Reporter: Thomas Tauber-Marshall


Impala's INSERT statement has an optional "partition" clause where partition columns can be specified.

This clause must be used for static partitioning, i.e. where the partition value is specified after the column:
{noformat}
> insert into t1 partition(x=10, y='a') select c1 from some_other_table;
{noformat}

But it is not required for dynamic partition, eg. the following inserts are equivalent:
{noformat}
> create table test (c string) partitioned by (p int);
> insert into foo (p, c) values (0, 'c');
> insert into foo (c) partition(p) values ('c', 0);
> insert into foo partition(p) values ('c', 0);
{noformat}
and note:
- the columns are inserted into in the order they appear in the SQL, hence the order of 'c' and 1 being flipped in the first two examples
- when a partition clause is specified but the other columns are excluded, as in the third example, the other columns are treated as though they had all been specified before the partition clauses in the SQL

Confusingly, though, the partition columns are required to be mentioned in the query in some form, eg:
{noformat}
> insert into foo values ('c', 1);
{noformat}
would be valid for a non-partitioned table, so long as it had a number and types of columns that match the values clause, but can never be valid for a partitioned table.

The docs around this are not very clear:
http://impala.apache.org/docs/build/html/topics/impala_insert.html
and seem to indicate that partition columns must be specified in the "partition" clause, eg. the sentence:
{noformat}
Inserting data into partitioned tables requires slightly different syntax that divides the partitioning columns from the others: 
{noformat}
and the examples that follow it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)