You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Lars Volker (JIRA)" <ji...@apache.org> on 2017/05/26 16:25:04 UTC

[jira] [Resolved] (IMPALA-4163) Introduce SORTBY plan hint for insert statements

     [ https://issues.apache.org/jira/browse/IMPALA-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Volker resolved IMPALA-4163.
---------------------------------
    Resolution: Won't Fix

> Introduce SORTBY plan hint for insert statements
> ------------------------------------------------
>
>                 Key: IMPALA-4163
>                 URL: https://issues.apache.org/jira/browse/IMPALA-4163
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Frontend
>    Affects Versions: Impala 2.2, Impala 2.3.0, Impala 2.5.0, Impala 2.4.0, Impala 2.6.0, Impala 2.7.0
>            Reporter: Alexander Behm
>            Assignee: Lars Volker
>              Labels: performance, ramp-up
>
> In order to improve compression and/or the effectiveness of min/max pruning, it is desirable to control the order in which rows are inserted into table (mostly for Parquet).
> To that end, we should introduce a "sortby" plan hint for insert statements: Example
> {code}
> CREATE TABLE dst (...);
> INSERT INTO dst /*+ sortby(day,hour) */ SELECT * FROM src;
> {code}
> This would produce the following plan:
> SCAN -> SORT(day,hour) -> TABLE SINK
> h4. Syntax and behavior
> {code} INSERT INTO dst /*+ sortby(day,hour) */ SELECT * FROM src; {code}
> - We will not support the legacy-hint style with brackets {code}[sortby(day,hour)]{code}
> - To keep the "clustered" hint strictly separate from the "sortby" hint, it is only legal to use non-partition columns in "sortby" for HDFS tables.
> - Similarly, it is only legal to mention non-primary-key columns of Kudu tables.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)