You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Matthew Jacobs (JIRA)" <ji...@apache.org> on 2017/05/22 19:20:04 UTC
[jira] [Resolved] (IMPALA-5343) Sort by Column(s) added as part of
inserting into Kudu table is incorrect
[ https://issues.apache.org/jira/browse/IMPALA-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthew Jacobs resolved IMPALA-5343.
------------------------------------
Resolution: Not A Problem
Fix Version/s: Impala 2.9.0
The plan and sort is correct, the reason the "KuduPartition" expr is there is because multiple partitions end up at a given sink fragment, and we want the rows inserted to kudu to be per-partition and then ordered by PK.
> Sort by Column(s) added as part of inserting into Kudu table is incorrect
> --------------------------------------------------------------------------
>
> Key: IMPALA-5343
> URL: https://issues.apache.org/jira/browse/IMPALA-5343
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Reporter: Mostafa Mokhtar
> Assignee: Thomas Tauber-Marshall
> Priority: Critical
> Labels: kudu
> Fix For: Impala 2.9.0
>
>
> The planner is including the KuduPartition(PARTITION_COLUMN) as part of the columns included in the sort by clause, The Sort should match the columns as in the primary key.
> Plan
> {code}
> Query: explain insert into lineitem_kudu_ts select * from lineitem_kudu
> | INSERT INTO KUDU [scan_primitives_tpch_3tb.lineitem_kudu_ts] |
> | | |
> | 02:SORT |
> | | order by: KuduPartition(scan_primitives_tpch_3tb.lineitem_kudu.l_orderkey) ASC NULLS LAST, l_shipdate ASC NULLS LAST, l_orderkey ASC NULLS LAST, l_linenumber ASC NULLS LAST |
> | | |
> | 01:EXCHANGE [KUDU(KuduPartition(scan_primitives_tpch_3tb.lineitem_kudu.l_orderkey))] |
> | | |
> | 00:SCAN KUDU [scan_primitives_tpch_3tb.lineitem_kudu] |
> {code}
> DDL
> {code}
> [vd1302.halxg.cloudera.com:21000] > show create table scan_primitives_tpch_3tb.lineitem_kudu_ts;
> Query: show create table scan_primitives_tpch_3tb.lineitem_kudu_ts
> CREATE TABLE scan_primitives_tpch_3tb.lineitem_kudu_ts (
> l_shipdate STRING NOT NULL ENCODING DICT_ENCODING COMPRESSION LZ4,
> l_orderkey BIGINT NOT NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,
> l_linenumber BIGINT NOT NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,
> l_partkey BIGINT NOT NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,
> l_suppkey BIGINT NOT NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,
> l_quantity DOUBLE NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,
> l_extendedprice DOUBLE NULL ENCODING PLAIN_ENCODING COMPRESSION LZ4,
> l_discount DOUBLE NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,
> l_tax DOUBLE NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,
> l_returnflag STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,
> l_linestatus STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,
> l_commitdate TIMESTAMP NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,
> l_receiptdate STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,
> l_shipinstruct STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,
> l_shipmode STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,
> l_comment STRING NULL ENCODING PLAIN_ENCODING COMPRESSION LZ4,
> PRIMARY KEY (l_shipdate, l_orderkey, l_linenumber)
> )
> PARTITION BY HASH (l_orderkey) PARTITIONS 140
> STORED AS KUDU
> TBLPROPERTIES ('kudu.master_addresses'='vd1301.halxg.cloudera.com:7051,vd1128.halxg.cloudera.com:7051')
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)