You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2018/01/26 00:47:00 UTC
[jira] [Commented] (PHOENIX-4560) ORDER BY with GROUP BY doesn't
work if there is WHERE on pk column
[ https://issues.apache.org/jira/browse/PHOENIX-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16340363#comment-16340363 ]
James Taylor commented on PHOENIX-4560:
---------------------------------------
Yep, your analysis is spot on, [~sergey.soldatov]. Nice test case. Here's a patch. We can't assume that there's an equality constraint unless there's only a single value being matched against.
> ORDER BY with GROUP BY doesn't work if there is WHERE on pk column
> ------------------------------------------------------------------
>
> Key: PHOENIX-4560
> URL: https://issues.apache.org/jira/browse/PHOENIX-4560
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.14.0
> Reporter: Sergey Soldatov
> Assignee: James Taylor
> Priority: Major
> Fix For: 4.14.0
>
> Attachments: PHOENIX-4560_v1.patch
>
>
> It's related to the optimizations for group by that were made in PHOENIX-3451. Test case to reproduce:
> {noformat}
> CREATE TABLE IF NOT EXISTS VA_TEST(ID VARCHAR NOT NULL PRIMARY KEY, VAL1 VARCHAR, VAL2 INTEGER);
> UPSERT INTO VA_TEST VALUES('ABC','aa123', 11);
> UPSERT INTO VA_TEST VALUES('ABD','ba124', 1);
> UPSERT INTO VA_TEST VALUES('ABE','cf125', 13);
> UPSERT INTO VA_TEST VALUES('ABF','dan126', 4);
> UPSERT INTO VA_TEST VALUES('ABG','elf127', 15);
> UPSERT INTO VA_TEST VALUES('ABH','fan128', 6);
> UPSERT INTO VA_TEST VALUES('AAA','get211', 100);
> UPSERT INTO VA_TEST VALUES('AAB','hat212', 7);
> UPSERT INTO VA_TEST VALUES('AAC','aap12', 2);
> UPSERT INTO VA_TEST VALUES('AAD','ball12', 3);
> UPSERT INTO VA_TEST VALUES('AAE','inn2110', 13);
> UPSERT INTO VA_TEST VALUES('AAF','key2112', 40);
> select distinct ID, VAL1, VAL2 from VA_TEST where "ID" in ('ABC','ABD','ABE','ABF','ABG','ABH','AAA', 'AAB', 'AAC','AAD','AAE','AAF') order by VAL1 ASC;
> {noformat}
> Execution result :
> {noformat}
> +------+----------+-------+
> | ID | VAL1 | VAL2 |
> +------+----------+-------+
> | AAA | get211 | 100 |
> | AAB | hat212 | 7 |
> | AAC | aap12 | 2 |
> | AAD | ball12 | 3 |
> | AAE | inn2110 | 13 |
> | AAF | key2112 | 40 |
> | ABC | aa123 | 11 |
> | ABD | ba124 | 1 |
> | ABE | cf125 | 13 |
> | ABF | dan126 | 4 |
> | ABG | elf127 | 15 |
> | ABH | fan128 | 6 |
> +------+----------+-------+
> {noformat}
> Explain plan:
> {noformat}
> +----------------------------------------------------------------------------------------+-----------------+----------------+----------------+
> | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS |
> +----------------------------------------------------------------------------------------+-----------------+----------------+----------------+
> | CLIENT 1-CHUNK 12 ROWS 1908 BYTES PARALLEL 1-WAY POINT LOOKUP ON 12 KEYS OVER VA_TEST | 1908 | 12 | 1516917709099 |
> | SERVER AGGREGATE INTO DISTINCT ROWS BY ["ID", "VAL1", "VAL2"] | 1908 | 12 | 1516917709099 |
> | CLIENT MERGE SORT | 1908 | 12 | 1516917709099 |
> +----------------------------------------------------------------------------------------+-----------------+----------------+----------------+
> {noformat}
> As we can see there is no client side sort, so it keeps the original order we get from server which is wrong.
> That happens because of OrderPreservingTracker.hasEqualityConstraints decides that WHERE clause on ID column is constant and let us perform the optimization and skip the client side sort.
> [~jamestaylor], [~chenglei] any thoughts?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)