You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "tu nguyen khac (JIRA)" <ji...@apache.org> on 2016/08/01 11:00:25 UTC

[jira] [Created] (PHOENIX-3131) improve "order by " performance with aggregated query

tu nguyen khac created PHOENIX-3131:
---------------------------------------

             Summary: improve "order by " performance with aggregated query 
                 Key: PHOENIX-3131
                 URL: https://issues.apache.org/jira/browse/PHOENIX-3131
             Project: Phoenix
          Issue Type: Improvement
    Affects Versions: 4.8.0
            Reporter: tu nguyen khac
            Priority: Critical


I created a table in phoenix with query : ( 4 node , ram 8gb, 4 cores / node ) 

CREATE TABLE pageview_site (
    url varchar(255) not null,
    pageview bigint,
    dt date not null,
    CONSTRAINT PK PRIMARY KEY (url, dt ROW_TIMESTAMP)
) SALT_BUCKETS = 4;

After that : 
1. I tried to upsert about : 13 milions rows to this table . 
2. Run 2 queries : 

    a. select url,sum(pageview) as pv FROM pageview_site where dt > to_date ('2016-06-01') group by url limit 100 offset 2;
the duration this query  in about : 0.5 second

    b. select url,sum(pageview) as pv FROM pageview_site where dt > to_date ('2016-06-01') group by ur order by pv descl limit 100 offset 2;

the duration this query  in about : 9.5 seconds

what happens with 2nd query ?? I think we should improve performance for "order by " command 






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)