You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/06/12 17:59:00 UTC

[jira] [Created] (IMPALA-9853) Push rank() predicates into sort

Tim Armstrong created IMPALA-9853:
-------------------------------------

             Summary: Push rank() predicates into sort
                 Key: IMPALA-9853
                 URL: https://issues.apache.org/jira/browse/IMPALA-9853
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
            Reporter: Tim Armstrong
            Assignee: Tim Armstrong


TPC-DS Q67 would benefit significantly if we could push the rank() predicate into the sort to do some reduction of unneeded data. The sorter could evaluate this predicate if it had the partition expressions available - as a post-processing step to the in-memory sort for the analytic sort group, it could do a pass over the sorted run, resetting a counter at the start of each partition boundary.

It might be best to start with tackling IMPALA-3471 by applying the limit within sorted runs, since that doesn't require any planner work.

{noformat}
with results as
(     select i_category ,i_class ,i_brand ,i_product_name ,d_year ,d_qoy ,d_moy ,s_store_id
                  ,sum(coalesce(ss_sales_price*ss_quantity,0)) sumsales
            from store_sales ,date_dim ,store ,item
       where  ss_sold_date_sk=d_date_sk
          and ss_item_sk=i_item_sk
          and ss_store_sk = s_store_sk
          and d_month_seq between 1212 and 1212 + 11
       group by i_category, i_class, i_brand, i_product_name, d_year, d_qoy, d_moy,s_store_id)
 ,
 results_rollup as
 (select i_category, i_class, i_brand, i_product_name, d_year, d_qoy, d_moy, s_store_id, sumsales
  from results
  union all
  select i_category, i_class, i_brand, i_product_name, d_year, d_qoy, d_moy, null s_store_id, sum(sumsales) sumsales
  from results
  group by i_category, i_class, i_brand, i_product_name, d_year, d_qoy, d_moy
  union all
  select i_category, i_class, i_brand, i_product_name, d_year, d_qoy, null d_moy, null s_store_id, sum(sumsales) sumsales
  from results
  group by i_category, i_class, i_brand, i_product_name, d_year, d_qoy
  union all
  select i_category, i_class, i_brand, i_product_name, d_year, null d_qoy, null d_moy, null s_store_id, sum(sumsales) sumsales
  from results
  group by i_category, i_class, i_brand, i_product_name, d_year
  union all
  select i_category, i_class, i_brand, i_product_name, null d_year, null d_qoy, null d_moy, null s_store_id, sum(sumsales) sumsales
  from results
  group by i_category, i_class, i_brand, i_product_name
  union all
  select i_category, i_class, i_brand, null i_product_name, null d_year, null d_qoy, null d_moy, null s_store_id, sum(sumsales) sumsales
  from results
  group by i_category, i_class, i_brand
  union all
  select i_category, i_class, null i_brand, null i_product_name, null d_year, null d_qoy, null d_moy, null s_store_id, sum(sumsales) sumsales
  from results
  group by i_category, i_class
  union all
  select i_category, null i_class, null i_brand, null i_product_name, null d_year, null d_qoy, null d_moy, null s_store_id, sum(sumsales) sumsales
  from results
  group by i_category
  union all
  select null i_category, null i_class, null i_brand, null i_product_name, null d_year, null d_qoy, null d_moy, null s_store_id, sum(sumsales) sumsales
  from results)

 select  *
from (select i_category
            ,i_class
            ,i_brand
            ,i_product_name
            ,d_year
            ,d_qoy
            ,d_moy
            ,s_store_id
            ,sumsales
            ,rank() over (partition by i_category order by sumsales desc) rk
      from results_rollup) dw2
where rk <= 100
order by i_category
        ,i_class
        ,i_brand
        ,i_product_name
        ,d_year
        ,d_qoy
        ,d_moy
        ,s_store_id
        ,sumsales
        ,rk
limit 100
{noformat}

Assigning to myself to fill in more details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)