You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Seung-Hwan Lim (JIRA)" <ji...@apache.org> on 2018/04/17 15:36:00 UTC
[jira] [Comment Edited] (CALCITE-2168) Implement a General Purpose
Benchmark for Calcite
[ https://issues.apache.org/jira/browse/CALCITE-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441031#comment-16441031 ]
Seung-Hwan Lim edited comment on CALCITE-2168 at 4/17/18 3:35 PM:
------------------------------------------------------------------
While writing TPC-DS queries for Calcite with Postgres backends, I have found couple of issues.
1. date time interval compatibility: postgres' dialect is
``` (cast('2000-08-20' as date) + interval '30 days') ```.
For Calcite with postgres backend , when I tried following:
(cast('2000-08-20' as date) + interval '30' day )
I have UnsupportedOperation Exception:
Caused by: java.lang.UnsupportedOperationException: class org.apache.calcite.sql.SqlSyntax$6: SPECIAL
2. nested aggregation with windows function.
in TPC-DS query 98, we have following troublesome phrase:
```sum(ss."ss_ext_sales_price")*100/sum(sum(ss."ss_ext_sales_price")) over
(partition by i."i_class") as REVENUERATIO```
Which generates:
SUM("t"."ss_ext_sales_price") * 100 / CASE WHEN (COUNT(SUM("t"."ss_ext_sales_price")) OVER (PARTITION BY "t1"."i_class" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)) > 0 THEN CAST($SUM0(SUM("t"."ss_ext_sales_price")) OVER (PARTITION BY "t1"."i_class" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS DECIMAL(7, 2)) ELSE NULL END AS "REVENUERATIO"
It causes syntax error in CAST($SUM0(SUM())) part in postgresql.
I'm testing TPC-DS with the version of 1.16.
Thank you,
was (Author: lims1):
While writing TPC-DS queries for Calcite with Postgres backends, I have found couple of issues.
1. date time interval compatibility: postgres' dialect is
``` (cast('2000-08-20' as date) + interval '30 days') ```.
For Calcite with postgres backend , I tried following:
(cast('2000-08-20' as date) + interval '30' day )
I have UnsupportedOperation Exception:
Caused by: java.lang.UnsupportedOperationException: class org.apache.calcite.sql.SqlSyntax$6: SPECIAL
2. nested aggregation with windows function.
in TPC-DS query 98, we have following troublesome phrase:
```sum(ss."ss_ext_sales_price")*100/sum(sum(ss."ss_ext_sales_price")) over
(partition by i."i_class") as REVENUERATIO```
Which generates:
SUM("t"."ss_ext_sales_price") * 100 / CASE WHEN (COUNT(SUM("t"."ss_ext_sales_price")) OVER (PARTITION BY "t1"."i_class" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)) > 0 THEN CAST($SUM0(SUM("t"."ss_ext_sales_price")) OVER (PARTITION BY "t1"."i_class" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS DECIMAL(7, 2)) ELSE NULL END AS "REVENUERATIO"
It causes syntax error in CAST($SUM0(SUM())) part in postgresql.
I'm testing TPC-DS with the version of 1.16.
Thank you,
> Implement a General Purpose Benchmark for Calcite
> --------------------------------------------------
>
> Key: CALCITE-2168
> URL: https://issues.apache.org/jira/browse/CALCITE-2168
> Project: Calcite
> Issue Type: Wish
> Components: core
> Reporter: Edmon Begoli
> Assignee: Edmon Begoli
> Priority: Minor
> Labels: performance
> Original Estimate: 2,688h
> Remaining Estimate: 2,688h
>
> Develop a benchmark that can be used for general purpose benchamrking of Calcite against other frameworks, and databases, and for study,research, and profiling of the framwork.
> Use popular benchmarks such as TCP-DS (or -H) or Star Schema Benchmark (SSB) and measure the performance of optimized vs. unoptimized Calcite queries, and the overhead of going through Calcite adapters vs. natively accessing the target DB
> Look into the existing approaches and do perhaps something similar:
> * https://www.slideshare.net/julianhyde/w-435phyde-3
> * https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_hive-performance-tuning/content/ch_cost-based-optimizer.html
> * (How much of this is still relevant (Hive 0.14)? Can we use queries/benchmarks?)
> https://hortonworks.com/blog/hive-0-14-cost-based-optimizer-cbo-technical-overview/
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)