You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Aman Sinha (Jira)" <ji...@apache.org> on 2021/03/30 00:07:00 UTC
[jira] [Resolved] (IMPALA-10116) Builtin cast function's
selectivity is different from that of explicit cast
[ https://issues.apache.org/jira/browse/IMPALA-10116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aman Sinha resolved IMPALA-10116.
---------------------------------
Fix Version/s: Impala 4.0
Resolution: Fixed
> Builtin cast function's selectivity is different from that of explicit cast
> ---------------------------------------------------------------------------
>
> Key: IMPALA-10116
> URL: https://issues.apache.org/jira/browse/IMPALA-10116
> Project: IMPALA
> Issue Type: Sub-task
> Components: Frontend
> Affects Versions: Impala 3.4.0
> Reporter: Aman Sinha
> Assignee: Aman Sinha
> Priority: Major
> Fix For: Impala 4.0
>
>
> Query 1 below uses 'casttobigint()' in the IS NOT NULL predicate and its selectivity is computed as the default 10% of the input rows, resulting in cardinality = 7.3K. The predicate in Query 2 with 'CAST' expr computes the correct cardinality of 73.05K.
> Query 1:
> {noformat}
> Query: explain select * from date_dim d1, date_dim d2 where d1.d_week_seq = d2.d_week_seq - 52 and casttobigint(d1.d_week_seq) is not null and casttobigint(d2.d_week_seq) is not null
> |
> | 00:SCAN HDFS [tpcds.date_dim d1] |
> | HDFS partitions=1/1 files=1 size=9.84MB |
> | predicates: casttobigint(d1.d_week_seq) IS NOT NULL |
> | runtime filters: RF000 -> d1.d_week_seq |
> | row-size=255B cardinality=7.30K |
> +-------------------------------------------------------------+
> {noformat}
> Query 2:
> {noformat}
> Query: explain select * from date_dim d1, date_dim d2 where d1.d_week_seq = d2.d_week_seq - 52 and cast(d1.d_week_seq as bigint) is not null and cast(d2.d_week_seq as bigint) is not null
> | 00:SCAN HDFS [tpcds.date_dim d1] |
> | HDFS partitions=1/1 files=1 size=9.84MB |
> | predicates: CAST(d1.d_week_seq AS BIGINT) IS NOT NULL |
> | runtime filters: RF000 -> d1.d_week_seq |
> | row-size=255B cardinality=73.05K |
> +-------------------------------------------------------------+
> {noformat}
> Query 1 should ideally provide the same cardinality as Query 2. Note that I had to comment out the following lines in FunctionCallExpr.java because a user query is not supposed to directly call the builtin cast function. However, for an external frontend module that calls functions in impala-frontend.jar, this is supported and we should make the behavior consistent.
> {noformat}
> +// if (isBuiltinCastFunction()) {
> +// throw new AnalysisException(toSql() +
> +// " is reserved for internal use only. Use 'cast(expr AS type)' instead.");
> +// }
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org