You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/01/04 10:41:00 UTC

[jira] [Commented] (IMPALA-7942) Add query hints for cardinalities and selectivities

    [ https://issues.apache.org/jira/browse/IMPALA-7942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17654386#comment-17654386 ] 

ASF subversion and git services commented on IMPALA-7942:
---------------------------------------------------------

Commit b296567a32c8f678549fe7e40ea87d7669f81a9e in impala's branch refs/heads/master from skyyws
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b296567a3 ]

IMPALA-7942 (part 1): Add query hints for table cardinalities

Currently, we run 'COMPUTE STATS' command to compute table stats
which is very useful for query planning. Without these stats, a
query plan may not be optimal. However, these stats may not be
available, up to date, or valid. To workaround this problem,
this patch adds a new query hint: 'TABLE_NUM_ROWS', We can use
this new hint after a hdfs or kudu table in query like this:

  * select col from t /* +TABLE_NUM_ROWS(1000) */;

If set, Impala will use this value as table scanned rows when
table no stats or has corrput stats. This hint value will not
valid if table stats is normal.

Testing:
- Added new fe test in 'PlannerTest'
- Added new fe test in 'AnalyzeStmtsTest' for negative cases

Change-Id: I9f0c773f4e67782a1428db64062f68afbd257af7
Reviewed-on: http://gerrit.cloudera.org:8080/18829
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Add query hints for cardinalities and selectivities
> ---------------------------------------------------
>
>                 Key: IMPALA-7942
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7942
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Frontend
>    Affects Versions: Impala 3.2.0
>            Reporter: Lars Volker
>            Assignee: Sheng Wang
>            Priority: Major
>
> The optimizer can pick suboptimal plans when tables don't have statistics. To allow users to help the optimizer, we should support query hints to specify cardinalities of scans, predicated (and possibly joins).
> This could look like the following example.
> {code:sql}
> select x from medium /*+ num_rows(1000000000) */
>   join small /*+ num_rows(1000000) */
>   join (select * from big /*+ num_rows(1000000000) */
>         where c1 < 10 /*+ selectivity(0.00001) */) as big
>   where medium.id = small.id and small.id = big.id;
> {code}
> Instead of cardinalities we could also support specifying the number of rows that pass a predicate (or join).
> We should not rely on the specified cardinalities to be accurate, e.g. the following should still execute a scan:
> {code:sql}
> select count(*) from T /*+ num_rows(100) */
>   where id < 100 /*+ selectivity(0.1) */;
> {code}
> This is a first step towards giving users more control over the planner / optimizer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org