You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2019/01/12 18:20:00 UTC

[jira] [Commented] (IMPALA-8022) Add cardinality checks to PlannerTest

    [ https://issues.apache.org/jira/browse/IMPALA-8022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741350#comment-16741350 ] 

ASF subversion and git services commented on IMPALA-8022:
---------------------------------------------------------

Commit a7ea86b768247ff5388174445e7c91736b99c2de in impala's branch refs/heads/master from paul-rogers
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=a7ea86b ]

IMPALA-8021: Add estimated cardinality to EXPLAIN output

Cardinality is vital to understanding why a plan has the form it does,
yet the planner normally emits cardinality information only for the
detailed levels. Unfortunately, most query profiles we see are at the
standard level without this information (except in the summary table),
making it hard to understand what happened.

This patch adds cardinality to the standard EXPLAIN output. It also
changes the displayed cardinality value to be in abbreviated "metric"
form: 1.23K instead of 1234, etc.

Changing the DESCRIBE output has a huge impact on PlannerTest: all the
"golden" test files must change. To avoid doing this twice, this patch
also includes:

IMPALA-7919: Add predicates line in plan output for partition key
predicates

This is also the time to also include:

IMPALA-8022: Add cardinality checks to PlannerTest

The comparison code was changed to allow a set of validators, one of
which compares cardinality to ensure it is within 5% of the expected
value. This should ensure we don't change estimates unintentionally.

While many planner tests are concerned with cardinality, many others are
not. Testing showed that the cardinality is actually unstable within
tests. For such tests, added filters to ignore cardinality. The filter
is enabled by default (for backward compatibility) but disabled (to
allow cardinality verification) for the critical tests.

Rebasing the tests was complicated by a bug in the error-matching code,
so this patch also fixes:

IMPALA-8023: Fix PlannerTest to handle error lines consistently

Now, the error output written to the output "save results" file matches
that expected in the "golden" file -- no more handling these specially.

Testing:

* Added cardinality verification.
* Reran all FE tests.
* Rebased all PlannerTest .test files.
* Adjusted the metadata/test_explain.py test to handle the changed
  EXPLAIN output.

Change-Id: Ie9aa2d715b04cbb279aaffec8c5692686562d986
Reviewed-on: http://gerrit.cloudera.org:8080/12136
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Add cardinality checks to PlannerTest
> -------------------------------------
>
>                 Key: IMPALA-8022
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8022
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 3.1.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>
> IMPALA-8021 suggests that we add cardinality to the standard EXPLAIN output. When we do so, the cardinality will appear in the plans generated by {{PlannerTest}}. We could just ignore the numbers, as we do for other derived values. Or, we could use this opportunity to verify the expected cardinalities.
> The expected cardinality is a critical number when planning queries: we decide join planning based on that number (sometimes multiplied by row width.) To prevent regressions, we want to verify that we've not changed the cardinality estimates except by design.
> Since cardinality estimates may vary a bit, do a range-based comparison, accepting as valid any value within 5% of the expected value. (Except for zero, if zero is expected, the actual estimate must also be zero.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org