You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by jb...@apache.org on 2021/05/16 14:23:46 UTC
[impala] branch branch-4.0.0 updated: IMPALA-10696: fix accuracy
problem
This is an automated email from the ASF dual-hosted git repository.
jbapple pushed a commit to branch branch-4.0.0
in repository https://gitbox.apache.org/repos/asf/impala.git
The following commit(s) were added to refs/heads/branch-4.0.0 by this push:
new bc94f3a IMPALA-10696: fix accuracy problem
bc94f3a is described below
commit bc94f3ad57837ee31e7bde528d1edad944d56940
Author: liuyao <54...@163.com>
AuthorDate: Sat May 8 11:23:53 2021 +0800
IMPALA-10696: fix accuracy problem
Table alltypes has no statistics, so the cardinality of alltypes
will be estimated based on the hdfs files and the avg row size.
Calling PrintUtils.printMetric, double will be divided by long. There
will be accuracy problems. In most cases, the number of lines
calculated is 17.91 K. But due to accuracy problems here, the
calculated value is 17.90K.
I modified line 221 of stats-extrapolation.test and used row_regex
to match, referring to the matching method of cardinality in line
224,in this case, their values are the same
Testing:
metadata/test_stats_extrapolation.py
Change-Id: I0a1a3809508c90217517705b2b188b2ccba6f23f
Reviewed-on: http://gerrit.cloudera.org:8080/17411
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Jim Apple <jb...@apache.org>
---
.../functional-query/queries/QueryTest/stats-extrapolation.test | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test b/testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
index 1e05208..fc540a5 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
@@ -218,7 +218,7 @@ explain select id from alltypes;
---- RESULTS: VERIFY_IS_SUBSET
' stored statistics:'
' table: rows=unavailable size=unavailable'
-' partitions: 0/24 rows=17.91K'
+row_regex:.* partitions: 0/24 rows=17\.9.*K
' columns: unavailable'
row_regex:.* extrapolated-rows=unavailable.*
row_regex:.* tuple-ids=0 row-size=4B cardinality=17\.9.*K