You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2019/04/30 06:39:00 UTC
[jira] [Commented] (IMPALA-8458) Can't set maxSize and avgSize
column stats with local catalog
[ https://issues.apache.org/jira/browse/IMPALA-8458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829996#comment-16829996 ]
Tim Armstrong commented on IMPALA-8458:
---------------------------------------
Well this explains a lot!
{code}
// Ugly hack: if the catalogd has never gotten any stats from HMS, numDVs will
// be -1, and we'll have to send no stats to the impalad.
// TODO(todd): this breaks test_ddl.test_alter_set_column_stats.
if (!col.getStats().hasNumDistinctValues()) continue;
{code}
> Can't set maxSize and avgSize column stats with local catalog
> -------------------------------------------------------------
>
> Key: IMPALA-8458
> URL: https://issues.apache.org/jira/browse/IMPALA-8458
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Affects Versions: Impala 3.3.0
> Reporter: Tim Armstrong
> Assignee: Tim Armstrong
> Priority: Critical
>
> Repro:
> {noformat}
> [tarmstrong-box2.ca.cloudera.com:21000] default> create table test_stats2(s string);
> +-------------------------+
> | summary |
> +-------------------------+
> | Table has been created. |
> +-------------------------+
> Fetched 1 row(s) in 0.36s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats test_stats2;
> +--------+--------+------------------+--------+----------+----------+
> | Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size |
> +--------+--------+------------------+--------+----------+----------+
> | s | STRING | -1 | -1 | -1 | -1 |
> +--------+--------+------------------+--------+----------+----------+
> Fetched 1 row(s) in 0.02s
> [tarmstrong-box2.ca.cloudera.com:21000] default> alter table test_stats2 set column stats s('avgSize'='1234');
> +-----------------------------------------+
> | summary |
> +-----------------------------------------+
> | Updated 0 partition(s) and 1 column(s). |
> +-----------------------------------------+
> Fetched 1 row(s) in 0.14s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats test_stats2;
> +--------+--------+------------------+--------+----------+----------+
> | Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size |
> +--------+--------+------------------+--------+----------+----------+
> | s | STRING | -1 | -1 | -1 | -1 |
> +--------+--------+------------------+--------+----------+----------+
> Fetched 1 row(s) in 0.02s
> [tarmstrong-box2.ca.cloudera.com:21000] default> alter table test_stats2 set column stats s('maxSize'='1234');
> +-----------------------------------------+
> | summary |
> +-----------------------------------------+
> | Updated 0 partition(s) and 1 column(s). |
> +-----------------------------------------+
> Fetched 1 row(s) in 0.10s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats test_stats2;
> +--------+--------+------------------+--------+----------+----------+
> | Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size |
> +--------+--------+------------------+--------+----------+----------+
> | s | STRING | -1 | -1 | -1 | -1 |
> +--------+--------+------------------+--------+----------+----------+
> Fetched 1 row(s) in 0.02s
> [tarmstrong-box2.ca.cloudera.com:21000] default> invalidate metadata test_stats2;
> Fetched 0 row(s) in 0.03s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats test_stats2;
> Query: show column stats test_stats2
> +--------+--------+------------------+--------+----------+----------+
> | Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size |
> +--------+--------+------------------+--------+----------+----------+
> | s | STRING | -1 | -1 | -1 | -1 |
> +--------+--------+------------------+--------+----------+----------+
> Fetched 1 row(s) in 0.07s
> {noformat}
> I expected that the updates would take effect. Weirdly it doesn't happen for NDV and NULLS:
> {noformat}
> [tarmstrong-box2.ca.cloudera.com:21000] default> alter table test_stats2 set column stats s('numDVs'='1234','numNulls'='12345');
> Query: alter table test_stats2 set column stats s('numDVs'='1234','numNulls'='12345')
> +-----------------------------------------+
> | summary |
> +-----------------------------------------+
> | Updated 0 partition(s) and 1 column(s). |
> +-----------------------------------------+
> Fetched 1 row(s) in 0.12s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats test_stats2;
> Query: show column stats test_stats2
> +--------+--------+------------------+--------+----------+----------+
> | Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size |
> +--------+--------+------------------+--------+----------+----------+
> | s | STRING | 1234 | 12345 | -1 | -1 |
> +--------+--------+------------------+--------+----------+----------+
> Fetched 1 row(s) in 0.02s
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org