You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2017/08/23 19:27:00 UTC
[jira] [Resolved] (IMPALA-1988) show column stats returns different
results for beeswax and hs2
[ https://issues.apache.org/jira/browse/IMPALA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Armstrong resolved IMPALA-1988.
-----------------------------------
Resolution: Duplicate
> show column stats returns different results for beeswax and hs2
> ---------------------------------------------------------------
>
> Key: IMPALA-1988
> URL: https://issues.apache.org/jira/browse/IMPALA-1988
> Project: IMPALA
> Issue Type: Bug
> Components: Clients
> Affects Versions: Impala 2.2
> Reporter: Alex Leblang
> Labels: compute-stats, hs2
>
> In the impala shell show column stats compute_stats_db.alltypes functions as expected; stats are returned. When the same command is executed through impyla using hs2, the second to last column, Max Size, is always of None type.
> To reproduce:
> in the Impala shell
> [localhost:21000] > show column stats compute_stats_db.alltypes;
> Query: show column stats compute_stats_db.alltypes
> +-----------------+-----------+------------------+--------+----------+----------+
> | Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size |
> +-----------------+-----------+------------------+--------+----------+----------+
> | id | INT | 8161 | -1 | 4 | 4 |
> | bool_col | BOOLEAN | 2 | -1 | 1 | 1 |
> | tinyint_col | TINYINT | 10 | -1 | 1 | 1 |
> | smallint_col | SMALLINT | 10 | -1 | 2 | 2 |
> | int_col | INT | 10 | -1 | 4 | 4 |
> | bigint_col | BIGINT | 10 | -1 | 8 | 8 |
> | float_col | FLOAT | 10 | -1 | 4 | 4 |
> | double_col | DOUBLE | 10 | -1 | 8 | 8 |
> | date_string_col | STRING | 666 | -1 | 8 | 8 |
> | string_col | STRING | 10 | -1 | 1 | 1 |
> | timestamp_col | TIMESTAMP | 5678 | -1 | 16 | 16 |
> | year | INT | 2 | 0 | 4 | 4 |
> | month | INT | 12 | 0 | 4 | 4 |
> +-----------------+-----------+------------------+--------+----------+----------+
> Fetched 13 row(s) in 0.01s
> In ipython (normal python also works fine for this):
> In [1]: from impala.dbapi import connect
> In [2]: conn = connect()
> In [3]: cur = conn.cursor()
> In [4]: cur.execute("show column stats compute_stats_db.alltypes")
> In [5]: cur.fetchall()
> Out[5]:
> [('id', 'INT', 8161, -1, None, 4.0),
> ('bool_col', 'BOOLEAN', 2, -1, None, 1.0),
> ('tinyint_col', 'TINYINT', 10, -1, None, 1.0),
> ('smallint_col', 'SMALLINT', 10, -1, None, 2.0),
> ('int_col', 'INT', 10, -1, None, 4.0),
> ('bigint_col', 'BIGINT', 10, -1, None, 8.0),
> ('float_col', 'FLOAT', 10, -1, None, 4.0),
> ('double_col', 'DOUBLE', 10, -1, None, 8.0),
> ('date_string_col', 'STRING', 666, -1, None, 8.0),
> ('string_col', 'STRING', 10, -1, None, 1.0),
> ('timestamp_col', 'TIMESTAMP', 5678, -1, None, 16.0),
> ('year', 'INT', 2, 0, None, 4.0),
> ('month', 'INT', 12, 0, None, 4.0)]
> In [6]: cur.description
> Out[6]:
> [('Column', 'STRING', None, None, None, None, None),
> ('Type', 'STRING', None, None, None, None, None),
> ('#Distinct Values', 'BIGINT', None, None, None, None, None),
> ('#Nulls', 'BIGINT', None, None, None, None, None),
> ('Max Size', 'INT', None, None, None, None, None),
> ('Avg Size', 'DOUBLE', None, None, None, None, None)]
> For those unfamiliar with impyla:
> fetchall() return the query results; each tuple is a row.
> description returns the column labels and types, e.g. the first column is named Column and is of type string.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)