You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Rajesh Balamohan (Jira)" <ji...@apache.org> on 2020/12/10 09:27:00 UTC
[jira] [Created] (HIVE-24515) Analyze table job can be skipped when
stats populated are already accurate
Rajesh Balamohan created HIVE-24515:
---------------------------------------
Summary: Analyze table job can be skipped when stats populated are already accurate
Key: HIVE-24515
URL: https://issues.apache.org/jira/browse/HIVE-24515
Project: Hive
Issue Type: Improvement
Reporter: Rajesh Balamohan
For non-partitioned tables, stats detail should be present in table level,
e.g
{noformat}
COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"d_current_day":"true"... }}
{noformat}
For partitioned tables, stats detail should be present in partition level,
{noformat}
store_sales(ss_sold_date_sk=2451819)
{totalSize=0, numRows=0, rawDataSize=0, COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"ss_addr_sk":"true"....}}
{noformat}
When stats populated are already accurate, {{analyze table tn compute statistics for columns}} should skip launching the job.
For ACID tables, stats are auto computed and it can skip computing stats again when stats are accurate.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)