You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Rajesh Balamohan (Jira)" <ji...@apache.org> on 2020/12/10 09:27:00 UTC

[jira] [Created] (HIVE-24515) Analyze table job can be skipped when stats populated are already accurate

Rajesh Balamohan created HIVE-24515:
---------------------------------------

             Summary: Analyze table job can be skipped when stats populated are already accurate
                 Key: HIVE-24515
                 URL: https://issues.apache.org/jira/browse/HIVE-24515
             Project: Hive
          Issue Type: Improvement
            Reporter: Rajesh Balamohan


For non-partitioned tables, stats detail should be present in table level,

e.g
{noformat}
COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"d_current_day":"true"... }}
  {noformat}

For partitioned tables, stats detail should be present in partition level,
{noformat}
store_sales(ss_sold_date_sk=2451819)

{totalSize=0, numRows=0, rawDataSize=0, COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"ss_addr_sk":"true"....}}
 
 {noformat}

When stats populated are already accurate, {{analyze table tn compute statistics for columns}} should skip launching the job.

 

For ACID tables, stats are auto computed and it can skip computing stats again when stats are accurate.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)