You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "David Wayne Birdsall (JIRA)" <ji...@apache.org> on 2019/07/08 22:44:00 UTC

[jira] [Created] (TRAFODION-3316) Fix some issues with incremental UPDATE STATISTICS

David Wayne Birdsall created TRAFODION-3316:
-----------------------------------------------

             Summary: Fix some issues with incremental UPDATE STATISTICS
                 Key: TRAFODION-3316
                 URL: https://issues.apache.org/jira/browse/TRAFODION-3316
             Project: Apache Trafodion
          Issue Type: Improvement
            Reporter: David Wayne Birdsall
            Assignee: David Wayne Birdsall


Fix some issues with incremental UPDATE STATISTICS.
 # The Hive TIMESTAMP datatype has precision 9. When processing a Hive TIMESTAMP column, UPDATE STATISTICS abends.
 # After fixing that, when testing a small example, the sampling ratio for the sample table is made incorrectly small, resulting in 9207 errors (sample is empty).
 # Sometimes, when populating the sample table, the UPSERT fails because garbage is used for the table name. This is because the table name is taken from an NAString that has gone out of scope.

The following test script can be used to test these conditions. In Hive, do the following:
{quote}create database if not exists h_increUpdatestats;
use h_increUpdatestats;
drop table if exists hive_sequencefile ;
create table hive_sequencefile(C_CUSTKEY int,C_NAME char(20),C_ADDRESS varchar(20), C_NATIONKEY int, C_TIMESTAMP timestamp,C_PHONE char(20), C_ACCTBAL string, 
C_MKTSEGMENT char(20), C_DATE date) stored as sequencefile;

insert into hive_sequencefile values 
(1,'x','xx',1,'2018-08-22 12:00:00.123','xxx','xxxx','xxxxx','2018-08-22'),
(2,'x','xx',2,'2018-08-23 12:00:00.123','xxx','xxxx','xxxxx','2018-08-23'),
(3,'x','xx',3,'2018-08-24 12:00:00.123','xxx','xxxx','xxxxx','2018-08-24'),
(4,'x','xx',4,'2018-08-25 12:00:00.123','xxx','xxxx','xxxxx','2018-08-25'),
(5,'x','xx',5,'2018-08-26 12:00:00.123','xxx','xxxx','xxxxx','2018-08-26');
{quote}
Then in Trafodion do the following:
{quote}update statistics for table hive.h_increUpdatestats.hive_sequencefile remove sample;
update statistics for table hive.h_increUpdatestats.hive_sequencefile create sample random 100 percent;
update statistics for table hive.h_increUpdatestats.hive_sequencefile on every column;
update statistics for table hive.h_increUpdatestats.hive_sequencefile on existing columns incremental where C_CUSTKEY>=0;
{quote}
Without any fixes, the third UPDATE STATISTICS command abends.

With a fix for the first problem, the last UPDATE STATISTICS command sometimes fails with error 2109, and always fails with error 9207.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)