You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Venkat Sambath (Jira)" <ji...@apache.org> on 2020/10/02 05:10:00 UTC

[jira] [Created] (IMPALA-10208) Drop stats doesnt remove impala_intermediate_stats_num_chunks from PARTITION_PARAMS

Venkat Sambath created IMPALA-10208:
---------------------------------------

             Summary: Drop stats doesnt remove impala_intermediate_stats_num_chunks from PARTITION_PARAMS
                 Key: IMPALA-10208
                 URL: https://issues.apache.org/jira/browse/IMPALA-10208
             Project: IMPALA
          Issue Type: Bug
            Reporter: Venkat Sambath
         Attachments: image-2020-10-02-10-38-16-153.png, image-2020-10-02-10-38-48-144.png, image-2020-10-02-10-39-18-642.png

Steps to replicate the issue:

Step1: 
CREATE TABLE impala_partition_test1 (                                          
   a INT                                                                               
 )                                                                                     
 PARTITIONED BY (                                                                      
   b STRING                                                                            
 ); alter table impala_partition_test1 add partition(b="part1");
 alter table impala_partition_test1 add partition(b="part2");
 alter table impala_partition_test1 add partition(b="part3");
 alter table impala_partition_test1 add partition(b="part4");

Step2: Populating the partitions
for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && hdfs dfs -put text_data hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part1/test_${i}; done
 for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && hdfs dfs -put text_data hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part2/test_${i}; done
 for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && hdfs dfs -put text_data hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part3/test_${i}; done
 for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && hdfs dfs -put text_data hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part4/test_${i}; done

Step3: Run compute incremental stats impala_partition_test1;

Step4: In HMS DB when you run the below query 

{code:java}
select A.TBL_NAME, B.PART_NAME, C.PARAM_KEY, sum(length(C.PARAM_KEY) + length(C.PARAM_VALUE)) from TBLS A join PARTITIONS B join PARTITION_PARAMS C on A.TBL_ID = B.TBL_ID and C.PART_ID=B.PART_ID and C.PARAM_KEY like "%impala_intermediate_stats%" group by A.TBL_NAME,B.PART_NAME,C.PARAM_KEY;	
{code}

You will be noticing
 !image-2020-10-02-10-39-18-642.png! 

Step5: After you drop the stats [drop stats impala_partition_test1 ] you still be noticing impala_intermediate_stats_num_chunks left unremoved.

 !image-2020-10-02-10-38-48-144.png! 

When you have million partitions this could contribute to 37mb I suppose. Requesting you to remove impala_intermediate_stats_num_chunks while we drop stats from table.









--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org