You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Naresh P R (Jira)" <ji...@apache.org> on 2022/09/09 01:38:00 UTC

[jira] [Updated] (HIVE-26526) MSCK sync is not removing partitions with special characters

     [ https://issues.apache.org/jira/browse/HIVE-26526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Naresh P R updated HIVE-26526:
------------------------------
    Issue Type: Bug  (was: New Feature)

> MSCK sync is not removing partitions with special characters
> ------------------------------------------------------------
>
>                 Key: HIVE-26526
>                 URL: https://issues.apache.org/jira/browse/HIVE-26526
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Naresh P R
>            Priority: Major
>
> PARTITIONS table were having encoding string & PARTITION_KEY_VALS were having original string.
> {code:java}
> hive=> select * from "PARTITION_KEY_VALS" where "PART_ID" IN (46753, 46754, 46755, 46756);
>  PART_ID |    PART_KEY_VAL     | INTEGER_IDX
> ---------+---------------------+-------------
>    46753 | 2022-02-*           |           0
>    46754 | 2011-03-01          |           0
>    46755 | 2022-01-*           |           0
>    46756 | 2010-01-01          |           0
>    
>    
> hive=> select * from "PARTITIONS" where "TBL_ID" = 23567 ;
>  PART_ID | CREATE_TIME | LAST_ACCESS_TIME |       PART_NAME       | SD_ID | TBL_ID | WRITE_ID
> ---------+-------------+------------------+-----------------------+-------+--------+----------
>    46753 |           0 |                0 | part_date=2022-02-%2A | 70195 |  23567 |        0
>    46754 |           0 |                0 | part_date=2011-03-01  | 70196 |  23567 |        0
>    46755 |           0 |                0 | part_date=2022-01-%2A | 70197 |  23567 |        0
>    46756 |           0 |                0 | part_date=2010-01-01  | 70198 |  23567 |        0
> (4 rows){code}
>  
> 1) DirectSQL has a join condition on PARTITION_KEY_VALS.PART_KEY_VAL = "2022-02-%2A" at here
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java#L883
> 2) Jdo is having filter condition on PARTITIONS.PART_NAME = "part_date=2022-02-%252A" (ie., 2 times url encoded)
> Once from HS2
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java#L353
> 2nd from HMS
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java#L365]
> Above conditions returns 0 partitions, so those are not removed from HMS metadata.
>  
> Attaching repro q file 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)