You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Marta Kuczora via Review Board <no...@reviews.apache.org> on 2018/03/06 15:32:00 UTC

Re: Review Request 65731: HIVE-18699: Check for duplicate partitions in HiveMetastore.exchange_partitions


> On Feb. 21, 2018, 1:37 p.m., Peter Vary wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 3370 (patched)
> > <https://reviews.apache.org/r/65731/diff/1/?file=1963122#file1963122line3370>
> >
> >     How "expensive" is this call? Is this a simple query? What happens if the destintaion table has 1m partitions? :)
> 
> Adam Szita wrote:
>     I don't see any other way (currently available) that is more lightweight than this is. Under the hood this calls getPartitionNamesNoTxn which executes an actual "select parititionName from .." statement.

Yeah, as Adam says it is the least expensive way to do this check. It would mean one extra "select partitionName" per exchangePartition calls.


- Marta


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65731/#review197848
-----------------------------------------------------------


On Feb. 21, 2018, 11:37 a.m., Marta Kuczora wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65731/
> -----------------------------------------------------------
> 
> (Updated Feb. 21, 2018, 11:37 a.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18699
>     https://issues.apache.org/jira/browse/HIVE-18699
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Extended the HiveMetastore.exchange_partitions method to check if the partitions to be exchanged don't exist in the dest table. If one of the partitions already exists, throw a MetaException with a proper error message.
> 
> Previously an exception like this (wrapped in a MetaException) was thrown:
> Insert of object
> "org.apache.hadoop.hive.metastore.model.MPartition@4e78fff5" using statement "INSERT INTO PARTITIONS
> (PART_ID,CREATE_TIME,LAST_ACCESS_TIME,PART_NAME,SD_ID,TBL_ID) VALUES (?,?,?,?,?,?)" failed : The statement was
> aborted because it would have caused a duplicate key value in a unique or primary key constraint or unique index
> identified by 'UNIQUEPARTITION' defined on 'PARTITIONS'.
> 
> From user point of view, the type of the exception is not changed (MetaException), just the error message is changed to a more understandable one.
> 
> 
> Diffs
> -----
> 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 47de215 
>   standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestExchangePartitions.java 3a06aec 
> 
> 
> Diff: https://reviews.apache.org/r/65731/diff/1/
> 
> 
> Testing
> -------
> 
> Tests already exist for this use case in TestExchangePartitions:
> - testExchangePartitionsPartAlreadyExists
> - testExchangePartitionPartAlreadyExists
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>