You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Peter Vary via Review Board <no...@reviews.apache.org> on 2018/05/09 13:00:46 UTC

Re: Review Request 66800: HIVE-6980 Drop table by using direct sql

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/
-----------------------------------------------------------

(Updated May 9, 2018, 1 p.m.)


Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.


Bugs: HIVE-6980
    https://issues.apache.org/jira/browse/HIVE-6980


Repository: hive-git


Description
-------

First version of the patch.

Splits getPartitionsViaSqlFilterInternal to:

getPartitionIdsViaSqlFilter - which returns the partition ids
getPartitionsFromPartitionIds - which returns the partition data for the partitions
Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands

Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.

Modifies the ObjectStore to drop partitions with directsql if possible.


Diffs (updated)
-----

  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 4e0e887 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java 6645e55 


Diff: https://reviews.apache.org/r/66800/diff/3/

Changes: https://reviews.apache.org/r/66800/diff/2-3/


Testing
-------

Run the TestDropPartition tests, also checked the database manually, that no object left in the database


Thanks,

Peter Vary


Re: Review Request 66800: HIVE-6980 Drop table by using direct sql

Posted by Peter Vary via Review Board <no...@reviews.apache.org>.

> On May 10, 2018, 11:49 a.m., Alexander Kolbasov wrote:
> > Do you have some way of testing that all the objects are actually deleted from the DB?

I will create an followup jira to find a way to do this.
HIVE-19503


> On May 10, 2018, 11:49 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Line 655 (original), 667 (patched)
> > <https://reviews.apache.org/r/66800/diff/3/?file=2018475#file2018475line687>
> >
> >     Is it existing bug in the code?

I think it is, but since we do not use this actively, this is not yet surfaced


> On May 10, 2018, 11:49 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 1090 (patched)
> > <https://reviews.apache.org/r/66800/diff/3/?file=2018475#file2018475line1110>
> >
> >     I think this can be done in a cleaner way using Java stream API. From https://docs.oracle.com/javase/8/docs/api/java/util/StringJoiner.html:
> >     
> >     `
> >     List<Integer> numbers = Arrays.asList(1, 2, 3, 4);
> >      String commaSeparatedNumbers = numbers.stream()
> >          .map(i -> i.toString())
> >          .collect(Collectors.joining(", "));
> >     `

Thanks, I will use this


> On May 10, 2018, 11:49 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 2567 (patched)
> > <https://reviews.apache.org/r/66800/diff/3/?file=2018475#file2018475line2587>
> >
> >     Why do we need to do it here? We are done already

Removed the last checkTimeout-s from every method. Added a few ones where they are needed


> On May 10, 2018, 11:49 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 2686 (patched)
> > <https://reviews.apache.org/r/66800/diff/3/?file=2018475#file2018475line2706>
> >
> >     Do we need to do it here?

Removed the last checkTimeout-s from every method. Added a few ones where they are needed


> On May 10, 2018, 11:49 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 2721 (patched)
> > <https://reviews.apache.org/r/66800/diff/3/?file=2018475#file2018475line2741>
> >
> >     Do we need to do it here?

Removed the last checkTimeout-s from every method. Added a few ones where they are needed


- Peter


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/#review202832
-----------------------------------------------------------


On May 9, 2018, 1 p.m., Peter Vary wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66800/
> -----------------------------------------------------------
> 
> (Updated May 9, 2018, 1 p.m.)
> 
> 
> Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-6980
>     https://issues.apache.org/jira/browse/HIVE-6980
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> First version of the patch.
> 
> Splits getPartitionsViaSqlFilterInternal to:
> 
> getPartitionIdsViaSqlFilter - which returns the partition ids
> getPartitionsFromPartitionIds - which returns the partition data for the partitions
> Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
> 
> Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
> 
> Modifies the ObjectStore to drop partitions with directsql if possible.
> 
> 
> Diffs
> -----
> 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 4e0e887 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java 6645e55 
> 
> 
> Diff: https://reviews.apache.org/r/66800/diff/3/
> 
> 
> Testing
> -------
> 
> Run the TestDropPartition tests, also checked the database manually, that no object left in the database
> 
> 
> Thanks,
> 
> Peter Vary
> 
>


Re: Review Request 66800: HIVE-6980 Drop table by using direct sql

Posted by Alexander Kolbasov <ak...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/#review202832
-----------------------------------------------------------



Do you have some way of testing that all the objects are actually deleted from the DB?


standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Line 655 (original), 667 (patched)
<https://reviews.apache.org/r/66800/#comment284842>

    Is it existing bug in the code?



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 1090 (patched)
<https://reviews.apache.org/r/66800/#comment284844>

    I think this can be done in a cleaner way using Java stream API. From https://docs.oracle.com/javase/8/docs/api/java/util/StringJoiner.html:
    
    `
    List<Integer> numbers = Arrays.asList(1, 2, 3, 4);
     String commaSeparatedNumbers = numbers.stream()
         .map(i -> i.toString())
         .collect(Collectors.joining(", "));
    `



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 2567 (patched)
<https://reviews.apache.org/r/66800/#comment284845>

    Why do we need to do it here? We are done already



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 2686 (patched)
<https://reviews.apache.org/r/66800/#comment284846>

    Do we need to do it here?



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 2721 (patched)
<https://reviews.apache.org/r/66800/#comment284847>

    Do we need to do it here?



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 2761 (patched)
<https://reviews.apache.org/r/66800/#comment284850>

    It is usually better to use !isEmpty() in such cases.
    Also, it may be easier to read if you do
    
    ```
    if (danglingColumnDescriptorIdList.isEmpty()) {
      return;
    }
    ```


- Alexander Kolbasov


On May 9, 2018, 1 p.m., Peter Vary wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66800/
> -----------------------------------------------------------
> 
> (Updated May 9, 2018, 1 p.m.)
> 
> 
> Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-6980
>     https://issues.apache.org/jira/browse/HIVE-6980
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> First version of the patch.
> 
> Splits getPartitionsViaSqlFilterInternal to:
> 
> getPartitionIdsViaSqlFilter - which returns the partition ids
> getPartitionsFromPartitionIds - which returns the partition data for the partitions
> Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
> 
> Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
> 
> Modifies the ObjectStore to drop partitions with directsql if possible.
> 
> 
> Diffs
> -----
> 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 4e0e887 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java 6645e55 
> 
> 
> Diff: https://reviews.apache.org/r/66800/diff/3/
> 
> 
> Testing
> -------
> 
> Run the TestDropPartition tests, also checked the database manually, that no object left in the database
> 
> 
> Thanks,
> 
> Peter Vary
> 
>


Re: Review Request 66800: HIVE-6980 Drop table by using direct sql

Posted by Alexander Kolbasov <ak...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/#review203011
-----------------------------------------------------------


Ship it!




Ship It!

- Alexander Kolbasov


On May 11, 2018, 2:13 p.m., Peter Vary wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66800/
> -----------------------------------------------------------
> 
> (Updated May 11, 2018, 2:13 p.m.)
> 
> 
> Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-6980
>     https://issues.apache.org/jira/browse/HIVE-6980
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> First version of the patch.
> 
> Splits getPartitionsViaSqlFilterInternal to:
> 
> getPartitionIdsViaSqlFilter - which returns the partition ids
> getPartitionsFromPartitionIds - which returns the partition data for the partitions
> Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
> 
> Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
> 
> Modifies the ObjectStore to drop partitions with directsql if possible.
> 
> 
> Diffs
> -----
> 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 56fbfed 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java b0a805f 
> 
> 
> Diff: https://reviews.apache.org/r/66800/diff/4/
> 
> 
> Testing
> -------
> 
> Run the TestDropPartition tests, also checked the database manually, that no object left in the database
> 
> 
> Thanks,
> 
> Peter Vary
> 
>


Re: Review Request 66800: HIVE-6980 Drop table by using direct sql

Posted by Peter Vary via Review Board <no...@reviews.apache.org>.

> On May 14, 2018, 7:07 p.m., Vihang Karajgaonkar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 2545 (patched)
> > <https://reviews.apache.org/r/66800/diff/4/?file=2019957#file2019957line2566>
> >
> >     Why not LOG.error?

My original thought was, that we fall back to JDO solution this case, so the problem is not fatal.
I think this is why the other cases are using LOG.warn too.
Shall I change here, and other places as well?

Thanks for the review!


- Peter


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/#review203055
-----------------------------------------------------------


On May 11, 2018, 2:13 p.m., Peter Vary wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66800/
> -----------------------------------------------------------
> 
> (Updated May 11, 2018, 2:13 p.m.)
> 
> 
> Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-6980
>     https://issues.apache.org/jira/browse/HIVE-6980
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> First version of the patch.
> 
> Splits getPartitionsViaSqlFilterInternal to:
> 
> getPartitionIdsViaSqlFilter - which returns the partition ids
> getPartitionsFromPartitionIds - which returns the partition data for the partitions
> Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
> 
> Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
> 
> Modifies the ObjectStore to drop partitions with directsql if possible.
> 
> 
> Diffs
> -----
> 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 56fbfed 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java b0a805f 
> 
> 
> Diff: https://reviews.apache.org/r/66800/diff/4/
> 
> 
> Testing
> -------
> 
> Run the TestDropPartition tests, also checked the database manually, that no object left in the database
> 
> 
> Thanks,
> 
> Peter Vary
> 
>


Re: Review Request 66800: HIVE-6980 Drop table by using direct sql

Posted by Vihang Karajgaonkar via Review Board <no...@reviews.apache.org>.

> On May 14, 2018, 7:07 p.m., Vihang Karajgaonkar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 2545 (patched)
> > <https://reviews.apache.org/r/66800/diff/4/?file=2019957#file2019957line2566>
> >
> >     Why not LOG.error?
> 
> Peter Vary wrote:
>     My original thought was, that we fall back to JDO solution this case, so the problem is not fatal.
>     I think this is why the other cases are using LOG.warn too.
>     Shall I change here, and other places as well?
>     
>     Thanks for the review!

I see, that sounds good to me then. Thanks


- Vihang


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/#review203055
-----------------------------------------------------------


On May 11, 2018, 2:13 p.m., Peter Vary wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66800/
> -----------------------------------------------------------
> 
> (Updated May 11, 2018, 2:13 p.m.)
> 
> 
> Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-6980
>     https://issues.apache.org/jira/browse/HIVE-6980
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> First version of the patch.
> 
> Splits getPartitionsViaSqlFilterInternal to:
> 
> getPartitionIdsViaSqlFilter - which returns the partition ids
> getPartitionsFromPartitionIds - which returns the partition data for the partitions
> Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
> 
> Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
> 
> Modifies the ObjectStore to drop partitions with directsql if possible.
> 
> 
> Diffs
> -----
> 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 56fbfed 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java b0a805f 
> 
> 
> Diff: https://reviews.apache.org/r/66800/diff/4/
> 
> 
> Testing
> -------
> 
> Run the TestDropPartition tests, also checked the database manually, that no object left in the database
> 
> 
> Thanks,
> 
> Peter Vary
> 
>


Re: Review Request 66800: HIVE-6980 Drop table by using direct sql

Posted by Vihang Karajgaonkar via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/#review203055
-----------------------------------------------------------


Fix it, then Ship it!




LGTM. Thanks for making this change. This should help a lot for dropping big tables!


standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 2545 (patched)
<https://reviews.apache.org/r/66800/#comment285111>

    Why not LOG.error?


- Vihang Karajgaonkar


On May 11, 2018, 2:13 p.m., Peter Vary wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66800/
> -----------------------------------------------------------
> 
> (Updated May 11, 2018, 2:13 p.m.)
> 
> 
> Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-6980
>     https://issues.apache.org/jira/browse/HIVE-6980
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> First version of the patch.
> 
> Splits getPartitionsViaSqlFilterInternal to:
> 
> getPartitionIdsViaSqlFilter - which returns the partition ids
> getPartitionsFromPartitionIds - which returns the partition data for the partitions
> Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
> 
> Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
> 
> Modifies the ObjectStore to drop partitions with directsql if possible.
> 
> 
> Diffs
> -----
> 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 56fbfed 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java b0a805f 
> 
> 
> Diff: https://reviews.apache.org/r/66800/diff/4/
> 
> 
> Testing
> -------
> 
> Run the TestDropPartition tests, also checked the database manually, that no object left in the database
> 
> 
> Thanks,
> 
> Peter Vary
> 
>


Re: Review Request 66800: HIVE-6980 Drop table by using direct sql

Posted by Peter Vary via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/
-----------------------------------------------------------

(Updated May 11, 2018, 2:13 p.m.)


Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.


Changes
-------

Addressed Sasha's comments


Bugs: HIVE-6980
    https://issues.apache.org/jira/browse/HIVE-6980


Repository: hive-git


Description
-------

First version of the patch.

Splits getPartitionsViaSqlFilterInternal to:

getPartitionIdsViaSqlFilter - which returns the partition ids
getPartitionsFromPartitionIds - which returns the partition data for the partitions
Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands

Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.

Modifies the ObjectStore to drop partitions with directsql if possible.


Diffs (updated)
-----

  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 56fbfed 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java b0a805f 


Diff: https://reviews.apache.org/r/66800/diff/4/

Changes: https://reviews.apache.org/r/66800/diff/3-4/


Testing
-------

Run the TestDropPartition tests, also checked the database manually, that no object left in the database


Thanks,

Peter Vary