You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Peter Vary via Review Board <no...@reviews.apache.org> on 2018/04/25 17:07:12 UTC
Review Request 66800: HIVE-6980 Drop table by using direct sql
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/
-----------------------------------------------------------
Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
Bugs: HIVE-6980
https://issues.apache.org/jira/browse/HIVE-6980
Repository: hive-git
Description
-------
First version of the patch.
Splits getPartitionsViaSqlFilterInternal to:
getPartitionIdsViaSqlFilter - which returns the partition ids
getPartitionsFromPartitionIds - which returns the partition data for the partitions
Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
Modifies the ObjectStore to drop partitions with directsql if possible.
Diffs
-----
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 997f5fd
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java 184ecb6
Diff: https://reviews.apache.org/r/66800/diff/1/
Testing
-------
Run the TestDropPartition tests, also checked the database manually, that no object left in the database
Thanks,
Peter Vary
Re: Review Request 66800: HIVE-6980 Drop table by using direct sql
Posted by Peter Vary via Review Board <no...@reviews.apache.org>.
> On May 10, 2018, 11:49 a.m., Alexander Kolbasov wrote:
> > Do you have some way of testing that all the objects are actually deleted from the DB?
I will create an followup jira to find a way to do this.
HIVE-19503
> On May 10, 2018, 11:49 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Line 655 (original), 667 (patched)
> > <https://reviews.apache.org/r/66800/diff/3/?file=2018475#file2018475line687>
> >
> > Is it existing bug in the code?
I think it is, but since we do not use this actively, this is not yet surfaced
> On May 10, 2018, 11:49 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 1090 (patched)
> > <https://reviews.apache.org/r/66800/diff/3/?file=2018475#file2018475line1110>
> >
> > I think this can be done in a cleaner way using Java stream API. From https://docs.oracle.com/javase/8/docs/api/java/util/StringJoiner.html:
> >
> > `
> > List<Integer> numbers = Arrays.asList(1, 2, 3, 4);
> > String commaSeparatedNumbers = numbers.stream()
> > .map(i -> i.toString())
> > .collect(Collectors.joining(", "));
> > `
Thanks, I will use this
> On May 10, 2018, 11:49 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 2567 (patched)
> > <https://reviews.apache.org/r/66800/diff/3/?file=2018475#file2018475line2587>
> >
> > Why do we need to do it here? We are done already
Removed the last checkTimeout-s from every method. Added a few ones where they are needed
> On May 10, 2018, 11:49 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 2686 (patched)
> > <https://reviews.apache.org/r/66800/diff/3/?file=2018475#file2018475line2706>
> >
> > Do we need to do it here?
Removed the last checkTimeout-s from every method. Added a few ones where they are needed
> On May 10, 2018, 11:49 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 2721 (patched)
> > <https://reviews.apache.org/r/66800/diff/3/?file=2018475#file2018475line2741>
> >
> > Do we need to do it here?
Removed the last checkTimeout-s from every method. Added a few ones where they are needed
- Peter
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/#review202832
-----------------------------------------------------------
On May 9, 2018, 1 p.m., Peter Vary wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66800/
> -----------------------------------------------------------
>
> (Updated May 9, 2018, 1 p.m.)
>
>
> Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
>
>
> Bugs: HIVE-6980
> https://issues.apache.org/jira/browse/HIVE-6980
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> First version of the patch.
>
> Splits getPartitionsViaSqlFilterInternal to:
>
> getPartitionIdsViaSqlFilter - which returns the partition ids
> getPartitionsFromPartitionIds - which returns the partition data for the partitions
> Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
>
> Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
>
> Modifies the ObjectStore to drop partitions with directsql if possible.
>
>
> Diffs
> -----
>
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 4e0e887
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java 6645e55
>
>
> Diff: https://reviews.apache.org/r/66800/diff/3/
>
>
> Testing
> -------
>
> Run the TestDropPartition tests, also checked the database manually, that no object left in the database
>
>
> Thanks,
>
> Peter Vary
>
>
Re: Review Request 66800: HIVE-6980 Drop table by using direct sql
Posted by Alexander Kolbasov <ak...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/#review202832
-----------------------------------------------------------
Do you have some way of testing that all the objects are actually deleted from the DB?
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Line 655 (original), 667 (patched)
<https://reviews.apache.org/r/66800/#comment284842>
Is it existing bug in the code?
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 1090 (patched)
<https://reviews.apache.org/r/66800/#comment284844>
I think this can be done in a cleaner way using Java stream API. From https://docs.oracle.com/javase/8/docs/api/java/util/StringJoiner.html:
`
List<Integer> numbers = Arrays.asList(1, 2, 3, 4);
String commaSeparatedNumbers = numbers.stream()
.map(i -> i.toString())
.collect(Collectors.joining(", "));
`
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 2567 (patched)
<https://reviews.apache.org/r/66800/#comment284845>
Why do we need to do it here? We are done already
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 2686 (patched)
<https://reviews.apache.org/r/66800/#comment284846>
Do we need to do it here?
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 2721 (patched)
<https://reviews.apache.org/r/66800/#comment284847>
Do we need to do it here?
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 2761 (patched)
<https://reviews.apache.org/r/66800/#comment284850>
It is usually better to use !isEmpty() in such cases.
Also, it may be easier to read if you do
```
if (danglingColumnDescriptorIdList.isEmpty()) {
return;
}
```
- Alexander Kolbasov
On May 9, 2018, 1 p.m., Peter Vary wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66800/
> -----------------------------------------------------------
>
> (Updated May 9, 2018, 1 p.m.)
>
>
> Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
>
>
> Bugs: HIVE-6980
> https://issues.apache.org/jira/browse/HIVE-6980
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> First version of the patch.
>
> Splits getPartitionsViaSqlFilterInternal to:
>
> getPartitionIdsViaSqlFilter - which returns the partition ids
> getPartitionsFromPartitionIds - which returns the partition data for the partitions
> Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
>
> Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
>
> Modifies the ObjectStore to drop partitions with directsql if possible.
>
>
> Diffs
> -----
>
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 4e0e887
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java 6645e55
>
>
> Diff: https://reviews.apache.org/r/66800/diff/3/
>
>
> Testing
> -------
>
> Run the TestDropPartition tests, also checked the database manually, that no object left in the database
>
>
> Thanks,
>
> Peter Vary
>
>
Re: Review Request 66800: HIVE-6980 Drop table by using direct sql
Posted by Alexander Kolbasov <ak...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/#review203011
-----------------------------------------------------------
Ship it!
Ship It!
- Alexander Kolbasov
On May 11, 2018, 2:13 p.m., Peter Vary wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66800/
> -----------------------------------------------------------
>
> (Updated May 11, 2018, 2:13 p.m.)
>
>
> Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
>
>
> Bugs: HIVE-6980
> https://issues.apache.org/jira/browse/HIVE-6980
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> First version of the patch.
>
> Splits getPartitionsViaSqlFilterInternal to:
>
> getPartitionIdsViaSqlFilter - which returns the partition ids
> getPartitionsFromPartitionIds - which returns the partition data for the partitions
> Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
>
> Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
>
> Modifies the ObjectStore to drop partitions with directsql if possible.
>
>
> Diffs
> -----
>
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 56fbfed
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java b0a805f
>
>
> Diff: https://reviews.apache.org/r/66800/diff/4/
>
>
> Testing
> -------
>
> Run the TestDropPartition tests, also checked the database manually, that no object left in the database
>
>
> Thanks,
>
> Peter Vary
>
>
Re: Review Request 66800: HIVE-6980 Drop table by using direct sql
Posted by Vihang Karajgaonkar via Review Board <no...@reviews.apache.org>.
> On May 14, 2018, 7:07 p.m., Vihang Karajgaonkar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 2545 (patched)
> > <https://reviews.apache.org/r/66800/diff/4/?file=2019957#file2019957line2566>
> >
> > Why not LOG.error?
>
> Peter Vary wrote:
> My original thought was, that we fall back to JDO solution this case, so the problem is not fatal.
> I think this is why the other cases are using LOG.warn too.
> Shall I change here, and other places as well?
>
> Thanks for the review!
I see, that sounds good to me then. Thanks
- Vihang
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/#review203055
-----------------------------------------------------------
On May 11, 2018, 2:13 p.m., Peter Vary wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66800/
> -----------------------------------------------------------
>
> (Updated May 11, 2018, 2:13 p.m.)
>
>
> Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
>
>
> Bugs: HIVE-6980
> https://issues.apache.org/jira/browse/HIVE-6980
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> First version of the patch.
>
> Splits getPartitionsViaSqlFilterInternal to:
>
> getPartitionIdsViaSqlFilter - which returns the partition ids
> getPartitionsFromPartitionIds - which returns the partition data for the partitions
> Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
>
> Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
>
> Modifies the ObjectStore to drop partitions with directsql if possible.
>
>
> Diffs
> -----
>
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 56fbfed
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java b0a805f
>
>
> Diff: https://reviews.apache.org/r/66800/diff/4/
>
>
> Testing
> -------
>
> Run the TestDropPartition tests, also checked the database manually, that no object left in the database
>
>
> Thanks,
>
> Peter Vary
>
>
Re: Review Request 66800: HIVE-6980 Drop table by using direct sql
Posted by Peter Vary via Review Board <no...@reviews.apache.org>.
> On May 14, 2018, 7:07 p.m., Vihang Karajgaonkar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 2545 (patched)
> > <https://reviews.apache.org/r/66800/diff/4/?file=2019957#file2019957line2566>
> >
> > Why not LOG.error?
My original thought was, that we fall back to JDO solution this case, so the problem is not fatal.
I think this is why the other cases are using LOG.warn too.
Shall I change here, and other places as well?
Thanks for the review!
- Peter
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/#review203055
-----------------------------------------------------------
On May 11, 2018, 2:13 p.m., Peter Vary wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66800/
> -----------------------------------------------------------
>
> (Updated May 11, 2018, 2:13 p.m.)
>
>
> Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
>
>
> Bugs: HIVE-6980
> https://issues.apache.org/jira/browse/HIVE-6980
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> First version of the patch.
>
> Splits getPartitionsViaSqlFilterInternal to:
>
> getPartitionIdsViaSqlFilter - which returns the partition ids
> getPartitionsFromPartitionIds - which returns the partition data for the partitions
> Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
>
> Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
>
> Modifies the ObjectStore to drop partitions with directsql if possible.
>
>
> Diffs
> -----
>
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 56fbfed
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java b0a805f
>
>
> Diff: https://reviews.apache.org/r/66800/diff/4/
>
>
> Testing
> -------
>
> Run the TestDropPartition tests, also checked the database manually, that no object left in the database
>
>
> Thanks,
>
> Peter Vary
>
>
Re: Review Request 66800: HIVE-6980 Drop table by using direct sql
Posted by Vihang Karajgaonkar via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/#review203055
-----------------------------------------------------------
Fix it, then Ship it!
LGTM. Thanks for making this change. This should help a lot for dropping big tables!
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 2545 (patched)
<https://reviews.apache.org/r/66800/#comment285111>
Why not LOG.error?
- Vihang Karajgaonkar
On May 11, 2018, 2:13 p.m., Peter Vary wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66800/
> -----------------------------------------------------------
>
> (Updated May 11, 2018, 2:13 p.m.)
>
>
> Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
>
>
> Bugs: HIVE-6980
> https://issues.apache.org/jira/browse/HIVE-6980
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> First version of the patch.
>
> Splits getPartitionsViaSqlFilterInternal to:
>
> getPartitionIdsViaSqlFilter - which returns the partition ids
> getPartitionsFromPartitionIds - which returns the partition data for the partitions
> Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
>
> Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
>
> Modifies the ObjectStore to drop partitions with directsql if possible.
>
>
> Diffs
> -----
>
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 56fbfed
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java b0a805f
>
>
> Diff: https://reviews.apache.org/r/66800/diff/4/
>
>
> Testing
> -------
>
> Run the TestDropPartition tests, also checked the database manually, that no object left in the database
>
>
> Thanks,
>
> Peter Vary
>
>
Re: Review Request 66800: HIVE-6980 Drop table by using direct sql
Posted by Peter Vary via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/
-----------------------------------------------------------
(Updated May 11, 2018, 2:13 p.m.)
Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
Changes
-------
Addressed Sasha's comments
Bugs: HIVE-6980
https://issues.apache.org/jira/browse/HIVE-6980
Repository: hive-git
Description
-------
First version of the patch.
Splits getPartitionsViaSqlFilterInternal to:
getPartitionIdsViaSqlFilter - which returns the partition ids
getPartitionsFromPartitionIds - which returns the partition data for the partitions
Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
Modifies the ObjectStore to drop partitions with directsql if possible.
Diffs (updated)
-----
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 56fbfed
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java b0a805f
Diff: https://reviews.apache.org/r/66800/diff/4/
Changes: https://reviews.apache.org/r/66800/diff/3-4/
Testing
-------
Run the TestDropPartition tests, also checked the database manually, that no object left in the database
Thanks,
Peter Vary
Re: Review Request 66800: HIVE-6980 Drop table by using direct sql
Posted by Peter Vary via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/
-----------------------------------------------------------
(Updated May 9, 2018, 1 p.m.)
Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
Bugs: HIVE-6980
https://issues.apache.org/jira/browse/HIVE-6980
Repository: hive-git
Description
-------
First version of the patch.
Splits getPartitionsViaSqlFilterInternal to:
getPartitionIdsViaSqlFilter - which returns the partition ids
getPartitionsFromPartitionIds - which returns the partition data for the partitions
Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
Modifies the ObjectStore to drop partitions with directsql if possible.
Diffs (updated)
-----
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 4e0e887
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java 6645e55
Diff: https://reviews.apache.org/r/66800/diff/3/
Changes: https://reviews.apache.org/r/66800/diff/2-3/
Testing
-------
Run the TestDropPartition tests, also checked the database manually, that no object left in the database
Thanks,
Peter Vary
Re: Review Request 66800: HIVE-6980 Drop table by using direct sql
Posted by Peter Vary via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/
-----------------------------------------------------------
(Updated April 26, 2018, 3:58 p.m.)
Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam Szita, and Vihang Karajgaonkar.
Changes
-------
Rebased the patch
Bugs: HIVE-6980
https://issues.apache.org/jira/browse/HIVE-6980
Repository: hive-git
Description
-------
First version of the patch.
Splits getPartitionsViaSqlFilterInternal to:
getPartitionIdsViaSqlFilter - which returns the partition ids
getPartitionsFromPartitionIds - which returns the partition data for the partitions
Creates dropPartitionsByPartitionIds which drops the partitions by directSQL commands
Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and dropPartitionsByPartitionIds.
Modifies the ObjectStore to drop partitions with directsql if possible.
Diffs (updated)
-----
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 4e0e887
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java 1abd99d
Diff: https://reviews.apache.org/r/66800/diff/2/
Changes: https://reviews.apache.org/r/66800/diff/1-2/
Testing
-------
Run the TestDropPartition tests, also checked the database manually, that no object left in the database
Thanks,
Peter Vary