You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Barna Zsombor Klara <zs...@cloudera.com> on 2017/07/03 09:05:42 UTC
Review Request 60589: HIVE-17001: Insert overwrite table doesn't clean
partition directory on HDFS if partition is missing from HMS
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60589/
-----------------------------------------------------------
Review request for hive.
Repository: hive-git
Description
-------
HIVE-17001: Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS
Diffs
-----
ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 73710a7c2917b5268f788f22baaee2d87846961b
ql/src/test/queries/clientpositive/insert_overwrite_table.q PRE-CREATION
ql/src/test/results/clientpositive/insert_overwrite_table.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/60589/diff/1/
Testing
-------
Manual testing and qtests.
Thanks,
Barna Zsombor Klara
Re: Review Request 60589: HIVE-17001: Insert overwrite table doesn't
clean partition directory on HDFS if partition is missing from HMS
Posted by Peter Vary <pv...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60589/#review179498
-----------------------------------------------------------
LGTM +1
Only one little wish, but the patch is still good without it :)
ql/src/test/queries/clientpositive/insert_overwrite_table.q
Lines 2 (patched)
<https://reviews.apache.org/r/60589/#comment254239>
Would you mind renaming the location more test specific, like '${system:test.tmp.dir}/inser_overwrite_table_test'?
It is perfectly good as it is, but I would be able to run most of the tests parallel with BeeLineDriver, and it might cause problems, if other test might use the same directory.
To reiterate - the patch is perfectly good as it is, since it will run with the current fremeworks and we might tackle this problem anyway when we change the BeeLine tests.
- Peter Vary
On July 3, 2017, 9:05 a.m., Barna Zsombor Klara wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60589/
> -----------------------------------------------------------
>
> (Updated July 3, 2017, 9:05 a.m.)
>
>
> Review request for hive.
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> HIVE-17001: Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 73710a7c2917b5268f788f22baaee2d87846961b
> ql/src/test/queries/clientpositive/insert_overwrite_table.q PRE-CREATION
> ql/src/test/results/clientpositive/insert_overwrite_table.q.out PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/60589/diff/1/
>
>
> Testing
> -------
>
> Manual testing and qtests.
>
>
> Thanks,
>
> Barna Zsombor Klara
>
>
Re: Review Request 60589: HIVE-17001: Insert overwrite table doesn't
clean partition directory on HDFS if partition is missing from HMS
Posted by Barna Zsombor Klara <zs...@cloudera.com>.
> On July 5, 2017, 7:03 p.m., Vihang Karajgaonkar wrote:
> > ql/src/test/queries/clientpositive/insert_overwrite_table.q
> > Lines 1-10 (patched)
> > <https://reviews.apache.org/r/60589/diff/1/?file=1768234#file1768234line1>
> >
> > I don't understand this test case completely. The table is defined as external so it is expected that the drop partition will not delete the HDFS file. The DFS operation is performed without the knowledge of Hive so when it returned 2 rows instead of 1 isn't it the expected behavior?
> >
> > I think the right way to solve this problem to throw an exception when we do a insert overwrite on an external table. Just like truncate table command on an external table doesn't work, I think insert overwrite should also fail on a external table. The behavior of external table is inconsistent in my opinion. We allow it to be overwritten but not truncated.
> >
> > When the table is a managed table, the test works as expected since Hive cleans up the directory after drop partition command.
The issue here is not really about external tables, I used it because I had to manually move/delete files for the testcase and I could not do it with tables on hdfs. The dfs command by default will look for the files on the local file system, and if I add the URI then I need to provide a host/port of a running HDFS instance which I don't have in a qtest. I could not find a working example in other qtests either so I went with the external table.
- Barna Zsombor
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60589/#review179680
-----------------------------------------------------------
On July 3, 2017, 9:05 a.m., Barna Zsombor Klara wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60589/
> -----------------------------------------------------------
>
> (Updated July 3, 2017, 9:05 a.m.)
>
>
> Review request for hive.
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> HIVE-17001: Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 73710a7c2917b5268f788f22baaee2d87846961b
> ql/src/test/queries/clientpositive/insert_overwrite_table.q PRE-CREATION
> ql/src/test/results/clientpositive/insert_overwrite_table.q.out PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/60589/diff/1/
>
>
> Testing
> -------
>
> Manual testing and qtests.
>
>
> Thanks,
>
> Barna Zsombor Klara
>
>
Re: Review Request 60589: HIVE-17001: Insert overwrite table doesn't
clean partition directory on HDFS if partition is missing from HMS
Posted by Vihang Karajgaonkar <vi...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60589/#review179680
-----------------------------------------------------------
ql/src/test/queries/clientpositive/insert_overwrite_table.q
Lines 1-10 (patched)
<https://reviews.apache.org/r/60589/#comment254494>
I don't understand this test case completely. The table is defined as external so it is expected that the drop partition will not delete the HDFS file. The DFS operation is performed without the knowledge of Hive so when it returned 2 rows instead of 1 isn't it the expected behavior?
I think the right way to solve this problem to throw an exception when we do a insert overwrite on an external table. Just like truncate table command on an external table doesn't work, I think insert overwrite should also fail on a external table. The behavior of external table is inconsistent in my opinion. We allow it to be overwritten but not truncated.
When the table is a managed table, the test works as expected since Hive cleans up the directory after drop partition command.
- Vihang Karajgaonkar
On July 3, 2017, 9:05 a.m., Barna Zsombor Klara wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60589/
> -----------------------------------------------------------
>
> (Updated July 3, 2017, 9:05 a.m.)
>
>
> Review request for hive.
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> HIVE-17001: Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 73710a7c2917b5268f788f22baaee2d87846961b
> ql/src/test/queries/clientpositive/insert_overwrite_table.q PRE-CREATION
> ql/src/test/results/clientpositive/insert_overwrite_table.q.out PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/60589/diff/1/
>
>
> Testing
> -------
>
> Manual testing and qtests.
>
>
> Thanks,
>
> Barna Zsombor Klara
>
>