You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Barna Zsombor Klara <zs...@cloudera.com> on 2017/07/03 09:05:42 UTC

Review Request 60589: HIVE-17001: Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60589/
-----------------------------------------------------------

Review request for hive.


Repository: hive-git


Description
-------

HIVE-17001: Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 73710a7c2917b5268f788f22baaee2d87846961b 
  ql/src/test/queries/clientpositive/insert_overwrite_table.q PRE-CREATION 
  ql/src/test/results/clientpositive/insert_overwrite_table.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/60589/diff/1/


Testing
-------

Manual testing and qtests.


Thanks,

Barna Zsombor Klara


Re: Review Request 60589: HIVE-17001: Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS

Posted by Peter Vary <pv...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60589/#review179498
-----------------------------------------------------------



LGTM +1
Only one little wish, but the patch is still good without it :)


ql/src/test/queries/clientpositive/insert_overwrite_table.q
Lines 2 (patched)
<https://reviews.apache.org/r/60589/#comment254239>

    Would you mind renaming the location more test specific, like '${system:test.tmp.dir}/inser_overwrite_table_test'?
    
    It is perfectly good as it is, but I would be able to run most of the tests parallel with BeeLineDriver, and it might cause problems, if other test might use the same directory.
    
    To reiterate - the patch is perfectly good as it is, since it will run with the current fremeworks and we might tackle this problem anyway when we change the BeeLine tests.


- Peter Vary


On July 3, 2017, 9:05 a.m., Barna Zsombor Klara wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60589/
> -----------------------------------------------------------
> 
> (Updated July 3, 2017, 9:05 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-17001: Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 73710a7c2917b5268f788f22baaee2d87846961b 
>   ql/src/test/queries/clientpositive/insert_overwrite_table.q PRE-CREATION 
>   ql/src/test/results/clientpositive/insert_overwrite_table.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60589/diff/1/
> 
> 
> Testing
> -------
> 
> Manual testing and qtests.
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>


Re: Review Request 60589: HIVE-17001: Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS

Posted by Barna Zsombor Klara <zs...@cloudera.com>.

> On July 5, 2017, 7:03 p.m., Vihang Karajgaonkar wrote:
> > ql/src/test/queries/clientpositive/insert_overwrite_table.q
> > Lines 1-10 (patched)
> > <https://reviews.apache.org/r/60589/diff/1/?file=1768234#file1768234line1>
> >
> >     I don't understand this test case completely. The table is defined as external so it is expected that the drop partition will not delete the HDFS file. The DFS operation is performed without the knowledge of Hive so when it returned 2 rows instead of 1 isn't it the expected behavior?
> >     
> >     I think the right way to solve this problem to throw an exception when we do a insert overwrite on an external table. Just like truncate table command on an external table doesn't work, I think insert overwrite should also fail on a external table. The behavior of external table is inconsistent in my opinion. We allow it to be overwritten but not truncated.
> >     
> >     When the table is a managed table, the test works as expected since Hive cleans up the directory after drop partition command.

The issue here is not really about external tables, I used it because I had to manually move/delete files for the testcase and I could not do it with tables on hdfs. The dfs command by default will look for the files on the local file system, and if I add the URI then I need to provide a host/port of a running HDFS instance which I don't have in a qtest. I could not find a working example in other qtests either so I went with the external table.


- Barna Zsombor


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60589/#review179680
-----------------------------------------------------------


On July 3, 2017, 9:05 a.m., Barna Zsombor Klara wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60589/
> -----------------------------------------------------------
> 
> (Updated July 3, 2017, 9:05 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-17001: Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 73710a7c2917b5268f788f22baaee2d87846961b 
>   ql/src/test/queries/clientpositive/insert_overwrite_table.q PRE-CREATION 
>   ql/src/test/results/clientpositive/insert_overwrite_table.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60589/diff/1/
> 
> 
> Testing
> -------
> 
> Manual testing and qtests.
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>


Re: Review Request 60589: HIVE-17001: Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS

Posted by Vihang Karajgaonkar <vi...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60589/#review179680
-----------------------------------------------------------




ql/src/test/queries/clientpositive/insert_overwrite_table.q
Lines 1-10 (patched)
<https://reviews.apache.org/r/60589/#comment254494>

    I don't understand this test case completely. The table is defined as external so it is expected that the drop partition will not delete the HDFS file. The DFS operation is performed without the knowledge of Hive so when it returned 2 rows instead of 1 isn't it the expected behavior?
    
    I think the right way to solve this problem to throw an exception when we do a insert overwrite on an external table. Just like truncate table command on an external table doesn't work, I think insert overwrite should also fail on a external table. The behavior of external table is inconsistent in my opinion. We allow it to be overwritten but not truncated.
    
    When the table is a managed table, the test works as expected since Hive cleans up the directory after drop partition command.


- Vihang Karajgaonkar


On July 3, 2017, 9:05 a.m., Barna Zsombor Klara wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60589/
> -----------------------------------------------------------
> 
> (Updated July 3, 2017, 9:05 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-17001: Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 73710a7c2917b5268f788f22baaee2d87846961b 
>   ql/src/test/queries/clientpositive/insert_overwrite_table.q PRE-CREATION 
>   ql/src/test/results/clientpositive/insert_overwrite_table.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60589/diff/1/
> 
> 
> Testing
> -------
> 
> Manual testing and qtests.
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>