You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Sahil Takiar <ta...@gmail.com> on 2016/12/06 22:31:37 UTC
Review Request 54451: HIVE-15367: CTAS with LOCATION should write
temp data under location directory rather than database location
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54451/
-----------------------------------------------------------
Review request for hive.
Bugs: HIVE-15367
https://issues.apache.org/jira/browse/HIVE-15367
Repository: hive-git
Description
-------
CTAS with LOCATION should write temp data under location directory rather than database location
Diffs
-----
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java c88dbc8
Diff: https://reviews.apache.org/r/54451/diff/
Testing
-------
Thanks,
Sahil Takiar
Re: Review Request 54451: HIVE-15367: CTAS with LOCATION should write
temp data under location directory rather than database location
Posted by Sahil Takiar <ta...@gmail.com>.
> On Dec. 7, 2016, 6:18 p.m., Sergio Pena wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 2140
> > <https://reviews.apache.org/r/54451/diff/2/?file=1578404#file1578404line2140>
> >
> > Shouldn't we use the parent directory instead of the location? I think CTAS will do a rename() from the temp -> location once it finishes (test blobstores that specify only s3a://bucket/path or s3a://bucket).
> >
> > In this patch, the location will have a staging temp directory that when renamed will rename file per file because the staging is a subdirectory of the location.
Ever since the introduction of the variable `hive.exec.stagingdir` the staging directory always gets created inside the table location. Yes, the rename is done file by file, but I think Hive does that intentionally in order to support HDFS encryption. I think it has some other benefits too, like making permissions easier to deal with. HIVE-15215 is an open JIRA to see if this behavior is necessary on blobstores. But until then, this is how Hive does things (at least thats my understanding).
- Sahil
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54451/#review158356
-----------------------------------------------------------
On Dec. 7, 2016, 10:29 p.m., Sahil Takiar wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54451/
> -----------------------------------------------------------
>
> (Updated Dec. 7, 2016, 10:29 p.m.)
>
>
> Review request for hive, Sergio Pena and Yongzhi Chen.
>
>
> Bugs: HIVE-15367
> https://issues.apache.org/jira/browse/HIVE-15367
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> CTAS with LOCATION should write temp data under location directory rather than database location
>
>
> Diffs
> -----
>
> itests/hive-blobstore/src/test/queries/clientpositive/ctas.q PRE-CREATION
> itests/hive-blobstore/src/test/results/clientpositive/ctas.q.out PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java c88dbc8
> ql/src/test/queries/clientpositive/ctas_uses_table_location.q PRE-CREATION
> ql/src/test/results/clientpositive/ctas_uses_table_location.q.out PRE-CREATION
> ql/src/test/results/clientpositive/encrypted/encryption_ctas.q.out 5b503ac
>
> Diff: https://reviews.apache.org/r/54451/diff/
>
>
> Testing
> -------
>
> Added qtests for hive-blobstore and for qtest
>
>
> Thanks,
>
> Sahil Takiar
>
>
Re: Review Request 54451: HIVE-15367: CTAS with LOCATION should write
temp data under location directory rather than database location
Posted by Sergio Pena <se...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54451/#review158356
-----------------------------------------------------------
itests/hive-blobstore/src/test/queries/clientpositive/ctas.q (line 12)
<https://reviews.apache.org/r/54451/#comment229122>
Can we add SELECT queries to verify the data was correctly copied? Same for the rest of CTAS statements.
itests/hive-blobstore/src/test/queries/clientpositive/ctas.q (line 25)
<https://reviews.apache.org/r/54451/#comment229121>
Should we delete the _dst tables too?
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java (line 2140)
<https://reviews.apache.org/r/54451/#comment229124>
Shouldn't we use the parent directory instead of the location? I think CTAS will do a rename() from the temp -> location once it finishes (test blobstores that specify only s3a://bucket/path or s3a://bucket).
In this patch, the location will have a staging temp directory that when renamed will rename file per file because the staging is a subdirectory of the location.
- Sergio Pena
On Dec. 7, 2016, 1:59 a.m., Sahil Takiar wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54451/
> -----------------------------------------------------------
>
> (Updated Dec. 7, 2016, 1:59 a.m.)
>
>
> Review request for hive, Sergio Pena and Yongzhi Chen.
>
>
> Bugs: HIVE-15367
> https://issues.apache.org/jira/browse/HIVE-15367
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> CTAS with LOCATION should write temp data under location directory rather than database location
>
>
> Diffs
> -----
>
> itests/hive-blobstore/src/test/queries/clientpositive/ctas.q PRE-CREATION
> itests/hive-blobstore/src/test/results/clientpositive/ctas.q.out PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java c88dbc8
> ql/src/test/queries/clientpositive/ctas_uses_table_location.q PRE-CREATION
> ql/src/test/results/clientpositive/ctas_uses_table_location.q.out PRE-CREATION
> ql/src/test/results/clientpositive/encrypted/encryption_ctas.q.out 5b503ac
>
> Diff: https://reviews.apache.org/r/54451/diff/
>
>
> Testing
> -------
>
> Added qtests for hive-blobstore and for qtest
>
>
> Thanks,
>
> Sahil Takiar
>
>
Re: Review Request 54451: HIVE-15367: CTAS with LOCATION should write
temp data under location directory rather than database location
Posted by Sergio Pena <se...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54451/#review158430
-----------------------------------------------------------
Ship it!
Ship It!
- Sergio Pena
On Dec. 7, 2016, 10:29 p.m., Sahil Takiar wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54451/
> -----------------------------------------------------------
>
> (Updated Dec. 7, 2016, 10:29 p.m.)
>
>
> Review request for hive, Sergio Pena and Yongzhi Chen.
>
>
> Bugs: HIVE-15367
> https://issues.apache.org/jira/browse/HIVE-15367
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> CTAS with LOCATION should write temp data under location directory rather than database location
>
>
> Diffs
> -----
>
> itests/hive-blobstore/src/test/queries/clientpositive/ctas.q PRE-CREATION
> itests/hive-blobstore/src/test/results/clientpositive/ctas.q.out PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java c88dbc8
> ql/src/test/queries/clientpositive/ctas_uses_table_location.q PRE-CREATION
> ql/src/test/results/clientpositive/ctas_uses_table_location.q.out PRE-CREATION
> ql/src/test/results/clientpositive/encrypted/encryption_ctas.q.out 5b503ac
>
> Diff: https://reviews.apache.org/r/54451/diff/
>
>
> Testing
> -------
>
> Added qtests for hive-blobstore and for qtest
>
>
> Thanks,
>
> Sahil Takiar
>
>
Re: Review Request 54451: HIVE-15367: CTAS with LOCATION should write
temp data under location directory rather than database location
Posted by Sahil Takiar <ta...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54451/
-----------------------------------------------------------
(Updated Dec. 7, 2016, 10:29 p.m.)
Review request for hive, Sergio Pena and Yongzhi Chen.
Bugs: HIVE-15367
https://issues.apache.org/jira/browse/HIVE-15367
Repository: hive-git
Description
-------
CTAS with LOCATION should write temp data under location directory rather than database location
Diffs (updated)
-----
itests/hive-blobstore/src/test/queries/clientpositive/ctas.q PRE-CREATION
itests/hive-blobstore/src/test/results/clientpositive/ctas.q.out PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java c88dbc8
ql/src/test/queries/clientpositive/ctas_uses_table_location.q PRE-CREATION
ql/src/test/results/clientpositive/ctas_uses_table_location.q.out PRE-CREATION
ql/src/test/results/clientpositive/encrypted/encryption_ctas.q.out 5b503ac
Diff: https://reviews.apache.org/r/54451/diff/
Testing
-------
Added qtests for hive-blobstore and for qtest
Thanks,
Sahil Takiar
Re: Review Request 54451: HIVE-15367: CTAS with LOCATION should write
temp data under location directory rather than database location
Posted by Sahil Takiar <ta...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54451/
-----------------------------------------------------------
(Updated Dec. 7, 2016, 1:59 a.m.)
Review request for hive, Sergio Pena and Yongzhi Chen.
Bugs: HIVE-15367
https://issues.apache.org/jira/browse/HIVE-15367
Repository: hive-git
Description
-------
CTAS with LOCATION should write temp data under location directory rather than database location
Diffs (updated)
-----
itests/hive-blobstore/src/test/queries/clientpositive/ctas.q PRE-CREATION
itests/hive-blobstore/src/test/results/clientpositive/ctas.q.out PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java c88dbc8
ql/src/test/queries/clientpositive/ctas_uses_table_location.q PRE-CREATION
ql/src/test/results/clientpositive/ctas_uses_table_location.q.out PRE-CREATION
ql/src/test/results/clientpositive/encrypted/encryption_ctas.q.out 5b503ac
Diff: https://reviews.apache.org/r/54451/diff/
Testing (updated)
-------
Added qtests for hive-blobstore and for qtest
Thanks,
Sahil Takiar