You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Deepak Jaiswal <dj...@hortonworks.com> on 2018/05/01 04:50:27 UTC
Re: Review Request 66805: HIVE-19311 : Partition and bucketing support for “load data” statement
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66805/
-----------------------------------------------------------
(Updated May 1, 2018, 4:50 a.m.)
Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jesús Camacho Rodríguez, Prasanth_J, and Vineet Garg.
Changes
-------
Handle nested subdirs.
Bugs: HIVE-19311
https://issues.apache.org/jira/browse/HIVE-19311
Repository: hive-git
Description
-------
Currently, "load data" statement is very limited. It errors out if any of the information is missing such as partitioning info if table is partitioned or appropriate names when table is bucketed.
It should be able to launch an insert job to load the data instead.
Diffs (updated)
-----
data/files/load_data_job/bucketing.txt PRE-CREATION
data/files/load_data_job/load_data_1_partition.txt PRE-CREATION
data/files/load_data_job/partitions/load_data_1_partition.txt PRE-CREATION
data/files/load_data_job/partitions/load_data_2_partitions.txt PRE-CREATION
data/files/load_data_job/partitions/subdir/load_data_1_partition.txt PRE-CREATION
data/files/load_data_job/partitions/subdir/load_data_2_partitions.txt PRE-CREATION
itests/src/test/resources/testconfiguration.properties 2ca7b5f63b
ql/src/java/org/apache/hadoop/hive/ql/Context.java 0fedf0e76e
ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 94dd63641d
ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java abd678bb54
ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java c07991d434
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java fad0e5c24a
ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 2f3b07f4af
ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java ec8c1507ec
ql/src/test/queries/clientnegative/load_part_nospec.q 81517991b2
ql/src/test/queries/clientnegative/nopart_load.q 966982fd5c
ql/src/test/queries/clientpositive/load_data_using_job.q PRE-CREATION
ql/src/test/results/clientnegative/load_part_nospec.q.out bebaf92311
ql/src/test/results/clientnegative/nopart_load.q.out 881514640c
ql/src/test/results/clientpositive/llap/load_data_using_job.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/66805/diff/6/
Changes: https://reviews.apache.org/r/66805/diff/5-6/
Testing
-------
Added a unit test.
Thanks,
Deepak Jaiswal
Re: Review Request 66805: HIVE-19311 : Partition and bucketing support for “load data” statement
Posted by j....@gmail.com.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66805/#review202219
-----------------------------------------------------------
Ship it!
Ship It!
- Prasanth_J
On May 1, 2018, 9:06 p.m., Deepak Jaiswal wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66805/
> -----------------------------------------------------------
>
> (Updated May 1, 2018, 9:06 p.m.)
>
>
> Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jesús Camacho Rodríguez, Prasanth_J, and Vineet Garg.
>
>
> Bugs: HIVE-19311
> https://issues.apache.org/jira/browse/HIVE-19311
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Currently, "load data" statement is very limited. It errors out if any of the information is missing such as partitioning info if table is partitioned or appropriate names when table is bucketed.
> It should be able to launch an insert job to load the data instead.
>
>
> Diffs
> -----
>
> data/files/load_data_job/bucketing.txt PRE-CREATION
> data/files/load_data_job/load_data_1_partition.txt PRE-CREATION
> data/files/load_data_job/partitions/load_data_1_partition.txt PRE-CREATION
> data/files/load_data_job/partitions/load_data_2_partitions.txt PRE-CREATION
> data/files/load_data_job/partitions/subdir/load_data_1_partition.txt PRE-CREATION
> data/files/load_data_job/partitions/subdir/load_data_2_partitions.txt PRE-CREATION
> itests/src/test/resources/testconfiguration.properties 2ca7b5f63b
> ql/src/java/org/apache/hadoop/hive/ql/Context.java 0fedf0e76e
> ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 94dd63641d
> ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java abd678bb54
> ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java c07991d434
> ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java fad0e5c24a
> ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 2f3b07f4af
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java ec8c1507ec
> ql/src/test/queries/clientnegative/load_part_nospec.q 81517991b2
> ql/src/test/queries/clientnegative/nopart_load.q 966982fd5c
> ql/src/test/queries/clientpositive/load_data_using_job.q PRE-CREATION
> ql/src/test/results/clientnegative/load_part_nospec.q.out bebaf92311
> ql/src/test/results/clientnegative/nopart_load.q.out 881514640c
> ql/src/test/results/clientpositive/llap/load_data_using_job.q.out PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/66805/diff/8/
>
>
> Testing
> -------
>
> Added a unit test.
>
>
> Thanks,
>
> Deepak Jaiswal
>
>
Re: Review Request 66805: HIVE-19311 : Partition and bucketing support for “load data” statement
Posted by Deepak Jaiswal <dj...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66805/
-----------------------------------------------------------
(Updated May 1, 2018, 9:10 p.m.)
Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jesús Camacho Rodríguez, Prasanth_J, and Vineet Garg.
Changes
-------
accidentally, uploaded wrong patch. Please ignore version 8.
Bugs: HIVE-19311
https://issues.apache.org/jira/browse/HIVE-19311
Repository: hive-git
Description
-------
Currently, "load data" statement is very limited. It errors out if any of the information is missing such as partitioning info if table is partitioned or appropriate names when table is bucketed.
It should be able to launch an insert job to load the data instead.
Diffs (updated)
-----
data/files/load_data_job/bucketing.txt PRE-CREATION
data/files/load_data_job/load_data_1_partition.txt PRE-CREATION
data/files/load_data_job/partitions/load_data_1_partition.txt PRE-CREATION
data/files/load_data_job/partitions/load_data_2_partitions.txt PRE-CREATION
data/files/load_data_job/partitions/subdir/load_data_1_partition.txt PRE-CREATION
data/files/load_data_job/partitions/subdir/load_data_2_partitions.txt PRE-CREATION
itests/src/test/resources/testconfiguration.properties 2ca7b5f63b
ql/src/java/org/apache/hadoop/hive/ql/Context.java 0fedf0e76e
ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 94dd63641d
ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java abd678bb54
ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java c07991d434
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java fad0e5c24a
ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 2f3b07f4af
ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java ec8c1507ec
ql/src/test/queries/clientnegative/load_part_nospec.q 81517991b2
ql/src/test/queries/clientnegative/nopart_load.q 966982fd5c
ql/src/test/queries/clientpositive/load_data_using_job.q PRE-CREATION
ql/src/test/results/clientnegative/load_part_nospec.q.out bebaf92311
ql/src/test/results/clientnegative/nopart_load.q.out 881514640c
ql/src/test/results/clientpositive/llap/load_data_using_job.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/66805/diff/9/
Changes: https://reviews.apache.org/r/66805/diff/8-9/
Testing
-------
Added a unit test.
Thanks,
Deepak Jaiswal
Re: Review Request 66805: HIVE-19311 : Partition and bucketing support for “load data” statement
Posted by Deepak Jaiswal <dj...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66805/
-----------------------------------------------------------
(Updated May 1, 2018, 9:06 p.m.)
Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jesús Camacho Rodríguez, Prasanth_J, and Vineet Garg.
Changes
-------
Implemented the new review comment.
Bugs: HIVE-19311
https://issues.apache.org/jira/browse/HIVE-19311
Repository: hive-git
Description
-------
Currently, "load data" statement is very limited. It errors out if any of the information is missing such as partitioning info if table is partitioned or appropriate names when table is bucketed.
It should be able to launch an insert job to load the data instead.
Diffs (updated)
-----
data/files/load_data_job/bucketing.txt PRE-CREATION
data/files/load_data_job/load_data_1_partition.txt PRE-CREATION
data/files/load_data_job/partitions/load_data_1_partition.txt PRE-CREATION
data/files/load_data_job/partitions/load_data_2_partitions.txt PRE-CREATION
data/files/load_data_job/partitions/subdir/load_data_1_partition.txt PRE-CREATION
data/files/load_data_job/partitions/subdir/load_data_2_partitions.txt PRE-CREATION
itests/src/test/resources/testconfiguration.properties 2ca7b5f63b
ql/src/java/org/apache/hadoop/hive/ql/Context.java 0fedf0e76e
ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 94dd63641d
ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java abd678bb54
ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java c07991d434
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java fad0e5c24a
ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 2f3b07f4af
ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java ec8c1507ec
ql/src/test/queries/clientnegative/load_part_nospec.q 81517991b2
ql/src/test/queries/clientnegative/nopart_load.q 966982fd5c
ql/src/test/queries/clientpositive/load_data_using_job.q PRE-CREATION
ql/src/test/results/clientnegative/load_part_nospec.q.out bebaf92311
ql/src/test/results/clientnegative/nopart_load.q.out 881514640c
ql/src/test/results/clientpositive/llap/load_data_using_job.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/66805/diff/8/
Changes: https://reviews.apache.org/r/66805/diff/7-8/
Testing
-------
Added a unit test.
Thanks,
Deepak Jaiswal
Re: Review Request 66805: HIVE-19311 : Partition and bucketing support for “load data” statement
Posted by j....@gmail.com.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66805/#review202211
-----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
Lines 1255 (patched)
<https://reviews.apache.org/r/66805/#comment283963>
Instead of doing instanceof check here, you would get the same behaviour if you Override getAllOutputs() methods in LoadSemanticAnalyzer right?
- Prasanth_J
On May 1, 2018, 7:56 p.m., Deepak Jaiswal wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66805/
> -----------------------------------------------------------
>
> (Updated May 1, 2018, 7:56 p.m.)
>
>
> Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jesús Camacho Rodríguez, Prasanth_J, and Vineet Garg.
>
>
> Bugs: HIVE-19311
> https://issues.apache.org/jira/browse/HIVE-19311
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Currently, "load data" statement is very limited. It errors out if any of the information is missing such as partitioning info if table is partitioned or appropriate names when table is bucketed.
> It should be able to launch an insert job to load the data instead.
>
>
> Diffs
> -----
>
> data/files/load_data_job/bucketing.txt PRE-CREATION
> data/files/load_data_job/load_data_1_partition.txt PRE-CREATION
> data/files/load_data_job/partitions/load_data_1_partition.txt PRE-CREATION
> data/files/load_data_job/partitions/load_data_2_partitions.txt PRE-CREATION
> data/files/load_data_job/partitions/subdir/load_data_1_partition.txt PRE-CREATION
> data/files/load_data_job/partitions/subdir/load_data_2_partitions.txt PRE-CREATION
> itests/src/test/resources/testconfiguration.properties 2ca7b5f63b
> ql/src/java/org/apache/hadoop/hive/ql/Context.java 0fedf0e76e
> ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 94dd63641d
> ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java abd678bb54
> ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java c07991d434
> ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java fad0e5c24a
> ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 2f3b07f4af
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java ec8c1507ec
> ql/src/test/queries/clientnegative/load_part_nospec.q 81517991b2
> ql/src/test/queries/clientnegative/nopart_load.q 966982fd5c
> ql/src/test/queries/clientpositive/load_data_using_job.q PRE-CREATION
> ql/src/test/results/clientnegative/load_part_nospec.q.out bebaf92311
> ql/src/test/results/clientnegative/nopart_load.q.out 881514640c
> ql/src/test/results/clientpositive/llap/load_data_using_job.q.out PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/66805/diff/7/
>
>
> Testing
> -------
>
> Added a unit test.
>
>
> Thanks,
>
> Deepak Jaiswal
>
>
Re: Review Request 66805: HIVE-19311 : Partition and bucketing support for “load data” statement
Posted by Deepak Jaiswal <dj...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66805/
-----------------------------------------------------------
(Updated May 1, 2018, 7:56 p.m.)
Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jesús Camacho Rodríguez, Prasanth_J, and Vineet Garg.
Changes
-------
Implemented Prasanth's review comments.
Bugs: HIVE-19311
https://issues.apache.org/jira/browse/HIVE-19311
Repository: hive-git
Description
-------
Currently, "load data" statement is very limited. It errors out if any of the information is missing such as partitioning info if table is partitioned or appropriate names when table is bucketed.
It should be able to launch an insert job to load the data instead.
Diffs (updated)
-----
data/files/load_data_job/bucketing.txt PRE-CREATION
data/files/load_data_job/load_data_1_partition.txt PRE-CREATION
data/files/load_data_job/partitions/load_data_1_partition.txt PRE-CREATION
data/files/load_data_job/partitions/load_data_2_partitions.txt PRE-CREATION
data/files/load_data_job/partitions/subdir/load_data_1_partition.txt PRE-CREATION
data/files/load_data_job/partitions/subdir/load_data_2_partitions.txt PRE-CREATION
itests/src/test/resources/testconfiguration.properties 2ca7b5f63b
ql/src/java/org/apache/hadoop/hive/ql/Context.java 0fedf0e76e
ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 94dd63641d
ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java abd678bb54
ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java c07991d434
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java fad0e5c24a
ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 2f3b07f4af
ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java ec8c1507ec
ql/src/test/queries/clientnegative/load_part_nospec.q 81517991b2
ql/src/test/queries/clientnegative/nopart_load.q 966982fd5c
ql/src/test/queries/clientpositive/load_data_using_job.q PRE-CREATION
ql/src/test/results/clientnegative/load_part_nospec.q.out bebaf92311
ql/src/test/results/clientnegative/nopart_load.q.out 881514640c
ql/src/test/results/clientpositive/llap/load_data_using_job.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/66805/diff/7/
Changes: https://reviews.apache.org/r/66805/diff/6-7/
Testing
-------
Added a unit test.
Thanks,
Deepak Jaiswal
Re: Review Request 66805: HIVE-19311 : Partition and bucketing support for “load data” statement
Posted by Deepak Jaiswal <dj...@hortonworks.com>.
> On May 1, 2018, 6:23 a.m., Prasanth_J wrote:
> >
Thanks for the review. Working on the review comments.
> On May 1, 2018, 6:23 a.m., Prasanth_J wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java
> > Lines 241 (patched)
> > <https://reviews.apache.org/r/66805/diff/6/?file=2014982#file2014982line270>
> >
> > Is useSuper required here? I can see why you are doing this. But can you make this decision based on where tempTable is set in the ctx or not? If tempTable is set in the ctx then it is rewritten load statement. Alternatively you could use a separate boolean in ctx to say if load query is rewritten or not based on which you can construct the if condition.
Makes sense. Will do the changes, thanks.
> On May 1, 2018, 6:23 a.m., Prasanth_J wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java
> > Lines 483 (patched)
> > <https://reviews.apache.org/r/66805/diff/6/?file=2014982#file2014982line515>
> >
> > Why is this required? Isn't load data "insert" operation type?
Cleanup issue. Will remove it. Thanks for catching it.
> On May 1, 2018, 6:23 a.m., Prasanth_J wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
> > Lines 1254 (patched)
> > <https://reviews.apache.org/r/66805/diff/6/?file=2014983#file2014983line1255>
> >
> > could you add a comment to say why does load require a shallow copy?
Sure.
> On May 1, 2018, 6:23 a.m., Prasanth_J wrote:
> > ql/src/test/queries/clientpositive/load_data_using_job.q
> > Lines 72 (patched)
> > <https://reviews.apache.org/r/66805/diff/6/?file=2014988#file2014988line72>
> >
> > Could you also add a case for multi-column bucketing?
> > I think Eric also asked the same question, does this handle multi-level subdirs? IIRC load only recurses to 1 or 2 levels.
I can add multi column bucketing.
Eric was talking about multi column partitioning which creates multi-level subdirs. Currently load fails if it sees a subdirs. It is now handled in latest patch.
- Deepak
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66805/#review202190
-----------------------------------------------------------
On May 1, 2018, 4:50 a.m., Deepak Jaiswal wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66805/
> -----------------------------------------------------------
>
> (Updated May 1, 2018, 4:50 a.m.)
>
>
> Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jesús Camacho Rodríguez, Prasanth_J, and Vineet Garg.
>
>
> Bugs: HIVE-19311
> https://issues.apache.org/jira/browse/HIVE-19311
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Currently, "load data" statement is very limited. It errors out if any of the information is missing such as partitioning info if table is partitioned or appropriate names when table is bucketed.
> It should be able to launch an insert job to load the data instead.
>
>
> Diffs
> -----
>
> data/files/load_data_job/bucketing.txt PRE-CREATION
> data/files/load_data_job/load_data_1_partition.txt PRE-CREATION
> data/files/load_data_job/partitions/load_data_1_partition.txt PRE-CREATION
> data/files/load_data_job/partitions/load_data_2_partitions.txt PRE-CREATION
> data/files/load_data_job/partitions/subdir/load_data_1_partition.txt PRE-CREATION
> data/files/load_data_job/partitions/subdir/load_data_2_partitions.txt PRE-CREATION
> itests/src/test/resources/testconfiguration.properties 2ca7b5f63b
> ql/src/java/org/apache/hadoop/hive/ql/Context.java 0fedf0e76e
> ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 94dd63641d
> ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java abd678bb54
> ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java c07991d434
> ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java fad0e5c24a
> ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 2f3b07f4af
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java ec8c1507ec
> ql/src/test/queries/clientnegative/load_part_nospec.q 81517991b2
> ql/src/test/queries/clientnegative/nopart_load.q 966982fd5c
> ql/src/test/queries/clientpositive/load_data_using_job.q PRE-CREATION
> ql/src/test/results/clientnegative/load_part_nospec.q.out bebaf92311
> ql/src/test/results/clientnegative/nopart_load.q.out 881514640c
> ql/src/test/results/clientpositive/llap/load_data_using_job.q.out PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/66805/diff/6/
>
>
> Testing
> -------
>
> Added a unit test.
>
>
> Thanks,
>
> Deepak Jaiswal
>
>
Re: Review Request 66805: HIVE-19311 : Partition and bucketing support for “load data” statement
Posted by j....@gmail.com.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66805/#review202190
-----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java
Lines 241 (patched)
<https://reviews.apache.org/r/66805/#comment283925>
Is useSuper required here? I can see why you are doing this. But can you make this decision based on where tempTable is set in the ctx or not? If tempTable is set in the ctx then it is rewritten load statement. Alternatively you could use a separate boolean in ctx to say if load query is rewritten or not based on which you can construct the if condition.
ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java
Line 255 (original), 254 (patched)
<https://reviews.apache.org/r/66805/#comment283926>
This can be inlined at the reparse method.
ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java
Lines 483 (patched)
<https://reviews.apache.org/r/66805/#comment283923>
Why is this required? Isn't load data "insert" operation type?
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
Lines 1254 (patched)
<https://reviews.apache.org/r/66805/#comment283927>
could you add a comment to say why does load require a shallow copy?
ql/src/test/queries/clientpositive/load_data_using_job.q
Lines 72 (patched)
<https://reviews.apache.org/r/66805/#comment283928>
Could you also add a case for multi-column bucketing?
I think Eric also asked the same question, does this handle multi-level subdirs? IIRC load only recurses to 1 or 2 levels.
- Prasanth_J
On May 1, 2018, 4:50 a.m., Deepak Jaiswal wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66805/
> -----------------------------------------------------------
>
> (Updated May 1, 2018, 4:50 a.m.)
>
>
> Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jesús Camacho Rodríguez, Prasanth_J, and Vineet Garg.
>
>
> Bugs: HIVE-19311
> https://issues.apache.org/jira/browse/HIVE-19311
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Currently, "load data" statement is very limited. It errors out if any of the information is missing such as partitioning info if table is partitioned or appropriate names when table is bucketed.
> It should be able to launch an insert job to load the data instead.
>
>
> Diffs
> -----
>
> data/files/load_data_job/bucketing.txt PRE-CREATION
> data/files/load_data_job/load_data_1_partition.txt PRE-CREATION
> data/files/load_data_job/partitions/load_data_1_partition.txt PRE-CREATION
> data/files/load_data_job/partitions/load_data_2_partitions.txt PRE-CREATION
> data/files/load_data_job/partitions/subdir/load_data_1_partition.txt PRE-CREATION
> data/files/load_data_job/partitions/subdir/load_data_2_partitions.txt PRE-CREATION
> itests/src/test/resources/testconfiguration.properties 2ca7b5f63b
> ql/src/java/org/apache/hadoop/hive/ql/Context.java 0fedf0e76e
> ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 94dd63641d
> ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java abd678bb54
> ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java c07991d434
> ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java fad0e5c24a
> ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 2f3b07f4af
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java ec8c1507ec
> ql/src/test/queries/clientnegative/load_part_nospec.q 81517991b2
> ql/src/test/queries/clientnegative/nopart_load.q 966982fd5c
> ql/src/test/queries/clientpositive/load_data_using_job.q PRE-CREATION
> ql/src/test/results/clientnegative/load_part_nospec.q.out bebaf92311
> ql/src/test/results/clientnegative/nopart_load.q.out 881514640c
> ql/src/test/results/clientpositive/llap/load_data_using_job.q.out PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/66805/diff/6/
>
>
> Testing
> -------
>
> Added a unit test.
>
>
> Thanks,
>
> Deepak Jaiswal
>
>