You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by Qian Xu <qi...@intel.com> on 2015/04/11 19:31:54 UTC
Review Request 33104: SQOOP-Hive import with Parquet should append
automatically
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33104/
-----------------------------------------------------------
Review request for Sqoop.
Bugs: SQOOP-2295
https://issues.apache.org/jira/browse/SQOOP-2295
Repository: sqoop-trunk
Description
-------
Currently, an existing dataset will throw an exception. This differs from `--as-textfile`. I've checked the user manual, the handling of HDFS and Hive are indeed different. For HDFS, unless `--append` is specified, the job will fail when destination exists already. For Hive, unless `--create-hive-table` is specified, the job will become append mode. The patch has made the handling of `--as-textfile` and `--as-parquetfile` consistent. (Note that `--as-avrofile` is not supported, which can be supported similias as parquet in follow-up jira.
Diffs
-----
src/docs/man/hive-args.txt 7d9e427
src/docs/man/sqoop-create-hive-table.txt 7aebcc1
src/docs/user/create-hive-table.txt 3aa34fd
src/docs/user/hive-args.txt 53de92d
src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java e70d23c
src/java/org/apache/sqoop/mapreduce/ParquetJob.java df55dbc
Diff: https://reviews.apache.org/r/33104/diff/
Testing
-------
Manually tested append, new create and overwrite cases.
Thanks,
Qian Xu
Re: Review Request 33104: SQOOP-Hive import with Parquet should append
automatically
Posted by Abraham Elmahrek <ab...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33104/#review79927
-----------------------------------------------------------
Can you also add test cases for this? TestParquetImport
- Abraham Elmahrek
On April 12, 2015, 7:26 a.m., Qian Xu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33104/
> -----------------------------------------------------------
>
> (Updated April 12, 2015, 7:26 a.m.)
>
>
> Review request for Sqoop.
>
>
> Bugs: SQOOP-2295
> https://issues.apache.org/jira/browse/SQOOP-2295
>
>
> Repository: sqoop-trunk
>
>
> Description
> -------
>
> Currently, an existing dataset will throw an exception. This differs from `--as-textfile`. I've checked the user manual, the handling of HDFS and Hive are indeed different. For HDFS, unless `--append` is specified, the job will fail when destination exists already. For Hive, unless `--create-hive-table` is specified, the job will become append mode. The patch has made the handling of `--as-textfile` and `--as-parquetfile` consistent.
>
>
> Diffs
> -----
>
> src/docs/man/hive-args.txt 7d9e427
> src/docs/man/sqoop-create-hive-table.txt 7aebcc1
> src/docs/user/create-hive-table.txt 3aa34fd
> src/docs/user/hive-args.txt 53de92d
> src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java e70d23c
> src/java/org/apache/sqoop/mapreduce/ParquetJob.java df55dbc
> src/java/org/apache/sqoop/tool/BaseSqoopTool.java c97bb58
>
> Diff: https://reviews.apache.org/r/33104/diff/
>
>
> Testing
> -------
>
> Manually tested append, new create and overwrite cases.
>
>
> Thanks,
>
> Qian Xu
>
>
Re: Review Request 33104: SQOOP-Hive import with Parquet should append
automatically
Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33104/#review82349
-----------------------------------------------------------
The patch looks good to me. Just one question - it seems that it's causing two test failures on my machine:
Testcase: testFieldWithHiveDelims took 5.021 sec
Testcase: testGenerateOnly took 0.329 sec
Testcase: testHiveExitFails took 1.7 sec
Testcase: testDate took 1.695 sec
Testcase: testFieldWithHiveDelimsReplacement took 1.617 sec
Testcase: testCustomDelimiters took 1.598 sec
Testcase: testHiveDropAndReplaceOptionValidation took 0.036 sec
Testcase: testCreateOverwriteHiveImport took 0.103 sec
Testcase: testCreateOnlyHiveImport took 0.055 sec
Testcase: testAppendHiveImportAsParquet took 15.383 sec
Caused an ERROR
null
java.util.NoSuchElementException
at org.kitesdk.data.spi.filesystem.MultiFileDatasetReader.next(MultiFileDatasetReader.java:144)
at com.cloudera.sqoop.hive.TestHiveImport.verifyHiveDataset(TestHiveImport.java:292)
at com.cloudera.sqoop.hive.TestHiveImport.testAppendHiveImportAsParquet(TestHiveImport.java:383)
Testcase: testNormalHiveImport took 1.58 sec
Testcase: testNormalHiveImportAsParquet took 3.46 sec
Testcase: testImportWithBadPartitionKey took 3.068 sec
Testcase: testCreateOverwriteHiveImportAsParquet took 4.107 sec
Caused an ERROR
Failure during job; return status 1
java.io.IOException: Failure during job; return status 1
at com.cloudera.sqoop.testutil.ImportJobTestCase.runImport(ImportJobTestCase.java:236)
at com.cloudera.sqoop.testutil.ImportJobTestCase.runImport(ImportJobTestCase.java:210)
at com.cloudera.sqoop.hive.TestHiveImport.runImportTest(TestHiveImport.java:215)
at com.cloudera.sqoop.hive.TestHiveImport.testCreateOverwriteHiveImportAsParquet(TestHiveImport.java:356)
Testcase: testImportHiveWithPartitions took 1.51 sec
Testcase: testNumeric took 1.476 sec
I'm wondering if you see the same failures Stanley?
- Jarek Cecho
On May 3, 2015, 3:40 p.m., Qian Xu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33104/
> -----------------------------------------------------------
>
> (Updated May 3, 2015, 3:40 p.m.)
>
>
> Review request for Sqoop.
>
>
> Bugs: SQOOP-2295
> https://issues.apache.org/jira/browse/SQOOP-2295
>
>
> Repository: sqoop-trunk
>
>
> Description
> -------
>
> Currently, an existing dataset will throw an exception. This differs from `--as-textfile`. I've checked the user manual, the handling of HDFS and Hive are indeed different. For HDFS, unless `--append` is specified, the job will fail when destination exists already. For Hive, unless `--create-hive-table` is specified, the job will become append mode. The patch has made the handling of `--as-textfile` and `--as-parquetfile` consistent.
>
>
> Diffs
> -----
>
> src/docs/man/hive-args.txt 7d9e427
> src/docs/man/sqoop-create-hive-table.txt 7aebcc1
> src/docs/user/create-hive-table.txt 3aa34fd
> src/docs/user/hive-args.txt 53de92d
> src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java d5bfae2
> src/java/org/apache/sqoop/mapreduce/ParquetJob.java df55dbc
> src/test/com/cloudera/sqoop/hive/TestHiveImport.java fa717cb
> src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java 7934791
> testdata/hive/scripts/normalImportAsParquet.q e434e9b
>
> Diff: https://reviews.apache.org/r/33104/diff/
>
>
> Testing
> -------
>
> Manually tested append, new create and overwrite cases.
>
>
> Thanks,
>
> Qian Xu
>
>
Re: Review Request 33104: SQOOP-Hive import with Parquet should append
automatically
Posted by Qian Xu <qi...@intel.com>.
> On May 7, 2015, 9:46 a.m., Abraham Elmahrek wrote:
> > src/test/com/cloudera/sqoop/hive/TestHiveImport.java, lines 296-314
> > <https://reviews.apache.org/r/33104/diff/4-5/?file=948229#file948229line296>
> >
> > Why not just create a Set object and compare sets? It should be much less code.
We will not able to test a dataset with duplicated records, if we use set instead of list. `List.remove` will remove the first matched element.
- Qian
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33104/#review82786
-----------------------------------------------------------
On May 6, 2015, 1:49 p.m., Qian Xu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33104/
> -----------------------------------------------------------
>
> (Updated May 6, 2015, 1:49 p.m.)
>
>
> Review request for Sqoop.
>
>
> Bugs: SQOOP-2295
> https://issues.apache.org/jira/browse/SQOOP-2295
>
>
> Repository: sqoop-trunk
>
>
> Description
> -------
>
> Currently, an existing dataset will throw an exception. This differs from `--as-textfile`. I've checked the user manual, the handling of HDFS and Hive are indeed different. For HDFS, unless `--append` is specified, the job will fail when destination exists already. For Hive, unless `--create-hive-table` is specified, the job will become append mode. The patch has made the handling of `--as-textfile` and `--as-parquetfile` consistent.
>
>
> Diffs
> -----
>
> src/docs/man/hive-args.txt 7d9e427
> src/docs/man/sqoop-create-hive-table.txt 7aebcc1
> src/docs/user/create-hive-table.txt 3aa34fd
> src/docs/user/hive-args.txt 53de92d
> src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java d5bfae2
> src/java/org/apache/sqoop/mapreduce/ParquetJob.java df55dbc
> src/test/com/cloudera/sqoop/TestParquetImport.java 07e140a
> src/test/com/cloudera/sqoop/hive/TestHiveImport.java fa717cb
> src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java 7934791
> src/test/com/cloudera/sqoop/testutil/ImportJobTestCase.java 293bf10
> testdata/hive/scripts/normalImportAsParquet.q e434e9b
>
> Diff: https://reviews.apache.org/r/33104/diff/
>
>
> Testing
> -------
>
> Manually tested append, new create and overwrite cases.
>
>
> Thanks,
>
> Qian Xu
>
>
Re: Review Request 33104: SQOOP-Hive import with Parquet should append
automatically
Posted by Abraham Elmahrek <ab...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33104/#review82786
-----------------------------------------------------------
src/test/com/cloudera/sqoop/hive/TestHiveImport.java
<https://reviews.apache.org/r/33104/#comment133601>
Why not just create a Set object and compare sets? It should be much less code.
- Abraham Elmahrek
On May 6, 2015, 5:49 a.m., Qian Xu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33104/
> -----------------------------------------------------------
>
> (Updated May 6, 2015, 5:49 a.m.)
>
>
> Review request for Sqoop.
>
>
> Bugs: SQOOP-2295
> https://issues.apache.org/jira/browse/SQOOP-2295
>
>
> Repository: sqoop-trunk
>
>
> Description
> -------
>
> Currently, an existing dataset will throw an exception. This differs from `--as-textfile`. I've checked the user manual, the handling of HDFS and Hive are indeed different. For HDFS, unless `--append` is specified, the job will fail when destination exists already. For Hive, unless `--create-hive-table` is specified, the job will become append mode. The patch has made the handling of `--as-textfile` and `--as-parquetfile` consistent.
>
>
> Diffs
> -----
>
> src/docs/man/hive-args.txt 7d9e427
> src/docs/man/sqoop-create-hive-table.txt 7aebcc1
> src/docs/user/create-hive-table.txt 3aa34fd
> src/docs/user/hive-args.txt 53de92d
> src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java d5bfae2
> src/java/org/apache/sqoop/mapreduce/ParquetJob.java df55dbc
> src/test/com/cloudera/sqoop/TestParquetImport.java 07e140a
> src/test/com/cloudera/sqoop/hive/TestHiveImport.java fa717cb
> src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java 7934791
> src/test/com/cloudera/sqoop/testutil/ImportJobTestCase.java 293bf10
> testdata/hive/scripts/normalImportAsParquet.q e434e9b
>
> Diff: https://reviews.apache.org/r/33104/diff/
>
>
> Testing
> -------
>
> Manually tested append, new create and overwrite cases.
>
>
> Thanks,
>
> Qian Xu
>
>
Re: Review Request 33104: SQOOP-Hive import with Parquet should append
automatically
Posted by Abraham Elmahrek <ab...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33104/#review82981
-----------------------------------------------------------
Ship it!
Ship It!
- Abraham Elmahrek
On May 6, 2015, 5:49 a.m., Qian Xu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33104/
> -----------------------------------------------------------
>
> (Updated May 6, 2015, 5:49 a.m.)
>
>
> Review request for Sqoop.
>
>
> Bugs: SQOOP-2295
> https://issues.apache.org/jira/browse/SQOOP-2295
>
>
> Repository: sqoop-trunk
>
>
> Description
> -------
>
> Currently, an existing dataset will throw an exception. This differs from `--as-textfile`. I've checked the user manual, the handling of HDFS and Hive are indeed different. For HDFS, unless `--append` is specified, the job will fail when destination exists already. For Hive, unless `--create-hive-table` is specified, the job will become append mode. The patch has made the handling of `--as-textfile` and `--as-parquetfile` consistent.
>
>
> Diffs
> -----
>
> src/docs/man/hive-args.txt 7d9e427
> src/docs/man/sqoop-create-hive-table.txt 7aebcc1
> src/docs/user/create-hive-table.txt 3aa34fd
> src/docs/user/hive-args.txt 53de92d
> src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java d5bfae2
> src/java/org/apache/sqoop/mapreduce/ParquetJob.java df55dbc
> src/test/com/cloudera/sqoop/TestParquetImport.java 07e140a
> src/test/com/cloudera/sqoop/hive/TestHiveImport.java fa717cb
> src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java 7934791
> src/test/com/cloudera/sqoop/testutil/ImportJobTestCase.java 293bf10
> testdata/hive/scripts/normalImportAsParquet.q e434e9b
>
> Diff: https://reviews.apache.org/r/33104/diff/
>
>
> Testing
> -------
>
> Manually tested append, new create and overwrite cases.
>
>
> Thanks,
>
> Qian Xu
>
>
Re: Review Request 33104: SQOOP-Hive import with Parquet should append
automatically
Posted by Qian Xu <qi...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33104/
-----------------------------------------------------------
(Updated May 6, 2015, 1:49 p.m.)
Review request for Sqoop.
Changes
-------
The new patch has two changes (1) makes sure table directory is cleaned up in setup stage for every test case (2) make sure records verification stable
Bugs: SQOOP-2295
https://issues.apache.org/jira/browse/SQOOP-2295
Repository: sqoop-trunk
Description
-------
Currently, an existing dataset will throw an exception. This differs from `--as-textfile`. I've checked the user manual, the handling of HDFS and Hive are indeed different. For HDFS, unless `--append` is specified, the job will fail when destination exists already. For Hive, unless `--create-hive-table` is specified, the job will become append mode. The patch has made the handling of `--as-textfile` and `--as-parquetfile` consistent.
Diffs (updated)
-----
src/docs/man/hive-args.txt 7d9e427
src/docs/man/sqoop-create-hive-table.txt 7aebcc1
src/docs/user/create-hive-table.txt 3aa34fd
src/docs/user/hive-args.txt 53de92d
src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java d5bfae2
src/java/org/apache/sqoop/mapreduce/ParquetJob.java df55dbc
src/test/com/cloudera/sqoop/TestParquetImport.java 07e140a
src/test/com/cloudera/sqoop/hive/TestHiveImport.java fa717cb
src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java 7934791
src/test/com/cloudera/sqoop/testutil/ImportJobTestCase.java 293bf10
testdata/hive/scripts/normalImportAsParquet.q e434e9b
Diff: https://reviews.apache.org/r/33104/diff/
Testing
-------
Manually tested append, new create and overwrite cases.
Thanks,
Qian Xu
Re: Review Request 33104: SQOOP-Hive import with Parquet should append
automatically
Posted by Qian Xu <qi...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33104/
-----------------------------------------------------------
(Updated May 3, 2015, 11:40 p.m.)
Review request for Sqoop.
Changes
-------
Removed `--create-hive-table` related code regarding Jarcec's comments.
Bugs: SQOOP-2295
https://issues.apache.org/jira/browse/SQOOP-2295
Repository: sqoop-trunk
Description
-------
Currently, an existing dataset will throw an exception. This differs from `--as-textfile`. I've checked the user manual, the handling of HDFS and Hive are indeed different. For HDFS, unless `--append` is specified, the job will fail when destination exists already. For Hive, unless `--create-hive-table` is specified, the job will become append mode. The patch has made the handling of `--as-textfile` and `--as-parquetfile` consistent.
Diffs (updated)
-----
src/docs/man/hive-args.txt 7d9e427
src/docs/man/sqoop-create-hive-table.txt 7aebcc1
src/docs/user/create-hive-table.txt 3aa34fd
src/docs/user/hive-args.txt 53de92d
src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java d5bfae2
src/java/org/apache/sqoop/mapreduce/ParquetJob.java df55dbc
src/test/com/cloudera/sqoop/hive/TestHiveImport.java fa717cb
src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java 7934791
testdata/hive/scripts/normalImportAsParquet.q e434e9b
Diff: https://reviews.apache.org/r/33104/diff/
Testing
-------
Manually tested append, new create and overwrite cases.
Thanks,
Qian Xu
Re: Review Request 33104: SQOOP-Hive import with Parquet should append
automatically
Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33104/#review82254
-----------------------------------------------------------
Just few notes:
src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java
<https://reviews.apache.org/r/33104/#comment132951>
Honestly this is the first time I'm seeing the "doFailIfHiveTableExists" method :) It seems unusued in current Sqoop code base, so I'm wondering whether it would be better to not use it here (and perhaps drop it completely in different JIRA).
src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java
<https://reviews.apache.org/r/33104/#comment132950>
Just a note for the log output: I believe that the semantcs of --create-hive-table is - create the table if it doesn't exist and do nothing if it does exists.
I'm thinking whether the comment should be just mentioning that this will do "append" to existing hive tables and that user might consider --hive-overwrite if rewrite is needed? E.g. no mention of the --create-hive-table. What do you think?
src/java/org/apache/sqoop/tool/BaseSqoopTool.java
<https://reviews.apache.org/r/33104/#comment132952>
The --append paramemetr doesn't make really sense with Hive because hive import behaves differently then HDFS one, right?
It's quite unfortunate, but it seems better to preserve the check to not confuse people even more?
Jarcec
- Jarek Cecho
On April 30, 2015, 5:54 p.m., Qian Xu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33104/
> -----------------------------------------------------------
>
> (Updated April 30, 2015, 5:54 p.m.)
>
>
> Review request for Sqoop.
>
>
> Bugs: SQOOP-2295
> https://issues.apache.org/jira/browse/SQOOP-2295
>
>
> Repository: sqoop-trunk
>
>
> Description
> -------
>
> Currently, an existing dataset will throw an exception. This differs from `--as-textfile`. I've checked the user manual, the handling of HDFS and Hive are indeed different. For HDFS, unless `--append` is specified, the job will fail when destination exists already. For Hive, unless `--create-hive-table` is specified, the job will become append mode. The patch has made the handling of `--as-textfile` and `--as-parquetfile` consistent.
>
>
> Diffs
> -----
>
> src/docs/man/hive-args.txt 7d9e427
> src/docs/man/sqoop-create-hive-table.txt 7aebcc1
> src/docs/user/create-hive-table.txt 3aa34fd
> src/docs/user/hive-args.txt 53de92d
> src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java d5bfae2
> src/java/org/apache/sqoop/mapreduce/ParquetJob.java df55dbc
> src/java/org/apache/sqoop/tool/BaseSqoopTool.java c97bb58
> src/test/com/cloudera/sqoop/hive/TestHiveImport.java fa717cb
> src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java 7934791
> testdata/hive/scripts/normalImportAsParquet.q e434e9b
>
> Diff: https://reviews.apache.org/r/33104/diff/
>
>
> Testing
> -------
>
> Manually tested append, new create and overwrite cases.
>
>
> Thanks,
>
> Qian Xu
>
>
Re: Review Request 33104: SQOOP-Hive import with Parquet should append
automatically
Posted by Qian Xu <qi...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33104/
-----------------------------------------------------------
(Updated May 1, 2015, 1:54 a.m.)
Review request for Sqoop.
Bugs: SQOOP-2295
https://issues.apache.org/jira/browse/SQOOP-2295
Repository: sqoop-trunk
Description
-------
Currently, an existing dataset will throw an exception. This differs from `--as-textfile`. I've checked the user manual, the handling of HDFS and Hive are indeed different. For HDFS, unless `--append` is specified, the job will fail when destination exists already. For Hive, unless `--create-hive-table` is specified, the job will become append mode. The patch has made the handling of `--as-textfile` and `--as-parquetfile` consistent.
Diffs (updated)
-----
src/docs/man/hive-args.txt 7d9e427
src/docs/man/sqoop-create-hive-table.txt 7aebcc1
src/docs/user/create-hive-table.txt 3aa34fd
src/docs/user/hive-args.txt 53de92d
src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java d5bfae2
src/java/org/apache/sqoop/mapreduce/ParquetJob.java df55dbc
src/java/org/apache/sqoop/tool/BaseSqoopTool.java c97bb58
src/test/com/cloudera/sqoop/hive/TestHiveImport.java fa717cb
src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java 7934791
testdata/hive/scripts/normalImportAsParquet.q e434e9b
Diff: https://reviews.apache.org/r/33104/diff/
Testing
-------
Manually tested append, new create and overwrite cases.
Thanks,
Qian Xu
Re: Review Request 33104: SQOOP-Hive import with Parquet should append
automatically
Posted by Qian Xu <qi...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33104/
-----------------------------------------------------------
(Updated April 12, 2015, 3:26 p.m.)
Review request for Sqoop.
Bugs: SQOOP-2295
https://issues.apache.org/jira/browse/SQOOP-2295
Repository: sqoop-trunk
Description (updated)
-------
Currently, an existing dataset will throw an exception. This differs from `--as-textfile`. I've checked the user manual, the handling of HDFS and Hive are indeed different. For HDFS, unless `--append` is specified, the job will fail when destination exists already. For Hive, unless `--create-hive-table` is specified, the job will become append mode. The patch has made the handling of `--as-textfile` and `--as-parquetfile` consistent.
Diffs (updated)
-----
src/docs/man/hive-args.txt 7d9e427
src/docs/man/sqoop-create-hive-table.txt 7aebcc1
src/docs/user/create-hive-table.txt 3aa34fd
src/docs/user/hive-args.txt 53de92d
src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java e70d23c
src/java/org/apache/sqoop/mapreduce/ParquetJob.java df55dbc
src/java/org/apache/sqoop/tool/BaseSqoopTool.java c97bb58
Diff: https://reviews.apache.org/r/33104/diff/
Testing
-------
Manually tested append, new create and overwrite cases.
Thanks,
Qian Xu