You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by daniel voros <da...@gmail.com> on 2018/03/27 08:50:38 UTC

Review Request 66300: Upgrade to Hadoop 3.0.0

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66300/
-----------------------------------------------------------

Review request for Sqoop.


Bugs: SQOOP-3305
    https://issues.apache.org/jira/browse/SQOOP-3305


Repository: sqoop-trunk


Description
-------

To be able to eventually support the latest versions of Hive, HBase and Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See https://hadoop.apache.org/docs/r3.0.0/index.html


Diffs
-----

  ivy.xml 6be4fa2 
  ivy/libraries.properties c44b50b 
  src/java/org/apache/sqoop/config/ConfigurationHelper.java e07a699 
  src/java/org/apache/sqoop/hive/HiveImport.java c272911 
  src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e049 
  src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2 
  src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b 
  src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20d 
  testdata/hcatalog/conf/hive-site.xml edac7aa 


Diff: https://reviews.apache.org/r/66300/diff/1/


Testing
-------

Normal and third-party unit tests.


Thanks,

daniel voros


Re: Review Request 66300: Upgrade to Hadoop 3.0.0

Posted by daniel voros <da...@gmail.com>.

> On March 28, 2018, 3:44 p.m., Szabolcs Vasas wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java
> > Lines 69 (patched)
> > <https://reviews.apache.org/r/66300/diff/1/?file=1988993#file1988993line69>
> >
> >     Can we use List interface and diamond operator here?

Fixed, please note that this file was originally copied from Hive.


- daniel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66300/#review200113
-----------------------------------------------------------


On March 27, 2018, 8:50 a.m., daniel voros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66300/
> -----------------------------------------------------------
> 
> (Updated March 27, 2018, 8:50 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3305
>     https://issues.apache.org/jira/browse/SQOOP-3305
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> To be able to eventually support the latest versions of Hive, HBase and Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See https://hadoop.apache.org/docs/r3.0.0/index.html
> 
> 
> Diffs
> -----
> 
>   ivy.xml 6be4fa2 
>   ivy/libraries.properties c44b50b 
>   src/java/org/apache/sqoop/SqoopOptions.java 651cebd 
>   src/java/org/apache/sqoop/config/ConfigurationHelper.java e07a699 
>   src/java/org/apache/sqoop/hive/HiveImport.java c272911 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e049 
>   src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2 
>   src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b 
>   src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20d 
>   src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java fdf972c 
>   testdata/hcatalog/conf/hive-site.xml edac7aa 
> 
> 
> Diff: https://reviews.apache.org/r/66300/diff/2/
> 
> 
> Testing
> -------
> 
> Normal and third-party unit tests.
> 
> 
> Thanks,
> 
> daniel voros
> 
>


Re: Review Request 66300: Upgrade to Hadoop 3.0.0

Posted by Szabolcs Vasas <va...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66300/#review200113
-----------------------------------------------------------



Hi Dani,

Thank you for starting this initiative! The changes look good to me, ran the unit and third party tests successfully, I have left some minor comments only.
I guess we do not want to commit this patch until the final version of Hive 3 is released, am I right? Do you know when it is going to happen?


ivy.xml
Lines 95 (patched)
<https://reviews.apache.org/r/66300/#comment280760>

    Can we move the version to libraries.properties?



src/java/org/apache/sqoop/hive/HiveImport.java
Lines 368 (patched)
<https://reviews.apache.org/r/66300/#comment280761>

    Is it possible that before setting DerbyPolicy another Policy was set? In that case I think we should restore the original Policy.



src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java
Lines 69 (patched)
<https://reviews.apache.org/r/66300/#comment280767>

    Can we use List interface and diamond operator here?



src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java
Lines 71 (patched)
<https://reviews.apache.org/r/66300/#comment280762>

    Missing @Override annotation.



src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java
Lines 75 (patched)
<https://reviews.apache.org/r/66300/#comment280763>

    Missing @Override annotation.



src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java
Lines 76 (patched)
<https://reviews.apache.org/r/66300/#comment280766>

    Can we use for-each loop here?



src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java
Lines 84 (patched)
<https://reviews.apache.org/r/66300/#comment280764>

    Missing @Override annotation.



src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java
Lines 88 (patched)
<https://reviews.apache.org/r/66300/#comment280765>

    Missing @Override annotation.



src/java/org/apache/sqoop/util/SqoopJsonUtil.java
Line 42 (original), 42 (patched)
<https://reviews.apache.org/r/66300/#comment280769>

    Casing: getJsonStringForMap



src/java/org/apache/sqoop/util/SqoopJsonUtil.java
Lines 44 (patched)
<https://reviews.apache.org/r/66300/#comment280768>

    It is a matter of taste, I am not a big fan of changing the parameter, maybe we could change it to something like this:
    
    if (map == null) {
        return EMPTY_JSON_MAP;
    }
    
    where EMPTY_JSON_MAP would be a constant extracted from isEmptyJSON method ("{}")


- Szabolcs Vasas


On March 27, 2018, 8:50 a.m., daniel voros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66300/
> -----------------------------------------------------------
> 
> (Updated March 27, 2018, 8:50 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3305
>     https://issues.apache.org/jira/browse/SQOOP-3305
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> To be able to eventually support the latest versions of Hive, HBase and Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See https://hadoop.apache.org/docs/r3.0.0/index.html
> 
> 
> Diffs
> -----
> 
>   ivy.xml 6be4fa2 
>   ivy/libraries.properties c44b50b 
>   src/java/org/apache/sqoop/config/ConfigurationHelper.java e07a699 
>   src/java/org/apache/sqoop/hive/HiveImport.java c272911 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e049 
>   src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2 
>   src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b 
>   src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20d 
>   testdata/hcatalog/conf/hive-site.xml edac7aa 
> 
> 
> Diff: https://reviews.apache.org/r/66300/diff/1/
> 
> 
> Testing
> -------
> 
> Normal and third-party unit tests.
> 
> 
> Thanks,
> 
> daniel voros
> 
>


Re: Review Request 66300: Upgrade to Hadoop 3.0.0

Posted by Boglarka Egyed <bo...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66300/#review203766
-----------------------------------------------------------



Hi Daniel,

These are great news! Thanks for the patch update.

Compile works well for me however I have failing unit test cases with various error messages in these test classes:
org.apache.sqoop.hive.TestHiveImport
org.apache.sqoop.hive.TestHiveMiniCluster
org.apache.sqoop.hive.TestHiveServer2TextImport

Could you please check these on your side too?

Many thanks,
Bogi

- Boglarka Egyed


On March 27, 2018, 8:50 a.m., daniel voros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66300/
> -----------------------------------------------------------
> 
> (Updated March 27, 2018, 8:50 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3305
>     https://issues.apache.org/jira/browse/SQOOP-3305
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> To be able to eventually support the latest versions of Hive, HBase and Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See https://hadoop.apache.org/docs/r3.0.0/index.html
> 
> 
> Diffs
> -----
> 
>   ivy.xml 1f587f3e 
>   ivy/libraries.properties 565a8bf5 
>   src/java/org/apache/sqoop/SqoopOptions.java d9984af3 
>   src/java/org/apache/sqoop/config/ConfigurationHelper.java fb2ab031 
>   src/java/org/apache/sqoop/hive/HiveImport.java 5da00a74 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e0499 
>   src/java/org/apache/sqoop/mapreduce/ParquetJob.java 46047733 
>   src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2a 
>   src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b7 
>   src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20dd 
>   src/test/org/apache/sqoop/hive/minicluster/KerberosAuthenticationConfiguration.java 549a8c6c 
>   src/test/org/apache/sqoop/hive/minicluster/PasswordAuthenticationConfiguration.java 79881f7b 
>   src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java fdf972c1 
>   testdata/hcatalog/conf/hive-site.xml edac7aa9 
> 
> 
> Diff: https://reviews.apache.org/r/66300/diff/5/
> 
> 
> Testing
> -------
> 
> Normal and third-party unit tests.
> 
> 
> Thanks,
> 
> daniel voros
> 
>


Re: Review Request 66300: Upgrade to Hadoop 3.0.0

Posted by Boglarka Egyed <bo...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66300/#review204537
-----------------------------------------------------------



Hi Daniel,

Could you please rebase your patch to the latest trunk?

Thank you very much,
Bogi

- Boglarka Egyed


On March 27, 2018, 8:50 a.m., daniel voros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66300/
> -----------------------------------------------------------
> 
> (Updated March 27, 2018, 8:50 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3305
>     https://issues.apache.org/jira/browse/SQOOP-3305
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> To be able to eventually support the latest versions of Hive, HBase and Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See https://hadoop.apache.org/docs/r3.0.0/index.html
> 
> 
> Diffs
> -----
> 
>   ivy.xml 1f587f3e 
>   ivy/libraries.properties 565a8bf5 
>   src/java/org/apache/sqoop/SqoopOptions.java d9984af3 
>   src/java/org/apache/sqoop/config/ConfigurationHelper.java fb2ab031 
>   src/java/org/apache/sqoop/hive/HiveImport.java 5da00a74 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e0499 
>   src/java/org/apache/sqoop/mapreduce/ParquetJob.java 46047733 
>   src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2a 
>   src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b7 
>   src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20dd 
>   src/test/org/apache/sqoop/hive/minicluster/HiveMiniCluster.java 19bb7605 
>   src/test/org/apache/sqoop/hive/minicluster/KerberosAuthenticationConfiguration.java 549a8c6c 
>   src/test/org/apache/sqoop/hive/minicluster/PasswordAuthenticationConfiguration.java 79881f7b 
>   src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java fdf972c1 
>   testdata/hcatalog/conf/hive-site.xml edac7aa9 
> 
> 
> Diff: https://reviews.apache.org/r/66300/diff/6/
> 
> 
> Testing
> -------
> 
> Normal and third-party unit tests.
> 
> 
> Thanks,
> 
> daniel voros
> 
>


Re: Review Request 66300: Upgrade to Hadoop 3.0.0

Posted by Boglarka Egyed <bo...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66300/#review200992
-----------------------------------------------------------



Hi Dani,

Thank you for taking care of these upgrades!

Would it be possible to split this change up into two separate ones: Hadoop and Hive/HBase upgrades? I'm asking because now we are depending only on the Hive 3 release and I'm wondering if we could procedd with upgrading the Hadoop version in the meantime.

I also agree on considering to have these changes in a major Sqoop release - could you maybe start a discussion about it on sqoop-dev@ mailing list please?

Many thanks,
Bogi

- Boglarka Egyed


On March 27, 2018, 8:50 a.m., daniel voros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66300/
> -----------------------------------------------------------
> 
> (Updated March 27, 2018, 8:50 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3305
>     https://issues.apache.org/jira/browse/SQOOP-3305
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> To be able to eventually support the latest versions of Hive, HBase and Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See https://hadoop.apache.org/docs/r3.0.0/index.html
> 
> 
> Diffs
> -----
> 
>   ivy.xml 6be4fa2 
>   ivy/libraries.properties c44b50b 
>   src/java/org/apache/sqoop/SqoopOptions.java 651cebd 
>   src/java/org/apache/sqoop/config/ConfigurationHelper.java e07a699 
>   src/java/org/apache/sqoop/hive/HiveImport.java c272911 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e049 
>   src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2 
>   src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b 
>   src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20d 
>   src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java fdf972c 
>   testdata/hcatalog/conf/hive-site.xml edac7aa 
> 
> 
> Diff: https://reviews.apache.org/r/66300/diff/3/
> 
> 
> Testing
> -------
> 
> Normal and third-party unit tests.
> 
> 
> Thanks,
> 
> daniel voros
> 
>


Re: Review Request 66300: Upgrade to Hadoop 3.0.0

Posted by Szabolcs Vasas <va...@gmail.com>.

> On April 4, 2018, 8:33 a.m., Szabolcs Vasas wrote:
> > Hi Dani,
> > 
> > Thanks for fixing the issues, one last nit I found is that DerbyPolicy contains 2 new lines at the end of file so git gives a warning when you apply it, can you please fix that as well?
> > 
> > I think we should also start thinking about when we want to commit this patch and understand the full impact of it. I think we should wait until we have a stable version of Hive 3 and you have mentioned that the ACID tables might be the default in Hive 3. Can you point us to a mail chain/wiki page/JIRA board which contains the rough timeline and the breaking changes of Hive 3?
> 
> daniel voros wrote:
>     Hi,
>     
>     I've updated the patch to remove the extra newline.
>     
>     Here's the latest thread about the Hive 3.0 release: https://mail-archives.apache.org/mod_mbox/hive-dev/201804.mbox/%3C0AE033EE-3488-4E4A-A919-705E8B4446B7%40hortonworks.com%3E
>     
>     I don't know if there's a list of breaking changes yet, but AFAIK Hive 3.0 will not support Hadoop 2.x.
>     
>     I think we should only release this patch in a major release. I'd prefer calling it 3.0 to be in line with Hadoop and Hive and not to be confused with the earlier Sqoop2 initiative.
>     
>     Should we continue the discussion on the mailing list?
>     
>     Daniel

Thank you!
Yes, I think it would be good to start a discussion about this on the dev list.


- Szabolcs


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66300/#review200442
-----------------------------------------------------------


On March 27, 2018, 8:50 a.m., daniel voros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66300/
> -----------------------------------------------------------
> 
> (Updated March 27, 2018, 8:50 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3305
>     https://issues.apache.org/jira/browse/SQOOP-3305
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> To be able to eventually support the latest versions of Hive, HBase and Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See https://hadoop.apache.org/docs/r3.0.0/index.html
> 
> 
> Diffs
> -----
> 
>   ivy.xml 6be4fa2 
>   ivy/libraries.properties c44b50b 
>   src/java/org/apache/sqoop/SqoopOptions.java 651cebd 
>   src/java/org/apache/sqoop/config/ConfigurationHelper.java e07a699 
>   src/java/org/apache/sqoop/hive/HiveImport.java c272911 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e049 
>   src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2 
>   src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b 
>   src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20d 
>   src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java fdf972c 
>   testdata/hcatalog/conf/hive-site.xml edac7aa 
> 
> 
> Diff: https://reviews.apache.org/r/66300/diff/3/
> 
> 
> Testing
> -------
> 
> Normal and third-party unit tests.
> 
> 
> Thanks,
> 
> daniel voros
> 
>


Re: Review Request 66300: Upgrade to Hadoop 3.0.0

Posted by Szabolcs Vasas <va...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66300/#review200442
-----------------------------------------------------------



Hi Dani,

Thanks for fixing the issues, one last nit I found is that DerbyPolicy contains 2 new lines at the end of file so git gives a warning when you apply it, can you please fix that as well?

I think we should also start thinking about when we want to commit this patch and understand the full impact of it. I think we should wait until we have a stable version of Hive 3 and you have mentioned that the ACID tables might be the default in Hive 3. Can you point us to a mail chain/wiki page/JIRA board which contains the rough timeline and the breaking changes of Hive 3?

- Szabolcs Vasas


On March 27, 2018, 8:50 a.m., daniel voros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66300/
> -----------------------------------------------------------
> 
> (Updated March 27, 2018, 8:50 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3305
>     https://issues.apache.org/jira/browse/SQOOP-3305
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> To be able to eventually support the latest versions of Hive, HBase and Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See https://hadoop.apache.org/docs/r3.0.0/index.html
> 
> 
> Diffs
> -----
> 
>   ivy.xml 6be4fa2 
>   ivy/libraries.properties c44b50b 
>   src/java/org/apache/sqoop/SqoopOptions.java 651cebd 
>   src/java/org/apache/sqoop/config/ConfigurationHelper.java e07a699 
>   src/java/org/apache/sqoop/hive/HiveImport.java c272911 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e049 
>   src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2 
>   src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b 
>   src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20d 
>   src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java fdf972c 
>   testdata/hcatalog/conf/hive-site.xml edac7aa 
> 
> 
> Diff: https://reviews.apache.org/r/66300/diff/2/
> 
> 
> Testing
> -------
> 
> Normal and third-party unit tests.
> 
> 
> Thanks,
> 
> daniel voros
> 
>


Re: Review Request 66300: Upgrade to Hadoop 3.0.0

Posted by daniel voros <da...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66300/#review206151
-----------------------------------------------------------



I've been working on the failure of `TestHiveMiniCluster#testInsertedRowCanBeReadFromTable[KerberosAuthenticationConfiguration]` and wanted to give an update.

I think this is the meaningful part of the quite verbose logs:

```
java.lang.Exception: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in localfetcher#1
	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:559)
Caused by: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in localfetcher#1
	at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:377)
	at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:347)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ExceptionInInitializerError
	at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:71)
	at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:62)
	at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:57)
	at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:125)
	at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.doCopy(LocalFetcher.java:103)
	at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.run(LocalFetcher.java:86)
Caused by: java.lang.RuntimeException: Secure IO is not possible without native code extensions.
	at org.apache.hadoop.io.SecureIOUtils.<clinit>(SecureIOUtils.java:71)
	... 6 more
```

This is happening in MR's shuffle phase. I was trying to find out how tests in Hive don't run into this and found out that Secure MR is not supported there. See here: https://github.com/apache/hive/blob/dceeefbdf5e4f6fea83cb6ca5c11fbac10e77677/itests/util/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java#L178-L180

I'm trying to get Tez working with our MiniHiveCluster, but did not succeed yet. The kerberos ticket is not picked up for some reason:

```
java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
        at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:755)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
        at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:718)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:811)
        at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:409)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1552)
        at org.apache.hadoop.ipc.Client.call(Client.java:1383)
        at org.apache.hadoop.ipc.Client.call(Client.java:1347)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
        at com.sun.proxy.$Proxy81.getAMStatus(Unknown Source)
        at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:772)
        at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:909)
        at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:880)
        at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezSessionState.java:434)
        at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:360)
        at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(TezSessionPoolSession.java:124)
        at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:237)
        at org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezTask.java:364)
        at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:191)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2479)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2150)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1826)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1567)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1561)
        at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:221)
        at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
        at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:313)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
        at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:326)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
        at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:173)
        at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:390)
        at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:613)
        at org.apache.hadoop.ipc.Client$Connection.access$2200(Client.java:409)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:798)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:794)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:794)
        ... 36 more
```

- daniel voros


On March 27, 2018, 8:50 a.m., daniel voros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66300/
> -----------------------------------------------------------
> 
> (Updated March 27, 2018, 8:50 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3305
>     https://issues.apache.org/jira/browse/SQOOP-3305
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> To be able to eventually support the latest versions of Hive, HBase and Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See https://hadoop.apache.org/docs/r3.0.0/index.html
> 
> 
> Diffs
> -----
> 
>   ivy.xml 1f587f3e 
>   ivy/libraries.properties 565a8bf5 
>   src/java/org/apache/sqoop/SqoopOptions.java d9984af3 
>   src/java/org/apache/sqoop/config/ConfigurationHelper.java fb2ab031 
>   src/java/org/apache/sqoop/hive/HiveImport.java 5da00a74 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e0499 
>   src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2a 
>   src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetUtils.java e68bba90 
>   src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b7 
>   src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20dd 
>   src/test/org/apache/sqoop/hive/minicluster/HiveMiniCluster.java 19bb7605 
>   src/test/org/apache/sqoop/hive/minicluster/KerberosAuthenticationConfiguration.java 549a8c6c 
>   src/test/org/apache/sqoop/hive/minicluster/PasswordAuthenticationConfiguration.java 79881f7b 
>   src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java fdf972c1 
>   testdata/hcatalog/conf/hive-site.xml edac7aa9 
> 
> 
> Diff: https://reviews.apache.org/r/66300/diff/7/
> 
> 
> Testing
> -------
> 
> Normal and third-party unit tests.
> 
> 
> Thanks,
> 
> daniel voros
> 
>


Re: Review Request 66300: Upgrade to Hadoop 3.0.0

Posted by daniel voros <da...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66300/#review200037
-----------------------------------------------------------



Patch #1 is the minimal set of changes required to upgrade to Hadoop 3.0.0 that passes all unit tests. It also updates:
 - Hive to 3.0.0-SNAPSHOT since Hive hadoop shims was unable to handle Hadoop 3.
 - HBase 2.0.0-beta2 since Hive 3.0.0-SNAPSHOT depends on HBase 2.0.0-alpha4 at the moment.

For the list of other changes and some reasoning behind them see https://github.com/dvoros/sqoop/pull/4.

- daniel voros


On March 27, 2018, 8:50 a.m., daniel voros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66300/
> -----------------------------------------------------------
> 
> (Updated March 27, 2018, 8:50 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3305
>     https://issues.apache.org/jira/browse/SQOOP-3305
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> To be able to eventually support the latest versions of Hive, HBase and Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See https://hadoop.apache.org/docs/r3.0.0/index.html
> 
> 
> Diffs
> -----
> 
>   ivy.xml 6be4fa2 
>   ivy/libraries.properties c44b50b 
>   src/java/org/apache/sqoop/config/ConfigurationHelper.java e07a699 
>   src/java/org/apache/sqoop/hive/HiveImport.java c272911 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e049 
>   src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2 
>   src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b 
>   src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20d 
>   testdata/hcatalog/conf/hive-site.xml edac7aa 
> 
> 
> Diff: https://reviews.apache.org/r/66300/diff/1/
> 
> 
> Testing
> -------
> 
> Normal and third-party unit tests.
> 
> 
> Thanks,
> 
> daniel voros
> 
>