You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Nandor Kollar via Review Board <no...@reviews.apache.org> on 2018/01/19 16:14:39 UTC

Review Request 65239: PIG-5253: Pig Hadoop 3 support

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/
-----------------------------------------------------------

Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.


Repository: pig-git


Description
-------

This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.

Major modifications:
 * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
 * hadoopversion property tells which mode the tests should run, the default is hadoop 2
 * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
 * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
 * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
 * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
 * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
 * ivy.xml: new config for Hadoop 3
 * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
 * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)


Diffs
-----

  bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
  bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
  build.xml 8bcbe5e4d60b793412dd5490518928b17308da3b 
  ivy.xml 3ac675190f15528674361eda924af61fc1d07613 
  ivy/libraries-common.properties PRE-CREATION 
  ivy/libraries-h2.properties PRE-CREATION 
  ivy/libraries-h3.properties PRE-CREATION 
  ivy/libraries.properties 800b75edea300d6ff4d0a55481a1b3ed5e3be6ea 
  shims/src/hadoop3/org/apache/pig/backend/hadoop/executionengine/shims/HadoopShims.java PRE-CREATION 
  shims/test/hadoop3/.gitignore PRE-CREATION 
  test/e2e/pig/build.xml 384639dfeddaa31ab7b2891ee1ed3602e1f0c08d 
  test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
  test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
  test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
  test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 


Diff: https://reviews.apache.org/r/65239/diff/1/


Testing
-------


Thanks,

Nandor Kollar


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/#review221505
-----------------------------------------------------------


Ship it!




It would be good to update to the latest hadoop versions - 2.10.0 and 3.3.0. But have let this one languish for long without review. So lets get this in and update to new versions in a separate jira as 3.3.0 may require some additional changes.

- Rohini Palaniswamy


On April 10, 2019, 11:49 a.m., Nandor Kollar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65239/
> -----------------------------------------------------------
> 
> (Updated April 10, 2019, 11:49 a.m.)
> 
> 
> Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.
> 
> Major modifications:
>  * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
>  * hadoopversion property tells which mode the tests should run, the default is hadoop 2
>  * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
>  * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
>  * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
>  * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
>  * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
>  * ivy.xml: new config for Hadoop 3
>  * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
>  * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)
> 
> 
> Diffs
> -----
> 
>   bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
>   bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
>   build.xml 9bb1b125eef3c9381466b8f043ecd31ff0d56dc7 
>   ivy.xml a93da2a9a20edf103dcbf08abc92301a1c2fda9e 
>   ivy/libraries-h2.properties PRE-CREATION 
>   ivy/libraries-h3.properties PRE-CREATION 
>   ivy/libraries.properties 1abc9677e6c0747bb102c563b4f6642274541476 
>   test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
>   test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
>   test/org/apache/pig/test/MapReduceMiniCluster.java PRE-CREATION 
>   test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
>   test/org/apache/pig/test/MiniGenericCluster.java 674860f880407595d68c4eea2b67e2d6465417fe 
>   test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
>   test/org/apache/pig/test/YarnMiniCluster.java 69d808124a4e9be661f1fda25755075dcb6607b1 
> 
> 
> Diff: https://reviews.apache.org/r/65239/diff/4/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Nandor Kollar
> 
>


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Nandor Kollar via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/
-----------------------------------------------------------

(Updated April 10, 2019, 11:49 a.m.)


Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.


Changes
-------

- Hadoop 2 jar is created even when build is hadoop 3
- missing important h3 dependencies for local mode


Repository: pig-git


Description
-------

This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.

Major modifications:
 * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
 * hadoopversion property tells which mode the tests should run, the default is hadoop 2
 * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
 * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
 * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
 * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
 * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
 * ivy.xml: new config for Hadoop 3
 * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
 * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)


Diffs (updated)
-----

  bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
  bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
  build.xml 9bb1b125eef3c9381466b8f043ecd31ff0d56dc7 
  ivy.xml a93da2a9a20edf103dcbf08abc92301a1c2fda9e 
  ivy/libraries-h2.properties PRE-CREATION 
  ivy/libraries-h3.properties PRE-CREATION 
  ivy/libraries.properties 1abc9677e6c0747bb102c563b4f6642274541476 
  test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
  test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
  test/org/apache/pig/test/MapReduceMiniCluster.java PRE-CREATION 
  test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
  test/org/apache/pig/test/MiniGenericCluster.java 674860f880407595d68c4eea2b67e2d6465417fe 
  test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
  test/org/apache/pig/test/YarnMiniCluster.java 69d808124a4e9be661f1fda25755075dcb6607b1 


Diff: https://reviews.apache.org/r/65239/diff/4/

Changes: https://reviews.apache.org/r/65239/diff/3-4/


Testing
-------


Thanks,

Nandor Kollar


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Nandor Kollar via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/
-----------------------------------------------------------

(Updated Sept. 5, 2018, 4:37 p.m.)


Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.


Repository: pig-git


Description
-------

This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.

Major modifications:
 * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
 * hadoopversion property tells which mode the tests should run, the default is hadoop 2
 * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
 * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
 * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
 * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
 * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
 * ivy.xml: new config for Hadoop 3
 * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
 * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)


Diffs (updated)
-----

  bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
  bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
  build.xml 34697ee5fc7a9fff9a0a6359e10ccb9fa3accb47 
  ivy.xml 0902b18040de579362abd2e785bdb42ac189a8b0 
  ivy/libraries-h2.properties PRE-CREATION 
  ivy/libraries-h3.properties PRE-CREATION 
  ivy/libraries.properties ec71472be7e885b8e03b03430c4d274ba84541ea 
  test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
  test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
  test/org/apache/pig/test/MapReduceMiniCluster.java PRE-CREATION 
  test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
  test/org/apache/pig/test/MiniGenericCluster.java 674860f880407595d68c4eea2b67e2d6465417fe 
  test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
  test/org/apache/pig/test/YarnMiniCluster.java 69d808124a4e9be661f1fda25755075dcb6607b1 


Diff: https://reviews.apache.org/r/65239/diff/3/

Changes: https://reviews.apache.org/r/65239/diff/2-3/


Testing
-------


Thanks,

Nandor Kollar


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/#review207039
-----------------------------------------------------------




build.xml
Lines 239 (patched)
<https://reviews.apache.org/r/65239/#comment290201>

    This is no longer needed after PIG-5282


- Rohini Palaniswamy


On Feb. 19, 2018, 1:54 p.m., Nandor Kollar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65239/
> -----------------------------------------------------------
> 
> (Updated Feb. 19, 2018, 1:54 p.m.)
> 
> 
> Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.
> 
> Major modifications:
>  * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
>  * hadoopversion property tells which mode the tests should run, the default is hadoop 2
>  * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
>  * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
>  * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
>  * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
>  * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
>  * ivy.xml: new config for Hadoop 3
>  * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
>  * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)
> 
> 
> Diffs
> -----
> 
>   bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
>   bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
>   build.xml 8bcbe5e4d60b793412dd5490518928b17308da3b 
>   ivy.xml 3ac675190f15528674361eda924af61fc1d07613 
>   ivy/libraries-h2.properties PRE-CREATION 
>   ivy/libraries-h3.properties PRE-CREATION 
>   ivy/libraries.properties 800b75edea300d6ff4d0a55481a1b3ed5e3be6ea 
>   test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
>   test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
>   test/org/apache/pig/test/MapReduceMiniCluster.java PRE-CREATION 
>   test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
>   test/org/apache/pig/test/MiniGenericCluster.java 674860f880407595d68c4eea2b67e2d6465417fe 
>   test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
>   test/org/apache/pig/test/YarnMiniCluster.java 69d808124a4e9be661f1fda25755075dcb6607b1 
> 
> 
> Diff: https://reviews.apache.org/r/65239/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Nandor Kollar
> 
>


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Nandor Kollar via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/
-----------------------------------------------------------

(Updated Feb. 19, 2018, 1:54 p.m.)


Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.


Repository: pig-git


Description
-------

This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.

Major modifications:
 * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
 * hadoopversion property tells which mode the tests should run, the default is hadoop 2
 * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
 * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
 * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
 * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
 * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
 * ivy.xml: new config for Hadoop 3
 * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
 * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)


Diffs (updated)
-----

  bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
  bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
  build.xml 8bcbe5e4d60b793412dd5490518928b17308da3b 
  ivy.xml 3ac675190f15528674361eda924af61fc1d07613 
  ivy/libraries-h2.properties PRE-CREATION 
  ivy/libraries-h3.properties PRE-CREATION 
  ivy/libraries.properties 800b75edea300d6ff4d0a55481a1b3ed5e3be6ea 
  test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
  test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
  test/org/apache/pig/test/MapReduceMiniCluster.java PRE-CREATION 
  test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
  test/org/apache/pig/test/MiniGenericCluster.java 674860f880407595d68c4eea2b67e2d6465417fe 
  test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
  test/org/apache/pig/test/YarnMiniCluster.java 69d808124a4e9be661f1fda25755075dcb6607b1 


Diff: https://reviews.apache.org/r/65239/diff/2/

Changes: https://reviews.apache.org/r/65239/diff/1-2/


Testing
-------


Thanks,

Nandor Kollar


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Nandor Kollar via Review Board <no...@reviews.apache.org>.

> On Jan. 26, 2018, 4:58 p.m., Rohini Palaniswamy wrote:
> > bin/pig
> > Line 480 (original), 478 (patched)
> > <https://reviews.apache.org/r/65239/diff/1/?file=1942582#file1942582line487>
> >
> >     How does it work when both h2 and h3 jars are generated in one target?

Is it possible to build with h2 and h3 into one target? I assume that there's only one jar there, relevant for hadoop 3 or hadoop 2.

If there's two jar there, then pig won't start:
Error: Could not find or load main class org.apache.pig.Main

Should we prepare for this case too? Not sure what is the best approach here.


- Nandor


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/#review196182
-----------------------------------------------------------


On April 10, 2019, 11:49 a.m., Nandor Kollar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65239/
> -----------------------------------------------------------
> 
> (Updated April 10, 2019, 11:49 a.m.)
> 
> 
> Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.
> 
> Major modifications:
>  * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
>  * hadoopversion property tells which mode the tests should run, the default is hadoop 2
>  * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
>  * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
>  * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
>  * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
>  * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
>  * ivy.xml: new config for Hadoop 3
>  * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
>  * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)
> 
> 
> Diffs
> -----
> 
>   bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
>   bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
>   build.xml 9bb1b125eef3c9381466b8f043ecd31ff0d56dc7 
>   ivy.xml a93da2a9a20edf103dcbf08abc92301a1c2fda9e 
>   ivy/libraries-h2.properties PRE-CREATION 
>   ivy/libraries-h3.properties PRE-CREATION 
>   ivy/libraries.properties 1abc9677e6c0747bb102c563b4f6642274541476 
>   test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
>   test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
>   test/org/apache/pig/test/MapReduceMiniCluster.java PRE-CREATION 
>   test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
>   test/org/apache/pig/test/MiniGenericCluster.java 674860f880407595d68c4eea2b67e2d6465417fe 
>   test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
>   test/org/apache/pig/test/YarnMiniCluster.java 69d808124a4e9be661f1fda25755075dcb6607b1 
> 
> 
> Diff: https://reviews.apache.org/r/65239/diff/4/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Nandor Kollar
> 
>


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Rohini Palaniswamy <ro...@gmail.com>.

> On Jan. 26, 2018, 4:58 p.m., Rohini Palaniswamy wrote:
> > test/org/apache/pig/parser/TestQueryParserUtils.java
> > Line 63 (original), 63 (patched)
> > <https://reviews.apache.org/r/65239/diff/1/?file=1942594#file1942594line63>
> >
> >     What is swebhdfs? secure webhdfs?
> 
> Nandor Kollar wrote:
>     Yes, it is secure webhdfs (https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/SWebHdfs.html). Not sure why I changed to secure. :) Should we use non-secure instead? And is it fine to test webhdfs instead of hftp here for Hadoop 2 too?

webhdfs is already there in the same test further down. If you could just move the swebdhfs right after webhdfs that would be nice.


> On Jan. 26, 2018, 4:58 p.m., Rohini Palaniswamy wrote:
> > test/org/apache/pig/test/MiniCluster.java
> > Line 107 (original), 112-114 (patched)
> > <https://reviews.apache.org/r/65239/diff/1/?file=1942595#file1942595line112>
> >
> >     Refer to YarnMiniCluster for what is written into each of the files.
> >     
> >     Actually this whole class can be deleted and YarnMiniCluster used instead for EXECTYPE_MR. There might be one or two settings missed which will have to be copied to YarnMiniCluster. If that works without any issues, would prefer that.
> 
> Nandor Kollar wrote:
>     How about having a new class (or rename MiniCluster) called MapReduceMiniCluster, which extends YarnMiniCluster, but with a very minimal code in the setup method? This way if any MR specific property has to be set, then only that minicluster class is changed, and it won't have any effect on Tez and Spark exec types.

+1 for the idea. Renaming MiniCluster to MapReduceMiniCluster and making it extend YarnMiniCluster sounds better.


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/#review196182
-----------------------------------------------------------


On Jan. 19, 2018, 4:14 p.m., Nandor Kollar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65239/
> -----------------------------------------------------------
> 
> (Updated Jan. 19, 2018, 4:14 p.m.)
> 
> 
> Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.
> 
> Major modifications:
>  * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
>  * hadoopversion property tells which mode the tests should run, the default is hadoop 2
>  * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
>  * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
>  * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
>  * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
>  * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
>  * ivy.xml: new config for Hadoop 3
>  * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
>  * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)
> 
> 
> Diffs
> -----
> 
>   bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
>   bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
>   build.xml 8bcbe5e4d60b793412dd5490518928b17308da3b 
>   ivy.xml 3ac675190f15528674361eda924af61fc1d07613 
>   ivy/libraries-common.properties PRE-CREATION 
>   ivy/libraries-h2.properties PRE-CREATION 
>   ivy/libraries-h3.properties PRE-CREATION 
>   ivy/libraries.properties 800b75edea300d6ff4d0a55481a1b3ed5e3be6ea 
>   shims/src/hadoop3/org/apache/pig/backend/hadoop/executionengine/shims/HadoopShims.java PRE-CREATION 
>   shims/test/hadoop3/.gitignore PRE-CREATION 
>   test/e2e/pig/build.xml 384639dfeddaa31ab7b2891ee1ed3602e1f0c08d 
>   test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
>   test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
>   test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
>   test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
> 
> 
> Diff: https://reviews.apache.org/r/65239/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Nandor Kollar
> 
>


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Nandor Kollar via Review Board <no...@reviews.apache.org>.

> On Jan. 26, 2018, 4:58 p.m., Rohini Palaniswamy wrote:
> > test/org/apache/pig/test/MiniCluster.java
> > Line 107 (original), 112-114 (patched)
> > <https://reviews.apache.org/r/65239/diff/1/?file=1942595#file1942595line112>
> >
> >     Refer to YarnMiniCluster for what is written into each of the files.
> >     
> >     Actually this whole class can be deleted and YarnMiniCluster used instead for EXECTYPE_MR. There might be one or two settings missed which will have to be copied to YarnMiniCluster. If that works without any issues, would prefer that.
> 
> Nandor Kollar wrote:
>     How about having a new class (or rename MiniCluster) called MapReduceMiniCluster, which extends YarnMiniCluster, but with a very minimal code in the setup method? This way if any MR specific property has to be set, then only that minicluster class is changed, and it won't have any effect on Tez and Spark exec types.
> 
> Rohini Palaniswamy wrote:
>     +1 for the idea. Renaming MiniCluster to MapReduceMiniCluster and making it extend YarnMiniCluster sounds better.
> 
> Nandor Kollar wrote:
>     I added a new class, MapReduceMiniCluster which is is used insted of MiniCluster. Unfortunately now one test fails: testStopOnFailure. I searched in the Jira, and I think the failure is similar to PIG-5245, Rohini, but I'm not sure how to fix it in YarnMiniCluster. Rohini, could you please help with this?
> 
> Rohini Palaniswamy wrote:
>     Is this still an issue and do you need me to look at it?

Yes, it is still an outstanding problem. I don't have a solution for it yet.


- Nandor


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/#review196182
-----------------------------------------------------------


On Sept. 5, 2018, 4:37 p.m., Nandor Kollar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65239/
> -----------------------------------------------------------
> 
> (Updated Sept. 5, 2018, 4:37 p.m.)
> 
> 
> Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.
> 
> Major modifications:
>  * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
>  * hadoopversion property tells which mode the tests should run, the default is hadoop 2
>  * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
>  * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
>  * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
>  * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
>  * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
>  * ivy.xml: new config for Hadoop 3
>  * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
>  * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)
> 
> 
> Diffs
> -----
> 
>   bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
>   bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
>   build.xml 34697ee5fc7a9fff9a0a6359e10ccb9fa3accb47 
>   ivy.xml 0902b18040de579362abd2e785bdb42ac189a8b0 
>   ivy/libraries-h2.properties PRE-CREATION 
>   ivy/libraries-h3.properties PRE-CREATION 
>   ivy/libraries.properties ec71472be7e885b8e03b03430c4d274ba84541ea 
>   test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
>   test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
>   test/org/apache/pig/test/MapReduceMiniCluster.java PRE-CREATION 
>   test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
>   test/org/apache/pig/test/MiniGenericCluster.java 674860f880407595d68c4eea2b67e2d6465417fe 
>   test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
>   test/org/apache/pig/test/YarnMiniCluster.java 69d808124a4e9be661f1fda25755075dcb6607b1 
> 
> 
> Diff: https://reviews.apache.org/r/65239/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Nandor Kollar
> 
>


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Rohini Palaniswamy <ro...@gmail.com>.

> On Jan. 26, 2018, 4:58 p.m., Rohini Palaniswamy wrote:
> > test/org/apache/pig/test/MiniCluster.java
> > Line 107 (original), 112-114 (patched)
> > <https://reviews.apache.org/r/65239/diff/1/?file=1942595#file1942595line112>
> >
> >     Refer to YarnMiniCluster for what is written into each of the files.
> >     
> >     Actually this whole class can be deleted and YarnMiniCluster used instead for EXECTYPE_MR. There might be one or two settings missed which will have to be copied to YarnMiniCluster. If that works without any issues, would prefer that.
> 
> Nandor Kollar wrote:
>     How about having a new class (or rename MiniCluster) called MapReduceMiniCluster, which extends YarnMiniCluster, but with a very minimal code in the setup method? This way if any MR specific property has to be set, then only that minicluster class is changed, and it won't have any effect on Tez and Spark exec types.
> 
> Rohini Palaniswamy wrote:
>     +1 for the idea. Renaming MiniCluster to MapReduceMiniCluster and making it extend YarnMiniCluster sounds better.
> 
> Nandor Kollar wrote:
>     I added a new class, MapReduceMiniCluster which is is used insted of MiniCluster. Unfortunately now one test fails: testStopOnFailure. I searched in the Jira, and I think the failure is similar to PIG-5245, Rohini, but I'm not sure how to fix it in YarnMiniCluster. Rohini, could you please help with this?

Is this still an issue and do you need me to look at it?


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/#review196182
-----------------------------------------------------------


On Feb. 19, 2018, 1:54 p.m., Nandor Kollar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65239/
> -----------------------------------------------------------
> 
> (Updated Feb. 19, 2018, 1:54 p.m.)
> 
> 
> Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.
> 
> Major modifications:
>  * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
>  * hadoopversion property tells which mode the tests should run, the default is hadoop 2
>  * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
>  * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
>  * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
>  * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
>  * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
>  * ivy.xml: new config for Hadoop 3
>  * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
>  * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)
> 
> 
> Diffs
> -----
> 
>   bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
>   bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
>   build.xml 8bcbe5e4d60b793412dd5490518928b17308da3b 
>   ivy.xml 3ac675190f15528674361eda924af61fc1d07613 
>   ivy/libraries-h2.properties PRE-CREATION 
>   ivy/libraries-h3.properties PRE-CREATION 
>   ivy/libraries.properties 800b75edea300d6ff4d0a55481a1b3ed5e3be6ea 
>   test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
>   test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
>   test/org/apache/pig/test/MapReduceMiniCluster.java PRE-CREATION 
>   test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
>   test/org/apache/pig/test/MiniGenericCluster.java 674860f880407595d68c4eea2b67e2d6465417fe 
>   test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
>   test/org/apache/pig/test/YarnMiniCluster.java 69d808124a4e9be661f1fda25755075dcb6607b1 
> 
> 
> Diff: https://reviews.apache.org/r/65239/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Nandor Kollar
> 
>


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Nandor Kollar via Review Board <no...@reviews.apache.org>.

> On Jan. 26, 2018, 4:58 p.m., Rohini Palaniswamy wrote:
> > test/org/apache/pig/parser/TestQueryParserUtils.java
> > Line 63 (original), 63 (patched)
> > <https://reviews.apache.org/r/65239/diff/1/?file=1942594#file1942594line63>
> >
> >     What is swebhdfs? secure webhdfs?

Yes, it is secure webhdfs (https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/SWebHdfs.html). Not sure why I changed to secure. :) Should we use non-secure instead? And is it fine to test webhdfs instead of hftp here for Hadoop 2 too?


- Nandor


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/#review196182
-----------------------------------------------------------


On Jan. 19, 2018, 4:14 p.m., Nandor Kollar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65239/
> -----------------------------------------------------------
> 
> (Updated Jan. 19, 2018, 4:14 p.m.)
> 
> 
> Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.
> 
> Major modifications:
>  * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
>  * hadoopversion property tells which mode the tests should run, the default is hadoop 2
>  * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
>  * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
>  * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
>  * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
>  * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
>  * ivy.xml: new config for Hadoop 3
>  * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
>  * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)
> 
> 
> Diffs
> -----
> 
>   bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
>   bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
>   build.xml 8bcbe5e4d60b793412dd5490518928b17308da3b 
>   ivy.xml 3ac675190f15528674361eda924af61fc1d07613 
>   ivy/libraries-common.properties PRE-CREATION 
>   ivy/libraries-h2.properties PRE-CREATION 
>   ivy/libraries-h3.properties PRE-CREATION 
>   ivy/libraries.properties 800b75edea300d6ff4d0a55481a1b3ed5e3be6ea 
>   shims/src/hadoop3/org/apache/pig/backend/hadoop/executionengine/shims/HadoopShims.java PRE-CREATION 
>   shims/test/hadoop3/.gitignore PRE-CREATION 
>   test/e2e/pig/build.xml 384639dfeddaa31ab7b2891ee1ed3602e1f0c08d 
>   test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
>   test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
>   test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
>   test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
> 
> 
> Diff: https://reviews.apache.org/r/65239/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Nandor Kollar
> 
>


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Nandor Kollar via Review Board <no...@reviews.apache.org>.

> On Jan. 26, 2018, 4:58 p.m., Rohini Palaniswamy wrote:
> > test/org/apache/pig/test/MiniCluster.java
> > Line 107 (original), 112-114 (patched)
> > <https://reviews.apache.org/r/65239/diff/1/?file=1942595#file1942595line112>
> >
> >     Refer to YarnMiniCluster for what is written into each of the files.
> >     
> >     Actually this whole class can be deleted and YarnMiniCluster used instead for EXECTYPE_MR. There might be one or two settings missed which will have to be copied to YarnMiniCluster. If that works without any issues, would prefer that.

How about having a new class (or rename MiniCluster) called MapReduceMiniCluster, which extends YarnMiniCluster, but with a very minimal code in the setup method? This way if any MR specific property has to be set, then only that minicluster class is changed, and it won't have any effect on Tez and Spark exec types.


- Nandor


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/#review196182
-----------------------------------------------------------


On Jan. 19, 2018, 4:14 p.m., Nandor Kollar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65239/
> -----------------------------------------------------------
> 
> (Updated Jan. 19, 2018, 4:14 p.m.)
> 
> 
> Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.
> 
> Major modifications:
>  * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
>  * hadoopversion property tells which mode the tests should run, the default is hadoop 2
>  * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
>  * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
>  * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
>  * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
>  * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
>  * ivy.xml: new config for Hadoop 3
>  * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
>  * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)
> 
> 
> Diffs
> -----
> 
>   bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
>   bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
>   build.xml 8bcbe5e4d60b793412dd5490518928b17308da3b 
>   ivy.xml 3ac675190f15528674361eda924af61fc1d07613 
>   ivy/libraries-common.properties PRE-CREATION 
>   ivy/libraries-h2.properties PRE-CREATION 
>   ivy/libraries-h3.properties PRE-CREATION 
>   ivy/libraries.properties 800b75edea300d6ff4d0a55481a1b3ed5e3be6ea 
>   shims/src/hadoop3/org/apache/pig/backend/hadoop/executionengine/shims/HadoopShims.java PRE-CREATION 
>   shims/test/hadoop3/.gitignore PRE-CREATION 
>   test/e2e/pig/build.xml 384639dfeddaa31ab7b2891ee1ed3602e1f0c08d 
>   test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
>   test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
>   test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
>   test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
> 
> 
> Diff: https://reviews.apache.org/r/65239/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Nandor Kollar
> 
>


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Nandor Kollar via Review Board <no...@reviews.apache.org>.

> On Jan. 26, 2018, 4:58 p.m., Rohini Palaniswamy wrote:
> > test/org/apache/pig/test/MiniCluster.java
> > Line 107 (original), 112-114 (patched)
> > <https://reviews.apache.org/r/65239/diff/1/?file=1942595#file1942595line112>
> >
> >     Refer to YarnMiniCluster for what is written into each of the files.
> >     
> >     Actually this whole class can be deleted and YarnMiniCluster used instead for EXECTYPE_MR. There might be one or two settings missed which will have to be copied to YarnMiniCluster. If that works without any issues, would prefer that.
> 
> Nandor Kollar wrote:
>     How about having a new class (or rename MiniCluster) called MapReduceMiniCluster, which extends YarnMiniCluster, but with a very minimal code in the setup method? This way if any MR specific property has to be set, then only that minicluster class is changed, and it won't have any effect on Tez and Spark exec types.
> 
> Rohini Palaniswamy wrote:
>     +1 for the idea. Renaming MiniCluster to MapReduceMiniCluster and making it extend YarnMiniCluster sounds better.

I added a new class, MapReduceMiniCluster which is is used insted of MiniCluster. Unfortunately now one test fails: testStopOnFailure. I searched in the Jira, and I think the failure is similar to PIG-5245, Rohini, but I'm not sure how to fix it in YarnMiniCluster. Rohini, could you please help with this?


- Nandor


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/#review196182
-----------------------------------------------------------


On Feb. 19, 2018, 1:54 p.m., Nandor Kollar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65239/
> -----------------------------------------------------------
> 
> (Updated Feb. 19, 2018, 1:54 p.m.)
> 
> 
> Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.
> 
> Major modifications:
>  * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
>  * hadoopversion property tells which mode the tests should run, the default is hadoop 2
>  * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
>  * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
>  * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
>  * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
>  * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
>  * ivy.xml: new config for Hadoop 3
>  * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
>  * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)
> 
> 
> Diffs
> -----
> 
>   bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
>   bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
>   build.xml 8bcbe5e4d60b793412dd5490518928b17308da3b 
>   ivy.xml 3ac675190f15528674361eda924af61fc1d07613 
>   ivy/libraries-h2.properties PRE-CREATION 
>   ivy/libraries-h3.properties PRE-CREATION 
>   ivy/libraries.properties 800b75edea300d6ff4d0a55481a1b3ed5e3be6ea 
>   test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
>   test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
>   test/org/apache/pig/test/MapReduceMiniCluster.java PRE-CREATION 
>   test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
>   test/org/apache/pig/test/MiniGenericCluster.java 674860f880407595d68c4eea2b67e2d6465417fe 
>   test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
>   test/org/apache/pig/test/YarnMiniCluster.java 69d808124a4e9be661f1fda25755075dcb6607b1 
> 
> 
> Diff: https://reviews.apache.org/r/65239/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Nandor Kollar
> 
>


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/#review196182
-----------------------------------------------------------




bin/pig
Line 480 (original), 478 (patched)
<https://reviews.apache.org/r/65239/#comment275726>

    How does it work when both h2 and h3 jars are generated in one target?



ivy.xml
Line 320 (original), 339 (patched)
<https://reviews.apache.org/r/65239/#comment275712>

    This can be removed. There is already one with htrace.version (3.1.0-incubating) picked up above. Can just increment htrace.version to 3.2.0-incubating.



ivy/libraries-common.properties
Lines 16 (patched)
<https://reviews.apache.org/r/65239/#comment275718>

    Can we keep the original name of libraries.properties as it would not make much difference? This would unnecessarily throw away history.



shims/src/hadoop3/org/apache/pig/backend/hadoop/executionengine/shims/HadoopShims.java
Lines 37 (patched)
<https://reviews.apache.org/r/65239/#comment275724>

    Can just point to hadoop2 shims if there are no changes.



test/org/apache/pig/parser/TestQueryParserUtils.java
Line 63 (original), 63 (patched)
<https://reviews.apache.org/r/65239/#comment275717>

    What is swebhdfs? secure webhdfs?



test/org/apache/pig/test/MiniCluster.java
Line 107 (original), 112-114 (patched)
<https://reviews.apache.org/r/65239/#comment275715>

    Refer to YarnMiniCluster for what is written into each of the files.
    
    Actually this whole class can be deleted and YarnMiniCluster used instead for EXECTYPE_MR. There might be one or two settings missed which will have to be copied to YarnMiniCluster. If that works without any issues, would prefer that.


- Rohini Palaniswamy


On Jan. 19, 2018, 4:14 p.m., Nandor Kollar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65239/
> -----------------------------------------------------------
> 
> (Updated Jan. 19, 2018, 4:14 p.m.)
> 
> 
> Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.
> 
> Major modifications:
>  * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
>  * hadoopversion property tells which mode the tests should run, the default is hadoop 2
>  * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
>  * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
>  * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
>  * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
>  * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
>  * ivy.xml: new config for Hadoop 3
>  * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
>  * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)
> 
> 
> Diffs
> -----
> 
>   bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
>   bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
>   build.xml 8bcbe5e4d60b793412dd5490518928b17308da3b 
>   ivy.xml 3ac675190f15528674361eda924af61fc1d07613 
>   ivy/libraries-common.properties PRE-CREATION 
>   ivy/libraries-h2.properties PRE-CREATION 
>   ivy/libraries-h3.properties PRE-CREATION 
>   ivy/libraries.properties 800b75edea300d6ff4d0a55481a1b3ed5e3be6ea 
>   shims/src/hadoop3/org/apache/pig/backend/hadoop/executionengine/shims/HadoopShims.java PRE-CREATION 
>   shims/test/hadoop3/.gitignore PRE-CREATION 
>   test/e2e/pig/build.xml 384639dfeddaa31ab7b2891ee1ed3602e1f0c08d 
>   test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
>   test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
>   test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
>   test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
> 
> 
> Diff: https://reviews.apache.org/r/65239/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Nandor Kollar
> 
>


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Nandor Kollar via Review Board <no...@reviews.apache.org>.

> On Jan. 22, 2018, 2:28 p.m., Adam Szita wrote:
> > build.xml
> > Line 1003 (original), 1071 (patched)
> > <https://reviews.apache.org/r/65239/diff/1/?file=1942584#file1942584line1074>
> >
> >     Why is jar listed as new dependency target here?
> >     
> >     AFAIK compile-test already depends on jar-simple, which I think is better than jar as it compiles against one Spark version only.

In 2nd version of this patch, jar is deleted from the dependencies.


- Nandor


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/#review195905
-----------------------------------------------------------


On Feb. 19, 2018, 1:54 p.m., Nandor Kollar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65239/
> -----------------------------------------------------------
> 
> (Updated Feb. 19, 2018, 1:54 p.m.)
> 
> 
> Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.
> 
> Major modifications:
>  * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
>  * hadoopversion property tells which mode the tests should run, the default is hadoop 2
>  * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
>  * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
>  * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
>  * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
>  * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
>  * ivy.xml: new config for Hadoop 3
>  * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
>  * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)
> 
> 
> Diffs
> -----
> 
>   bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
>   bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
>   build.xml 8bcbe5e4d60b793412dd5490518928b17308da3b 
>   ivy.xml 3ac675190f15528674361eda924af61fc1d07613 
>   ivy/libraries-h2.properties PRE-CREATION 
>   ivy/libraries-h3.properties PRE-CREATION 
>   ivy/libraries.properties 800b75edea300d6ff4d0a55481a1b3ed5e3be6ea 
>   test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
>   test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
>   test/org/apache/pig/test/MapReduceMiniCluster.java PRE-CREATION 
>   test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
>   test/org/apache/pig/test/MiniGenericCluster.java 674860f880407595d68c4eea2b67e2d6465417fe 
>   test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
>   test/org/apache/pig/test/YarnMiniCluster.java 69d808124a4e9be661f1fda25755075dcb6607b1 
> 
> 
> Diff: https://reviews.apache.org/r/65239/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Nandor Kollar
> 
>


Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

Posted by Adam Szita via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/#review195905
-----------------------------------------------------------




build.xml
Line 1003 (original), 1071 (patched)
<https://reviews.apache.org/r/65239/#comment275261>

    Why is jar listed as new dependency target here?
    
    AFAIK compile-test already depends on jar-simple, which I think is better than jar as it compiles against one Spark version only.


- Adam Szita


On Jan. 19, 2018, 4:14 p.m., Nandor Kollar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65239/
> -----------------------------------------------------------
> 
> (Updated Jan. 19, 2018, 4:14 p.m.)
> 
> 
> Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam Szita.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> This is an initial patch that adds Hadoop 3 support to Pig in addition to Hadoop 2.
> 
> Major modifications:
>  * No breaking API change was introduced in Hadoop 3, the current code compiles with Hadoop 3
>  * hadoopversion property tells which mode the tests should run, the default is hadoop 2
>  * Hadoop 3 introduced a security fix, only whitelisted environment variables are passed to MiniCluster
>  * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the config into all of these files in MiniCluster.java (into hadoop-site.xml too to stay compatible with Hadoop 2) for the shake of simplicity, we might want to have different files for Hadoop 2 and separate the properties for Hadoop 3.
>  * TestErrorHandling.java: small format change in error message, modified the assert so it works on both on Hadoop 2 and Hadoop 3
>  * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any more. I think we should move it to the src instead.
>  * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 specific properties
>  * ivy.xml: new config for Hadoop 3
>  * build.xml: new target to package both hadoop2 and hadoop3 - not sure that this is needed, if we move shims, the I think we don't need this target
>  * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)
> 
> 
> Diffs
> -----
> 
>   bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
>   bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
>   build.xml 8bcbe5e4d60b793412dd5490518928b17308da3b 
>   ivy.xml 3ac675190f15528674361eda924af61fc1d07613 
>   ivy/libraries-common.properties PRE-CREATION 
>   ivy/libraries-h2.properties PRE-CREATION 
>   ivy/libraries-h3.properties PRE-CREATION 
>   ivy/libraries.properties 800b75edea300d6ff4d0a55481a1b3ed5e3be6ea 
>   shims/src/hadoop3/org/apache/pig/backend/hadoop/executionengine/shims/HadoopShims.java PRE-CREATION 
>   shims/test/hadoop3/.gitignore PRE-CREATION 
>   test/e2e/pig/build.xml 384639dfeddaa31ab7b2891ee1ed3602e1f0c08d 
>   test/org/apache/pig/parser/TestErrorHandling.java 15e09031c360cea5f81609129ac3a6d38d68d3ea 
>   test/org/apache/pig/parser/TestQueryParserUtils.java 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
>   test/org/apache/pig/test/MiniCluster.java a7532ad750f06ffae5a03024b1658ff77152c902 
>   test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
> 
> 
> Diff: https://reviews.apache.org/r/65239/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Nandor Kollar
> 
>