You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by Benjamin Zhitomirsky <be...@gmail.com> on 2014/05/04 21:05:42 UTC

Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19929/
-----------------------------------------------------------

(Updated May 4, 2014, 7:05 p.m.)


Review request for oozie.


Changes
-------

Job-xml now allows fully qualified path, otherwise defaulted to the application's file system.


Repository: oozie-git


Description
-------

When <name-node> element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly:
?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the <name-node> element, but later during job submission it expects this path to be under the default Oozie name node
?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action.

Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters.


Diffs (updated)
-----

  core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 59ad143 
  core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 
  core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f 
  core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 
  core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 

Diff: https://reviews.apache.org/r/19929/diff/


Testing
-------

On deployed Hadoop cluster. Two tests were added.


Thanks,

Benjamin Zhitomirsky


Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19929/#review45089
-----------------------------------------------------------


Patch looks good. Just two minor comments.


core/src/main/java/org/apache/oozie/service/ShareLibService.java
<https://reviews.apache.org/r/19929/#comment79756>

    Is constructing new FileSystem required? Can't the local variable "fs" be returned as is. The fs will access filesystem as the oozie user. But I guess that should be ok as it is only used for reading files. 



core/src/test/java/org/apache/oozie/test/XTestCase.java
<https://reviews.apache.org/r/19929/#comment79757>

    create - c lower case.


- Rohini Palaniswamy


On May 30, 2014, 2:32 p.m., Benjamin Zhitomirsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19929/
> -----------------------------------------------------------
> 
> (Updated May 30, 2014, 2:32 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> When <name-node> element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly:
> ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the <name-node> element, but later during job submission it expects this path to be under the default Oozie name node
> ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action.
> 
> Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters.
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c 
>   core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e 
>   core/src/main/java/org/apache/oozie/service/ShareLibService.java 353b382 
>   core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 
>   core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 
>   core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f 
>   core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 
>   docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 
> 
> Diff: https://reviews.apache.org/r/19929/diff/
> 
> 
> Testing
> -------
> 
> On deployed Hadoop cluster. Two tests were added.
> 
> 
> Thanks,
> 
> Benjamin Zhitomirsky
> 
>


Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node

Posted by Benjamin Zhitomirsky <be...@gmail.com>.

> On June 15, 2014, 2:03 a.m., Rohini Palaniswamy wrote:
> > core/src/main/java/org/apache/oozie/service/ShareLibService.java, line 96
> > <https://reviews.apache.org/r/19929/diff/6-7/?file=599794#file599794line96>
> >
> >     You can just return this fs object

This won't be right, because if sysLibPath is a full qualified path than fs would not be the right filesystem.
Of cause I may check for authority in sysLibPath and if not present then return fs, otherwise call getFileSystem().


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19929/#review45702
-----------------------------------------------------------


On June 14, 2014, 7:27 p.m., Benjamin Zhitomirsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19929/
> -----------------------------------------------------------
> 
> (Updated June 14, 2014, 7:27 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> When <name-node> element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly:
> ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the <name-node> element, but later during job submission it expects this path to be under the default Oozie name node
> ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action.
> 
> Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters.
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c 
>   core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e 
>   core/src/main/java/org/apache/oozie/service/ShareLibService.java 320af8b 
>   core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 
>   core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 
>   core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f 
>   core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 
>   docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 
> 
> Diff: https://reviews.apache.org/r/19929/diff/
> 
> 
> Testing
> -------
> 
> On deployed Hadoop cluster. Two tests were added.
> 
> 
> Thanks,
> 
> Benjamin Zhitomirsky
> 
>


Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node

Posted by Benjamin Zhitomirsky <be...@gmail.com>.

> On June 15, 2014, 2:03 a.m., Rohini Palaniswamy wrote:
> > core/src/main/java/org/apache/oozie/service/ShareLibService.java, line 96
> > <https://reviews.apache.org/r/19929/diff/6-7/?file=599794#file599794line96>
> >
> >     You can just return this fs object
> 
> Benjamin Zhitomirsky wrote:
>     This won't be right, because if sysLibPath is a full qualified path than fs would not be the right filesystem.
>     Of cause I may check for authority in sysLibPath and if not present then return fs, otherwise call getFileSystem().
> 
> Rohini Palaniswamy wrote:
>     This fs is constructed by doing fs = FileSystem.get(has.createJobConf(uri.getAuthority())); and should take care of the case where uri authority is null or if it is fully qualified for system lib path. Why do you want to construct fs by always passing null for authority?

Hmm... Yes, you are right... My bad! I will fix it as you say.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19929/#review45702
-----------------------------------------------------------


On June 14, 2014, 7:27 p.m., Benjamin Zhitomirsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19929/
> -----------------------------------------------------------
> 
> (Updated June 14, 2014, 7:27 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> When <name-node> element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly:
> ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the <name-node> element, but later during job submission it expects this path to be under the default Oozie name node
> ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action.
> 
> Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters.
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c 
>   core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e 
>   core/src/main/java/org/apache/oozie/service/ShareLibService.java 320af8b 
>   core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 
>   core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 
>   core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f 
>   core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 
>   docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 
> 
> Diff: https://reviews.apache.org/r/19929/diff/
> 
> 
> Testing
> -------
> 
> On deployed Hadoop cluster. Two tests were added.
> 
> 
> Thanks,
> 
> Benjamin Zhitomirsky
> 
>


Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node

Posted by Rohini Palaniswamy <ro...@gmail.com>.

> On June 15, 2014, 2:03 a.m., Rohini Palaniswamy wrote:
> > core/src/main/java/org/apache/oozie/service/ShareLibService.java, line 96
> > <https://reviews.apache.org/r/19929/diff/6-7/?file=599794#file599794line96>
> >
> >     You can just return this fs object
> 
> Benjamin Zhitomirsky wrote:
>     This won't be right, because if sysLibPath is a full qualified path than fs would not be the right filesystem.
>     Of cause I may check for authority in sysLibPath and if not present then return fs, otherwise call getFileSystem().

This fs is constructed by doing fs = FileSystem.get(has.createJobConf(uri.getAuthority())); and should take care of the case where uri authority is null or if it is fully qualified for system lib path. Why do you want to construct fs by always passing null for authority?


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19929/#review45702
-----------------------------------------------------------


On June 14, 2014, 7:27 p.m., Benjamin Zhitomirsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19929/
> -----------------------------------------------------------
> 
> (Updated June 14, 2014, 7:27 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> When <name-node> element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly:
> ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the <name-node> element, but later during job submission it expects this path to be under the default Oozie name node
> ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action.
> 
> Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters.
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c 
>   core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e 
>   core/src/main/java/org/apache/oozie/service/ShareLibService.java 320af8b 
>   core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 
>   core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 
>   core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f 
>   core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 
>   docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 
> 
> Diff: https://reviews.apache.org/r/19929/diff/
> 
> 
> Testing
> -------
> 
> On deployed Hadoop cluster. Two tests were added.
> 
> 
> Thanks,
> 
> Benjamin Zhitomirsky
> 
>


Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19929/#review45702
-----------------------------------------------------------



core/src/main/java/org/apache/oozie/service/ShareLibService.java
<https://reviews.apache.org/r/19929/#comment80658>

    You can just return this fs object


- Rohini Palaniswamy


On June 14, 2014, 7:27 p.m., Benjamin Zhitomirsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19929/
> -----------------------------------------------------------
> 
> (Updated June 14, 2014, 7:27 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> When <name-node> element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly:
> ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the <name-node> element, but later during job submission it expects this path to be under the default Oozie name node
> ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action.
> 
> Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters.
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c 
>   core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e 
>   core/src/main/java/org/apache/oozie/service/ShareLibService.java 320af8b 
>   core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 
>   core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 
>   core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f 
>   core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 
>   docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 
> 
> Diff: https://reviews.apache.org/r/19929/diff/
> 
> 
> Testing
> -------
> 
> On deployed Hadoop cluster. Two tests were added.
> 
> 
> Thanks,
> 
> Benjamin Zhitomirsky
> 
>


Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19929/#review45985
-----------------------------------------------------------

Ship it!


Ship It!

- Rohini Palaniswamy


On June 16, 2014, 7:43 a.m., Benjamin Zhitomirsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19929/
> -----------------------------------------------------------
> 
> (Updated June 16, 2014, 7:43 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> When <name-node> element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly:
> ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the <name-node> element, but later during job submission it expects this path to be under the default Oozie name node
> ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action.
> 
> Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters.
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c 
>   core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e 
>   core/src/main/java/org/apache/oozie/service/ShareLibService.java 320af8b 
>   core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 
>   core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 
>   core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f 
>   core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 
>   docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 
> 
> Diff: https://reviews.apache.org/r/19929/diff/
> 
> 
> Testing
> -------
> 
> On deployed Hadoop cluster. Two tests were added.
> 
> 
> Thanks,
> 
> Benjamin Zhitomirsky
> 
>


Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node

Posted by Benjamin Zhitomirsky <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19929/
-----------------------------------------------------------

(Updated June 16, 2014, 7:43 a.m.)


Review request for oozie.


Changes
-------

ShareLibService#getFileSystem now simply returns fs as Rohini pointed out


Repository: oozie-git


Description
-------

When <name-node> element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly:
?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the <name-node> element, but later during job submission it expects this path to be under the default Oozie name node
?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action.

Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters.


Diffs (updated)
-----

  core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c 
  core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e 
  core/src/main/java/org/apache/oozie/service/ShareLibService.java 320af8b 
  core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 
  core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 
  core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f 
  core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 
  core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 
  docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 

Diff: https://reviews.apache.org/r/19929/diff/


Testing
-------

On deployed Hadoop cluster. Two tests were added.


Thanks,

Benjamin Zhitomirsky


Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node

Posted by Benjamin Zhitomirsky <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19929/
-----------------------------------------------------------

(Updated June 14, 2014, 7:27 p.m.)


Review request for oozie.


Changes
-------

Latest CR comments applied: ShareLibService#getFileSystem now retrieves filesystem simply for Oozie user instead of the caller.


Repository: oozie-git


Description
-------

When <name-node> element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly:
?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the <name-node> element, but later during job submission it expects this path to be under the default Oozie name node
?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action.

Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters.


Diffs (updated)
-----

  core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c 
  core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e 
  core/src/main/java/org/apache/oozie/service/ShareLibService.java 320af8b 
  core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 
  core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 
  core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f 
  core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 
  core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 
  docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 

Diff: https://reviews.apache.org/r/19929/diff/


Testing
-------

On deployed Hadoop cluster. Two tests were added.


Thanks,

Benjamin Zhitomirsky


Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node

Posted by Benjamin Zhitomirsky <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19929/
-----------------------------------------------------------

(Updated May 30, 2014, 2:32 p.m.)


Review request for oozie.


Changes
-------

Fixed according to CR comments


Repository: oozie-git


Description
-------

When <name-node> element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly:
?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the <name-node> element, but later during job submission it expects this path to be under the default Oozie name node
?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action.

Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters.


Diffs (updated)
-----

  core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c 
  core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e 
  core/src/main/java/org/apache/oozie/service/ShareLibService.java 353b382 
  core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 
  core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 
  core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f 
  core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 
  core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 
  docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 

Diff: https://reviews.apache.org/r/19929/diff/


Testing
-------

On deployed Hadoop cluster. Two tests were added.


Thanks,

Benjamin Zhitomirsky


Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node

Posted by Benjamin Zhitomirsky <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19929/
-----------------------------------------------------------

(Updated May 8, 2014, 4:05 p.m.)


Review request for oozie.


Changes
-------

Fixes according to comments. Shared system libraries are defaulted to default filesystem, unless fully qualified path specified in configuration. Previous implementation wronly assumed that shared libraries should be located in the application's file system.


Repository: oozie-git


Description
-------

When <name-node> element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly:
?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the <name-node> element, but later during job submission it expects this path to be under the default Oozie name node
?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action.

Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters.


Diffs (updated)
-----

  core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c 
  core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e 
  core/src/main/java/org/apache/oozie/service/ShareLibService.java 353b382 
  core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 
  core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 
  core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f 
  core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 
  core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 
  docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 

Diff: https://reviews.apache.org/r/19929/diff/


Testing
-------

On deployed Hadoop cluster. Two tests were added.


Thanks,

Benjamin Zhitomirsky


Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node

Posted by Benjamin Zhitomirsky <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19929/
-----------------------------------------------------------

(Updated May 8, 2014, 3:53 p.m.)


Review request for oozie.


Changes
-------

Fixes according to CR comments. System libraries path now assumed to point to a default file system instead of application location.


Repository: oozie-git


Description
-------

When <name-node> element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly:
?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the <name-node> element, but later during job submission it expects this path to be under the default Oozie name node
?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action.

Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters.


Diffs (updated)
-----

  core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 40add2c 
  core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java bb68b0e 
  core/src/main/java/org/apache/oozie/service/ShareLibService.java 353b382 
  core/src/main/java/org/apache/oozie/util/JobUtils.java 135b096 
  core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 
  core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f 
  core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 
  core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 
  docs/src/site/twiki/WorkflowFunctionalSpec.twiki f7590d0 

Diff: https://reviews.apache.org/r/19929/diff/


Testing
-------

On deployed Hadoop cluster. Two tests were added.


Thanks,

Benjamin Zhitomirsky


Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node

Posted by Rohini Palaniswamy <ro...@gmail.com>.

> On May 5, 2014, 10:59 p.m., Rohini Palaniswamy wrote:
> > core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java, lines 443-458
> > <https://reviews.apache.org/r/19929/diff/3/?file=574188#file574188line443>
> >
> >     Can just do:
> >     Path pathToAdd = new Path(uri.normalize());
> >     Services.get().get(HadoopAccessorService.class).addFileToClassPath(user, pathToAdd, conf);
> >     
> >     
> >     Make the below change to HadoopAccessorService.java:
> >     public void addFileToClassPath(String user, final Path file, final Configuration conf)
> >                 throws IOException {
> >             ParamChecker.notEmpty(user, "user");
> >             try {
> >                 UserGroupInformation ugi = getUGI(user);
> >                 ugi.doAs(new PrivilegedExceptionAction<Void>() {
> >                     public Void run() throws Exception {
> >                         Configuration defaultConf = new Configuration();
> >                         XConfiguration.copy(conf, defaultConf);
> >                         //Doing this NOP add first to have the FS created and cached
> >                         DistributedCache.addFileToClassPath(file, defaultConf);
> >     
> >                         // Hadoop 0.20/1.x.
> >                         if (defaultConf.get("mapred.job.classpath.files") != null) {
> >                             // Duplicate hadoop 1.x code to workaround MAPREDUCE-2361 in Hadoop 0.20
> >                             // Refer OOZIE-1806.
> >                             String filepath = file.toUri().getPath();
> >                             String classpath = conf.get("mapred.job.classpath.files");
> >                             conf.set("mapred.job.classpath.files", classpath == null
> >                                 ? filepath
> >                                 : classpath + System.getProperty("path.separator") + filepath);
> >                             URI uri = file.getFileSystem(defaultConf).makeQualified(file).toUri();
> >                             DistributedCache.addCacheFile(uri, conf);
> >                         }
> >                         else { // Hadoop 0.23/2.x
> >                             DistributedCache.addFileToClassPath(file, conf);
> >                         }
> >     
> >                         return null;
> >                     }
> >                 });
> >     
> >             }
> >             catch (InterruptedException ex) {
> >                 throw new IOException(ex);
> >             }
> >     
> >         }

The code inside run() method can be replaced with JobUtils.addFileToClasspath() which is added by OOZIE-1806


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19929/#review42205
-----------------------------------------------------------


On May 4, 2014, 7:05 p.m., Benjamin Zhitomirsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19929/
> -----------------------------------------------------------
> 
> (Updated May 4, 2014, 7:05 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> When <name-node> element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly:
> ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the <name-node> element, but later during job submission it expects this path to be under the default Oozie name node
> ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action.
> 
> Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters.
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 59ad143 
>   core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 
>   core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f 
>   core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 
> 
> Diff: https://reviews.apache.org/r/19929/diff/
> 
> 
> Testing
> -------
> 
> On deployed Hadoop cluster. Two tests were added.
> 
> 
> Thanks,
> 
> Benjamin Zhitomirsky
> 
>


Re: Review Request 19929: OOZIE-1685: Oozie doesn’t process correctly workflows with a non-default name node

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19929/#review42205
-----------------------------------------------------------



core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
<https://reviews.apache.org/r/19929/#comment75952>

    hadoop Configuration deprecation feature should handle it. You don't have to explicitly check for both. Only checking fs.default.name is good enough.



core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
<https://reviews.apache.org/r/19929/#comment75953>

    Just getAuthority() != null is enough as it contains host as well



core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
<https://reviews.apache.org/r/19929/#comment75954>

    Since this is in a loop, we can have a map<String,FileSystem) of authority and filesystem and reuse.



core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
<https://reviews.apache.org/r/19929/#comment75968>

    Can just do:
    Path pathToAdd = new Path(uri.normalize());
    Services.get().get(HadoopAccessorService.class).addFileToClassPath(user, pathToAdd, conf);
    
    
    Make the below change to HadoopAccessorService.java:
    public void addFileToClassPath(String user, final Path file, final Configuration conf)
                throws IOException {
            ParamChecker.notEmpty(user, "user");
            try {
                UserGroupInformation ugi = getUGI(user);
                ugi.doAs(new PrivilegedExceptionAction<Void>() {
                    public Void run() throws Exception {
                        Configuration defaultConf = new Configuration();
                        XConfiguration.copy(conf, defaultConf);
                        //Doing this NOP add first to have the FS created and cached
                        DistributedCache.addFileToClassPath(file, defaultConf);
    
                        // Hadoop 0.20/1.x.
                        if (defaultConf.get("mapred.job.classpath.files") != null) {
                            // Duplicate hadoop 1.x code to workaround MAPREDUCE-2361 in Hadoop 0.20
                            // Refer OOZIE-1806.
                            String filepath = file.toUri().getPath();
                            String classpath = conf.get("mapred.job.classpath.files");
                            conf.set("mapred.job.classpath.files", classpath == null
                                ? filepath
                                : classpath + System.getProperty("path.separator") + filepath);
                            URI uri = file.getFileSystem(defaultConf).makeQualified(file).toUri();
                            DistributedCache.addCacheFile(uri, conf);
                        }
                        else { // Hadoop 0.23/2.x
                            DistributedCache.addFileToClassPath(file, conf);
                        }
    
                        return null;
                    }
                });
    
            }
            catch (InterruptedException ex) {
                throw new IOException(ex);
            }
    
        }



core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
<https://reviews.apache.org/r/19929/#comment75959>

    uri.normalize()



core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
<https://reviews.apache.org/r/19929/#comment75973>

    Can be removed



core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java
<https://reviews.apache.org/r/19929/#comment75969>

    Can we do this only for TestJavaActionExecutor as it is not needed by other action executor test cases? Even though cluster setup is only done once for all tests, we will unnecessarily keep creating test dirs in both clusters.



core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java
<https://reviews.apache.org/r/19929/#comment75972>

    3. Absolute (but not fully qualified) path located in the both filesystems
    
    Comment not in sync with actual test. job3.xml is a fully qualified path. Need to make it absolute and add another job4.xml for fully qualified path.
    
    



core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java
<https://reviews.apache.org/r/19929/#comment75977>

    Can you add a comment here and in 3.2.2.4 Syntax of WorkflowFunctionalSpecification.twiki, saying relative and absolute paths for job-xml point to the Namenode of the app path even if a different namenode is specified for the action. And that to point to a different namenode it should be fully qualified. 



core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java
<https://reviews.apache.org/r/19929/#comment75978>

    This will not be ignored right if job3.xml was not fully qualified? Aren't absolute non-qualified paths picked from app path filesystem?



core/src/test/java/org/apache/oozie/test/XFsTestCase.java
<https://reviews.apache.org/r/19929/#comment75970>

    file system of the second cluster



core/src/test/java/org/apache/oozie/test/XFsTestCase.java
<https://reviews.apache.org/r/19929/#comment75971>

    return the FS test working directory in the second cluster


- Rohini Palaniswamy


On May 4, 2014, 7:05 p.m., Benjamin Zhitomirsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19929/
> -----------------------------------------------------------
> 
> (Updated May 4, 2014, 7:05 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> When <name-node> element in Oozie workflow specifies a name node different from the default one (specified in core-site.xml), the following functionality doesn’t work properly:
> ?Location of libraries specified via oozie.service.WorkflowAppService.system.libpath. Oozie first (during launcher configuration) tries to locate them using name node specified by the <name-node> element, but later during job submission it expects this path to be under the default Oozie name node
> ?Processing of the job-xml element if job xml is specified via absolute path. Oozie tries locate it under the default Oozie name node instead of the name-node specified in action.
> 
> Specifying non-default name node makes a lot of sense in Azure environment, because it allows to submit the same job to different Hadoop clusters.
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 59ad143 
>   core/src/test/java/org/apache/oozie/action/hadoop/ActionExecutorTestCase.java bc2c1b6 
>   core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 390ad3f 
>   core/src/test/java/org/apache/oozie/test/XFsTestCase.java 18cb742 
>   core/src/test/java/org/apache/oozie/test/XTestCase.java 1536927 
> 
> Diff: https://reviews.apache.org/r/19929/diff/
> 
> 
> Testing
> -------
> 
> On deployed Hadoop cluster. Two tests were added.
> 
> 
> Thanks,
> 
> Benjamin Zhitomirsky
> 
>