You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by Attila Sasvari via Review Board <no...@reviews.apache.org> on 2017/11/16 13:50:28 UTC
Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/
-----------------------------------------------------------
Review request for oozie, Peter Bacsko and Robert Kanter.
Repository: oozie-git
Description
-------
Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
Changes:
- ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
- ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
Diffs
-----
core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe
core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 8cb76cf
Diff: https://reviews.apache.org/r/63875/diff/1/
Testing
-------
Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
- workflow:
```
<workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
<start to="distcp-3a1f"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="distcp-3a1f">
<distcp xmlns="uri:oozie:distcp-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
<value>*</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.hdfs-servers</name>
<value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
<value>remote.test2.com</value>
</property>
</configuration>
<arg>hdfs://remote.test2.com:8020/tmp/1</arg>
<arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
</distcp>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
```
Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
- changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
- regenerating service credentials
- changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
- additional configuration to enable trust between the test hadoop clusters
Thanks,
Attila Sasvari
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by Peter Cseh via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/#review191334
-----------------------------------------------------------
core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java
Lines 72-76 (patched)
<https://reviews.apache.org/r/63875/#comment269117>
Can you add some info-level logging or validation of the prperties? E.g. what happens if the JOB_NAMENODES property not null, but empty?
core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java
Lines 110 (patched)
<https://reviews.apache.org/r/63875/#comment269116>
Please add logging similart to the other code path here.
core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
Lines 1375 (patched)
<https://reviews.apache.org/r/63875/#comment269119>
Please create a constant for this as well.
core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
Lines 1378 (patched)
<https://reviews.apache.org/r/63875/#comment269118>
You've created a constant for this but haven't used it. Please do so.
- Peter Cseh
On Nov. 16, 2017, 1:50 p.m., Attila Sasvari wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63875/
> -----------------------------------------------------------
>
> (Updated Nov. 16, 2017, 1:50 p.m.)
>
>
> Review request for oozie, Peter Bacsko and Robert Kanter.
>
>
> Repository: oozie-git
>
>
> Description
> -------
>
> Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
>
> Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
>
> Changes:
> - ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
> - ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
>
>
> Diffs
> -----
>
> core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe
> core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 8cb76cf
>
>
> Diff: https://reviews.apache.org/r/63875/diff/1/
>
>
> Testing
> -------
>
> Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
>
> - workflow:
> ```
> <workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
> <start to="distcp-3a1f"/>
> <kill name="Kill">
> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> </kill>
> <action name="distcp-3a1f">
> <distcp xmlns="uri:oozie:distcp-action:0.1">
> <job-tracker>${jobTracker}</job-tracker>
> <name-node>${nameNode}</name-node>
>
> <configuration>
> <property>
> <name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
> <value>*</value>
> </property>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers</name>
> <value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
> </property>
>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
> <value>remote.test2.com</value>
> </property>
> </configuration>
> <arg>hdfs://remote.test2.com:8020/tmp/1</arg>
> <arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
> </distcp>
> <ok to="End"/>
> <error to="Kill"/>
> </action>
> <end name="End"/>
> </workflow-app>
> ```
>
> Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
> - changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
> - regenerating service credentials
> - changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
> - additional configuration to enable trust between the test hadoop clusters
>
>
> Thanks,
>
> Attila Sasvari
>
>
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by Robert Kanter via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/#review191285
-----------------------------------------------------------
core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java
Lines 63-66 (patched)
<https://reviews.apache.org/r/63875/#comment269046>
Given that this stuff is only needed if jobNameNodes is not null, we should move it inside the if block so we don't do it unnecessarily.
core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
Lines 1404 (patched)
<https://reviews.apache.org/r/63875/#comment269050>
This probably shouldn't be hardcoded here :)
- Robert Kanter
On Nov. 16, 2017, 1:50 p.m., Attila Sasvari wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63875/
> -----------------------------------------------------------
>
> (Updated Nov. 16, 2017, 1:50 p.m.)
>
>
> Review request for oozie, Peter Bacsko and Robert Kanter.
>
>
> Repository: oozie-git
>
>
> Description
> -------
>
> Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
>
> Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
>
> Changes:
> - ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
> - ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
>
>
> Diffs
> -----
>
> core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe
> core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 8cb76cf
>
>
> Diff: https://reviews.apache.org/r/63875/diff/1/
>
>
> Testing
> -------
>
> Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
>
> - workflow:
> ```
> <workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
> <start to="distcp-3a1f"/>
> <kill name="Kill">
> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> </kill>
> <action name="distcp-3a1f">
> <distcp xmlns="uri:oozie:distcp-action:0.1">
> <job-tracker>${jobTracker}</job-tracker>
> <name-node>${nameNode}</name-node>
>
> <configuration>
> <property>
> <name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
> <value>*</value>
> </property>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers</name>
> <value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
> </property>
>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
> <value>remote.test2.com</value>
> </property>
> </configuration>
> <arg>hdfs://remote.test2.com:8020/tmp/1</arg>
> <arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
> </distcp>
> <ok to="End"/>
> <error to="Kill"/>
> </action>
> <end name="End"/>
> </workflow-app>
> ```
>
> Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
> - changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
> - regenerating service credentials
> - changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
> - additional configuration to enable trust between the test hadoop clusters
>
>
> Thanks,
>
> Attila Sasvari
>
>
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by Peter Cseh via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/#review191662
-----------------------------------------------------------
core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
Lines 1379-1386 (patched)
<https://reviews.apache.org/r/63875/#comment269504>
Can this part be pushed down to DistcpActionExecutor? It does not feel like other actions would have to work with these properties.
Also, please cosider adding some logging here, at least on debug level to make it easier to see what's happening.
- Peter Cseh
On Nov. 21, 2017, 3:49 p.m., Attila Sasvari wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63875/
> -----------------------------------------------------------
>
> (Updated Nov. 21, 2017, 3:49 p.m.)
>
>
> Review request for oozie, Peter Bacsko and Robert Kanter.
>
>
> Repository: oozie-git
>
>
> Description
> -------
>
> Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
>
> Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
>
> Changes:
> - ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
> - ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
>
>
> Diffs
> -----
>
> core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe9a7876b6400d80356d5c826e77575e2ab
> core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java a1df304914b73d406e986409a8053c2a48e1bd38
>
>
> Diff: https://reviews.apache.org/r/63875/diff/3/
>
>
> Testing
> -------
>
> Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
>
> - workflow:
> ```
> <workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
> <start to="distcp-3a1f"/>
> <kill name="Kill">
> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> </kill>
> <action name="distcp-3a1f">
> <distcp xmlns="uri:oozie:distcp-action:0.1">
> <job-tracker>${jobTracker}</job-tracker>
> <name-node>${nameNode}</name-node>
>
> <configuration>
> <property>
> <name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
> <value>*</value>
> </property>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers</name>
> <value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
> </property>
>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
> <value>remote.test2.com</value>
> </property>
> </configuration>
> <arg>hdfs://remote.test2.com:8020/tmp/1</arg>
> <arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
> </distcp>
> <ok to="End"/>
> <error to="Kill"/>
> </action>
> <end name="End"/>
> </workflow-app>
> ```
>
> Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
> - changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
> - regenerating service credentials
> - changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
> - additional configuration to enable trust between the test hadoop clusters
>
>
> Thanks,
>
> Attila Sasvari
>
>
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by András Piros via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/#review191832
-----------------------------------------------------------
Ship it!
Ship It!
- András Piros
On Nov. 24, 2017, 10:29 a.m., Attila Sasvari wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63875/
> -----------------------------------------------------------
>
> (Updated Nov. 24, 2017, 10:29 a.m.)
>
>
> Review request for oozie, Peter Bacsko and Robert Kanter.
>
>
> Repository: oozie-git
>
>
> Description
> -------
>
> Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
>
> Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
>
> Changes:
> - ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
> - ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
>
>
> Diffs
> -----
>
> core/src/main/java/org/apache/oozie/action/hadoop/DistcpActionExecutor.java 81e28f722d9ecd0bf972bf2d0a684d207547d165
> core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe9a7876b6400d80356d5c826e77575e2ab
> core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java a1df304914b73d406e986409a8053c2a48e1bd38
>
>
> Diff: https://reviews.apache.org/r/63875/diff/6/
>
>
> Testing
> -------
>
> Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
>
> - workflow:
> ```
> <workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
> <start to="distcp-3a1f"/>
> <kill name="Kill">
> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> </kill>
> <action name="distcp-3a1f">
> <distcp xmlns="uri:oozie:distcp-action:0.1">
> <job-tracker>${jobTracker}</job-tracker>
> <name-node>${nameNode}</name-node>
>
> <configuration>
> <property>
> <name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
> <value>*</value>
> </property>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers</name>
> <value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
> </property>
>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
> <value>remote.test2.com</value>
> </property>
> </configuration>
> <arg>hdfs://remote.test2.com:8020/tmp/1</arg>
> <arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
> </distcp>
> <ok to="End"/>
> <error to="Kill"/>
> </action>
> <end name="End"/>
> </workflow-app>
> ```
>
> Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
> - changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
> - regenerating service credentials
> - changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
> - additional configuration to enable trust between the test hadoop clusters
>
>
> Thanks,
>
> Attila Sasvari
>
>
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by Peter Bacsko via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/#review191831
-----------------------------------------------------------
Ship it!
Ship It!
- Peter Bacsko
On nov. 24, 2017, 10:29 de, Attila Sasvari wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63875/
> -----------------------------------------------------------
>
> (Updated nov. 24, 2017, 10:29 de)
>
>
> Review request for oozie, Peter Bacsko and Robert Kanter.
>
>
> Repository: oozie-git
>
>
> Description
> -------
>
> Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
>
> Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
>
> Changes:
> - ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
> - ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
>
>
> Diffs
> -----
>
> core/src/main/java/org/apache/oozie/action/hadoop/DistcpActionExecutor.java 81e28f722d9ecd0bf972bf2d0a684d207547d165
> core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe9a7876b6400d80356d5c826e77575e2ab
> core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java a1df304914b73d406e986409a8053c2a48e1bd38
>
>
> Diff: https://reviews.apache.org/r/63875/diff/6/
>
>
> Testing
> -------
>
> Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
>
> - workflow:
> ```
> <workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
> <start to="distcp-3a1f"/>
> <kill name="Kill">
> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> </kill>
> <action name="distcp-3a1f">
> <distcp xmlns="uri:oozie:distcp-action:0.1">
> <job-tracker>${jobTracker}</job-tracker>
> <name-node>${nameNode}</name-node>
>
> <configuration>
> <property>
> <name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
> <value>*</value>
> </property>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers</name>
> <value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
> </property>
>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
> <value>remote.test2.com</value>
> </property>
> </configuration>
> <arg>hdfs://remote.test2.com:8020/tmp/1</arg>
> <arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
> </distcp>
> <ok to="End"/>
> <error to="Kill"/>
> </action>
> <end name="End"/>
> </workflow-app>
> ```
>
> Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
> - changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
> - regenerating service credentials
> - changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
> - additional configuration to enable trust between the test hadoop clusters
>
>
> Thanks,
>
> Attila Sasvari
>
>
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by Peter Bacsko via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/#review191830
-----------------------------------------------------------
Ship it!
Ship It!
- Peter Bacsko
On nov. 24, 2017, 10:29 de, Attila Sasvari wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63875/
> -----------------------------------------------------------
>
> (Updated nov. 24, 2017, 10:29 de)
>
>
> Review request for oozie, Peter Bacsko and Robert Kanter.
>
>
> Repository: oozie-git
>
>
> Description
> -------
>
> Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
>
> Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
>
> Changes:
> - ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
> - ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
>
>
> Diffs
> -----
>
> core/src/main/java/org/apache/oozie/action/hadoop/DistcpActionExecutor.java 81e28f722d9ecd0bf972bf2d0a684d207547d165
> core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe9a7876b6400d80356d5c826e77575e2ab
> core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java a1df304914b73d406e986409a8053c2a48e1bd38
>
>
> Diff: https://reviews.apache.org/r/63875/diff/6/
>
>
> Testing
> -------
>
> Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
>
> - workflow:
> ```
> <workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
> <start to="distcp-3a1f"/>
> <kill name="Kill">
> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> </kill>
> <action name="distcp-3a1f">
> <distcp xmlns="uri:oozie:distcp-action:0.1">
> <job-tracker>${jobTracker}</job-tracker>
> <name-node>${nameNode}</name-node>
>
> <configuration>
> <property>
> <name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
> <value>*</value>
> </property>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers</name>
> <value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
> </property>
>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
> <value>remote.test2.com</value>
> </property>
> </configuration>
> <arg>hdfs://remote.test2.com:8020/tmp/1</arg>
> <arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
> </distcp>
> <ok to="End"/>
> <error to="Kill"/>
> </action>
> <end name="End"/>
> </workflow-app>
> ```
>
> Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
> - changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
> - regenerating service credentials
> - changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
> - additional configuration to enable trust between the test hadoop clusters
>
>
> Thanks,
>
> Attila Sasvari
>
>
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by Attila Sasvari via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/
-----------------------------------------------------------
(Updated Nov. 24, 2017, 10:29 a.m.)
Review request for oozie, Peter Bacsko and Robert Kanter.
Repository: oozie-git
Description
-------
Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
Changes:
- ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
- ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
Diffs (updated)
-----
core/src/main/java/org/apache/oozie/action/hadoop/DistcpActionExecutor.java 81e28f722d9ecd0bf972bf2d0a684d207547d165
core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe9a7876b6400d80356d5c826e77575e2ab
core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java a1df304914b73d406e986409a8053c2a48e1bd38
Diff: https://reviews.apache.org/r/63875/diff/6/
Changes: https://reviews.apache.org/r/63875/diff/5-6/
Testing
-------
Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
- workflow:
```
<workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
<start to="distcp-3a1f"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="distcp-3a1f">
<distcp xmlns="uri:oozie:distcp-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
<value>*</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.hdfs-servers</name>
<value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
<value>remote.test2.com</value>
</property>
</configuration>
<arg>hdfs://remote.test2.com:8020/tmp/1</arg>
<arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
</distcp>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
```
Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
- changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
- regenerating service credentials
- changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
- additional configuration to enable trust between the test hadoop clusters
Thanks,
Attila Sasvari
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by Attila Sasvari via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/
-----------------------------------------------------------
(Updated Nov. 24, 2017, 10:17 a.m.)
Review request for oozie, Peter Bacsko and Robert Kanter.
Repository: oozie-git
Description
-------
Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
Changes:
- ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
- ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
Diffs (updated)
-----
core/src/main/java/org/apache/oozie/action/hadoop/DistcpActionExecutor.java 81e28f722d9ecd0bf972bf2d0a684d207547d165
core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe9a7876b6400d80356d5c826e77575e2ab
core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java a1df304914b73d406e986409a8053c2a48e1bd38
Diff: https://reviews.apache.org/r/63875/diff/5/
Changes: https://reviews.apache.org/r/63875/diff/4-5/
Testing
-------
Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
- workflow:
```
<workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
<start to="distcp-3a1f"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="distcp-3a1f">
<distcp xmlns="uri:oozie:distcp-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
<value>*</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.hdfs-servers</name>
<value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
<value>remote.test2.com</value>
</property>
</configuration>
<arg>hdfs://remote.test2.com:8020/tmp/1</arg>
<arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
</distcp>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
```
Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
- changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
- regenerating service credentials
- changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
- additional configuration to enable trust between the test hadoop clusters
Thanks,
Attila Sasvari
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by Peter Bacsko via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/#review191802
-----------------------------------------------------------
core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
Lines 1405 (patched)
<https://reviews.apache.org/r/63875/#comment269714>
Minor:
1. Pls add javadoc to this method, explaining how and when subclasses should override it
2. Add a "// nop" comment to the method body (indicates that it's empty on purpose)
- Peter Bacsko
On nov. 22, 2017, 3:11 du, Attila Sasvari wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63875/
> -----------------------------------------------------------
>
> (Updated nov. 22, 2017, 3:11 du)
>
>
> Review request for oozie, Peter Bacsko and Robert Kanter.
>
>
> Repository: oozie-git
>
>
> Description
> -------
>
> Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
>
> Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
>
> Changes:
> - ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
> - ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
>
>
> Diffs
> -----
>
> core/src/main/java/org/apache/oozie/action/hadoop/DistcpActionExecutor.java 81e28f722d9ecd0bf972bf2d0a684d207547d165
> core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe9a7876b6400d80356d5c826e77575e2ab
> core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java a1df304914b73d406e986409a8053c2a48e1bd38
>
>
> Diff: https://reviews.apache.org/r/63875/diff/4/
>
>
> Testing
> -------
>
> Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
>
> - workflow:
> ```
> <workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
> <start to="distcp-3a1f"/>
> <kill name="Kill">
> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> </kill>
> <action name="distcp-3a1f">
> <distcp xmlns="uri:oozie:distcp-action:0.1">
> <job-tracker>${jobTracker}</job-tracker>
> <name-node>${nameNode}</name-node>
>
> <configuration>
> <property>
> <name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
> <value>*</value>
> </property>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers</name>
> <value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
> </property>
>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
> <value>remote.test2.com</value>
> </property>
> </configuration>
> <arg>hdfs://remote.test2.com:8020/tmp/1</arg>
> <arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
> </distcp>
> <ok to="End"/>
> <error to="Kill"/>
> </action>
> <end name="End"/>
> </workflow-app>
> ```
>
> Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
> - changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
> - regenerating service credentials
> - changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
> - additional configuration to enable trust between the test hadoop clusters
>
>
> Thanks,
>
> Attila Sasvari
>
>
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by András Piros via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/#review191804
-----------------------------------------------------------
core/src/main/java/org/apache/oozie/action/hadoop/DistcpActionExecutor.java
Lines 39-42 (patched)
<https://reviews.apache.org/r/63875/#comment269716>
It would be nice to have field level Javadoc here explaining why those are needed. Also linking to Hadoop repo for similar properties would be nice.
core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java
Lines 59 (patched)
<https://reviews.apache.org/r/63875/#comment269718>
Would have the `INFO` log inside the delegate method.
core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java
Lines 98-111 (patched)
<https://reviews.apache.org/r/63875/#comment269717>
An `INFO` level log message stating which tokens are obtained from where, similar to the other method, would be nice.
core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
Lines 1375-1377 (patched)
<https://reviews.apache.org/r/63875/#comment269719>
Some `DEBUG` level logging...
- András Piros
On Nov. 22, 2017, 3:11 p.m., Attila Sasvari wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63875/
> -----------------------------------------------------------
>
> (Updated Nov. 22, 2017, 3:11 p.m.)
>
>
> Review request for oozie, Peter Bacsko and Robert Kanter.
>
>
> Repository: oozie-git
>
>
> Description
> -------
>
> Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
>
> Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
>
> Changes:
> - ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
> - ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
>
>
> Diffs
> -----
>
> core/src/main/java/org/apache/oozie/action/hadoop/DistcpActionExecutor.java 81e28f722d9ecd0bf972bf2d0a684d207547d165
> core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe9a7876b6400d80356d5c826e77575e2ab
> core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java a1df304914b73d406e986409a8053c2a48e1bd38
>
>
> Diff: https://reviews.apache.org/r/63875/diff/4/
>
>
> Testing
> -------
>
> Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
>
> - workflow:
> ```
> <workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
> <start to="distcp-3a1f"/>
> <kill name="Kill">
> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> </kill>
> <action name="distcp-3a1f">
> <distcp xmlns="uri:oozie:distcp-action:0.1">
> <job-tracker>${jobTracker}</job-tracker>
> <name-node>${nameNode}</name-node>
>
> <configuration>
> <property>
> <name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
> <value>*</value>
> </property>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers</name>
> <value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
> </property>
>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
> <value>remote.test2.com</value>
> </property>
> </configuration>
> <arg>hdfs://remote.test2.com:8020/tmp/1</arg>
> <arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
> </distcp>
> <ok to="End"/>
> <error to="Kill"/>
> </action>
> <end name="End"/>
> </workflow-app>
> ```
>
> Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
> - changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
> - regenerating service credentials
> - changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
> - additional configuration to enable trust between the test hadoop clusters
>
>
> Thanks,
>
> Attila Sasvari
>
>
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by Attila Sasvari via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/
-----------------------------------------------------------
(Updated Nov. 22, 2017, 3:11 p.m.)
Review request for oozie, Peter Bacsko and Robert Kanter.
Changes
-------
minor refactoring: moving distcp specific settings required for obtaining HDFS tokens to DistCpActionExecutor
Repository: oozie-git
Description
-------
Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
Changes:
- ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
- ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
Diffs (updated)
-----
core/src/main/java/org/apache/oozie/action/hadoop/DistcpActionExecutor.java 81e28f722d9ecd0bf972bf2d0a684d207547d165
core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe9a7876b6400d80356d5c826e77575e2ab
core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java a1df304914b73d406e986409a8053c2a48e1bd38
Diff: https://reviews.apache.org/r/63875/diff/4/
Changes: https://reviews.apache.org/r/63875/diff/3-4/
Testing
-------
Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
- workflow:
```
<workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
<start to="distcp-3a1f"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="distcp-3a1f">
<distcp xmlns="uri:oozie:distcp-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
<value>*</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.hdfs-servers</name>
<value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
<value>remote.test2.com</value>
</property>
</configuration>
<arg>hdfs://remote.test2.com:8020/tmp/1</arg>
<arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
</distcp>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
```
Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
- changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
- regenerating service credentials
- changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
- additional configuration to enable trust between the test hadoop clusters
Thanks,
Attila Sasvari
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by Attila Sasvari via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/
-----------------------------------------------------------
(Updated Nov. 21, 2017, 3:49 p.m.)
Review request for oozie, Peter Bacsko and Robert Kanter.
Repository: oozie-git
Description
-------
Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
Changes:
- ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
- ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
Diffs (updated)
-----
core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe9a7876b6400d80356d5c826e77575e2ab
core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java a1df304914b73d406e986409a8053c2a48e1bd38
Diff: https://reviews.apache.org/r/63875/diff/3/
Changes: https://reviews.apache.org/r/63875/diff/2-3/
Testing
-------
Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
- workflow:
```
<workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
<start to="distcp-3a1f"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="distcp-3a1f">
<distcp xmlns="uri:oozie:distcp-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
<value>*</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.hdfs-servers</name>
<value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
<value>remote.test2.com</value>
</property>
</configuration>
<arg>hdfs://remote.test2.com:8020/tmp/1</arg>
<arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
</distcp>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
```
Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
- changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
- regenerating service credentials
- changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
- additional configuration to enable trust between the test hadoop clusters
Thanks,
Attila Sasvari
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by Attila Sasvari via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/
-----------------------------------------------------------
(Updated Nov. 20, 2017, 11:21 p.m.)
Review request for oozie, Peter Bacsko and Robert Kanter.
Repository: oozie-git
Description
-------
Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
Changes:
- ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
- ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
Diffs (updated)
-----
core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe9a7876b6400d80356d5c826e77575e2ab
core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java a1df304914b73d406e986409a8053c2a48e1bd38
Diff: https://reviews.apache.org/r/63875/diff/2/
Changes: https://reviews.apache.org/r/63875/diff/1-2/
Testing
-------
Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
- workflow:
```
<workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
<start to="distcp-3a1f"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="distcp-3a1f">
<distcp xmlns="uri:oozie:distcp-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
<value>*</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.hdfs-servers</name>
<value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
<value>remote.test2.com</value>
</property>
</configuration>
<arg>hdfs://remote.test2.com:8020/tmp/1</arg>
<arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
</distcp>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
```
Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
- changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
- regenerating service credentials
- changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
- additional configuration to enable trust between the test hadoop clusters
Thanks,
Attila Sasvari
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by Peter Bacsko via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/#review191343
-----------------------------------------------------------
core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java
Lines 68 (patched)
<https://reviews.apache.org/r/63875/#comment269125>
You can simplify this a bit:
String[] nameNodes = conf.getStrings(MRJobConfig.JOB_NAMENODES);
- Peter Bacsko
On nov. 16, 2017, 1:50 du, Attila Sasvari wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63875/
> -----------------------------------------------------------
>
> (Updated nov. 16, 2017, 1:50 du)
>
>
> Review request for oozie, Peter Bacsko and Robert Kanter.
>
>
> Repository: oozie-git
>
>
> Description
> -------
>
> Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
>
> Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
>
> Changes:
> - ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
> - ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
>
>
> Diffs
> -----
>
> core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe
> core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 8cb76cf
>
>
> Diff: https://reviews.apache.org/r/63875/diff/1/
>
>
> Testing
> -------
>
> Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
>
> - workflow:
> ```
> <workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
> <start to="distcp-3a1f"/>
> <kill name="Kill">
> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> </kill>
> <action name="distcp-3a1f">
> <distcp xmlns="uri:oozie:distcp-action:0.1">
> <job-tracker>${jobTracker}</job-tracker>
> <name-node>${nameNode}</name-node>
>
> <configuration>
> <property>
> <name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
> <value>*</value>
> </property>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers</name>
> <value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
> </property>
>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
> <value>remote.test2.com</value>
> </property>
> </configuration>
> <arg>hdfs://remote.test2.com:8020/tmp/1</arg>
> <arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
> </distcp>
> <ok to="End"/>
> <error to="Kill"/>
> </action>
> <end name="End"/>
> </workflow-app>
> ```
>
> Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
> - changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
> - regenerating service credentials
> - changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
> - additional configuration to enable trust between the test hadoop clusters
>
>
> Thanks,
>
> Attila Sasvari
>
>
Re: Review Request 63875: OOZIE-2900 Retrieve tokens for
oozie.launcher.mapreduce.job.hdfs-servers before submission
Posted by Peter Cseh via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63875/#review191332
-----------------------------------------------------------
core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java
Lines 63-64 (patched)
<https://reviews.apache.org/r/63875/#comment269115>
You could use UserGroupInformationService here.
- Peter Cseh
On Nov. 16, 2017, 1:50 p.m., Attila Sasvari wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63875/
> -----------------------------------------------------------
>
> (Updated Nov. 16, 2017, 1:50 p.m.)
>
>
> Review request for oozie, Peter Bacsko and Robert Kanter.
>
>
> Repository: oozie-git
>
>
> Description
> -------
>
> Before Oozie on YARN, ``JobSubmitter`` from MapReduce (more precisely ``TokenCache.obtainTokensForNamenodes``) took care of obtaining delegation tokens for HDFS nodes specified by ``oozie.launcher.mapreduce.job.hdfs-servers`` before submitting the Oozie launcher job.
>
> Oozie launcher is now a Yarn Application Master. It needs HDFS delegation tokens to be able to copy files between secure clusters via the Oozie DistCp action.
>
> Changes:
> - ``JavaActionExecutor`` was modified to handle Distcp related parameters like (``oozie.launcher.mapreduce.job.hdfs-servers`` and ``oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude``)
> - ``HDFSCredentials`` was changed to reuse ``TokenCache.obtainTokensForNamenodes`` to obtain HDFS delegation tokens.
>
>
> Diffs
> -----
>
> core/src/main/java/org/apache/oozie/action/hadoop/HDFSCredentials.java 92a7ebe
> core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 8cb76cf
>
>
> Diff: https://reviews.apache.org/r/63875/diff/1/
>
>
> Testing
> -------
>
> Tested on a secure cluster that Oozie dist cp action can copy file from another secure cluster where different Kerberos realm was used.
>
> - workflow:
> ```
> <workflow-app name="DistCp" xmlns="uri:oozie:workflow:0.5">
> <start to="distcp-3a1f"/>
> <kill name="Kill">
> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> </kill>
> <action name="distcp-3a1f">
> <distcp xmlns="uri:oozie:distcp-action:0.1">
> <job-tracker>${jobTracker}</job-tracker>
> <name-node>${nameNode}</name-node>
>
> <configuration>
> <property>
> <name>oozie.launcher.mapreduce.job.dfs.namenode.kerberos.principal.pattern</name>
> <value>*</value>
> </property>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers</name>
> <value>hdfs://oozie.test1.com:8020,hdfs://remote.test2.com:8020</value>
> </property>
>
> <property>
> <name>oozie.launcher.mapreduce.job.hdfs-servers.token-renewal.exclude</name>
> <value>remote.test2.com</value>
> </property>
> </configuration>
> <arg>hdfs://remote.test2.com:8020/tmp/1</arg>
> <arg>hdfs://oozie.test1.com:8020/tmp/2</arg>
> </distcp>
> <ok to="End"/>
> <error to="Kill"/>
> </action>
> <end name="End"/>
> </workflow-app>
> ```
>
> Prior to executing the workflow I had to setup cross realm trust between the test secure clusters. It involved:
> - changing Kerberos configuration ``/etc/krb5.conf`` (adding realms and setting additional properties like ``udp_preference_limit = 1``)
> - regenerating service credentials
> - changing HDFS settings (such as ``dfs.namenode.kerberos.principal.pattern``) and setting hadoop auth to local rule like ``RULE:[2:$1](.*)s/(.*)/$1/g``
> - additional configuration to enable trust between the test hadoop clusters
>
>
> Thanks,
>
> Attila Sasvari
>
>