You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@falcon.apache.org by "Peeyush Bishnoi (JIRA)" <ji...@apache.org> on 2014/10/02 00:14:34 UTC

[jira] [Updated] (FALCON-510) Inject falcon related properties to job conf

     [ https://issues.apache.org/jira/browse/FALCON-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peeyush Bishnoi updated FALCON-510:
-----------------------------------
    Attachment: falcon-510.txt

Falcon related properties can be injected to Hadoop MR job conf by specifying properties from materialized coordinator.xml to respective action workflow.xml. For Pig & Hive, required falcon properties can be specified in respective process action xml file. But for Oozie MR action, properties are not getting set to job conf despite we specified properties in subworkflow action. I have talked to Oozie developers about this, they told that currently Oozie don't support to propagate properties from coordinator.xml to MR action workflow.xml, if there is sub workflow.

With this approach I have attached patch that inject required falcon properties to Hadoop MR job conf. Please review. 

> Inject falcon related properties to job conf
> --------------------------------------------
>
>                 Key: FALCON-510
>                 URL: https://issues.apache.org/jira/browse/FALCON-510
>             Project: Falcon
>          Issue Type: Improvement
>            Reporter: Shwetha G S
>            Assignee: Peeyush Bishnoi
>         Attachments: falcon-510.txt
>
>
> Currently there is no falcon context injected at MR job level. The job conf has at most the oozie workflow / action ID either in the job name or sometimes in the job conf.
> Therefore there is no way for a tool like hraven, which relies completely on jobconf and job history data, to identify that a particular job maps to a particular falcon process or it's instance time, etc. Right now hraven does regex-based job name surgery on a best effort basis before emitting metrics to graphite
> Request the following feature in falcon:
> Add the following properties to the job conf (for all jobs - be it a pig action or an MR action):
> falcon.process.name
> falcon.process.instancetime
> while we're at it, might as well add any other falcon context as a jobconf property (like whether it was a rerun or the input/output feeds, cluster, validity, any process properties, etc.)
> This will ofcourse inject at the first job level and cannot ensure that any child jobs get the properties passed on (unless we can figure out a way to do that too).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)