You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@falcon.apache.org by "Shwetha G S (JIRA)" <ji...@apache.org> on 2014/12/04 06:24:12 UTC

[jira] [Updated] (FALCON-310) Allow existing processes to work out-of-box when existing HDFS feeds are configured in HCatalog

     [ https://issues.apache.org/jira/browse/FALCON-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shwetha G S updated FALCON-310:
-------------------------------
    Assignee:     (was: Shwetha G S)

> Allow existing processes to work out-of-box when existing HDFS feeds are configured in HCatalog
> -----------------------------------------------------------------------------------------------
>
>                 Key: FALCON-310
>                 URL: https://issues.apache.org/jira/browse/FALCON-310
>             Project: Falcon
>          Issue Type: Improvement
>            Reporter: Satish Mittal
>
> After Hcatalog integration, one can configure new falcon feeds based on HCatalog tables and then write processes that read/write HCat based feeds. However the expectation is that these processes will be implemented using HCatalog interfaces (HCatInputFormat/HCatOutputFormat in case of M/R jobs, or HCatLoader/HCatStorer in case of PIG scripts). This is easy for new processes. 
> However there would be existing processes running in production that are based on HDFS based feeds and may not get re-written using HCat interfaces. For such processes, one might just want to configure HCatalog tables around their HDFS feeds and provide a way to allow existing processes to continue to run as if they are still working with HDFS feeds. 
> Behind the scenes, falcon should be able to find new partitions to read/write, get their corresponding locations, populate the corresponding workflow variables, register/drop partitions etc as part of pre/post processing step.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)