You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@falcon.apache.org by "Venkatesh Seetharam (JIRA)" <ji...@apache.org> on 2014/08/11 20:54:12 UTC

[jira] [Commented] (FALCON-310) Allow existing processes to work out-of-box when existing HDFS feeds are configured in HCatalog

    [ https://issues.apache.org/jira/browse/FALCON-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093159#comment-14093159 ] 

Venkatesh Seetharam commented on FALCON-310:
--------------------------------------------

This should be quite straight forward to create external tables in Hive and point to data on HDFS. It should work OOTB.

> Allow existing processes to work out-of-box when existing HDFS feeds are configured in HCatalog
> -----------------------------------------------------------------------------------------------
>
>                 Key: FALCON-310
>                 URL: https://issues.apache.org/jira/browse/FALCON-310
>             Project: Falcon
>          Issue Type: Improvement
>            Reporter: Satish Mittal
>            Assignee: Shwetha G S
>
> After Hcatalog integration, one can configure new falcon feeds based on HCatalog tables and then write processes that read/write HCat based feeds. However the expectation is that these processes will be implemented using HCatalog interfaces (HCatInputFormat/HCatOutputFormat in case of M/R jobs, or HCatLoader/HCatStorer in case of PIG scripts). This is easy for new processes. 
> However there would be existing processes running in production that are based on HDFS based feeds and may not get re-written using HCat interfaces. For such processes, one might just want to configure HCatalog tables around their HDFS feeds and provide a way to allow existing processes to continue to run as if they are still working with HDFS feeds. 
> Behind the scenes, falcon should be able to find new partitions to read/write, get their corresponding locations, populate the corresponding workflow variables, register/drop partitions etc as part of pre/post processing step.



--
This message was sent by Atlassian JIRA
(v6.2#6252)