You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@falcon.apache.org by "John Yu (JIRA)" <ji...@apache.org> on 2014/07/17 21:19:04 UTC
[jira] [Created] (FALCON-511) Support for multiple sources to
multiple targets, without partitions
John Yu created FALCON-511:
------------------------------
Summary: Support for multiple sources to multiple targets, without partitions
Key: FALCON-511
URL: https://issues.apache.org/jira/browse/FALCON-511
Project: Falcon
Issue Type: New Feature
Reporter: John Yu
We currently have the following use case:
Colo1 has 1 ETL cluster (Colo1-ETL) and 1 adhoc cluster (Colo1-A)
Colo2 has 1 ETL cluster (Colo2-ETL) and 1 adhoc cluster (Colo2-A)
Due to the bandwidth constraint between the two colo's, we are thinking of having the 2 ETL clusters perform the same computation to generate the same dataset, and have the 2 adhoc clusters pull from their respective colo-local ETL cluster.
This can be done currently by specifying 2 different feeds. However, a critical dataset might be computed on different colos simultaneously for both DR and load balancing purposes. In this scenario, we would like to ease data discovery for end users by having only 1 feed definition, so that end users know these pieces of data are logically the same data, and they are free to pick one to use.
--
This message was sent by Atlassian JIRA
(v6.2#6252)