You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@falcon.apache.org by Sowmya Ramesh <sr...@hortonworks.com> on 2015/06/15 20:10:42 UTC
Review Request 35468: FeedReplicator improvement to include more
DistCP options
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35468/
-----------------------------------------------------------
Review request for Falcon, Pallavi Rao, Srikanth Sundarrajan, and Venkat Ranganathan.
Bugs: FALCON-668
https://issues.apache.org/jira/browse/FALCON-668
Repository: falcon-git
Description
-------
FeedReplicator improvement to include more DistCP options listed below. User is expected to pass these as custom properties in the feed defined for replicaiton and those will be propagated to DistCp tool.
* overwrite
* ignore errors
* skip checksum
* remove deleted files
* preserve block size
* preserve replication count
* preserve permissions
Diffs
-----
common/src/main/java/org/apache/falcon/util/ReplicationConstants.java PRE-CREATION
docs/src/site/twiki/EntitySpecification.twiki 1ed2cb5
oozie/src/main/java/org/apache/falcon/oozie/OozieOrchestrationWorkflowBuilder.java 49f9e07
oozie/src/main/java/org/apache/falcon/oozie/feed/FSReplicationWorkflowBuilder.java 1d97204
oozie/src/main/java/org/apache/falcon/oozie/feed/HCatReplicationWorkflowBuilder.java 72bbca4
replication/src/main/java/org/apache/falcon/replication/FeedReplicator.java b2175b2
replication/src/test/java/org/apache/falcon/replication/FeedReplicatorTest.java 539d00d
Diff: https://reviews.apache.org/r/35468/diff/
Testing
-------
Verified custom property added got propagated to FeedReplicator Java action and the replication succeeded.
Action configuration:
<main-class>org.apache.falcon.replication.FeedReplicator</main-class>
<arg>-Dfalcon.include.path=hftp://sandbox.hortonworks.com:50070/user/ambari-qa/dr/test/primaryCluster/feed/input/2015/06/08</arg>
<arg>-Dmapred.job.queue.name=default</arg>
<arg>-Dmapred.job.priority=NORMAL</arg>
<arg>-maxMaps</arg>
<arg>33</arg>
<arg>-mapBandwidth</arg>
<arg>2</arg>
<arg>-sourcePaths</arg>
<arg>hftp://sandbox.hortonworks.com:50070/user/ambari-qa/dr/test/primaryCluster/feed/input/2015/06/08</arg>
<arg>-targetPath</arg>
<arg>hdfs://sandbox.hortonworks.com:8020/user/ambari-qa/dr/test/backupCluster/feed/output/2015/06/08</arg>
<arg>-falconFeedStorageType</arg>
<arg>FILESYSTEM</arg>
<arg>-availabilityFlag</arg>
<arg>NA</arg>
<arg>-overwrite</arg>
<arg>true</arg>
<arg>-ignoreErrors</arg>
<arg>false</arg>
<arg>-preservePermissions</arg>
<arg>false</arg>
Thanks,
Sowmya Ramesh
Re: Review Request 35468: FeedReplicator improvement to include more
DistCP options
Posted by Pallavi Rao <pa...@inmobi.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35468/#review88351
-----------------------------------------------------------
Ship it!
Ship It!
- Pallavi Rao
On June 16, 2015, 10:18 p.m., Sowmya Ramesh wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35468/
> -----------------------------------------------------------
>
> (Updated June 16, 2015, 10:18 p.m.)
>
>
> Review request for Falcon, Pallavi Rao, Srikanth Sundarrajan, and Venkat Ranganathan.
>
>
> Bugs: FALCON-668
> https://issues.apache.org/jira/browse/FALCON-668
>
>
> Repository: falcon-git
>
>
> Description
> -------
>
> FeedReplicator improvement to include more DistCP options listed below. User is expected to pass these as custom properties in the feed defined for replicaiton and those will be propagated to DistCp tool.
>
> * overwrite
> * ignore errors
> * skip checksum
> * remove deleted files
> * preserve block size
> * preserve replication count
> * preserve permissions
>
>
> Diffs
> -----
>
> common/src/main/java/org/apache/falcon/util/ReplicationDistCpOption.java PRE-CREATION
> docs/src/site/twiki/EntitySpecification.twiki 1ed2cb5
> oozie/src/main/java/org/apache/falcon/oozie/OozieOrchestrationWorkflowBuilder.java 49f9e07
> oozie/src/main/java/org/apache/falcon/oozie/feed/FSReplicationWorkflowBuilder.java 1d97204
> oozie/src/main/java/org/apache/falcon/oozie/feed/HCatReplicationWorkflowBuilder.java 72bbca4
> replication/src/main/java/org/apache/falcon/replication/FeedReplicator.java b2175b2
> replication/src/test/java/org/apache/falcon/replication/FeedReplicatorTest.java 539d00d
>
> Diff: https://reviews.apache.org/r/35468/diff/
>
>
> Testing
> -------
>
> Verified custom property added got propagated to FeedReplicator Java action and the replication succeeded.
>
> Action configuration:
>
> <main-class>org.apache.falcon.replication.FeedReplicator</main-class>
> <arg>-Dfalcon.include.path=hftp://sandbox.hortonworks.com:50070/user/ambari-qa/dr/test/primaryCluster/feed/input/2015/06/08</arg>
> <arg>-Dmapred.job.queue.name=default</arg>
> <arg>-Dmapred.job.priority=NORMAL</arg>
> <arg>-maxMaps</arg>
> <arg>33</arg>
> <arg>-mapBandwidth</arg>
> <arg>2</arg>
> <arg>-sourcePaths</arg>
> <arg>hftp://sandbox.hortonworks.com:50070/user/ambari-qa/dr/test/primaryCluster/feed/input/2015/06/08</arg>
> <arg>-targetPath</arg>
> <arg>hdfs://sandbox.hortonworks.com:8020/user/ambari-qa/dr/test/backupCluster/feed/output/2015/06/08</arg>
> <arg>-falconFeedStorageType</arg>
> <arg>FILESYSTEM</arg>
> <arg>-availabilityFlag</arg>
> <arg>NA</arg>
> <arg>-overwrite</arg>
> <arg>true</arg>
> <arg>-ignoreErrors</arg>
> <arg>false</arg>
> <arg>-preservePermissions</arg>
> <arg>false</arg>
>
>
> Thanks,
>
> Sowmya Ramesh
>
>
Re: Review Request 35468: FeedReplicator improvement to include more
DistCP options
Posted by Sowmya Ramesh <sr...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35468/
-----------------------------------------------------------
(Updated June 16, 2015, 10:18 p.m.)
Review request for Falcon, Pallavi Rao, Srikanth Sundarrajan, and Venkat Ranganathan.
Changes
-------
Applied feedback!
Bugs: FALCON-668
https://issues.apache.org/jira/browse/FALCON-668
Repository: falcon-git
Description
-------
FeedReplicator improvement to include more DistCP options listed below. User is expected to pass these as custom properties in the feed defined for replicaiton and those will be propagated to DistCp tool.
* overwrite
* ignore errors
* skip checksum
* remove deleted files
* preserve block size
* preserve replication count
* preserve permissions
Diffs (updated)
-----
common/src/main/java/org/apache/falcon/util/ReplicationDistCpOption.java PRE-CREATION
docs/src/site/twiki/EntitySpecification.twiki 1ed2cb5
oozie/src/main/java/org/apache/falcon/oozie/OozieOrchestrationWorkflowBuilder.java 49f9e07
oozie/src/main/java/org/apache/falcon/oozie/feed/FSReplicationWorkflowBuilder.java 1d97204
oozie/src/main/java/org/apache/falcon/oozie/feed/HCatReplicationWorkflowBuilder.java 72bbca4
replication/src/main/java/org/apache/falcon/replication/FeedReplicator.java b2175b2
replication/src/test/java/org/apache/falcon/replication/FeedReplicatorTest.java 539d00d
Diff: https://reviews.apache.org/r/35468/diff/
Testing
-------
Verified custom property added got propagated to FeedReplicator Java action and the replication succeeded.
Action configuration:
<main-class>org.apache.falcon.replication.FeedReplicator</main-class>
<arg>-Dfalcon.include.path=hftp://sandbox.hortonworks.com:50070/user/ambari-qa/dr/test/primaryCluster/feed/input/2015/06/08</arg>
<arg>-Dmapred.job.queue.name=default</arg>
<arg>-Dmapred.job.priority=NORMAL</arg>
<arg>-maxMaps</arg>
<arg>33</arg>
<arg>-mapBandwidth</arg>
<arg>2</arg>
<arg>-sourcePaths</arg>
<arg>hftp://sandbox.hortonworks.com:50070/user/ambari-qa/dr/test/primaryCluster/feed/input/2015/06/08</arg>
<arg>-targetPath</arg>
<arg>hdfs://sandbox.hortonworks.com:8020/user/ambari-qa/dr/test/backupCluster/feed/output/2015/06/08</arg>
<arg>-falconFeedStorageType</arg>
<arg>FILESYSTEM</arg>
<arg>-availabilityFlag</arg>
<arg>NA</arg>
<arg>-overwrite</arg>
<arg>true</arg>
<arg>-ignoreErrors</arg>
<arg>false</arg>
<arg>-preservePermissions</arg>
<arg>false</arg>
Thanks,
Sowmya Ramesh
Re: Review Request 35468: FeedReplicator improvement to include more
DistCP options
Posted by Pallavi Rao <pa...@inmobi.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35468/#review88025
-----------------------------------------------------------
docs/src/site/twiki/EntitySpecification.twiki (line 293)
<https://reviews.apache.org/r/35468/#comment140431>
The Distcp Options do not exactly match the options you have listed above. For example, -pr of Distcp is preserveReplicationNumber of Falcon. Given that, I think we should list the options as we support it, in Falcon documentation.
oozie/src/main/java/org/apache/falcon/oozie/OozieOrchestrationWorkflowBuilder.java (line 150)
<https://reviews.apache.org/r/35468/#comment140432>
Repititive code. May be we can define the Replication options supported as enum and just have one for loop to iterate over the enum values.
- Pallavi Rao
On June 15, 2015, 6:10 p.m., Sowmya Ramesh wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35468/
> -----------------------------------------------------------
>
> (Updated June 15, 2015, 6:10 p.m.)
>
>
> Review request for Falcon, Pallavi Rao, Srikanth Sundarrajan, and Venkat Ranganathan.
>
>
> Bugs: FALCON-668
> https://issues.apache.org/jira/browse/FALCON-668
>
>
> Repository: falcon-git
>
>
> Description
> -------
>
> FeedReplicator improvement to include more DistCP options listed below. User is expected to pass these as custom properties in the feed defined for replicaiton and those will be propagated to DistCp tool.
>
> * overwrite
> * ignore errors
> * skip checksum
> * remove deleted files
> * preserve block size
> * preserve replication count
> * preserve permissions
>
>
> Diffs
> -----
>
> common/src/main/java/org/apache/falcon/util/ReplicationConstants.java PRE-CREATION
> docs/src/site/twiki/EntitySpecification.twiki 1ed2cb5
> oozie/src/main/java/org/apache/falcon/oozie/OozieOrchestrationWorkflowBuilder.java 49f9e07
> oozie/src/main/java/org/apache/falcon/oozie/feed/FSReplicationWorkflowBuilder.java 1d97204
> oozie/src/main/java/org/apache/falcon/oozie/feed/HCatReplicationWorkflowBuilder.java 72bbca4
> replication/src/main/java/org/apache/falcon/replication/FeedReplicator.java b2175b2
> replication/src/test/java/org/apache/falcon/replication/FeedReplicatorTest.java 539d00d
>
> Diff: https://reviews.apache.org/r/35468/diff/
>
>
> Testing
> -------
>
> Verified custom property added got propagated to FeedReplicator Java action and the replication succeeded.
>
> Action configuration:
>
> <main-class>org.apache.falcon.replication.FeedReplicator</main-class>
> <arg>-Dfalcon.include.path=hftp://sandbox.hortonworks.com:50070/user/ambari-qa/dr/test/primaryCluster/feed/input/2015/06/08</arg>
> <arg>-Dmapred.job.queue.name=default</arg>
> <arg>-Dmapred.job.priority=NORMAL</arg>
> <arg>-maxMaps</arg>
> <arg>33</arg>
> <arg>-mapBandwidth</arg>
> <arg>2</arg>
> <arg>-sourcePaths</arg>
> <arg>hftp://sandbox.hortonworks.com:50070/user/ambari-qa/dr/test/primaryCluster/feed/input/2015/06/08</arg>
> <arg>-targetPath</arg>
> <arg>hdfs://sandbox.hortonworks.com:8020/user/ambari-qa/dr/test/backupCluster/feed/output/2015/06/08</arg>
> <arg>-falconFeedStorageType</arg>
> <arg>FILESYSTEM</arg>
> <arg>-availabilityFlag</arg>
> <arg>NA</arg>
> <arg>-overwrite</arg>
> <arg>true</arg>
> <arg>-ignoreErrors</arg>
> <arg>false</arg>
> <arg>-preservePermissions</arg>
> <arg>false</arg>
>
>
> Thanks,
>
> Sowmya Ramesh
>
>