You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Attila Sasvari (JIRA)" <ji...@apache.org> on 2017/03/02 20:42:45 UTC
[jira] [Commented] (CRUNCH-636) Make replication factor for
temporary files configurable
[ https://issues.apache.org/jira/browse/CRUNCH-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892952#comment-15892952 ]
Attila Sasvari commented on CRUNCH-636:
---------------------------------------
Thanks a lot for the response. I will invest some more time here, and let you know if I can come up with something reasonable or close this ticket with "Won't fix".
> Make replication factor for temporary files configurable
> --------------------------------------------------------
>
> Key: CRUNCH-636
> URL: https://issues.apache.org/jira/browse/CRUNCH-636
> Project: Crunch
> Issue Type: New Feature
> Reporter: Attila Sasvari
> Assignee: Attila Sasvari
>
> As of now, Crunch does not allow having different replication factor for temporary files and non-temporary files (e.g. final output data of leaf nodes) at the same time. If a user has a large amount of data (say hundreds a of gigabytes) to process, they might want to have lower replication factor for large temporary files between Crunch jobs.
> We could make this configurable via a new setting (e.g. {{crunch.tmp.dir.replication}}).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)